Skip to content

Question about the inconsistent length of train_dataset and sample_weights #4

@Sapphire-356

Description

@Sapphire-356

Hello,

When I tried to reproduce the results for "domainnet real-clipart" using your code, I found an inconsistency in methods/ours/mi_dis_pm.py: len(train_dataset) and len(sample_weights) are not equal — they are 189818 and 187226, respectively.

I realized that this inconsistency may reflect the inconsistency of the subsampling result:

subsample_indices2 = subsample_balanced_instances(deepcopy(train_dataset2), prop_indices_to_subsample=0.3)

I carefully checked my DomainNet dataset against the one (Original version) provided at https://ai.bu.edu/M3SDA/, but I did not find any differences.

The number of images in my DomainNet dataset is:

Clipart Infograph Painting Quickdraw Real Sketch
Num. 48,834 53,201 75,759 172,500 175,327 70,386

Could you kindly provide your subsample_indices2, so that I can properly reproduce your results?

Here are the results on several other datasets:

  len(train_dataset) len(sample_weights)
cubc 266724 266724
fgvcc 156652 156652
scarsc 369989 369989
domainnet real-clipart 189818 187226
domainnet real-sketch 196284 193524

Thank you in advance!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions