Question about the inconsistent length of `train_dataset` and `sample_weights`


Hello,

When I tried to reproduce the results for "domainnet real-clipart" using your code, I found an inconsistency in `methods/ours/mi_dis_pm.py`: `len(train_dataset)` and `len(sample_weights)` are not equal — they are `189818` and `187226`, respectively.

I realized that this inconsistency may reflect the inconsistency of the subsampling result:

```
subsample_indices2 = subsample_balanced_instances(deepcopy(train_dataset2), prop_indices_to_subsample=0.3)
```


I carefully checked my DomainNet dataset against the one (`Original` version) provided at https://ai.bu.edu/M3SDA/, but I did not find any differences.

The number of images in my DomainNet dataset is:

|  | Clipart | Infograph | Painting | Quickdraw | Real | Sketch
-- |-- | -- | -- | -- | -- | --
Num. | 48,834 | 53,201 | 75,759 | 172,500 | 175,327 | 70,386




Could you kindly provide your subsample_indices2, so that I can properly reproduce your results?


Here are the results on several other datasets：

  | len(train_dataset) | len(sample_weights)
-- | -- | --
cubc | 266724 | 266724
fgvcc | 156652 | 156652
scarsc | 369989 | 369989
domainnet real-clipart | 189818 | 187226
domainnet real-sketch | 196284 | 193524


Thank you in advance!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about the inconsistent length of `train_dataset` and `sample_weights` #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	len(train_dataset)	len(sample_weights)
cubc	266724	266724
fgvcc	156652	156652
scarsc	369989	369989
domainnet real-clipart	189818	187226
domainnet real-sketch	196284	193524

Question about the inconsistent length of train_dataset and sample_weights #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Question about the inconsistent length of `train_dataset` and `sample_weights` #4