The test script https://github.com/trueagi-io/minecraft-experiments/blob/main/vla/benchmarks/block_type_classification/los_block_classifier.py
contains a lot of model-specific imports and code, and it is not always clear whether it is model-specific or general-purpose, e.g.
from extract_features import extract_dino_features_center_crop, crop_center_features, extract_full_features, base_transform
If these are model-specific functions, they should be encapsulated into the model itself, which should just take an image and produce features. Otherwise, they should be separated from the model example.
I'm not sure what TRANSFORM = transforms.Compose(...) is doing in this script. Say, normalization might not be needed by another model. And I don't see where TRANSFORM is used. It's confusing.
Besides importing all those functions, there is also a class
class Dinov2Features:
def encode(img):
return extract_full_features(img)
It seems that extract_full_features is only used by it, and extract_dino_features_center_crop, crop_center_features are not used. Possibly, they are supposed to replace extract_full_features in Dinov2Features for experiments.
I believe, Dinov2Features should be defined in the model class - not in the training script. All these functions should not be imported, and should be selected via Dinov2Features constructor.
I also see that base_transform (which is similar to TRANSFORM) from the model file is passed to BlockTypeDatas. It may make sense to perform image transformation at load time to save memory, and to make the preprocessing model-specific. But base_transform sounds a little confusing. Preprocessor or ImageProcessor (or just Processor) would be more conventional (e.g., transformers have AutoProcessor class for such purposes). If we suppose that the model and the processor are created in the model file to be tested, then we can agree on the following format:
from my_model import model, preprocessor
and nothing else model-specific. Then, changing my_model would be enough to test another model.
The test script https://github.com/trueagi-io/minecraft-experiments/blob/main/vla/benchmarks/block_type_classification/los_block_classifier.py
contains a lot of model-specific imports and code, and it is not always clear whether it is model-specific or general-purpose, e.g.
from extract_features import extract_dino_features_center_crop, crop_center_features, extract_full_features, base_transformIf these are model-specific functions, they should be encapsulated into the model itself, which should just take an image and produce features. Otherwise, they should be separated from the model example.
I'm not sure what
TRANSFORM = transforms.Compose(...)is doing in this script. Say, normalization might not be needed by another model. And I don't see whereTRANSFORMis used. It's confusing.Besides importing all those functions, there is also a class
It seems that
extract_full_featuresis only used by it, andextract_dino_features_center_crop,crop_center_featuresare not used. Possibly, they are supposed to replaceextract_full_featuresinDinov2Featuresfor experiments.I believe,
Dinov2Featuresshould be defined in the model class - not in the training script. All these functions should not be imported, and should be selected viaDinov2Featuresconstructor.I also see that
base_transform(which is similar toTRANSFORM) from the model file is passed toBlockTypeDatas. It may make sense to perform image transformation at load time to save memory, and to make the preprocessing model-specific. Butbase_transformsounds a little confusing.PreprocessororImageProcessor(or justProcessor) would be more conventional (e.g.,transformershaveAutoProcessorclass for such purposes). If we suppose that the model and the processor are created in the model file to be tested, then we can agree on the following format:and nothing else model-specific. Then, changing
my_modelwould be enough to test another model.