Hello,
We are using 'from tabpfn_extensions.embedding import TabPFNEmbedding to get embedding data using the following:
train_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_train, data_source='train')
test_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_test, data_source='test')
I have a question on what data_source is and how it's working in the background. I could not understand its purpose and function by reading the source code/documentation.
Also, if I have n_estimators = 8, is there a recommended approach for combining the 8 different embedding datasets into a single dataset for further analysis? In your examples, you use n_estimators = 1 to bypass this problem.
Hello,
We are using 'from tabpfn_extensions.embedding import TabPFNEmbedding to get embedding data using the following:
train_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_train, data_source='train')
test_embeddings = embedding_extractor.get_embeddings(X_train, y_train, X_test, data_source='test')
I have a question on what data_source is and how it's working in the background. I could not understand its purpose and function by reading the source code/documentation.
Also, if I have n_estimators = 8, is there a recommended approach for combining the 8 different embedding datasets into a single dataset for further analysis? In your examples, you use n_estimators = 1 to bypass this problem.