Skip to content

Reads filtration change seq length + RF model update#5

Open
TsabarM wants to merge 363 commits intocontrolled_shufflesfrom
reads_filtration_change_seq_length
Open

Reads filtration change seq length + RF model update#5
TsabarM wants to merge 363 commits intocontrolled_shufflesfrom
reads_filtration_change_seq_length

Conversation

@TsabarM
Copy link
Copy Markdown

@TsabarM TsabarM commented Sep 30, 2020

No description provided.

@TsabarM TsabarM changed the base branch from master to controlled_shuffles September 30, 2020 09:42
Comment thread IgOmeProfiling_pipeline.py Outdated
module_parameters = [fastq_path, first_phase_output_path, first_phase_logs_path,
barcode2samplename_path, left_construct, right_construct,
max_mismatches_allowed, min_sequencing_quality, first_phase_done_path,
max_mismatches_allowed, min_sequencing_quality, minimal_length_required,first_phase_done_path,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mismatch with the definition in read_filtration/module_wrapper.py.
You define that as a named parameter (starts with --) and pass it here as positional parameter.
Also the order is incorrect/doesn't match, you are passing minimal length as done path.

To summarize, this change is wrong and doesn't work

Comment thread model_fitting/random_forest.py Outdated

def get_hyperparameters_grid(seed):
# Number of trees in random forest
n_estimators = [int(x) for x in np.linspace(start=100, stop=2000, num=20)]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set parameters using command arguments, parameters shouldn't be hardcoded.
This remark should be applied to entire file, not just this line

Comment thread model_fitting/random_forest.py Outdated
for i in range(num_of_configurations_to_sample):
configuration = {}
for key in hyperparameters_grid:
configuration[key] = np.random.choice(hyperparameters_grid[key], size=1)[0]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It this seeded? Would we get same results every experiment run?

Comment thread model_fitting/random_forest.py Outdated
data.drop(['sample_name', 'label'], axis=1, inplace=True)
# a matrix of the actual feature values
X_train = data[train_rows_mask].values
X_test = data[test_rows_mask].values
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no usage in (modified) code - not X_test and not Y_test

yael1994 and others added 30 commits May 3, 2022 12:19
change the path of the wsl tutorial
add new script for summary reads in one csv file
The motif samples were by sort_by_num_samples, sort_by_unique_memebers, sort_by_cluster_size

now its sort_by_num_samples, sort_by_cluster_size , sort_by_unique_memebers
when unique members goes from low to high
changed the order of the samples
fixed bug of biological condition type value
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants