Update evaluation logic for dashboard support#62
Open
prateekdesai04 wants to merge 1 commit intoautogluon:masterfrom
Open
Update evaluation logic for dashboard support#62prateekdesai04 wants to merge 1 commit intoautogluon:masterfrom
prateekdesai04 wants to merge 1 commit intoautogluon:masterfrom
Conversation
tonyhoo
approved these changes
Oct 23, 2023
Innixma
reviewed
Oct 23, 2023
Comment on lines
+160
to
+168
| dataframes = [] | ||
| for path in paths: | ||
| path = path if is_s3_url(path) else os.path.join(self.results_dir_input, path) | ||
| dataframe = pd.read_csv(path) | ||
| dataframes.append(dataframe) | ||
| # Discarding extra folds | ||
| min_num_rows = min(len(df) for df in dataframes) | ||
| trimmed_dataframes = [df[:min_num_rows] for df in dataframes] | ||
| return pd.concat(trimmed_dataframes, ignore_index=True, sort=True) |
Contributor
There was a problem hiding this comment.
This will not discard extra folds properly. Please add a unit test and separate out the filtering logic so it is not hard-coded into the load_results_raw method.
- Not all DataFrames loaded will have the same number of methods or datasets, so trimming by length of rows will not work.
- We don't want to always filter extra folds. This should be a post-load operation that is optional.
- You are assuming the input file is sorted by fold. This is not a valid assumption.
suzhoum
reviewed
Oct 23, 2023
| dataframe = pd.read_csv(path) | ||
| dataframes.append(dataframe) | ||
| # Discarding extra folds | ||
| min_num_rows = min(len(df) for df in dataframes) |
Collaborator
There was a problem hiding this comment.
What if there are multiple datasets in results file? min() will not do what it's intended right?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of changes:
This PR handles the case where if multiple cleaned CSVs having been run on different folds are being evaluated.
Initially evaluation was only possible if all were using same number of folds.
This sets the folds to the least of all the cleaned CSVs being evaluated.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.