Skip to content

Matminer2023 preset does not include all 2023 Matminer features #138

@gbrunin

Description

@gbrunin

Hi,

I have already discussed this with @ml-evs and @ppdebreuck, but discussing here might be a better idea for future references.
The very new Matminer2023Featurizer preset has updated the previous (DeBreuck2020Featurizer) in the sense that it has kept the same features, but updated their use with the current Matminer version. From a user perspective, I believe this new preset, with the current name, should include all possible features from Matminer and not only those present in DeBreuck2020.

  • The first possibility was to simply augment Matminer2023Featurizer by including the more recent features. This would not break previously-created models as all previous features would still be included. While working on this, I realized that it's not that simple: for instance, the WenAlloys featurizer was not present before and adds a few interesting features. However, it fully includes the features from YangSolidSolution, which could then be removed from the preset. But that would break things... Of course, it's still possible to select each feature from each featurizer, but that would not be very clean...
  • The second possibility is then to simply update Matminer2023Featurizer to include all possible 2023 Matminer featurizers, and not care (or care less) about the fact that it could break things. After all, this new preset is only 1-2 months old.
  • Finally, a new preset could be created with all this. The only problem I see is on the clarity of the names for a user, but it's an important aspect I think.

Note also that with these possibilities, I'm also adding the option to keep only continuous features for composition features, as mentioned in #137. For now, I'm using a separate preset (gbrunin#1), but I'm actually in favor of updating the Matminer2023Featurizer. This specific part should not break anything though.

Cheers!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions