Conversation
Summary of ChangesHello @michaelteboul, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request expands the available basket recommendation models by introducing the Item2vec/Prod2vec model, enhancing the system's capability to generate item embeddings and predict purchases. Concurrently, an existing self-attention model has been refactored with a more concise name, improving clarity and organization within the model pool. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces the Item2vec/Prod2vec model and renames SelfAttentionModel to AttRec. The new Item2vec model implementation appears to be a mix of new logic and code copied from other models, which has resulted in several critical issues. There are uninitialized attributes that will cause AttributeError at runtime, a potential crash from a nullable parameter that isn't checked, and significant amounts of dead or confusing code. My review focuses on fixing these critical bugs and improving code clarity by removing unused logic and correcting documentation.
| momentum: float = 0.0, | ||
| epsilon_price: float = 1e-5, | ||
| l2_regularization: float = 0.0, | ||
| item_counts: list[int] = None, |
There was a problem hiding this comment.
| if n_stores == 0 and self.price_effects: | ||
| # To take into account the price effects, the number of stores must be > 0 | ||
| # to have a delta embedding | ||
| # (By default, the store id is 0) | ||
| n_stores = 1 |
There was a problem hiding this comment.
self.price_effects is used here but is not initialized in __init__, which will cause an AttributeError. Since price effects are not implemented in this model, this check is incorrect. The logic should be adjusted to ensure n_stores is at least 1, as it's needed for the theta embedding.
| if n_stores == 0 and self.price_effects: | |
| # To take into account the price effects, the number of stores must be > 0 | |
| # to have a delta embedding | |
| # (By default, the store id is 0) | |
| n_stores = 1 | |
| if n_stores == 0: | |
| # To have a theta embedding, the number of stores must be > 0 | |
| # (By default, the store id is 0) | |
| n_stores = 1 |
| if self.item_intercept: | ||
| weights.append(self.alpha) | ||
| if self.price_effects: | ||
| weights.extend([self.beta, self.delta]) | ||
|
|
||
| if self.seasonal_effects: | ||
| weights.extend([self.mu, self.nu]) | ||
|
|
There was a problem hiding this comment.
This code accesses uninitialized attributes self.item_intercept, self.price_effects, and self.seasonal_effects, which will cause an AttributeError. The weights these blocks would add (alpha, beta, etc.) are also not defined. Since these features are not part of the model, this logic should be removed.
return weights| item_intercept: bool, optional | ||
| Whether to include item intercept in the model, by default True | ||
| Corresponds to the item intercept | ||
| price_effects: bool, optional | ||
| Whether to include price effects in the model, by default True | ||
| seasonal_effects: bool, optional | ||
| Whether to include seasonal effects in the model, by default False |
There was a problem hiding this comment.
The parameters item_intercept, price_effects, and seasonal_effects are mentioned in the docstring but are not part of the function signature, nor are they implemented in the model. This is misleading and contributes to AttributeErrors in other methods. Please remove this part of the docstring to align with the model's actual implementation.
| by default {"preferences": 4, "price": 4, "season": 4} | ||
| n_negative_samples: int, optional | ||
| Number of negative samples to draw for each positive sample for the training, | ||
| by default 2 | ||
| Must be > 0 | ||
| optimizer: str, optional | ||
| Optimizer to use for training, by default "adam" | ||
| callbacks: tf.keras.callbacks.Callbacklist, optional | ||
| List of callbacks to add to model.fit, by default None and only add History | ||
| lr: float, optional | ||
| Learning rate, by default 1e-3 | ||
| epochs: int, optional | ||
| Number of epochs, by default 100 |
There was a problem hiding this comment.
The docstrings for latent_sizes and epochs are inconsistent with their default values in the function signature.
latent_sizes: The docstring default is{"preferences": 4, "price": 4, "season": 4}, but the code default is{"preferences": 4}.epochs: The docstring default is100, but the code default is10.
Please update the docstrings to match the code.
| price_batch: np.ndarray | ||
| Batch of prices (floats) for each purchased item | ||
| Shape must be (batch_size,) |
There was a problem hiding this comment.
The docstring for price_batch states its shape is (batch_size,). However, based on its usage (e.g., line 492) and how batches are generated, the shape is actually (batch_size, n_items). Please correct the docstring here and in compute_batch_utility.
price_batch: np.ndarray
Batch of prices for all items, for each purchased item.
Shape must be (batch_size, n_items)|
|
||
| def __init__( | ||
| self, | ||
| latent_sizes: dict[str] = {"preferences": 4}, |
There was a problem hiding this comment.
| n_items: int, | ||
| n_stores: int = 0, | ||
| ) -> None: | ||
| """Instantiate the Shopper model. |
| # --- NOUVEAU CODE (VECTORISÉ) --- | ||
| negative_samples = self.get_negative_samples_sgns( | ||
| item_batch=item_batch, basket_batch=basket_batch, n_samples=self.n_negative_samples | ||
| ) # Shape: (batch_size, n_negative_samples) | ||
| # ---------------------------------------------- |
There was a problem hiding this comment.
| # ----------------Loss computation----------------# | ||
| bce = tf.keras.backend.binary_crossentropy( | ||
| # Target: 1 for positive samples, 0 for negative samples | ||
| target=tf.concat( | ||
| [ | ||
| tf.ones_like(positive_samples_utilities), | ||
| tf.zeros_like(negative_samples_utilities), | ||
| ], | ||
| axis=1, | ||
| ), | ||
| output=tf.nn.sigmoid(all_utilities), | ||
| ) # Shape: (batch_size * (n_negative_samples + 1),) | ||
|
|
||
| # --------------------------------------------------# |
There was a problem hiding this comment.
These comments appear to be temporary developer notes and should be removed from the final code.
) # Shape of loglikelihood: (1,))
bce = tf.keras.backend.binary_crossentropy(
# Target: 1 for positive samples, 0 for negative samples
target=tf.concat(
[
tf.ones_like(positive_samples_utilities),
tf.zeros_like(negative_samples_utilities),
],
axis=1,
),
output=tf.nn.sigmoid(all_utilities),
) # Shape: (batch_size * (n_negative_samples + 1),)
Coverage Report for Python 3.11
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Coverage Report for Python 3.9
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Coverage Report for Python 3.12
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Description of the goal of the PR
Description:
Changes this PR introduces (fill it before implementation)