Skip to content

Conversation

@jvachier
Copy link
Owner

@jvachier jvachier commented May 4, 2025

Describe your changes

Provide a clear and concise description of the changes made in this pull request. Include any relevant context or background information.

  • What is the purpose of this pull request?
  • What problem does it solve?
  • What functionality does it add or improve?

Issue ticket number and link

Type of Change

Check the type of change your pull request introduces:

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Refactoring (code improvements without changing functionality)

Checklist before requesting a review

Before submitting your pull request, ensure the following:

  • I have performed a self-review of my code.
  • I have added tests that prove my fix is effective or that my feature works.
  • I have added or updated relevant documentation (if applicable).
  • My changes do not introduce any new warnings or errors.
  • I have checked for security vulnerabilities in the code.
  • I have ensured backward compatibility (if applicable).

@jvachier jvachier self-assigned this May 4, 2025
@jvachier jvachier requested a review from Copilot May 4, 2025 11:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request introduces a new transformer model with hyperparameter tuning using Optuna, alongside relevant documentation updates.

  • Adds a new module for building and tuning a transformer model.
  • Implements data processing, model construction, and hyperparameter optimization logic.

]

# Train the model
with tf.device("/GPU:0"):
Copy link

Copilot AI May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider checking for GPU availability before explicitly setting the device to '/GPU:0' to ensure compatibility on systems without a GPU. This can help avoid potential runtime errors on CPU-only machines.

Suggested change
with tf.device("/GPU:0"):
device = "/GPU:0" if tf.config.list_physical_devices('GPU') else "/CPU:0"
with tf.device(device):

Copilot uses AI. Check for mistakes.
Comment on lines 43 to 50
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)
encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)

decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)
x = tf.keras.layers.Dropout(dropout_rate)(x)
decoder_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(x)
Copy link

Copilot AI May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider using a more descriptive variable name instead of 'x' to improve code readability, especially within the transformer pipeline where multiple transformations are applied.

Suggested change
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)
encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(x)
decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")
x = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
x = TransformerDecoder(embed_dim, dense_dim, num_heads)(x, encoder_outputs)
x = tf.keras.layers.Dropout(dropout_rate)(x)
decoder_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(x)
encoder_embeddings = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(encoder_inputs)
encoder_outputs = TransformerEncoder(embed_dim, dense_dim, num_heads)(encoder_embeddings)
decoder_inputs = tf.keras.Input(shape=(None,), dtype="int32", name="french")
decoder_embeddings = PositionalEmbedding(sequence_length, vocab_size, embed_dim)(decoder_inputs)
decoder_outputs = TransformerDecoder(embed_dim, dense_dim, num_heads)(decoder_embeddings, encoder_outputs)
dropout_outputs = tf.keras.layers.Dropout(dropout_rate)(decoder_outputs)
final_outputs = tf.keras.layers.Dense(vocab_size, activation="softmax")(dropout_outputs)

Copilot uses AI. Check for mistakes.
@jvachier jvachier requested a review from Copilot May 4, 2025 14:15
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This pull request focuses on tuning the hyperparameters for the Transformer model used for French-to-English translation and updating related documentation. Key changes include:

  • Adding a new argument (transformer_model_path) to load a saved model.
  • Adjusting the model hyperparameters (embed_dim, dense_dim, num_heads, and dropout_rate) in the transformer_model function.
  • Introducing a new module (optuna_transformer.py) for hyperparameter optimization using Optuna.

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/translation_french_english.py Updated transformer_model: added transformer_model_path parameter and modified hyperparameters.
src/modules/optuna_transformer.py Added a new module to enable hyperparameter tuning with Optuna.

@jvachier jvachier requested a review from Copilot May 4, 2025 14:30
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements transformer hyperparameter tuning by introducing new testing scripts and an Optuna-based module for hyperparameter optimization while updating the Transformer model parameters and documentation. Key changes include:

  • New tests in tests/test_transformer_model.py for model building, training, evaluation, BLEU scoring, and loading.
  • Updates to the transformer_model in src/translation_french_english.py with reduced hyperparameter values and dropout addition.
  • Introduction of src/modules/optuna_transformer.py for Optuna hyperparameter optimization and workflow updates in .github/workflows/test.yaml.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
tests/test_transformer_model.py Added tests to validate model build, training, evaluation, BLEU score computation, and loading.
src/translation_french_english.py Modified model hyperparameters and added dropout in the Transformer architecture.
src/modules/optuna_transformer.py Created an Optuna optimization module for hyperparameter tuning of the Transformer model.
.github/workflows/test.yaml Updated workflow file to include additional file paths for triggering tests.
Comments suppressed due to low confidence (2)

.github/workflows/test.yaml:10

  • [nitpick] The file path 'src/modules/transformer_component.py' may be a typo; consider confirming if the intended filename is 'transformer_components.py' to ensure the workflow triggers correctly.
- "src/modules/transformer_component.py"

src/translation_french_english.py:59

  • [nitpick] The reduced embedding dimensions, dense layer size, and number of heads significantly lower the model's capacity; please verify that these changes are intentional and sufficient for achieving the desired performance.
embed_dim = 64

@jvachier jvachier requested a review from Copilot May 4, 2025 14:39
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR introduces new features and tests for tuning Transformer hyperparameters, updates the Transformer model implementation, and enhances CI workflow triggers.

  • Added comprehensive tests for building, training, evaluating, and loading the Transformer model.
  • Updated model hyperparameters and refined model architecture in the translation module.
  • Introduced a new module for hyperparameter tuning using Optuna and updated GitHub Actions paths for testing.

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
tests/test_transformer_model.py Added unit tests for building, training, evaluation, and BLEU scoring of the model.
src/translation_french_english.py Updated Transformer model parameters and architecture for improved tuning.
src/modules/optuna_transformer.py Introduces hyperparameter optimization using Optuna for the Transformer model.
.github/workflows/test.yaml Updated workflow paths to include new and modified module files for testing.
Comments suppressed due to low confidence (1)

tests/test_transformer_model.py:99

  • [nitpick] Consider using explicit metric names or indices rather than relying on the number of elements in 'results', which would make the test more robust if additional metrics are ever included.
assert len(results) == 2, "Evaluation did not return loss and accuracy."

Comment on lines 58 to +61
# Define model parameters
embed_dim = 128
dense_dim = 2048
num_heads = 8
embed_dim = 64
dense_dim = 1536
num_heads = 2
Copy link

Copilot AI May 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] Consider adding a comment or updating the docstring to explain the rationale behind the updated hyperparameter values (e.g., embed_dim, dense_dim, and num_heads) for improved clarity and maintainability.

Copilot uses AI. Check for mistakes.
@jvachier jvachier merged commit 9f3a1c9 into main May 4, 2025
1 check passed
@jvachier jvachier deleted the jv/translation_optuna branch May 4, 2025 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants