This was moved to the main mozilla/translations repo.
Track and extract data from the training system of Firefox Translations.
Logs are extracted from Marian training tasks, running in Task Cluster.
This POC works offline, using a text log sample within the samples directory. It outputs an instance of the TrainingLog dataclass with the following attributes:
info: Marian information as a dictconfigurationRuntime configuration as a dicttrainingList of Training dataclass instances:epochupsencosttimerategnorm
validationList ofValidationdataclass instances:epochupchrfce_mean_wordsbleu_detok
logsas a dict of log lines, indexed by their header (e.g. marian, data, memory)
On a virtual environment, you can install the package using pip:
$ pip install .Run the parser with the local sample:
$ parse_tc_logs -i samples/<log_file>Publish data to Weight & Biases:
$ parse_tc_logs -i samples/<log_file> --wandb-project <project> --wandb-group=<group> --wandb-run-name=<run>Run the parser on a directory containing experiments and publis to Weight & Biases:
$ parse_experiment_dir -d modelsOn a virtual environment, you can install the package using pip: A developer may want to install the package in editable mode (i.e install from the local path directly):
$ pip install -e .Pre-commit rules are automatically run once pre-commits hooks have been installed:
$ pip install pre-commit
$ pre-commit install
$ pre-commit run -a # Run pre-commit once