Releases: google-deepmind/optax
Releases · google-deepmind/optax
Optax 0.2.8
What's Changed
- Following the JAX 0.9.2 release, the
jax_pmap_shmap_mergeconfig flag was removed so that thejax.pmapimplementation is always based onjax.jitandjax.shard_map, and opting into the oldjax.pmapbehavior is no longer an option. Optax had opted into the old behavior to give users time to migrate, and as of Optax 0.2.8 this is no longer supported. This changed shouldn't impact most users, but if you experience errors or performance regressions as a result of it, you can update your code following JAX's migration guide (or use JAX 0.9.2 or earlier and setjax.config.update("jax_pmap_shmap_merge", False)). - Explicitly specify the dtype of the gradient accumulator in the MultiStep transform. by @copybara-service[bot] in #1605
- feat: add preconditioning and coef presets to muon by @massena-t in #1602
- Backwards compatibility export for the newton schulz iterator by @copybara-service[bot] in #1608
- Remove TensorFlow dependency in
Adversarial trainingexample by @rajasekharporeddy in #1609 - Improve lookahead docstrings with example and usage notes by @rdyro in #1619
- Make sure
inject_hyperparamsuses the dtype inferred from parameters... by @copybara-service[bot] in #1615 - Memory-optimization for microbatching. by @copybara-service[bot] in #1623
- Remove TensorFlow dependency and migrate mlp_mnist to Flax NNX by @selamw1 in #1536
- Let inject use the highest dtype found in the params as the default dtype of params. by @copybara-service[bot] in #1628
- Support scheduling alpha for AdEMAmix by @copybara-service[bot] in #1630
- [JAX] Suppress type errors found by pytype after correcting definition of jax.typing.ArrayLike. by @copybara-service[bot] in #1629
- [JAX] Suppress type errors found by pytype after correcting definition of jax.typing.ArrayLike. by @copybara-service[bot] in #1633
New Contributors
- @massena-t made their first contribution in #1602
- @selamw1 made their first contribution in #1536
Full Changelog: v0.2.7...v0.2.8
Optax 0.2.7
What's Changed
- Update Optax version to 0.2.7.dev. by @copybara-service[bot] in #1420
- Fix doctest by @copybara-service[bot] in #1424
- Fix piecewise interpolate with 0 first split by @rdyro in #1425
- Expose weight decay as a schedule option in all alias optimizers. by @copybara-service[bot] in #1427
- Skip schedule free tests casting from complex to float. by @copybara-service[bot] in #1428
- Add
monitorandmeasure_with_emato Optax transformations. by @copybara-service[bot] in #1430 - Add consistent rms scaling for muon update by @shuningjin in #1435
- Add microbatch transformation to optax/experimental. by @copybara-service[bot] in #1434
- Add experimental aggregators in optax. by @copybara-service[bot] in #1436
- Add missing exports to optax/init.py. by @carlosgmartin in #1433
- Adding microbatch to the docs by @copybara-service[bot] in #1443
- Add a few sharding-related tests to optax. by @copybara-service[bot] in #1450
- Allow instantiating optimizers before jax has been initialized. by @copybara-service[bot] in #1454
- Adding gradient variance tracking. by @copybara-service[bot] in #1451
- Resolve remaining sharding test failures in optax. by @copybara-service[bot] in #1457
- Add optional "in_axes" and "argnames" kwargs to microbatch, which will allow for natural composition with jax.vmap. by @copybara-service[bot] in #1442
- Honor inject_hyperparams dtypes if passed as jax.Arrays. by @copybara-service[bot] in #1460
- Update reshape_batch_axis to include sharding information when in explicit sharding mode. by @copybara-service[bot] in #1459
- Fix typo in L-BFGS debug information section by @partev in #1465
- Deprecate second order utilities. by @copybara-service[bot] in #1466
- Deprecate optax.global_norm in favor of optax.tree.norm. by @carlosgmartin in #1368
- Add signum optimizer by @copybara-service[bot] in #1463
- Use internal
warn_deprecated_functioninstead of the Chex version. by @copybara-service[bot] in #1469 - Define internal
assert_trees_all_{close, equal}functions and use them in tests instead of the Chex versions. by @copybara-service[bot] in #1481 - Replace (non-test) Chex assertions, usually with ValueErrors. by @copybara-service[bot] in #1483
- Replace
chex.Numericwithjax.typing.ArrayLike. Replacechex.Scalarwithfloat/intas appropriate. by @copybara-service[bot] in #1479 - Update readme by @copybara-service[bot] in #1496
- Enforce keyword-only arguments for optional parameters in Optax losses. Disable or fix existing Pytype bugs that surfaced as a result of this change. by @copybara-service[bot] in #1505
- Remove use of cast_tree in favor of optax.tree.cast by @copybara-service[bot] in #1477
- Minor doc fixes: Include hyperlinks to functions, classes references and correct typos by @rajasekharporeddy in #1511
- [optax] Remove
jax_pmap_shmap_merge=Falseflag incontrib/_complex_valued. #jax-fixit by @copybara-service[bot] in #1515 - Enforce keyword-only arguments for optional parameters in Optax losses. Disable or fix existing Pytype bugs that surfaced as a result of this change. by @copybara-service[bot] in #1516
- Fix argument order in scale_by_distance_over_gradients by @Aaryan-549 in #1501
- Fix momo crash when loss value is a Python float by @Aaryan-549 in #1502
- Clarify difference between kl_divergence and convex_kl_divergence by @zer-art in #1514
- Fix broken test in numerics_test. by @copybara-service[bot] in #1522
- Add guidelines on contributing AI generated code, the same as JAX's. by @copybara-service[bot] in #1526
- Move _microbatching.py to microbatching.py and update init.py so we can directly import microbatching without causing pytype errors. by @copybara-service[bot] in #1525
- Fix up microbatching documentation. by @copybara-service[bot] in #1527
- Fix conjugation in Newton-Schulz iterator and update tests for comple… by @eirikfagerbakke in #1506
- Fix microbatching.Accumulator in docs. by @copybara-service[bot] in #1529
- Upgrade GitHub Actions to latest versions by @salmanmkc in #1534
- Add axis and where arguments to smooth_labels function. by @carlosgmartin in #1492
- Fix _projections_test.py: Remove prints and use optax.tree.allclose. by @carlosgmartin in #1439
- updating github actions versions by @copybara-service[bot] in #1540
- Fix typo in alias.py documentation by @partev in #1464
- Added Refined Lion Optimizer and Tests by @raghulchandramouli in #1497
- Add a bias_correction_v flag to scale_by_amsgrad to align with the original AMSGrad paper and Pytorch/tensorflow impl by @vvsvictor in #1423
- fix failing CI by @copybara-service[bot] in #1543
- Optimize tree_sum compile time using tree_reduce_associative by @Aaryan-549 in #1503
- Tests to warmup_cosine_decay_schedule edge cases by @edawite in #1413
- resolve DeprecationWarnings by @rdyro in #1547
- state dtype consistency in multi-step by @copybara-service[bot] in #1554
- Fix optax CI after merging galore. by @copybara-service[bot] in #1556
- Replace unicode escaped characters in ipynb files by @copybara-service[bot] in #1557
- Add an internal definition of
ArrayTreeand use it instead ofchex.ArrayTree. by @copybara-service[bot] in #1484 - Optional auxiliary learning rate for Adam within Moun by @RaphaelRe in #1565
- Upstream num_real_microbatches to micro_vmap and micro_grad and add unit tests. by @copybara-service[bot] in #1570
- Allow early stopping when num_real_microbatches is dynamic (Tracer). by @copybara-service[bot] in #1571
- Fix
multiply_by_parameter_scaletype in adafactor optimizer. by @copybara-service[bot] in #1504 - Generalize dice_loss with alpha/beta weighting by @aymuos15 in #1458
- adding additional non 2d array test in galore (refrence to #1541) by @yash194 in #1574
- optax fix CI by @copybara-service[bot] in #1579
- Raise error on unused extra kwargs in backtracking linesearch by @TanmayThakur2209 in #1559
- Add madgrad optimizer by @divye-joshi in #1581
- Fix optax CI by @copybara-service[bot] in #1586
- Remove TF dependency in
Lookahead Optimizer on MNISTexample by @rajasekharporeddy in #1568 - Add example demonstrating the microbatching api and comparing it against optax.MultiSteps. by @copybara-service[bot] in #1573
- Add an internal definition of
ArrayTreeand use it instead ofchex.ArrayTree. by @copybara-service[bot] in #1588 - clipping: support default unitwise_norm for 5D params by @staticpayload in #1576
- Fix CI by @copybara-service[bot] in #1592
- Finish removing Chex dependency. by @copybara-service[bot] in #1590
- attempt at fixing docs building by @copybara-service[bot] in #1593
- Release optax 0.2.7 by @copybara-ser...
Optax 0.2.6
What's Changed
- Fix for #1328 by @copybara-service[bot] in #1329
- Make pip quiet in a notebook by @copybara-service[bot] in #1330
- Clean up freezing doctests by @rdyro in #1333
- Fix rendering issue of
Freezingintransformations api pagein the documentation by @rajasekharporeddy in #1331 - Remove the reference to
optax.transformsinfreezingdocumentation by @rajasekharporeddy in #1334 - Fix for #1335 by @rdyro in #1336
- Add Salimans et al. 2017 citation to make_perturbed_fun docstring. by @carlosgmartin in #1325
- Add tree utility functions. by @carlosgmartin in #1321
- Add tests to verify cross_entropy_losses accept per-logit masks. by @copybara-service[bot] in #1343
- Remove reliance on chex.dataclass since it's not supported in newest JAX by @copybara-service[bot] in #1350
- Add line too long (E501) to optax source code by @rdyro in #1347
- Simplify code by using new tree.size function. by @carlosgmartin in #1354
- Enable adaptive gradient clipping for high-dimensional tensors by @aymuos15 in #1340
- Extend the fromage optimizer to allow a learning rate schedule by @rdyro in #1359
- Fix ruff to check line-length=80 by @rdyro in #1360
- Add function tree_allclose. by @carlosgmartin in #1352
- fix CI failure from line-too-long by @copybara-service[bot] in #1361
- Fix gradient NaN issues in sigmoid_focal_loss for extreme logits by @leochlon in #1346
- Internal changes by @copybara-service[bot] in #1367
- Clean up and fix errors in DoG implementation and documentation. by @carlosgmartin in #1292
- Trimming the library. by @copybara-service[bot] in #1370
- Address optimistic_adam interface re-work in the documentation. by @copybara-service[bot] in #1381
- Small docs fixes by @copybara-service[bot] in #1382
- Add missing entry for tree_cast_like in utilities.rst. by @carlosgmartin in #1377
- Remove type hint in test to align with new jax.nn annotations by @copybara-service[bot] in #1385
- Bump jax version for optax by @copybara-service[bot] in #1392
- Simplify l2 projection by @copybara-service[bot] in #1394
- Make init_empty_state public by @copybara-service[bot] in #1395
- Use OrderedDict in named_chain to preserve transformation order in the state object through jax.jit. by @copybara-service[bot] in #1397
- Fix hlo equivalence test for abs_sqr, fix broken html links by @copybara-service[bot] in #1404
- Add pyink config for external PRs (optional) by @copybara-service[bot] in #1409
- Expose scale by muon mask in the muon alias by @copybara-service[bot] in #1407
- add segmentation based (dice) loss by @aymuos15 in #1366
- fix CI by fixing pylint errors by @copybara-service[bot] in #1411
- Add explanation to Newton Schulz step by @copybara-service[bot] in #1410
- Fix doctests: add necessary dependency for sphinx-collections by @copybara-service[bot] in #1417
- Add missing equations to optax.optimistic_gradient_descent. by @carlosgmartin in #1400
- Fix dtype casting inside tree_add_scale. by @carlosgmartin in #1376
- Update version number for release. by @copybara-service[bot] in #1419
New Contributors
- @rajasekharporeddy made their first contribution in #1331
- @aymuos15 made their first contribution in #1340
- @leochlon made their first contribution in #1346
Full Changelog: v0.2.5...v0.2.6
Optax 0.2.5
What's Changed
- Extend example gallery entry for linear assignment problem with optimal transport example. by @carlosgmartin in #1143
- In flatname space import from the subpackage not from _src. by @copybara-service in #1147
- Fix docs for
optax.partition. by @copybara-service in #1150 - Fix complex support for L-BFGS by @gautierronan in #1142
- Add sophia-h optimizer by @evanatyourservice in #979
- Exposing named_chain in docs by @rdyro in #1153
- Split github workflows for lower latency, add ruff by @rdyro in #1156
- Remove longtime deprecated functions. by @copybara-service in #1149
- Fix typo in docstring by @KhawajaAbaid in #1157
- Changes to test.sh and pyproject.toml. by @carlosgmartin in #1110
- Minor grammar fix. by @copybara-service in #1158
- Add Muon Optimizer to
contribby @leloykun in #1126 - Fix axis should be int or None in softmax_cross_entropy_with_integer_labels by @rdyro in #1164
- [Triplet Margin Loss] Issue 1118 by @cvnad1 in #1120
- Fix Muon implementation by @leloykun in #1170
- removing
scalar_type_of- random.key breaks it by @rdyro in #1169 - Updating optax to use jax.random.key instead of PRNGKey by @copybara-service in #1172
- CI: test against nightly JAX by @jakevdp in #1173
- optax doctests fix by @copybara-service in #1176
- minor change so backtracking init and update return the same state types by @bafflingbits in #1175
- Patching transforms:trace issue due to JAX tree map deprecation by @mustass in #1181
- Adds AdeMAMix Optimizer to
contribby @mathDR in #1104 - Fixes to docs, prelude to docs build CI by @rdyro in #1184
- Add improved version of Hungarian algorithm. by @carlosgmartin in #1140
- Add 'plus' option to polyak_sgd. by @carlosgmartin in #1180
- Add optax.tree_utils.tree_batch_shape. by @carlosgmartin in #1161
- Move configuration for Pylint from .pylintrc to pyproject.toml. by @carlosgmartin in #1160
- Fix Typo. by @copybara-service in #1191
- Fix optax CI by relaxing float32 rtol comparison to 1e-6 by @copybara-service in #1194
- Fixing doctests and removing etils.lazy_import for compatibility with Python 3.9 by @copybara-service in #1195
- Add non-negative least squares (NNLS) solver. by @carlosgmartin in #1155
- fix: Unexpected tracer in muon optimizer when sharding. by @hlzl in #1193
- Stop gradient in linesearch identity scaling by @younik in #1190
- fix: Correct string formatting for edge labels in linear assignment e… by @k22036 in #1211
- Fixes #1210 by @copybara-service in #1217
- Deprecate multi_transform in favor of partition by @copybara-service in #1216
- Remove unused dependency by @NeilGirdhar in #1219
- Improved docstring for optax.centralize with explanation and example by @shreyans413 in #1220
- Fix doctest error by @copybara-service in #1221
- Ensure optimizers accept extra args by @copybara-service in #1212
- Update _schedule.py by @dpoteryayev in #1222
- allow triggering CI tests manually by @rdyro in #1226
- fixed PartitionState import issue while building html docs by @julurisaichandu in #1232
- Add tree_utils functions to docs that are missing. by @carlosgmartin in #1233
- Improve docstring for optax.power_iteration. by @carlosgmartin in #1215
- added mathematical description for the ada delta optimizer. by @julurisaichandu in #1235
- Allow learning_rate of None, addressing #1242 by @copybara-service in #1244
- [#1196] renaming tree_scalar_mul to tree_scale and tree_add_scalar_mul to tree_add_scale by @aryanmahawar205 in #1246
- fixing optax CI by @copybara-service in #1251
- Adding pre-commit by @copybara-service in #1250
- Adding pre-commit, uv in test.sh - better caching by @rdyro in #1247
- Fix scaling factor in muon by @stephen-huan in #1249
- Update references to JAX's GitHub repo by @copybara-service in #1252
- Add defaults for min_value and max_value to tree_utils.tree_clip. by @carlosgmartin in #1234
- Add how to read lr by @copybara-service in #1253
- simplify centralize by @copybara-service in #1254
- type stability of state in optimizers, tested via jax.lax.cond by @copybara-service in #1218
- Add Simplified AdEMAMix. by @carlosgmartin in #1206
- Change type of num_linesearch_steps for backtracking linesearch by @copybara-service in #1258
- Fix nnls to handle batch shapes correctly. by @carlosgmartin in #1200
- Edit sigmoid_binary_cross_entropy and sigmoid_focal_loss to work correctly with non-array label arguments. by @carlosgmartin in #1261
- Minor improvements to tree_batch_shape. by @carlosgmartin in #1260
- Fix docs for optax.contrib.dog. by @carlosgmartin in #1262
- Improve docstrings for schedule functions in optax/schedules/_schedule.py by @DakshBegani in #1255
- Fix docs building by @copybara-service in #1263
- Note that default eps values of Adabelief don't match paper. by @copybara-service in #1264
- Seed key v3 by @Tomas542 in #1240
- Abandon python 3.9, test on 3.12 by @copybara-service in #1266
- Clean dependencies by @copybara-service in #1270
- Muon: Add weight decay by @leloykun in #1275
- Fix MuonState docstring by @EIFY in #1276
- Fix duplicate-bases error upon running test.sh by directing pylint to run on optax directory only. by @carlosgmartin in #1278
- add freezing parameters examples [Addresses Issue #296] by @pranavagrawaI in #1274
- Implementing ADOPT in Optax by @tinker495 in #1257
- Automated Code Change by @copybara-service in #1281
- Update JAX nightly index usage by @copybara-service in #1282
- Configure tests to explicitly use jax_threefry_partitionable=False. by @copybara-service in #1289
- Typos in reduce_on_plateau.ipynb by @ddrous in #1293
- Add license check for all Python files to pre-commit by @copybara-service in #1296
- Remove clutter from MNIST examples by @Balint-H in #1285
- Add Freezing Utilities:
freeze,selective_optimizer, and Prefix-Mask Support by @pranavagrawaI in #1284 - Unify tree_(l1|l2|linf)_norm into tree_norm & introduce optax.tree.* aliases. by @copybara-service in #1299
- update optax getting started colab link by @copybara-service in #1301
- update freezing parameter examples with new utilities
freezeandselective_transformby @pranavagrawaI in https://...
Optax 0.2.4
What's Changed
- Beginning of 0.2.4 development by @copybara-service in #1003
- Fix gallery display, add images by @vroulet in #1005
- Fix docs for dog by @jungtaekkim in #1008
- docs: remove
multi_normalfrom utilities.rst by @Abhinavcode13 in #1009 - Fix docs for dowg by @jungtaekkim in #1013
- feat: add a mathematical description of AdaGrad optimizer by @Abhinavcode13 in #1011
- fix: refactor AdaGrad optimizer recent changes by @Abhinavcode13 in #1016
- Make masking compatible with callable pytrees a la Equinox by @copybara-service in #1015
- Enable zero lrate for schedule-free optimization by @copybara-service in #1018
- keep a local .pylintrc file by @fabianp in #1024
- Add bias_correction and eps_in_sqrt options to rmsprop and associated transforms by @copybara-service in #1019
- Replace adam(b1=0) by rmsprop for schedule_free by @copybara-service in #1025
- Update init value for zakharov problem from 1e4 to 1e3 by @copybara-service in #1027
- Fix typo by @gil2rok in #1030
- Updated docs cosine_decay_schedule by @bhargavyagnik in #1032
- feat: add mathematical notation docs of SM3 optimizer by @Abhinavcode13 in #1012
- DOC: misc improvements in docstring of softmax_cross_entropy* by @copybara-service in #1033
- add doctest to constant_schedule by @fabianp in #1034
- Add axis and where arguments to loss functions. by @carlosgmartin in #912
- Fix doctest error in make_fenchel_young_loss by @copybara-service in #1035
- add doctest for polynomial_schedule by @fabianp in #1037
- add missing schedule_free_* methods by @fabianp in #1043
- fix error in softmax_cross_entropy formula by @fabianp in #1041
- Fix typo in formula of cosine_decay_schedule by @fabianp in #1044
- schedule_free: fix broadcasting of scalar arrays to 1d arrays by @n-gao in #1042
- Update polynomial_schedule doctest per @vroulet's feedback by @fabianp in #1045
- Fix linting schedule_free_test by @copybara-service in #1048
- more robust tests by @copybara-service in #1050
- Generalizes safe_int32_increment to safe_increment by @copybara-service in #1054
- Add dtype option to tree_random_like by @copybara-service in #1056
- Add double precision tests for safe_increment and fix warnings on float64_test.py by @copybara-service in #1055
- Add optax.tree_utils.tree_random_split. by @carlosgmartin in #1063
- Fix test.sh, which uses set -u, so that it works when JAX_VERSION is unset. by @carlosgmartin in #1070
- Migrate from jax.tree_util legacy APIs to new jax.tree API. by @carlosgmartin in #1066
- Ensure optimizers return updates of same dtype as params. by @copybara-service in #1060
- Fix test.sh to not modify .pylintrc. by @carlosgmartin in #1071
- Replace deprecated typing.Hashable with collections.abc.Hashable. by @carlosgmartin in #1068
- Relax absolute tolerance for failing tests involving chex.assert_trees_all_close. by @carlosgmartin in #1069
- Fix doctests by @copybara-service in #1073
- Tidy up test.sh and make it clean up properly. by @carlosgmartin in #1074
- Add missing initializer argument of 0 to tree_reduce in tree_vdot and tree_sum. by @carlosgmartin in #1065
- bump chex version for #1076 by @fabianp in #1078
- correct RST references by @fabianp in #1079
- deprecate methods in
optax/monte_carlo. by @copybara-service in #1076 - schedule-free optimier: ensure it's possible to donate both the state and the params by @enolan in #1059
- add link to examples from docstring by @fabianp in #1085
- Adopt US spelling for documentation and fix typos by @miguelcsx in #1087
- Update docs: Note RMSprop usage instead of Adam for memory savings in… by @nasyxx in #1086
- adding a perturbations module. Can take pytrees as inputs by @copybara-service in #827
- Fix initial step of scale_by_optimistic_gradient. by @carlosgmartin in #1088
- Add Hungarian algorithm for the linear assignment problem. by @carlosgmartin in #1083
- Allow safe_increment to handle unsigned integers. by @carlosgmartin in #1092
- Fix formatting issues with gallery entry for linear assignment problem. by @carlosgmartin in #1095
- use sphinx references instead of hardcoded links. by @fabianp in #1096
- Remove dtype safeguards by @copybara-service in #1099
- cosmetic improvements perturbations module by @fabianp in #1097
- update jax.tree.map to comply with jax 0.4.34 by @a1302z in #1094
- Add Adan optimizer. by @carlosgmartin in #1090
- Fix typo in projection_simplex docstring. by @copybara-service in #1105
- add config for link checking, and mark 429 (too many requests) as fine by @fabianp in #1103
- Fix docstring for hungarian_algorithm. by @carlosgmartin in #1102
- Add optax.optimistic_adam. by @carlosgmartin in #1089
- Add projection_l1_sphere and projection_l1_ball. by @copybara-service in #1106
- Add projection_l2_sphere and projection_l2_ball. by @copybara-service in #1114
- Add tree_max. by @copybara-service in #1115
- Add projection_linf_ball. by @copybara-service in #1117
- remove test that leaked jax tracers by @copybara-service in #1123
- Add a mathematical description for Lion by @aman2304 in #1121
- Fix the sign for the update in the math equation for nadam in the docs. by @carlosgmartin in #1128
- ntxent fix by @GrantMcConachie in #946
- Add Nesterov momentum to AdaBelief optimizer. by @carlosgmartin in #1127
- fix: Coherent dtypes of updates with and without MultiSteps by @hlzl in #1122
- Fix AdaBelief implementation. by @carlosgmartin in #1130
- Revisiting linesearches and LBFGS. by @copybara-service in #1133
New Contributors
- @vroulet made their first contribution in #1005
- @jungtaekkim made their first contribution in #1008
- @Abhinavcode13 made their first contribution in #1009
- @gil2rok made their first contribution in #1030
- @bhargavyagnik made their first contribution in #1032
- @n-gao made their first contribution in #1042
- @enolan made their first contribution in #1059
- @miguelcsx made their first contribution in #1087
- @a1302z made their first contribution in #1094
- @aman2304 made their first contribution in #1121
- @hlzl made their first contribution in #1122
Full Changelog: https://githu...
Optax 0.2.3
What's Changed
- Fix the KeyboardInterrupt exception from #860 by removing the timeout by @copybara-service in #886
- Beginning of 0.2.3 development by @copybara-service in #893
- Add a mathematical description of AdamW by @gbruno16 in #894
- Suppress not-callable pylint error for now since is being flagged erroneously all over the place. by @copybara-service in #908
- Fix doc link by @yixiaoer in #903
- Fixed pseudocode for Nesterov in description of SGD. by @satyenkale in #901
- Fix softmax_cross_entropy to handle -inf logits correctly when corresponding label is 0. by @carlosgmartin in #898
- Upstream sparsemax jaxopt loss to optax. by @copybara-service in #899
- Reorganize tree_utils. by @copybara-service in #914
- Revert of #898. by @copybara-service in #915
- Fix jax.tree_map deprecation warnings. by @copybara-service in #917
- Correct handling of -inf in softmax_cross_entropy. Fix #898. by @copybara-service in #916
- Added mathematical documentation to AdaMax by @hmludwig in #918
- Fix pip install command for doc dependencies. by @mblondel in #922
- Start documentation for projections. by @mblondel in #921
- Add projection_simplex. by @copybara-service in #919
- Move gradient transformations to optax.transforms sub-package - 1/N by @copybara-service in #923
- Added a NTXent loss by @GrantMcConachie in #897
- fix(docs): broken link in README by @jeertmans in #940
- Add a deprecation module to warn or raise errors for deprecations (following jax semantics). by @copybara-service in #931
- chore(ci): add markdown-link-check action by @jeertmans in #939
- Implementation of MoMo algorithm by @fabian-sp in #721
- Weight decay for COCOB by @albcab in #945
- Add a nesterov flag to radam optimizer. by @carlosgmartin in #949
- Formatting in momo docstring + doctest by @fabianp in #950
- docstring formatting by @fabianp in #952
- Port schedule_free optimizer to optax. Original pytorch repo: https://github.com/facebookresearch/schedule_free by @copybara-service in #911
- Fix RST formatting issues. by @fabianp in #953
- remove duplicated BATCH_SIZE argument by @fabianp in #956
- Replace deprecated
jax.tree_*functions withjax.tree.*by @copybara-service in #963 - remove residues from previous builds before running tests by @fabianp in #967
- Fix docs errors by @copybara-service in #941
- Removing sophia optimizer by @copybara-service in #973
- move clipping transforms to optax.transforms. by @copybara-service in #926
- Expose components in sub-package by @copybara-service in #978
- Add multiclass_sparsemax_loss. by @copybara-service in #971
- Remove useless inner jit by @copybara-service in #957
- Fix memory leak in radam optimizer by @lukekulik in #974
- Add end_scale argument by @stefanocortinovis in #975
- Fix error with x64 loss by @stefanocortinovis in #976
- LBFGS solver part 1: chainable preconditioner. by @copybara-service in #980
- Fix docs errors (following warnings displayed in doc logs of github actions) by @copybara-service in #984
- [JAX] Update users of jax.tree.map() to be more careful about how they handle Nones. by @copybara-service in #983
- LBFGS solver part 2: implementing linesearch ensuring sufficient decrease and small curvature by @copybara-service in #981
- CI: add test against oldest supported JAX version by @jakevdp in #987
- Internal change by @copybara-service in #988
- Ignore some linesearch tests on gpu/tpu by @copybara-service in #986
- LBFGS part 3: combine lbfgs and zoom linesearch by @copybara-service in #989
- Add arxiv reference to schedule_free optimizer. by @copybara-service in #997
- LBFGS part 4: notebook illustrating how to use lbfgs with linesearch as a solver. by @copybara-service in #991
- Add common schedule_free wrappers. by @copybara-service in #998
- Add schedule_free check for b1 != 0. by @copybara-service in #999
- feat: add
normalize_by_update_normby @SauravMaheshkar in #958 - Saurav maheshkar saurav/scale by grad norm by @fabianp in #1000
- Fix doctest normalize_by_update_norm by @copybara-service in #1002
- Release v0.2.3 by @copybara-service in #1001
New Contributors
- @gbruno16 made their first contribution in #894
- @satyenkale made their first contribution in #901
- @GrantMcConachie made their first contribution in #897
- @fabian-sp made their first contribution in #721
- @lukekulik made their first contribution in #974
- @jakevdp made their first contribution in #987
Full Changelog: v0.2.2...v0.2.3
Optax 0.2.2
What's Changed
- Added mathematical description to Noisy SGD by @hmludwig in #857
- Use sphinx-contributors for an automated contributors list. by @fabianp in #841
- Implementation of the Polyak SGD solver by @copybara-service in #718
- Document the extra args of the update function in docstring by @copybara-service in #864
- Utility to set value in a pytree (and so in state) by @copybara-service in #865
- Added mathematical description to AdaBelief docstring by @hmludwig in #869
- FIX RST formatting in inject hyperparams by @fabianp in #867
- Warn that in future arguments after the initial (prediction, ground_truth) positional arguments will become keyword-only in optax losses. by @copybara-service in #863
- Upstream missing jaxopt losses to optax - Part 2/N by @copybara-service in #872
- Fix error
reduce_on_plateau.ipynb:20002: WARNING: No source code lexer found for notebook cellby @copybara-service in #875 - docstring cosmetic improvements by @fabianp in #879
- Extend capabilities of tree_get, tree_set. by @copybara-service in #878
- [DOC] Add to the gallery an example on a small language model by @copybara-service in #866
- Update reduce_on_plateau to handle training average loss. by @copybara-service in #883
- Fix notebook reduce_on_plateau by @copybara-service in #887
- ENH: extend power_iteration to accept a matrix in implicit form by @copybara-service in #858
- Document changes in power_iteration by @copybara-service in #889
- Release of version 0.2.2 by @copybara-service in #892
Full Changelog: v0.2.1...v0.2.2
v0.2.1
What's Changed
- Begin of 0.2.1 development by @copybara-service in #845
- Add mathematical formula in docstring of linear_schedule by @copybara-service in #839
- Reference reduce on plateau example from docstring by @copybara-service in #851
- Fix sign discrepancy in tolerance in scale_by_backtracking_linesearch by @copybara-service in #853
- 0.2.1 release by @copybara-service in #855
Full Changelog: v0.2.0...v0.2.1
v0.2.0
What's Changed
- Begin of 0.2.0 development by @copybara-service in #766
- Remove unnecessary %pip install statements from the notebooks by @JayJayleee in #762
- Updates warmup_cosine_decay_schedule to allow 0 as peak_value. Currently it errors out as divide by 0. by @copybara-service in #770
- fixed math flex direction by @copybara-service in #775
- Docstrings for all solvers with Objective Function values by @copybara-service in #771
- Misc improvements in
adversarial_training.ipynbby @copybara-service in #776 - Solver API (wip) by @copybara-service in #777
- Detail that learning_rate can be scalar or a schedule. by @copybara-service in #778
- workaround for pylint issues by @fabianp in #783
- correct type annotation for
paramsby @yixiaoer in #785 - Ensure no execution of slow notebooks. by @copybara-service in #788
- [DOC] merge development and design_docs into a single file by @fabianp in #789
- Add development information in README. by @mblondel in #790
- Solve #767, some examples not being displayed on the webpage by @fabianp in #791
- Add missing examples to contrib examples section. by @fabianp in #794
- Make EmptyState a valid ArrayTree by @copybara-service in #752
- Improve error message for
cosine_decay_schedule. by @copybara-service in #797 - move DPSGD docs to contrib/ section by @WangHanSolo in #782
- Adjusted 'test.sh' for universal macOS and Linux Compatibility by @hmludwig in #804
- Benchmark recommendations. by @copybara-service in #807
- Removed 'examples/quick_start.ipynb' and adjusted references. by @hmludwig in #811
- [DOC] streamline sidebar of gallery by @fabianp in #810
- [FIX] bad indentation in the docstring of nadam by @fabianp in #812
- Backtracking linesearch. by @copybara-service in #795
- Fixing backtracking linesearch support on gpus by @copybara-service in #814
- allow wrappers.flatten to work with 0-dim arrays by @copybara-service in #816
- Add Example, Args and Returns section to optax.linear_schedule by @copybara-service in #818
- FIX doctest format in TransformUpdateExtraArgsFn by @copybara-service in #820
- Refactoring and link fixes in development.md by @fabianp in #822
- Clarify formula for cosine learning rate. by @copybara-service in #823
- Detail SGD with momentum equations. by @copybara-service in #830
- Fixed the rendering of equations in jupyter notebooks by @hmludwig in #836
- Fixed the Sharpness aware minimization docstring by @hmludwig in #834
- FIX "gallery.rst:5: WARNING: Title underline too short." by @copybara-service in #838
- Moved linear algebra operations into utilities by @hmludwig in #837
- adding tree utils for generating trees with random values and averaging across trees. by @copybara-service in #825
- v0.2.0 release by @copybara-service in #844
New Contributors
- @JayJayleee made their first contribution in #762
- @yixiaoer made their first contribution in #785
- @mblondel made their first contribution in #790
- @WangHanSolo made their first contribution in #782
- @hmludwig made their first contribution in #804
Full Changelog: v0.1.9...v0.2.0
v0.1.9
What's Changed
- update URL github.com/deepmind -> github.com/google-deepmind and branch to main by @copybara-service in #710
- Show CI results only for the main branch by @fabianp in #716
- typo nesterov -> Nesterov by @fabianp in #722
- Add
atoloption tocontrib.reduce_on_plateau()by @stefanocortinovis in #698 - add docs for tree_utils module by @amosyou in #724
- Simplifications on the doc by @copybara-service in #727
- add nadam and nadamw optimizers by @copybara-service in #723
- Add
versionaddedandseealsometadata to (n)adam(w) solvers by @copybara-service in #729 - Enable doctests in sphinx and fix failing doctests by @fabianp in #733
- Add missing members to the doc by @copybara-service in #734
- FIX sphinx warnings "this document is not included in any toctree" by @fabianp in #736
- feat(ci): drop
setup.pyfrom publishing CI by @SauravMaheshkar in #737 - Minor tweak in pypi-publish.yml by @fabianp in #739
- [TEST] Install virtual environment in current directory instead of /tmp by @fabianp in #746
- migrate to pyproject by @copybara-service in #747
- Deprecate optax.inject_stateful_hyperparameters and replace it with optax.inject_hyperparameters by @copybara-service in #730
- Clarify inclusion criteria into optax and optax.contrib by @copybara-service in #742
- fix the default learning rate in prodigy by @konstmish in #740
- update and merge quickstart notebooks by @amosyou in #726
- Remove redundant examples from README by @mmhamdy in #754
- Instructions to build the doc and option to build the docs without executing the notebooks. by @copybara-service in #759
- Remove license statement from notebooks by @renuka010 in #764
New Contributors
- @amosyou made their first contribution in #724
- @konstmish made their first contribution in #740
- @renuka010 made their first contribution in #764
Full Changelog: v0.1.8...v0.1.9