Add tree_bits function. by carlosgmartin · Pull Request #1341 · google-deepmind/optax

carlosgmartin · 2025-06-18T23:42:41Z

Add a tree_bytes function to the tree utilities, analogous to Haiku's.

rdyro · 2025-06-20T17:40:05Z

I like the idea!

Can we check for the presence of jnp.ones(10).on_device_size_in_bytes() on the object, for int4 arrays itemsize=1, but on device size is half a byte per element.

carlosgmartin · 2025-06-21T00:36:54Z

@rdyro Are you suggesting adding a flag on_device to tree_bytes that replaces jnp.asarray(leaf).nbytes with jnp.asarray(leaf).on_device_size_in_bytes()?

rdyro · 2025-06-22T17:52:09Z

I was thinking of using Python's hasattr, however, this will fail for tracers that are backend independent and will report they don't have on_device_size_in_bytes.

I'm not sure about the name of this function now, size * itemsize does not correctly report the byte size and on_device_size_in_bytes is not available on tracers.

carlosgmartin · 2025-06-22T18:21:04Z

I'm now using nbytes. That seems like the simplest, most straightforward approach. Let's go with that for now.

In the meantime, I left a question about on_device_size_in_bytes here.

rdyro · 2025-06-22T22:07:33Z

Hmmm, nbytes appears to just use size * itemsize.

Can you test this on the following case:

fn = lambda: tree_bytes(jnp.ones((1024,), dtype=jnp.int32))
fn()
jax.jit(fn)()

It should report 512 in both cases

carlosgmartin · 2025-06-22T22:11:34Z

@rdyro Shouldn't it be 4096?

An int32 has 32 bits / 8 bits per byte = 4 bytes. And 1024 * 4 = 4096.

I get 4096 in both cases.

rdyro · 2025-06-22T23:17:36Z

Should have been int4, that's the whole problem here :/

fn = lambda: tree_bytes(jnp.ones((1024,), dtype=jnp.int4))
fn()
jax.jit(fn)()

carlosgmartin · 2025-06-22T23:23:29Z

For int4, I get 1024 in both cases.

optax/tree_utils/_tree_math_test.py

rdyro · 2025-06-22T23:26:24Z

For int4, I get 1024 in both cases.

.nbytes and .size * .itemsize both give the wrong number. Currently the only way to get the true byte size for all dtypes is to call .on_device_size_in_bytes() in eager mode.

carlosgmartin · 2025-06-23T00:32:09Z

This still works for the most common dtypes, so let's go with the nbytes-based implementation for now.

I added a warning to the function's docstring. Feel free to reword this warning.

rdyro · 2025-06-23T00:41:11Z

Let’s wait until we have a solution on the JAX side, I’ll keep the PR open for now.

carlosgmartin · 2025-06-23T00:42:32Z

Will you be opening an issue for that?

rdyro · 2025-06-23T00:44:42Z

I'll follow up on this internally. For the JAX issue you opened, can you explicitly ask about the use case of getting int4 byte size under jit?

carlosgmartin · 2025-06-24T21:33:08Z

@rdyro In your opinion, what would be the ideal output of jnp.ones(1, dtype=jnp.int4).nbytes, or whatever the equivalent to nbytes is? Seems ugly to mix float and int values.

Perhaps this suggests that, more generally, we ought to be counting in bits rather than bytes?

rdyro · 2025-06-25T03:09:54Z

@rdyro In your opinion, what would be the ideal output of jnp.ones(1, dtype=jnp.int4).nbytes, or whatever the equivalent to nbytes is? Seems ugly to mix float and int values.

Perhaps this suggests that, more generally, we ought to be counting in bits rather than bytes?

Currently on CPU, GPU and TPU the byte size of jnp.ones(1, dtype=jnp.int4) is 1, but the byte size jnp.ones(2, dtype=jnp.int4) is also 1 since it's packed.

However, it's possible that a platform doesn't guarantee packing for int4, I don't think it's possible to have a jit-compatible function counting bytes currently. I believe users interested in RAM/VRAM size should use a custom lambda with jax.tree.map to explicitly state their own assumptions.

I believe fp4 will suffer from the same problem as int4, I'm not sure there's a difference between integer or floating point representations.

We'd typically make an assumption that 1 byte is 8 bits, so it shouldn't change the calculation and it doesn't solve the packing representation problem of fp4/int4.

Perhaps this function could be tree_nbytes, but I'm really wary of exposing bytes counting in optax because it can be a source of issues and can be potentially confusing for users working with large models.

I'd prefer not to merge this function into optax. I find the haiku version actively confusing when working with int4 quantized models.

carlosgmartin · 2025-06-25T17:45:11Z

What does @vroulet think?

A potential alternative is to say that we're interested only in how much information there is in a pytree (not how it will be packed or laid out on devices, which is hardware-dependent). We can do this by counting bits. A tree_bits function would be able to do so exactly, even for fractional-byte types like int4, int2, float4, bool, etc.

carlosgmartin · 2025-07-11T18:05:41Z

@rdyro I fixed the issue by switching to a tree_bits function.

carlosgmartin force-pushed the tree_bytes branch from 6b8a952 to 6d0d300 Compare June 18, 2025 23:44

carlosgmartin force-pushed the tree_bytes branch from 6d0d300 to a24fe51 Compare June 21, 2025 00:35

carlosgmartin force-pushed the tree_bytes branch 2 times, most recently from e206e87 to 7956e45 Compare June 21, 2025 23:20

carlosgmartin force-pushed the tree_bytes branch 2 times, most recently from 4d48682 to 4bc029b Compare June 22, 2025 18:34

carlosgmartin force-pushed the tree_bytes branch from 4bc029b to 780739d Compare June 22, 2025 22:13

carlosgmartin force-pushed the tree_bytes branch from 780739d to 85ef02b Compare June 22, 2025 23:22

rdyro reviewed Jun 22, 2025

View reviewed changes

optax/tree_utils/_tree_math_test.py Outdated Show resolved Hide resolved

carlosgmartin force-pushed the tree_bytes branch 2 times, most recently from 96697d4 to 53e365f Compare June 23, 2025 00:31

carlosgmartin force-pushed the tree_bytes branch from 53e365f to 44714c9 Compare June 23, 2025 01:09

carlosgmartin force-pushed the tree_bytes branch 2 times, most recently from 89d298f to f3944d4 Compare July 11, 2025 18:03

carlosgmartin changed the title ~~Add tree_bytes function.~~ Add tree_bits function. Jul 11, 2025

carlosgmartin force-pushed the tree_bytes branch from f3944d4 to cbbd7ca Compare July 11, 2025 18:05

carlosgmartin requested a review from rdyro July 11, 2025 18:05

carlosgmartin force-pushed the tree_bytes branch from cbbd7ca to ad15aa0 Compare July 11, 2025 19:48

Add tree_bits function.

ebe94c1

carlosgmartin force-pushed the tree_bytes branch from ad15aa0 to ebe94c1 Compare July 11, 2025 19:49

Comments

Conversation

carlosgmartin commented Jun 18, 2025

Uh oh!

rdyro commented Jun 20, 2025

Uh oh!

carlosgmartin commented Jun 21, 2025

Uh oh!

rdyro commented Jun 22, 2025

Uh oh!

carlosgmartin commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdyro commented Jun 22, 2025

Uh oh!

carlosgmartin commented Jun 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rdyro commented Jun 22, 2025

Uh oh!

carlosgmartin commented Jun 22, 2025

Uh oh!

Uh oh!

rdyro commented Jun 22, 2025

Uh oh!

carlosgmartin commented Jun 23, 2025

Uh oh!

rdyro commented Jun 23, 2025

Uh oh!

carlosgmartin commented Jun 23, 2025

Uh oh!

rdyro commented Jun 23, 2025

Uh oh!

carlosgmartin commented Jun 24, 2025

Uh oh!

rdyro commented Jun 25, 2025

Uh oh!

carlosgmartin commented Jun 25, 2025

Uh oh!

carlosgmartin commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

carlosgmartin commented Jun 22, 2025 •

edited

Loading

carlosgmartin commented Jun 22, 2025 •

edited

Loading