Trying to predict band splitting by jmvalin · Pull Request #15 · AOMediaCodec/oac

jmvalin · 2026-02-13T16:01:51Z

No description provided.

thirv · 2026-02-19T01:16:05Z

I'm bit confused about the branches. You want a review on this PR in how it affects jmvalin/theta_cleanup, while that branch is used in another PR to main? Should PRs be layered this way because it seems difficult to interpret the final effects to main...?

jmvalin · 2026-02-19T15:28:20Z

I just thought it made more sense to already have the changes be one on top of each other, i.e.

main -> A -> B -> C -> D

rather than

       /-->A
main -/--> B
      \--> C
       \-> D

It's more logical in terms of development, it reduces the amount of conflicts, and it's necessary in some cases where PRs depend on earlier PRs. For example, among the current PRs, #17 requires #16 to work.

thirv · 2026-02-19T17:22:13Z

If you do a PR on top of an ongoing PR, doesn't this assume that the first PR works, or at least possible changes to the first PR during review would not affect the latter? Is it better to wait for the first PR to merge or you think this is not an issue?

jmvalin · 2026-02-19T19:27:18Z

Well, I'm assuming there's more than a 50% chance the PR lands in one form or another. Having all PRs based on main would also mean having to adapt as soon as something lands in main. There's no way to avoid rebase unless you only ever allow a single open PR to exist at any given time. In this case, I stacked the PRs to minimize the expected amount of complications. PRs #14, #15 and #16 are mostly independent (any of them can be applied separately), but they all change the same files, so I think this is simpler. As for PR #17 , that one just cannot exist without #16, so it couldn't be based on main even if I wanted to.

thirv · 2026-02-20T00:07:19Z

It is clear multiple open PRs must be allowed. Was just trying to learn why stacking is beneficial compared to making PRs to main and merging main changes as they appear. But no problem so far if this way is preferred.

thirv

It would help if at least a brief description of the logic and PR motivation is included. Please correct if any of this is wrong:

The transient detector decides to split to short frames before MDCT, both for mono and stereo with B = num of subframes. theta = generic rotation angle between two splits. Meaning changes whether split happens in channel, time or freg domain, and different PDFs are used depending.

Mode	B	Theta Use Domain	split_mem Used?	PDF Type
Mono	1	Frequency (sub-band split)	NO	Triangular (B0>1, balanced) or Uniform (B0>1, biased) or Uniform (B0=1)
Mono	>1	Time (MDCT temporal split)	YES	Triangular (balanced, split_mem=3) or Uniform (biased, split_mem=1/2)
Stereo	1	Channel (mid/side spatial)	NO	Special stereo PDF (p0=3 for bias, 1 for rest)
Stereo	>1	Both: Time (MDCT) + Channel (mid/side)	NO	Special stereo PDF (p0=3 for bias, 1 for rest)

Assuming this, using split_mem to select PDF in the one case seems like it saves bits (given typical audio and the PDFs used). However, should it be used for other cases? Do you disagree with:

Case	Currently Uses split_mem?	Should Use?	Expected Benefit	Implementation Complexity
Mono B=1 (freq)	NO	NO	<0.1%	High (need per-band memory)
Mono B>1 (time)	YES	YES	0.1-0.3%	Low (already implemented)
Stereo B=1 (mid/side)	NO	MAYBE	0.5-1.0%	Medium (need stereo memory)
Stereo B>1 (time+channel)	NO	YES	0.2-0.6%	Low (extend existing code)

jmvalin · 2026-02-20T20:51:35Z

Just to clarify one thing, the "stereo" flag is set when the angle itself corresponds to mid/side (exactly one split per band). So the commit still applies to stereo audio when coding the mid and side channels. Basically, when B=1 it means we're splitting along frequency, so there's no reason for one band to have the same split as the previous band. That's why B=1 is already using a triangular pdf. That leaves the stereo (mid/side) split itself. That one may eventually benefit from inter-band memory, but I haven't looked into that yet.

thirv · 2026-02-20T21:26:17Z

Are you saying that in this PR the case Stereo B>1 is already using split_mem for temporal + stereo PDF for spatial?

Did not understand what does "angle itself corresponds to mid/side" mean.

jmvalin · 2026-02-21T15:57:00Z

So if you have a stereo input, each band will be coded with oaci_quant_band_stereo(), which computes the mid and side signals. It then calls oaci_compute_theta() with stereo=1 to compute/encode an "angle" theta=atan(|side|/|mid|) as explained in Section 4.5.1 of https://jmvalin.ca/papers/aes135_opus_celt.pdf .
From there, the mid and side signals are normalized and each of them is coded with oaci_quant_band() (which in turns calls oaci_quant_partition()) like any mono signal. Now oaci_quant_partition() decide to split any band in half (recursively) if it requires too many bits. To do that, it also uses oaci_compute_theta() but with stereo=0 this time. In this case, the angle is between the magnitude of the left and right part of the vector.
Now when it comes to coding the angle, the probability of any given angle depends on the context of the split. stereo=1 means that we're splitting mid and side. B==1 means we're splitting along different frequencies, and B>0 means we're splitting along time.

thirv

OK thanks, I was bit confused by this but now it seems clear. In the future it would be interesting to look at split memory for the channel coding theta, if it is not meaningful now.

jmvalin · 2026-02-23T18:00:15Z

Thanks for the review. When you say "split memory for the channel coding theta", do you mean adding memory for the theta that's used for mid/side splitting? If so, then yes, it's definitely something to look at in the future. The potential gains are likely a bit smaller just because there's fewer of those angles, but it's probably still worth doing something simple.

thirv · 2026-02-23T21:07:10Z

When you say "split memory for the channel coding theta", do you mean adding memory for the theta that's used for mid/side splitting?

Yes that was the meaning.

jmvalin self-assigned this Feb 13, 2026

jmvalin force-pushed the jmvalin/theta_cleanup branch from 226eb18 to c2ccc64 Compare February 17, 2026 22:50

jmvalin force-pushed the jmvalin/split_mem2 branch from 085456e to 5595048 Compare February 18, 2026 15:28

Trying to predict band splitting

413ed98

jmvalin force-pushed the jmvalin/split_mem2 branch from 5595048 to 413ed98 Compare February 18, 2026 18:20

thirv self-requested a review February 18, 2026 21:42

thirv reviewed Feb 20, 2026

View reviewed changes

thirv approved these changes Feb 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Trying to predict band splitting#15

Trying to predict band splitting#15
jmvalin wants to merge 1 commit intojmvalin/theta_cleanupfrom
jmvalin/split_mem2

jmvalin commented Feb 13, 2026

Uh oh!

thirv commented Feb 19, 2026

Uh oh!

jmvalin commented Feb 19, 2026 •

edited

Loading

Uh oh!

thirv commented Feb 19, 2026

Uh oh!

jmvalin commented Feb 19, 2026

Uh oh!

thirv commented Feb 20, 2026

Uh oh!

thirv left a comment •

edited

Loading

Uh oh!

jmvalin commented Feb 20, 2026 •

edited

Loading

Uh oh!

thirv commented Feb 20, 2026

Uh oh!

jmvalin commented Feb 21, 2026

Uh oh!

thirv left a comment

Uh oh!

jmvalin commented Feb 23, 2026

Uh oh!

thirv commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

jmvalin commented Feb 13, 2026

Uh oh!

thirv commented Feb 19, 2026

Uh oh!

jmvalin commented Feb 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thirv commented Feb 19, 2026

Uh oh!

jmvalin commented Feb 19, 2026

Uh oh!

thirv commented Feb 20, 2026

Uh oh!

thirv left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jmvalin commented Feb 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thirv commented Feb 20, 2026

Uh oh!

jmvalin commented Feb 21, 2026

Uh oh!

thirv left a comment

Choose a reason for hiding this comment

Uh oh!

jmvalin commented Feb 23, 2026

Uh oh!

thirv commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jmvalin commented Feb 19, 2026 •

edited

Loading

thirv left a comment •

edited

Loading

jmvalin commented Feb 20, 2026 •

edited

Loading