Skip to content

Use uv to run CI#973

Closed
adenzler-nvidia wants to merge 1 commit intogoogle-deepmind:mainfrom
adenzler-nvidia:dev/adenzler/uv-ci
Closed

Use uv to run CI#973
adenzler-nvidia wants to merge 1 commit intogoogle-deepmind:mainfrom
adenzler-nvidia:dev/adenzler/uv-ci

Conversation

@adenzler-nvidia
Copy link
Copy Markdown
Collaborator

This PR changes the CI to use the dependency versions locked in the uv.lock file. Right now CI tests against latest dev version of warp and mujoco, which means the uv workflow is untested and we have no test for known-good dependencies.

I made this a draft because there's definitely a discussion to be had here. My stance is that we should have nightlies for dev versions of mujoco/warp and use the uv workflow for the package that ships/users will use. Like this, any change that needs a change in minimum versions should be including a version upgrade as well, and we explicitly know when we are breaking dependencies.

But I probably don't have the full picture here, so let's discuss.

Signed-off-by: Alain Denzler <adenzler@nvidia.com>
@adenzler-nvidia adenzler-nvidia marked this pull request as draft January 5, 2026 08:48
@thowell thowell requested a review from erikfrey January 5, 2026 10:57
@thowell
Copy link
Copy Markdown
Collaborator

thowell commented Jan 5, 2026

thanks @adenzler-nvidia!
#917 is probably relevant to this discussion

@thowell thowell requested a review from btaba January 5, 2026 10:59
@adenzler-nvidia
Copy link
Copy Markdown
Collaborator Author

yes, looks super relevant. Not sure if everything on this matrix needs to be tested on each PR, I think for forward compatibility it's ok to have nightlies (for uv-locked mujoco + nightly warp, or uv-locked warp and nightly mujoco).

@adenzler-nvidia
Copy link
Copy Markdown
Collaborator Author

The other thing we need (which I didn't include but we probably should) is to make sure pyproject.toml specifies correct min version that actually work. For example, the warp-lang requirement of >= 1.9.0 likely doesn't work.

@adenzler-nvidia
Copy link
Copy Markdown
Collaborator Author

@thowell wondering what we should do with this. I think the dev/release questions are probaby going to be much better once there is a proper release schedule.

But are the version we have in uv.lock considered dev or release versions? I think it makes sense to test with the versions we have in uv.lock for the PR ci, compared to just getting latest. That's the whole point of having a uv.lock file? What do you think?

@thowell
Copy link
Copy Markdown
Collaborator

thowell commented Jan 13, 2026

@btaba what are your thoughts on this and what do you recommend?

Comment thread .github/workflows/ci.yml
Comment on lines 66 to 67
python -m pip install --upgrade pip
pip install uv
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's also simpler just to use the setup-uv action as recommended in the uv docs (https://docs.astral.sh/uv/guides/integration/github/) instead of installing via pip

@btaba
Copy link
Copy Markdown
Collaborator

btaba commented Jan 16, 2026

To answer @thowell , I went through uv docs a bit and here's what I came up with. I'm not a uv expert.

  • Single uv.lock (as uv intended)

  • Prod/release build: uv sync --no-sources --prerelease=disallow

  • Dev/nightly: uv sync

with this config to pyproject.toml

[tool.uv]
upgrade-package = ["mujoco", "warp-lang"]
prerelease = "allow"
index-url = "https://pypi.org/simple/"

dev installs with nightlies, and prod installs from PyPI. We should test against both to make sure we maintain backwards compatibility at HEAD (mainly a constraint of our google import workflow). Before release, uv.lock should be updated to include PyPI versions only.

wdyt @thowell @adenzler-nvidia @shi-eric

@shi-eric
Copy link
Copy Markdown
Collaborator

@btaba I lean towards simplifying more. Keep the uv.lock for development builds + merge requests. The main purpose of uv.lock is to make sure that the main branch always "just works" with pulling dependencies from the lockfile via a uv run ... or uv sync. Having the uv.lock file do double duty with dev/release testing is asking a bit much of it IMO.

Now when we start talking about prod/release builds that pull only from PyPI, I would suggest a GitHub job that doesn't use uv at all. The job will just use ordinary pip install commands (which by default don't get pre-release versions). This job might run in pull requests (in which failures are tolerated) or it can run on the main branch only for informative purposes.

@adenzler-nvidia
Copy link
Copy Markdown
Collaborator Author

I agree with Eric here - uv.lock should just enable always pulling a working version for development. As consequence, the PR checks should use that lock file to get the dependencies. I tend to also update pyproject.toml to these versions on the main branch because not all devs are that used to using uv yet, but that's not strictly necessary and I don't mind if we don't do that.

And then I think what remains is that we need other CI jobs that check forward/backwards compatibility. If we plan to release with a dependency version that is not released, test against their dev packages, or for backwards have a check that makes sure people don't increase the uv.lock versions beyond the requirement for that dependency.

@btaba
Copy link
Copy Markdown
Collaborator

btaba commented Jan 16, 2026

@shi-eric sounds good to me.

To summarize: with the pyproject mod in my OP, uv sync would pull nightlies for dev for expected packages. And for prod, instead of uv sync --no-sources --prerelease=disallow, we use pip, which is what pip does anyways.

I don't think failures should be tolerated for the prod CI workflows (just because of an already existing requirement on our end, that we don't want MuJoCo releases blocked on warp-lang or mujoco_warp releases – transitive dep through MJX – and we need the mujoco_warp imports to always work at HEAD).

RE @adenzler-nvidia, if folks still use pip for dev, pyproject.toml should not pin to pre-release versions. I find pip --pre-release to cause more friction.

@shi-eric
Copy link
Copy Markdown
Collaborator

I don't think failures should be tolerated for the prod CI workflows (just because of an already existing requirement on our end, that we don't want MuJoCo releases blocked on warp-lang or mujoco_warp releases – transitive dep through MJX – and we need the mujoco_warp imports to always work at HEAD).

Can you clarify this point a bit more? Also I don't know what happened to @kevinzakka's comment. Maybe there's some nomenclature differences. Sorry in advance for the wall of text.

Current Situation

mujoco_warp has two main unit test workflows that are run on a pull request called Build and test (putting Python versions aside). One installs the dependencies satisfying project.dependencies list (defaults to the latest prod versions of the dependencies) and the other installs the dependencies that must satisfy project.optional-dependencies.dev.

The first workflow (prod dependencies) already has continue-on-error while the second is does not. This makes sense because the main branch is the bleeding-edge development branch typically making use of unreleased (nightly) builds of Warp and MuJoCo.

Dev workflows

The second workflow (Build and test with dev dependencies) is required to pass for pull requests, but it's not making use of the uv.lock file since it is using uv pip install. This means that the uv.lock file isn't really being utilized the way it's supposed to (to have a set of version-resolved dependencies that are consistent with pyproject.toml requirements and have been tested to work with the current state of the main branch.

Since the dev version of the Build and test workflow doesn't use the lockfile, what GitHub CI/CD checks are doing is to ensure that mujoco_warp's main branch work with the latest dev versions of warp-lang and mujoco when a pull request is made. This is useful information, but it can derail the scope of pull requests (e.g. someone has a pull request to fix bug Y but because the latest dev version of mujoco removed Z, they now have to fix that to get the pipelines green again. This is just my guess of what happens a lot with the current setup, I don't pay close attention to the contributions.)

The thing that should be fixed here is the mandatory-pass workflow in CI/CD needs to be installing dependencies from the uv.lock file. Since all changes to main happen through pull requests, this virtually guarantees that we always know the versions of mujoco and warp-lang that are known to work with the current state of the main branch. (I say virtually because an exception to this when a dependency is no longer available, such as when an older nightly mujoco wheel is removed).

Prod workflows

This is more ambiguous since I don't fully understand what @btaba means by a prod workflow and your release process. To me, a prod workflow means testing mujoco_warp using release-version dependencies (no .dev in the version string) available from PyPI. From this meaning, it is highly unlikely that requiring a pass in CI/CD for pull requests is what you want. This would mean that no changes can be merged into the main branch until a prod release of the mujoco and warp-lang packages has the feature available.

Depending on your release process, it's usually only right before a release that this job is brought into the passing state as verification that the version of the mujoco_warp library you want to release is compatible with prod versions of the dependencies. I recommended pip install here since it won't use third-party mujoco/nvidia Python indexes and it won't pull dev dependencies unless you opt-in. You can do get this behavior with uv commands, but conceptually I find the pip behavior a lot simpler to understand versus uv-based commands.

@adenzler-nvidia
Copy link
Copy Markdown
Collaborator Author

Thanks Eric - very helpful.

I think we need to understand more about the forward/backward compatibility requirements that are imposed by the Google workflow.

I was assuming we could have something like this:

  • main (and the next release branched off main) always targets a release version for each dependency.
  • uv.lock contains known-good dev packages of these target release versions, and enforces this in CI
  • we have a nightly pipeline that runs against the latest dev versions of the target releases, which helps informing us about potential breaking changes we need to act on. Acting on one of these means increasing the versions in uv.lock as well.
  • once a target release version is released, uv.lock is switched to the release version and pyproject.toml is updated.
  • once the release branch is created, pyproject.toml and uv.lock are updated to point to non-dev packages on the release branch, and the target versions for the next release branch off main are defined.

Does that make sense?

@erikfrey
Copy link
Copy Markdown
Collaborator

erikfrey commented Jan 26, 2026

It looks like we can create a github action that updates uv.lock nightly. We can probably make it clever enough to create a PR that merges automatically if the tests pass, or leaves the PR open with failure log if a test fails.

So I generally prescribe to the flow proposed by @adenzler-nvidia and @shi-eric

This is some work though.

For now let's get in whatever is the smallest change that at least ensures that our matrix is testing dev against nightlies and "release" against non-nightlies, where the latter can continue on error, and then make an issue to track this idea of relying on uv.lock as a source of truth for dev testing, which seems reasonable to me.

@shi-eric
Copy link
Copy Markdown
Collaborator

It looks like we can create a github action that updates uv.lock nightly. We can probably make it clever enough to create a PR that merges automatically if the tests pass, or leaves the PR open with failure log if a test fails.

So I generally prescribe to the flow proposed by @adenzler-nvidia and @shi-eric

This is some work though.

For now let's get in whatever is the smallest change that at least ensures that our matrix is testing dev against nightlies and "release" against non-nightlies, where the latter can continue on error, and then make an issue to track this idea of relying on uv.lock as a source of truth for dev testing, which seems reasonable to me.

Thanks for weighing in. I didn't mean with my "bleeding-edge" statement that the uv.lock file should always have the most recent versions of every dependency. I would recommend against developing GitHub Action workflow that auto-updates the uv.lock file due to the extra noise in the commit history and the the removal of explicit developer intent when it comes to updating a dependency version. For example, the uv.lock file doesn't need to be bumped up every day to pick up the most recently Warp nightly. A developer should instead submit a pull request to update the Warp nightly only when an upgrade is needed (e.g. to pick up a new feature).

@erikfrey
Copy link
Copy Markdown
Collaborator

erikfrey commented Jan 27, 2026

A property of our CI that I would like to maintain is that we are testing against the latest nightly build to catch regressions in dependencies - which are still quite frequent - so that we can address them right away.

It seems like you are recommending against testing against the latest nightlies in CI and also recommending against frequently updating the lock file. Then how do we maintain this property of rapidly catching breaking behavior in our dependencies?

@shi-eric
Copy link
Copy Markdown
Collaborator

A property of our CI that I would like to maintain is that we are testing against the latest nightly build to catch regressions in dependencies - which are still quite frequent - so that we can address them right away.

It seems like you are recommending against testing against the latest nightlies in CI and also recommending against frequently updating the lock file. Then how do we maintain this property of rapidly catching breaking behavior in our dependencies?

Oh, I still recommend having a CI/CD job that tests against the bleeding edge dependencies to catch issues early. We do this in Warp with testing against MJWarp main and also in Newton when testing against the nightly Warp. But I think this job shouldn't automatically increment the uv lockfile if it passes (after all, CI/CD doesn't test GPU and it's good to have a human in the loop). I think this was also what Alain had in mind when he said:

we have a nightly pipeline that runs against the latest dev versions of the target releases, which helps informing us about potential breaking changes we need to act on. Acting on one of these means increasing the versions in uv.lock as well.

@shi-eric
Copy link
Copy Markdown
Collaborator

Also wanted to mention that https://docs.astral.sh/uv/guides/integration/pre-commit/ will be useful to enable.

@erikfrey
Copy link
Copy Markdown
Collaborator

Ah OK thanks @shi-eric - let me summarize what I think are the concrete changes:

  1. PRs should run tests on whatever dep versions are in uv.lock
  2. To satisfy @btaba 's MJX requirements that MJWarp HEAD works against release versions of warp and mujoco, PRs should also run tests against the latest released deps too. I think @btaba would ideally like a failure here to block merging. This means we occasionally have code in MJWarp that gates some functionality if it requires unreleased features in a dependency.
  3. We set up a nightly CI that also tests against nightly deps.

Is that a fair summary of where we have landed?

cc @btaba @thowell

@shi-eric
Copy link
Copy Markdown
Collaborator

Ah OK thanks @shi-eric - let me summarize what I think are the concrete changes:

1. PRs should run tests on whatever dep versions are in `uv.lock`

2. To satisfy @btaba 's MJX requirements that MJWarp HEAD works against release versions of warp and mujoco, PRs should also run tests against the latest released deps too.  I think @btaba would ideally like a failure here to block merging.  This means we occasionally have code in MJWarp that gates some functionality if it requires unreleased features in a dependency.

3. We set up a nightly CI that also tests against nightly deps.

Is that a fair summary of where we have landed?

cc @btaba @thowell

Yes, though I am still concerned that having #2 be a required pass is burdensome and will make it hard for new MJWarp features to be used in Newton HEAD.

For Warp and Newton, we make (or plan to make) releases off of a separate branch where commits can be cherry-picked onto it so that code using unreleased features from a dependency can be safely merged into main without interfering with an upcoming release of the library.

But this is just a simple toggle in the CI/CD config so it's a decision that can be trivially changed if you find that the required/optional pass decision needs to be changed.

I can also open a pull request to make these changes.

@erikfrey
Copy link
Copy Markdown
Collaborator

@shi-eric a PR would be amazing thanks.

@erikfrey
Copy link
Copy Markdown
Collaborator

This is addressed in #1084 - closing this PR. Thanks folks for the helpful discussion.

@erikfrey erikfrey closed this Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants