[3/6 Deps Update] Install UV + update dev usage to UV#398
[3/6 Deps Update] Install UV + update dev usage to UV#398svij-sc merged 15 commits intosvij/update-deps-migrate-uvfrom
Conversation
python/gigl/scripts/post_install.py
Outdated
| if result != 0: | ||
| raise RuntimeError( | ||
| f"Post-install script finished running, with return code: {result}" | ||
| ) |
There was a problem hiding this comment.
Does this change need to be part of these prs?
There was a problem hiding this comment.
I can pull it out if you prefer.
This was failing at some point during testing and I wanted to ensure that it didn't silently fail.
There was a problem hiding this comment.
I'd prefer to just address these unrelated things out of the scope of this uv+py3.11 change as the core changes are already pretty massive.
e.g. #409
| if [[ "${UV_SYSTEM_PYTHON}" == "true" ]] | ||
| then | ||
| echo "Recognized using system python." | ||
| echo "Will use inexact match for dependencies so we don't override system packages." | ||
| flag_use_inexact_match="--inexact" | ||
| fi |
There was a problem hiding this comment.
seems risky? Is this ever true? I'd imagine we'd want to error out?
Or do we need this for prod?
There was a problem hiding this comment.
Left a comment, lso as an example here:
GiGL/containers/Dockerfile.cuda.base
Lines 9 to 11 in b771633
There was a problem hiding this comment.
hmmm - should we be doing this? I feel like this makes our environment less hermetic and more susceptible to e.g. gcp changes.
There was a problem hiding this comment.
Could you expand what you mean by gcp changes?
The alternative would be we create a complicated docker image that is able to build / install torch and cuda libs closely matching how pytorch genreates their docker images: https://hub.docker.com/layers/pytorch/pytorch/2.9.1-cuda12.6-cudnn9-runtime/images/sha256-28fb54912f51b59d80f696da9e551ac1da601ce761195f5ff8dd59dccdcd1516
I'd say the preference we want to have is leave the heavy lifting of creating optimized docker base images to pytorch and cuda engineers and just to use the existing installed python interpreter and the related cuda / pytorch libs.
There was a problem hiding this comment.
Do the system python and non-system python executables have different binaries? Or is your point that torch/cuda libs are installed specially and just doing pip install torch is not going to give us the same behavior?
My concern here is that if torch/etc end up putting some other deps in their "system install" then we may run into conflicts. This is what I meant by "gcp changes" e.g. our upstream sources change and now we're broken for mysterious reasons.
But if there is in fact special (if not secret) sauce in the dep installation it makes sense to resuse that.
There was a problem hiding this comment.
torch/cuda libs are installed specially and just doing pip install torch is not going to give us the same behavior?
exactly this.
Note that we have the same issue w/ the dataflow image. I need to use system python as it has apache beam installed. For whatever reason I am unable to just "install the right version of apache beam" and it working for dataflow workers. See: #399 (comment)
The older version of building Dataflow base image does not work anymore i.e. COPY --from=apache/beam_python3.9_sdk:2.53.0 /opt/apache/beam /opt/apache/beam with the updated python and sdk version creates runtime failures when running dataflow jobs. Thus, we are now just using apache/beam_python3.11_sdk:2.56.0 as base image instead of trying to copy from it.
Re:
torch/cuda libs are installed specially and just doing pip install torch is not going to give us the same behavior?
Sure, but that would mean we update the docker source image - but this happens very rarely.
And, when we update that image likely other things might also be dreprecated that will need to be fixed beyond dep resolution.
| git diff --quiet || { echo Branch is dirty, please commit changes and ensure branch is clean; exit 1; } | ||
| $(eval GIT_COMMIT=$(shell git log -1 --pretty=format:"%H")) | ||
|
|
||
| initialize_environment: |
There was a problem hiding this comment.
this is documented in a couple .md files as well, and need to be removed. My understanding is that with uv we don't need conda virtualenv anymore right
There was a problem hiding this comment.
good catch, updated
| COPY requirements gigl_deps/requirements | ||
| COPY python/gigl/scripts gigl_deps/python/gigl/scripts | ||
| RUN pip install --upgrade pip | ||
| RUN cd gigl_deps && bash ./requirements/install_py_deps.sh --no-pip-cache --dev |
There was a problem hiding this comment.
uv cache is diff how pip cache is handled.
pip cache gave us issues in the past due to some libs being preinstalled w/ the conda python interpreters.
This is not an issue anymore as we are not using pip anymore and have replaced it with uv
|
|
||
| check_if_valid_env: | ||
| #@command -v docker >/dev/null 2>&1 || { echo >&2 "docker is required but it's not installed. Aborting."; exit 1; } | ||
| @command -v docker >/dev/null 2>&1 || { echo >&2 "docker is required but it's not installed. Aborting."; exit 1; } |
There was a problem hiding this comment.
Nit. Should probably go in a separate PR.
There was a problem hiding this comment.
I created a pr for other related comment.
Leaving this in since its 1 line / nit.
python/gigl/scripts/post_install.py
Outdated
| if result != 0: | ||
| raise RuntimeError( | ||
| f"Post-install script finished running, with return code: {result}" | ||
| ) |
There was a problem hiding this comment.
I'd prefer to just address these unrelated things out of the scope of this uv+py3.11 change as the core changes are already pretty massive.
e.g. #409
| if [[ "${UV_SYSTEM_PYTHON}" == "true" ]] | ||
| then | ||
| echo "Recognized using system python." | ||
| echo "Will use inexact match for dependencies so we don't override system packages." | ||
| flag_use_inexact_match="--inexact" | ||
| fi |
There was a problem hiding this comment.
hmmm - should we be doing this? I feel like this makes our environment less hermetic and more susceptible to e.g. gcp changes.
kmontemayor2-sc
left a comment
There was a problem hiding this comment.
FYI there are some other references to pip install in our code base - let's update them too (at some point) https://github.com/search?q=repo%3ASnapchat%2FGiGL+%22pip+install%22&type=code
| pip install -e ./python/ | ||
|
|
||
| # Can only be run on an arm64 mac, otherwise generated hashed req file will be wrong | ||
| generate_mac_arm64_cpu_hashed_requirements: |
There was a problem hiding this comment.
should we add a make generate_deps or something and update this? http://github.com/Snapchat/GiGL/pull/397/files#diff-8cd8c57e79e3bb16584ae59589a0bb144d5f9ca8ccc32a972b659601f16b9e65R15
| if [[ "${UV_SYSTEM_PYTHON}" == "true" ]] | ||
| then | ||
| echo "Recognized using system python." | ||
| echo "Will use inexact match for dependencies so we don't override system packages." | ||
| flag_use_inexact_match="--inexact" | ||
| fi |
There was a problem hiding this comment.
Do the system python and non-system python executables have different binaries? Or is your point that torch/cuda libs are installed specially and just doing pip install torch is not going to give us the same behavior?
My concern here is that if torch/etc end up putting some other deps in their "system install" then we may run into conflicts. This is what I meant by "gcp changes" e.g. our upstream sources change and now we're broken for mysterious reasons.
But if there is in fact special (if not secret) sauce in the dep installation it makes sense to resuse that.
| # 10.0 = Blackwell support i.e. B200 - CUDA 12.6 or later | ||
| # 12.0 = Blackwell support i.e. RTX6000 - CUDA 12.8 or later | ||
| # List of Nvidia GPUS: https://developer.nvidia.com/cuda-gpus | ||
| TORCH_CUDA_ARCH_LIST="7.5" WITH_CUDA="ON" python setup.py bdist_wheel |
There was a problem hiding this comment.
hmmm do we not need to bump this too if we want to use newer GPUs (more efficiently)? Or is that something we want to do later?
There was a problem hiding this comment.
probably as a seperate change.
Hopefully it does become more efficient.?
But let me add that to the tracker - this is a good call.
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Background Context: #405
Some guidance on stack of PRs: #405 (comment)
Scope of PR
requirements/install_py_deps.shsouvis installed. Introduced a couple new flags that can help with finer grain control when installing gigl deps, installing glt, etc. Subsequently, also cleaned up this script quite a bit as it was lacking love.Makefileto now usuv