Set current job as environment variable, and add util to get job type by kmontemayor2-sc · Pull Request #431 · Snapchat/GiGL

kmontemayor2-sc · 2026-01-09T23:20:55Z

Scope of work done

We do this so we can infer the job type downstream, e.g. for is_inference 1.

I thought about making this a CLI flag that we inject, similar to use_cuda 2. But I strongly feel that we should not be injecting CLI flags, and that the CLI flags should only be controled by users.

If I'm a user, I expect there to be all sorts of stuff in the environment variables, but I'd expect the CLI flags to be mine, (and I don't thikn we should be using the ones we have anyways...)

We could still inject the CLI flag but I think it's going to cause more pain down the road, and its going to be unsafe to add to the colcateed (e.g. non-graphstore jobs).

Another note: I think we should use the "job type" vs is_inference as we may have other "job" types in the future (e.g. some GLT preprocessor step to compute PEs or similar.

Where is the documentation for this feature?: N/A

Did you add automated tests or write a test plan?

Updated Changelog.md? NO

Ready for code review?: NO

kmontemayor2-sc · 2026-01-09T23:21:28Z

/unit_test_py

kmontemayor2-sc · 2026-01-09T23:21:33Z

/integration_test

kmontemayor2-sc · 2026-01-09T23:21:39Z

/e2e_test

github-actions · 2026-01-09T23:21:40Z

GiGL Automation

@ 23:21:40UTC : 🔄 Python Unit Test started.

@ 24:36:24UTC : ✅ Workflow completed successfully.

github-actions · 2026-01-09T23:21:46Z

GiGL Automation

@ 23:21:45UTC : 🔄 Integration Test started.

@ 24:30:51UTC : ✅ Workflow completed successfully.

github-actions · 2026-01-09T23:21:50Z

GiGL Automation

@ 23:21:49UTC : 🔄 E2E Test started.

@ 24:36:14UTC : ✅ Workflow completed successfully.

svij-sc

I am not a fan of introducing env vars as it may limit flexibility to move to other backends for train/infer. Specifically we want to support K8s training/inference sometime later this year.

Btw, how does this affect if a user is doing local training/inference?
Will we have to set this var?
If it doesn't affect local training I am fine with this change.

python/gigl/env/distributed.py

kmontemayor2-sc · 2026-01-13T17:23:26Z

I am not a fan of introducing env vars as it may limit flexibility to move to other backends for train/infer. Specifically we want to support K8s training/inference sometime later this year.

I think that it's pretty easy to migrate the env vars to k8s/etc, if we're not able to set env vars at all then I think using that as a backend would be a non-starter.

Regardless in the migration to k8s we're going to need to migrate the injected CLI flags (among other things like labels 1).

Btw, how does this affect if a user is doing local training/inference?

What do you mean by this? AFAIK people aren't running glt_trainer locally, they directly run the training loops? If they are running the loops they already need to set a lot of env vars like RANK, WORLD_SIZE etc.

svij-sc · 2026-01-13T18:43:09Z

What do you mean by this? AFAIK people aren't running glt_trainer locally, they directly run the training loops? If they are running the loops they already need to set a lot of env vars like RANK, WORLD_SIZE etc.

Yes, but those are expected for any dist training.
Ideally we have a local launcher that takes care of it for us too - but we are not there yet. Ideally we make use of somehting like torchrun, but i think thats not possible right now: https://docs.pytorch.org/docs/stable/elastic/run.html

Anyways, I am just concerned but this is not blocking.
I do agree its nice to have a utility to see what component you are in.

mkolodner-sc

This LGTM, thanks Kyle!

kmontemayor2-sc · 2026-01-13T18:50:36Z

Yes, but those are expected for any dist training.

Is your concern that we may now need to require users to provide more env vars? I guess that's fair - but given that we build datasets differently based on the component 1 I'm not sure what the other approach here is, we need to signal somehow, either via CLI flag or env var, and I feel that env var is less intrusive.

FWIW I don't expect we'll require users to set this all the time, but I do think it'll be useful for graphstore more (which is sort of weird to run locally anyways...)

Set current job as environment variable, and add util to get job type

86ae5d6

kmontemayor2-sc requested review from mkolodner-sc, nshah-sc, svij-sc, xgao4-sc, yliu2-sc and zfan3-sc as code owners January 9, 2026 23:20

svij-sc approved these changes Jan 13, 2026

View reviewed changes

python/gigl/env/distributed.py Outdated Show resolved Hide resolved

kmonte added 3 commits January 13, 2026 17:33

Merge branch 'main' into kmonte/add-inf-env-var

3d86e9e

rename

2c4baa8

update error message

8eac2e8

mkolodner-sc approved these changes Jan 13, 2026

View reviewed changes

mkolodner-sc mentioned this pull request Jan 13, 2026

Setup DistNeighborloader for graph store sampling #432

Merged

kmontemayor2-sc added this pull request to the merge queue Jan 13, 2026

kmontemayor2-sc removed this pull request from the merge queue due to a manual request Jan 13, 2026

kmontemayor2-sc mentioned this pull request Jan 26, 2026

[Custom Storage 2/3] Implement custom storage main #462

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Set current job as environment variable, and add util to get job type#431

Set current job as environment variable, and add util to get job type#431
kmontemayor2-sc wants to merge 4 commits intomainfrom
kmonte/add-inf-env-var

kmontemayor2-sc commented Jan 9, 2026 •

edited

Loading

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Jan 9, 2026 •

edited

Loading

Uh oh!

svij-sc left a comment

Uh oh!

Uh oh!

kmontemayor2-sc commented Jan 13, 2026

Uh oh!

svij-sc commented Jan 13, 2026 •

edited

Loading

Uh oh!

mkolodner-sc left a comment

Uh oh!

kmontemayor2-sc commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

kmontemayor2-sc commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

kmontemayor2-sc commented Jan 9, 2026

Uh oh!

github-actions bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

github-actions bot commented Jan 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

GiGL Automation

Uh oh!

svij-sc left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

kmontemayor2-sc commented Jan 13, 2026

Uh oh!

svij-sc commented Jan 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mkolodner-sc left a comment

Choose a reason for hiding this comment

Uh oh!

kmontemayor2-sc commented Jan 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

kmontemayor2-sc commented Jan 9, 2026 •

edited

Loading

github-actions bot commented Jan 9, 2026 •

edited

Loading

github-actions bot commented Jan 9, 2026 •

edited

Loading

github-actions bot commented Jan 9, 2026 •

edited

Loading

svij-sc commented Jan 13, 2026 •

edited

Loading