Set current job as environment variable, and add util to get job type#431
Set current job as environment variable, and add util to get job type#431kmontemayor2-sc wants to merge 4 commits intomainfrom
Conversation
|
/unit_test_py |
|
/integration_test |
|
/e2e_test |
GiGL Automation@ 23:21:40UTC : 🔄 @ 24:36:24UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 23:21:45UTC : 🔄 @ 24:30:51UTC : ✅ Workflow completed successfully. |
GiGL Automation@ 23:21:49UTC : 🔄 @ 24:36:14UTC : ✅ Workflow completed successfully. |
svij-sc
left a comment
There was a problem hiding this comment.
I am not a fan of introducing env vars as it may limit flexibility to move to other backends for train/infer. Specifically we want to support K8s training/inference sometime later this year.
Btw, how does this affect if a user is doing local training/inference?
Will we have to set this var?
If it doesn't affect local training I am fine with this change.
I think that it's pretty easy to migrate the env vars to k8s/etc, if we're not able to set env vars at all then I think using that as a backend would be a non-starter. Regardless in the migration to k8s we're going to need to migrate the injected CLI flags (among other things like labels 1).
What do you mean by this? AFAIK people aren't running |
Yes, but those are expected for any dist training. Anyways, I am just concerned but this is not blocking. |
mkolodner-sc
left a comment
There was a problem hiding this comment.
This LGTM, thanks Kyle!
Is your concern that we may now need to require users to provide more env vars? I guess that's fair - but given that we build datasets differently based on the component 1 I'm not sure what the other approach here is, we need to signal somehow, either via CLI flag or env var, and I feel that env var is less intrusive. FWIW I don't expect we'll require users to set this all the time, but I do think it'll be useful for graphstore more (which is sort of weird to run locally anyways...) |
Scope of work done
We do this so we can infer the job type downstream, e.g. for
is_inference1.I thought about making this a CLI flag that we inject, similar to
use_cuda2. But I strongly feel that we should not be injecting CLI flags, and that the CLI flags should only be controled by users.If I'm a user, I expect there to be all sorts of stuff in the environment variables, but I'd expect the CLI flags to be mine, (and I don't thikn we should be using the ones we have anyways...)
We could still inject the CLI flag but I think it's going to cause more pain down the road, and its going to be unsafe to add to the colcateed (e.g. non-graphstore jobs).
Another note: I think we should use the "job type" vs
is_inferenceas we may have other "job" types in the future (e.g. some GLT preprocessor step to compute PEs or similar.Where is the documentation for this feature?: N/A
Did you add automated tests or write a test plan?
Updated Changelog.md? NO
Ready for code review?: NO