-
Notifications
You must be signed in to change notification settings - Fork 115
Remove upper bound for azure-datalake-store dependency #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Hi @hutch3232. Thanks for the PR! I'm hesitant to remove the upper bound cap completely as it protects That being said Azure Data Lake Storage Gen 1 has been retired since February 2024 and @hutch3232 could you also elaborate more on why you are trying to use |
|
Thanks, @kyleknap. To be honest, I don't have an immediate need to use the new version of the dependency. I'm new to Azure and was poking around to try to understand how all these packages fit together and I was surprised when my resolver wasn't getting me the latest version I had seen on GitHub. Found that |
|
We'd love this to be merged and relased @kyleknap -> it holds us back in Airflow from upgrading the
This is far better than the current "they are limited by you not to upgrade it even if they need it for something else" - this is precisely the issue we have in Airflow - even if our users do not use azure with fsspec - but airflow itself (and particularly microsoft-azure provider uses it for other things. the sheer fact that we use fsspec limits us from upgrading the azure library) So my recommendation is - to accept the "no upper-binding" approach for those libraries, and also likely adding missing tests for adlfs. It's not your user's fault that there are no tests, and well, as maintainers of the library you chose to depend on it (as a required dependency) and expose azure functionality through adlfs. And it's not a secret what's changed: https://github.com/Azure/azure-data-lake-store-python/releases
But how it impacts your code, it's likely an assessment maintainers of This is the code comparision: Azure/azure-data-lake-store-python@v0.0.53...v1.0.1 -> it does not seem a lot, so I guess knowing the integration points, there - it should be easy to asses |
|
@hutch3232 @potiuk Thanks for the feedback here. For the short term, I'd still prefer for now just increasing the ceiling to the major version Long term, I'd actually prefer |
Sounds good. I was about to propose that as an option as well, but I did not know how strong tie it has with adlfs |
|
As discussed, I've added back the cap but bumped it to 2.0.0. Agreed it'd be great to remove this dependency entirely once adls gen 1 support is dropped. |
|
Should we merge/release it ? |
|
The diff looks good, but before merging it, I'd like to pull it down and try to vet if there are any negative impact of the upgrade especially since there are no tests for it. I'm hoping to do that in the next few days. |
Maybe we can help with that - if you have a version of azure-data-lake-store (say alpha/beta/rc) that we can test, we can run it through our (Apache Airflow) testing suite for our provider - and maybe other users can be asked for it as well - we do not have a full coverage of it of course, but when you combine inputs from multiple users, your own testing might be limited |
|
@hutch3232 Thanks! That sounds good to me. I'm thinking we just use the latest version (1.0.1). That should suffice. Does your test suite include end to end tests that make API requests to Azure DataLake Gen 1? I have not actually used Gen 1 before so I'm curious with its retirement how much of the API we will be able to use in testing its filesystem class. |
Good question: We do have system tests that test "real" Azure service - but likely not datalake-store https://github.com/apache/airflow/tree/main/providers/microsoft/azure/tests/system/microsoft/azure - and those rely on someone who would like to run them. We tried to encourage Microsoft to take stewardship of the azure provider - with a limited success so far unfortunately so if the tests are really "end-2-end" we likely cannot help much. Unless of course we can get hold of the azure-deltalake-store team that could be interested in spending time on contributing and testing the system test suite. |
|
@hutch3232 Got it. Thanks for the context. I still think running it through your test suite would be helpful. Mainly curious on whether you had an end to end set up. I plan to try to get a working setup when I pull down the PR and see how far I get. |

I noticed there is a 1.x.x release (June, 2025) of this dependency but this upper bound prevents it's usage. However, it is being tested against already (the latest.txt requirements are uncapped).