Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs/requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ sphinx-autobuild==2025.8.25
sphinx-copybutton==0.5.2
sphinx-sitemap==2.9.0
sphinx-click==6.2.0
sphinx-autodoc-typehints==3.9.9
sphinx-autodoc-typehints==3.10.4
16 changes: 8 additions & 8 deletions requirements-dev.txt
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
setuptools==80.9.0
pytest==8.3.5
respx==0.22.0
setuptools==82.0.1
pytest==9.0.3
respx==0.23.1
pytest-git==1.8.0
pytest-env==1.1.5
pytest-mock==3.14.0
fiftyone==1.5.2
datasets==3.6.0
ultralytics==8.3.237
pytest-env==1.6.0
pytest-mock==3.15.1
fiftyone==1.16.0
datasets==4.8.5
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | πŸ—οΈ Heavy lift

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Find datasets library usage to assess impact of major version upgrade

echo "=== Searching for datasets library imports and usage ==="
rg -n --type py -C3 'from datasets import|import datasets' --glob '!requirements*.txt'

echo -e "\n=== Checking for common datasets API patterns ==="
rg -n --type py -C2 'load_dataset|DatasetDict|Dataset\.' --glob '!requirements*.txt' | head -50

Repository: DagsHub/client

Length of output: 3637


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Show requirements-dev.txt around datasets line(s) ==="
if [ -f requirements-dev.txt ]; then
  rg -n "datasets==" requirements-dev.txt || true
  sed -n '1,40p' requirements-dev.txt
else
  echo "requirements-dev.txt not found at repo root"
fi

echo -e "\n=== Search repo for datasets pin(s) ==="
rg -n --hidden --glob '!**/node_modules/**' "datasets==[0-9]+\.[0-9]+\.[0-9]+" . || true

echo -e "\n=== Inspect datasets usage in query_result.py ==="
FILE="dagshub/data_engine/model/query_result.py"
if [ -f "$FILE" ]; then
  rg -n "import datasets as hf_ds|hf_ds\.|load_dataset|DatasetDict|Dataset\." "$FILE" || true
  # print a focused window around key usages
  sed -n '40,120p' "$FILE"
  sed -n '280,380p' "$FILE"
fi

Repository: DagsHub/client

Length of output: 7655


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Show requirements-dev.txt around datasets line(s) ==="
if [ -f requirements-dev.txt ]; then
  rg -n "datasets==" requirements-dev.txt || true
  sed -n '1,40p' requirements-dev.txt
else
  echo "requirements-dev.txt not found at repo root"
fi

echo -e "\n=== Search repo for datasets pin(s) ==="
rg -n --hidden --glob '!**/node_modules/**' "datasets==[0-9]+\.[0-9]+\.[0-9]+" . || true

echo -e "\n=== Inspect datasets usage in query_result.py ==="
FILE="dagshub/data_engine/model/query_result.py"
if [ -f "$FILE" ]; then
  rg -n "import datasets as hf_ds|hf_ds\.|load_dataset|DatasetDict|Dataset\." "$FILE" || true
  sed -n '40,120p' "$FILE"
  sed -n '280,380p' "$FILE"
fi

Repository: DagsHub/client

Length of output: 7655


🏁 Script executed:

#!/bin/bash
set -euo pipefail

echo "=== Show requirements-dev.txt around datasets line(s) ==="
if [ -f requirements-dev.txt ]; then
  rg -n "datasets==" requirements-dev.txt || true
  sed -n '1,40p' requirements-dev.txt
else
  echo "requirements-dev.txt not found at repo root"
fi

echo -e "\n=== Search repo for datasets pin(s) ==="
rg -n --hidden --glob '!**/node_modules/**' "datasets==[0-9]+\.[0-9]+\.[0-9]+" . || true

echo -e "\n=== Inspect datasets usage in query_result.py ==="
FILE="dagshub/data_engine/model/query_result.py"
if [ -f "$FILE" ]; then
  rg -n "import datasets as hf_ds|hf_ds\.|load_dataset|DatasetDict|Dataset\." "$FILE" || true
  sed -n '40,120p' "$FILE"
  sed -n '280,380p' "$FILE"
fi

Repository: DagsHub/client

Length of output: 7655


Test required for datasets major bump (datasets==4.8.5)

requirements-dev.txt pins datasets==4.8.5. The repo’s datasets usage is primarily QueryResult.as_hf_dataset() (dagshub/data_engine/model/query_result.py), which lazily loads datasets and calls hf_ds.Dataset.from_pandas(df)β€”so API/dtype compatibility changes would likely surface there. Add/adjust coverage for QueryResult.as_hf_dataset() (and its documented β€œdownload paths then cast_column” workflow expectations).

ultralytics==8.4.59
Loading