Skip to content

Commit 0dd349a

Browse files
authored
SQL retriever enhancements for Dell AIDp (#357)
* Initial set of changes Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> # Conflicts: # industries/asset_lifecycle_management_agent/INSTALLATION.md # industries/asset_lifecycle_management_agent/config_examples.yaml # industries/asset_lifecycle_management_agent/configs/README.md # industries/asset_lifecycle_management_agent/example_eval_output/multimodal_eval_output.json # industries/asset_lifecycle_management_agent/example_eval_output/workflow_output.json # industries/asset_lifecycle_management_agent/frontend/README.md # industries/asset_lifecycle_management_agent/frontend/app.js # industries/asset_lifecycle_management_agent/frontend/package-lock.json # industries/asset_lifecycle_management_agent/frontend/package.json # industries/asset_lifecycle_management_agent/frontend/server.js # industries/asset_lifecycle_management_agent/frontend/styles.css # industries/asset_lifecycle_management_agent/prompts.md * Adding version file Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> * Trying to rebase with main and commit the files again Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> # Conflicts: # industries/asset_lifecycle_management_agent/README.md # industries/asset_lifecycle_management_agent/configs/config-reasoning.yml * Remove frontend, example outputs, and config examples from PR * Fix unresolved merge conflicts in config-reasoning.yml Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> * Fix code generation sandbox errors and add workspace utils template - Fix database path in code_generation_assistant.py to use '/workspace/database/nasa_turbo.db' - Make utils import conditional and only for RUL transformations - Fix sys.path.append to use '/workspace' instead of '.' - Add utils_template folder with pre-built RUL transformation utilities - Update README with clear setup instructions for workspace utilities - Addresses customer issues: ModuleNotFoundError for utils and mysql modules Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> * Tested SQL retriever Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com> --------- Signed-off-by: Vineeth Kalluru <vikalluru@nvidia.com>
1 parent 9bbcadb commit 0dd349a

File tree

18 files changed

+2332
-425
lines changed

18 files changed

+2332
-425
lines changed

industries/asset_lifecycle_management_agent/.cursor.rules.md

Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -542,6 +542,120 @@ configs/
542542
.env # Environment variables
543543
```
544544

545+
## NAT Version Compatibility
546+
547+
### NAT 1.2.1 vs 1.3.0
548+
549+
**Current Version**: NAT 1.2.1 (with pydantic 2.10.x)
550+
551+
**Key Compatibility Rules**:
552+
553+
1. **Optional String Fields**:
554+
```python
555+
# ❌ WRONG - Will fail validation
556+
elasticsearch_url: str = Field(default=None)
557+
558+
# ✅ CORRECT - Use Optional for nullable strings
559+
from typing import Optional
560+
elasticsearch_url: Optional[str] = Field(default=None)
561+
```
562+
563+
2. **Reference Field Types (NAT 1.2.1)**:
564+
```python
565+
# NAT 1.2.1 uses plain strings for references
566+
llm_name: str = Field(description="LLM reference")
567+
embedding_name: str = Field(description="Embedder reference")
568+
```
569+
570+
3. **Reference Field Types (NAT 1.3.0 - Future)**:
571+
```python
572+
# NAT 1.3.0 requires typed references
573+
from nat.data_models.component_ref import LLMRef, EmbedderRef, FunctionRef
574+
575+
llm_name: LLMRef = Field(description="LLM reference")
576+
embedding_name: EmbedderRef = Field(description="Embedder reference")
577+
code_execution_tool: FunctionRef = Field(description="Function reference")
578+
```
579+
580+
4. **YAML Configuration Quoting**:
581+
```yaml
582+
# Always quote string references in YAML configs for pydantic 2.10+
583+
functions:
584+
sql_retriever:
585+
llm_name: "sql_llm" # Quoted
586+
embedding_name: "vanna_embedder" # Quoted
587+
vector_store_type: "chromadb" # Quoted
588+
db_type: "sqlite" # Quoted
589+
590+
data_analysis_assistant:
591+
tool_names: [
592+
"sql_retriever", # All tool names quoted
593+
"predict_rul",
594+
"plot_distribution"
595+
]
596+
```
597+
598+
### Pydantic 2.10+ Best Practices
599+
600+
**Type Annotations**:
601+
```python
602+
from typing import Optional
603+
604+
class ToolConfig(FunctionBaseConfig):
605+
# Required fields
606+
required_param: str = Field(description="Must be provided")
607+
608+
# Optional fields with None default
609+
optional_param: Optional[str] = Field(default=None, description="Can be None")
610+
611+
# Optional fields with non-None default
612+
param_with_default: str = Field(default="default_value", description="Has default")
613+
614+
# Numeric fields (can use None without Optional if you want)
615+
max_retries: int = Field(default=3, description="Number of retries")
616+
```
617+
618+
**Common Validation Errors**:
619+
```
620+
ValidationError: Input should be a valid string [input_value=None, input_type=NoneType]
621+
→ Solution: Use Optional[str] instead of str for fields with default=None
622+
623+
ValidationError: functions: Input should be a valid string (4 times)
624+
→ Solution: Quote all string values in YAML config, especially references
625+
```
626+
627+
### Upgrading to NAT 1.3.0 (Future)
628+
629+
When upgrading, you'll need to:
630+
631+
1. Update pyproject.toml:
632+
```toml
633+
dependencies = [
634+
"nvidia-nat[profiling,langchain,telemetry]==1.3.0",
635+
"pydantic>=2.11.0,<3.0.0",
636+
]
637+
```
638+
639+
2. Update all tool configs:
640+
```python
641+
# Before (NAT 1.2.1)
642+
llm_name: str = Field(...)
643+
644+
# After (NAT 1.3.0)
645+
from nat.data_models.component_ref import LLMRef
646+
llm_name: LLMRef = Field(...)
647+
```
648+
649+
3. Update evaluator configs:
650+
```python
651+
# multimodal_llm_judge_evaluator_register.py
652+
# llm_judge_evaluator_register.py
653+
from nat.data_models.component_ref import LLMRef
654+
llm_name: LLMRef = Field(...)
655+
```
656+
657+
4. Keep Optional[str] for nullable fields (both versions need this)
658+
545659
## Debugging and Troubleshooting
546660

547661
### Common Issues and Solutions
Lines changed: 109 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,46 +1,123 @@
1-
# macOS system files
1+
# Misc
2+
config_examples.yml
3+
config_examples.yaml
4+
env.sh
5+
frontend/
6+
prompts.md
7+
8+
# Python
9+
__pycache__/
10+
*.py[cod]
11+
*$py.class
12+
*.so
13+
.Python
14+
*.egg
15+
*.egg-info/
16+
dist/
17+
build/
18+
*.whl
19+
pip-wheel-metadata/
220
.DS_Store
3-
.DS_Store?
4-
._*
5-
.Spotlight-V100
6-
.Trashes
7-
ehthumbs.db
8-
Thumbs.db
921

10-
# Database and vector store files
11-
database/
12-
*.db
13-
*.sqlite3
22+
# Virtual environments
23+
.venv/
24+
venv/
25+
ENV/
26+
env/
27+
28+
# IDEs and Editors
29+
.vscode/
30+
.idea/
31+
*.swp
32+
*.swo
33+
*~
34+
.DS_Store
35+
36+
# Testing
37+
.pytest_cache/
38+
.coverage
39+
htmlcov/
40+
.tox/
41+
.hypothesis/
42+
43+
# Jupyter Notebook
44+
.ipynb_checkpoints/
45+
*.ipynb_checkpoints/
1446

15-
# Output and generated files
47+
# Output and Data Directories
1648
output_data/
17-
moment/
18-
readmes/
19-
*.html
20-
*.csv
21-
*.npy
49+
eval_output/
50+
example_eval_output/
51+
output/
52+
results/
53+
logs/
2254

23-
# Python package metadata
24-
src/**/*.egg-info/
25-
*.egg-info/
55+
# Database files
56+
*.db
57+
*.sqlite
58+
*.sqlite3
59+
database/*.db
60+
database/*.sqlite
2661

27-
# Environment files (if they contain secrets)
28-
env.sh
62+
# Vector store data (ChromaDB)
63+
database/
64+
chroma_db/
65+
vector_store/
66+
vanna_vector_store/
2967

30-
# Model files (if large/binary)
68+
# Model files (large binary files)
3169
models/*.pkl
32-
models/*.joblib
33-
models/*.model
70+
models/*.h5
71+
models/*.pt
72+
models/*.pth
73+
models/*.ckpt
74+
*.pkl
75+
*.h5
76+
*.pt
77+
*.pth
78+
moment/
3479

35-
# Logs
36-
*.log
37-
logs/
80+
# Data files (CSV, JSON, etc. - be selective)
81+
*.csv
82+
*.json
83+
!training_data.json
84+
!vanna_training_data.yaml
85+
!config*.json
86+
!config*.yaml
87+
!config*.yml
88+
!pyproject.toml
89+
!package.json
90+
91+
# Frontend build artifacts
92+
frontend/node_modules/
93+
frontend/dist/
94+
frontend/build/
95+
frontend/.next/
96+
frontend/out/
97+
98+
# Environment and secrets
99+
.env
100+
.env.local
101+
.env.*.local
102+
*.secret
103+
secrets/
104+
credentials/
38105

39106
# Temporary files
40107
*.tmp
41108
*.temp
42-
.pytest_cache/
43-
__pycache__/
109+
*.log
110+
*.cache
111+
112+
# OS specific
113+
Thumbs.db
114+
Desktop.ini
115+
116+
# Experiment tracking
117+
mlruns/
118+
wandb/
44119

45-
# dot env
46-
mydot.env
120+
# Documentation builds
121+
docs/_build/
122+
docs/.doctrees/
123+
site/

0 commit comments

Comments
 (0)