Skip to content

Alemusica/nico

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

71 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›°οΈ SLCCI Satellite Altimetry + Causal Discovery Platform

A comprehensive Python toolkit for satellite altimetry analysis and intelligent causal discovery with LLM-powered explanations.

Python FastAPI Ollama License Audit

🎯 Overview

This project combines oceanographic data analysis with AI-powered causal discovery:

Core Features

  • πŸ›°οΈ Satellite Altimetry - DOT, SLA, SSH analysis from Jason/CMEMS/AVISO
  • πŸ”¬ Causal Discovery - PCMCI algorithm to find cause-effect relationships with time lags
  • πŸ€– LLM Integration - Ollama (qwen3-coder) for automatic data interpretation
  • ⚑ Physics Validation - Validate patterns against physical laws (wind setup, inverse barometer)
  • πŸ“Š Pattern Detection - tsfresh features, association rules, anomaly detection
  • πŸ€– Multi-Agent Audit - Parallel quality assurance system (8 specialized agents)

New: Multi-Agent Audit System (Dec 2025)

8 Specialized Agents for comprehensive quality monitoring:

# Run full parallel audit (all 8 agents)
python audit_agents/run_all.py

# Or test individual agent
python audit_agents/data_flow_auditor.py

Agents:

  1. 🌊 DataFlowAuditor - CMEMS/ERA5/Cache (βœ… 11/13 checks)
  2. πŸ” InvestigationAuditor - Pipeline E2E
  3. πŸ“š KnowledgeAuditor - Services/Persistence
  4. πŸ” APIAuditor - Security/Performance
  5. 🧠 CausalAuditor - PCMCI/Tigramite
  6. πŸ§ͺ QualityAuditor - Tests/Coverage
  7. 🎨 FrontendAuditor - React/TypeScript
  8. πŸš€ OpsAuditor - Docker/Monitoring

Output: JSON + Markdown reports in audit_reports/

See audit_agents/README.md for details.

Intelligent Causal Discovery Pipeline

Dataset β†’ LLM Interprets β†’ Find Time Dimension β†’ PCMCI Discovery β†’ Physics Validation β†’ LLM Explains

Example: Load flood data β†’ LLM identifies "sea_level_anomaly" as target β†’ PCMCI finds "precipitation β†’ river_level (lag=2 days)" β†’ Physics confirms wind setup mechanism β†’ LLM explains the Atlantic storm track connection.


πŸš€ Quick Start

1. Installation

# Clone repository
git clone <repo-url>
cd nico

# Use Python 3.12 (recommended - 3.14 has compatibility issues)
python3.12 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

# For causal discovery features
pip install tigramite networkx fastapi uvicorn ollama

2. Start Ollama (for LLM features)

# Install Ollama: https://ollama.ai
ollama pull qwen3-coder:30b  # or llama3.2 for faster inference
ollama serve

3. Run the API Server

# Start FastAPI backend
uvicorn api.main:app --reload --port 8000

4. Run Headless Test

python test_headless.py

Expected output:

βœ… PASS: llm (Ollama connected, data interpreted)
βœ… PASS: causal (Found precipitationβ†’river_level, windβ†’surge)
βœ… PASS: satellite (Loaded AVISO/CMEMS data)
βœ… PASS: llm_explain (Physics validation: 0.95)

πŸ“ Project Structure

nico/
β”œβ”€β”€ api/                          # πŸ”Œ FastAPI Backend (NEW)
β”‚   β”œβ”€β”€ main.py                   # REST endpoints
β”‚   └── services/
β”‚       β”œβ”€β”€ llm_service.py        # Ollama LLM integration
β”‚       β”œβ”€β”€ causal_service.py     # PCMCI causal discovery
β”‚       └── data_service.py       # Dataset loading/preprocessing
β”‚
β”œβ”€β”€ src/                          # 🧠 Core Analysis Modules
β”‚   β”œβ”€β”€ analysis/                 # DOT, slope, statistics
β”‚   β”œβ”€β”€ core/                     # Config, coordinates, resolvers
β”‚   β”œβ”€β”€ data/                     # Loaders, filters, geoid
β”‚   β”œβ”€β”€ visualization/            # Plotly/Matplotlib charts
β”‚   β”œβ”€β”€ pattern_engine/           # Pattern detection (tsfresh, mlxtend)
β”‚   β”‚   β”œβ”€β”€ core/                 # Pattern dataclasses
β”‚   β”‚   β”œβ”€β”€ detection/            # ML detectors, association rules
β”‚   β”‚   β”œβ”€β”€ physics/              # Domain rules (flood, manufacturing)
β”‚   β”‚   └── output/               # Gray zone detector
β”‚   └── surge_shazam/             # Physics-informed ML
β”‚       β”œβ”€β”€ physics/              # Shallow water equations (PyTorch)
β”‚       └── causal/               # PCMCI integration (stubs)
β”‚
β”œβ”€β”€ app/                          # πŸ“± Streamlit Dashboard
β”‚   └── components/               # UI tabs (analysis, spatial, profiles)
β”‚
β”œβ”€β”€ data/                         # πŸ“‚ Satellite Data
β”‚   β”œβ”€β”€ aviso/                    # AVISO altimetry
β”‚   β”œβ”€β”€ cmems/                    # CMEMS L3/L4
β”‚   β”œβ”€β”€ slcci/                    # SLCCI Jason-1/2
β”‚   └── geoid/                    # TUM geoid model
β”‚
β”œβ”€β”€ gates/                        # 🌊 Strait Shapefiles
β”‚
β”œβ”€β”€ test_headless.py              # πŸ§ͺ Integration tests
└── gradio_app.py                 # Alternative Gradio UI

πŸ”¬ API Endpoints

Core Endpoints

Endpoint Method Description
/api/v1/health GET Check API + Ollama status
/api/v1/data/files GET List available data files
/api/v1/data/upload POST Upload CSV/NetCDF
/api/v1/data/load/{path} GET Load file from data/
/api/v1/interpret POST LLM interprets dataset structure
/api/v1/discover POST Run PCMCI causal discovery
/api/v1/discover/correlations POST Cross-correlation analysis
/api/v1/chat POST Chat with LLM about data
/api/v1/hypotheses POST Generate causal hypotheses
/api/v1/ws/chat WebSocket Stream LLM responses

Knowledge Base

Endpoint Method Description
/api/v1/knowledge/stats GET Knowledge base statistics
/api/v1/knowledge/papers GET/POST Scientific papers CRUD
/api/v1/knowledge/events GET/POST Historical events
/api/v1/knowledge/patterns GET/POST Causal patterns

Investigation (WebSocket Streaming)

Endpoint Method Description
/api/v1/investigate/ws WebSocket Real-time investigation streaming
/api/v1/investigate/status GET Investigation components status

Note: All endpoints now require /api/v1 prefix (v1.8.0+)

Example: Causal Discovery

curl -X POST http://localhost:8000/api/v1/discover \
  -H "Content-Type: application/json" \
  -d '{
    "dataset_name": "flood_data",
    "max_lag": 7,
    "alpha_level": 0.05,
    "domain": "flood",
    "use_llm": true
  }'

Response:

{
  "variables": ["precipitation", "wind_speed", "pressure", "river_level", "flood_index"],
  "links": [
    {
      "source": "precipitation",
      "target": "river_level",
      "lag": 2,
      "strength": 0.95,
      "p_value": 0.0001,
      "explanation": "Heavy precipitation causes river levels to rise with a 2-day lag...",
      "physics_valid": true,
      "physics_score": 0.92
    }
  ]
}

🧠 LLM Service Features

The OllamaLLMService provides:

1. Data Interpretation

result = await llm.interpret_dataset(columns_info, filename)
# Returns: domain="flood", temporal_column="timestamp", suggested_targets=["sea_level"]

2. Causal Explanation

explanation = await llm.explain_causal_relationship(
    source="wind_speed", target="storm_surge", lag=1, strength=0.52
)
# Returns: "Wind speed causes storm surge through the wind setup mechanism (Ο„ ∝ UΒ²)..."

3. Physics Validation

validation = await llm.validate_pattern_physics(
    pattern="wind β†’ surge", domain="flood", confidence=0.99
)
# Returns: {"is_valid": True, "physics_score": 0.95, "supporting_evidence": ["wind stress formula"]}

4. Hypothesis Generation

hypotheses = await llm.generate_hypotheses(variables, domain="flood")
# Returns: [{"source": "NAO_index", "target": "storm_surge", "expected_lag": "3-5 days"}]

⚑ Physics Rules

Built-in physics validation for multiple domains:

Flood/Storm Surge

Rule Formula Typical Lag
Wind Setup η ∝ U²·L/(g·h) 6-24 hours
Inverse Barometer Δη β‰ˆ -1 cm/hPa 12-48 hours
Pressure Effect Low pressure β†’ surge 24-72 hours

Manufacturing

Rule Effect
Temperature Arrhenius: rate Γ—2 per 10Β°C
Viscosity Decreases with temperature
Speed Optimal range for quality

πŸ› οΈ Development

Run Tests

# Headless integration test
python test_headless.py

# Unit tests
pytest tests/

# Multi-agent audit (NEW)
python audit_agents/run_all.py

Code Quality

black src/ api/
ruff check src/ api/

# Check audit status
python audit_agents/data_flow_auditor.py

Known Issues

⚠️ Python 3.14 Compatibility: NetworkX and some libraries have issues with Python 3.14. Use Python 3.12 for now.

⚠️ Cache Stats: DataManager returns 503 - investigation required

⚠️ ERA5 Humidity Variables: Verification check pending


πŸ—ΊοΈ Roadmap

βœ… Completed (v1.8)

  • FastAPI backend with REST endpoints + /api/v1 versioning
  • Ollama LLM integration (qwen3-coder, llama3.2)
  • PCMCI causal discovery with correlation fallback
  • Physics validation rules (flood, manufacturing)
  • Data interpretation and explanation generation
  • NetCDF/CSV loading with auto-detection
  • Headless test pipeline
  • Pattern engine (tsfresh, mlxtend, pyod)
  • Modular routers (7 routers, 75% code reduction)
  • Production middleware (logging, security, rate limiting)
  • Multi-Agent Audit System (8 agents, parallel execution)
  • MinimalKnowledgeService (in-memory, production-ready)
  • Investigation pipeline with WebSocket streaming

🚧 In Progress (v1.9)

  • Complete 7 remaining audit agents
  • React frontend with PHI spacing layout (partial)
  • Interactive causal graph visualization (D3.js)
  • Real-time chat with WebSocket streaming (functional)
  • Knowledge base persistence (SurrealDB/Neo4j)
  • Test coverage 40% β†’ 80%

πŸ“‹ Planned (v2.0)

  • Neo4j for causal graph persistence
  • RAG with scientific papers (ChromaDB)
  • Multi-dataset correlation analysis
  • Export to standard causal formats (TETRAD, DOT)
  • Teleconnection patterns (NAO, ENSO)
  • Automated report generation
  • Full audit coverage (156+ checks)
  • Docker Compose deployment

πŸ“š Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     React Frontend (TODO)                        β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                β”‚
β”‚  β”‚ Causal Graphβ”‚ β”‚ Chat (LLM)  β”‚ β”‚ Time Series β”‚                β”‚
β”‚  β”‚ (D3.js)     β”‚ β”‚ Interface   β”‚ β”‚ Explorer    β”‚                β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚ REST/WebSocket
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     FastAPI Backend (/api)                       β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                β”‚
β”‚  β”‚ LLM Service β”‚ β”‚ Causal      β”‚ β”‚ Data        β”‚                β”‚
β”‚  β”‚ (Ollama)    β”‚ β”‚ Discovery   β”‚ β”‚ Service     β”‚                β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Core Analysis + Pattern Engine                      β”‚
β”‚  (DOT analysis, tsfresh, mlxtend, physics validation)           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“„ License

MIT License


🀝 Contributing

Key areas for contribution:

  1. React Frontend - Build the PHI-spaced dashboard with D3.js graphs
  2. Physics Rules - Add domain-specific validation rules
  3. LLM Prompts - Improve scientific explanation quality
  4. Test Data - Contribute synthetic/real datasets

About

Surge Analysis System - Multi-resolution causal discovery platform for extreme weather events

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors