Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 98 additions & 0 deletions skills/analyze-data/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,98 @@
---
name: Asta Data Analysis
description: Analyze data using DataVoyager AI agent. Use when the user asks to "analyze data", "explore dataset", "visualize data", "run data analysis", or needs help with data science tasks.
allowed-tools:
- Bash(asta analyze-data *)
- Bash(mkdir -p .asta/analyze-data/*)
- Read(.asta/analyze-data/*)
- TaskOutput
---

# Analyze Data with DataVoyager

Analyze datasets using the DataVoyager AI agent. This skill provides an interactive AI-powered data analysis environment that can explore datasets, create visualizations, and perform statistical analysis.

## Installation

If `asta` command is not available, install it using `uv tool install git+ssh://git@github.com/allenai/asta-plugins.git`

**Prerequisites:** Python 3.11+ and [uv package manager](https://docs.astral.sh/uv/)

Verify installation with `asta analyze-data --help`

## Workflow

The user will typically:
1. Provide a dataset file path or ask to analyze data
2. Describe the analysis they want to perform
3. Request visualizations or statistical summaries

### Default Output Locations

**IMPORTANT**: Always specify output locations to keep analyses organized in `.asta/analyze-data/`:

- **OUTPUTS_DIR**: `.asta/analyze-data/<YYYY-MM-DD-slug>/` where:
- `YYYY-MM-DD` is the current date
- `slug` is a short descriptive name derived from the analysis task (e.g., "sales-analysis", "customer-segmentation")

**Example directory structure:**
```
.asta/analyze-data/
├── 2024-01-15-sales-analysis/
│ ├── plots/
│ └── [analysis outputs]
└── 2024-01-16-customer-segmentation/
├── plots/
└── [analysis outputs]
```

### Running DataVoyager

DataVoyager runs in interactive mode by default. The basic command is:

```bash
# Run DataVoyager with default Docker backend (recommended)
asta analyze-data
```

**Backend Options:**
- `--backend docker` (default): Local Docker container for isolated execution
- `--backend modal`: Remote serverless execution

**Configuration:**
- `--config path/to/config.yaml`: Use custom configuration
- `--log-level INFO`: Set logging level (DEBUG, INFO, WARNING, ERROR)

### Example Usage

**Basic interactive analysis:**
```bash
# Start DataVoyager in interactive mode
asta analyze-data

# With specific backend
asta analyze-data --backend docker

# With custom config
asta analyze-data --config .asta/analyze-data/config.yaml
```

**Creating organized output directories:**
```bash
# Create output directory with date and slug
OUTPUTS_DIR=".asta/analyze-data/$(date +%Y-%m-%d)-sales-analysis"
mkdir -p "$OUTPUTS_DIR"

# Run DataVoyager (outputs will be saved by the agent)
cd "$OUTPUTS_DIR"
asta analyze-data
```

### Notes

- **Output Directory**: Create `.asta/analyze-data/<YYYY-MM-DD-slug>/` directory before running analysis
- **Task Slug**: Create a short descriptive slug from the analysis task (e.g., "sales-analysis", "data-exploration"). Keep it lowercase with hyphens.
- **Docker Backend**: Recommended for safe, isolated code execution. Requires Docker to be installed and running.
- **Modal Backend**: Serverless execution option for remote computation
- **Interactive Mode**: The agent will prompt you for dataset paths and analysis instructions
- Always inform the user where outputs were saved after analysis completes
5 changes: 5 additions & 0 deletions src/asta/analyze_data/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
"""Pass-through to DataVoyager (dv) CLI for data analysis"""

from asta.analyze_data.passthrough import analyze_data

__all__ = ["analyze_data"]
18 changes: 18 additions & 0 deletions src/asta/analyze_data/passthrough.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
"""Pass-through command for DataVoyager (dv) CLI"""

from asta.utils.config import get_config
from asta.utils.passthrough import create_passthrough_command

# Load configuration from asta.conf
config = get_config()["passthrough"]["analyze-data"]

# Create the analyze-data passthrough command
analyze_data = create_passthrough_command(
tool_name=config["tool_name"],
install_type=config["install_type"],
install_source=config["install_source"],
minimum_version=config["minimum_version"],
command_name=config["command_name"],
friendly_name=config["friendly_name"],
docstring=config["docstring"],
)
2 changes: 2 additions & 0 deletions src/asta/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import click

from asta import __version__
from asta.analyze_data import analyze_data
from asta.commands.auth import auth
from asta.documents import documents
from asta.experiment import experiment
Expand Down Expand Up @@ -39,6 +40,7 @@ def papers():
cli.add_command(auth)

# Register passthrough commands
cli.add_command(analyze_data)
cli.add_command(documents)
cli.add_command(experiment)

Expand Down
10 changes: 10 additions & 0 deletions src/asta/utils/asta.conf
Original file line number Diff line number Diff line change
Expand Up @@ -66,4 +66,14 @@ passthrough {
friendly_name = "panda"
docstring = "Run computational experiments"
}

analyze-data {
tool_name = "dv"
install_type = "local"
install_source = "/Users/rodneyk/workspace/dv-core-asta-integration"
minimum_version = "0.1.0"
command_name = "analyze-data"
friendly_name = "DataVoyager"
docstring = "Analyze data using DataVoyager AI agent"
}
}
Loading