SkyDeck now includes comprehensive automated testing using Playwright, allowing you (Claude) to test the dashboard UI programmatically.
- Playwright: Browser automation framework
- pytest-playwright: Pytest integration for Playwright
- Chromium: Headless browser for testing
- Test fixtures: Pre-configured database and server setup
# Run all tests
uv run pytest
# Run only UI tests
uv run pytest tests/test_dashboard_ui.py -v
# Run specific test
uv run pytest tests/test_dashboard_ui.py::test_dashboard_loads -v
# Run with visible browser (for debugging)
# Edit tests/conftest.py: headless=False
uv run pytest tests/test_dashboard_ui.py -v-
Database Fixtures
temp_db: Empty temporary databasedb_with_experiments: Pre-populated with 3 test experiments
-
Playwright Fixtures
browser: Chromium browser instancepage: New browser page for each testdashboard: Running server with test database
test_models.py: Unit tests for data modelstest_database.py: Database operation teststest_dashboard_ui.py: End-to-end UI tests
- test_dashboard_loads: Verifies dashboard loads and displays header
- test_experiments_table_renders: Checks experiment table with data
- test_expand_experiment_row: Tests expandable row functionality
- test_flag_columns_displayed: Verifies dynamic flag columns
- test_create_experiment_modal: Tests modal open/close
- test_action_buttons_present: Verifies Start/Edit/Delete buttons
- test_health_status_displayed: Checks health status display
- test_clusters_section_exists: Verifies clusters section
-
Server Lifecycle
async with DashboardServer(db_path=test_db.path) as server: # Server runs on http://127.0.0.1:8765 # Automatically starts/stops
-
Database Setup
# Each test gets a clean database # Pre-populated with test experiments # Automatically cleaned up after test
-
Browser Interaction
await page.goto(dashboard.base_url) await page.click("button#my-button") text = await page.text_content("#result") assert "expected" in text
-
Screenshots on Failure
- Automatically saved to
tests/screenshots/ - Named after the failing test
- Useful for debugging
- Automatically saved to
import pytest
from tests.utils import wait_for_element, screenshot_on_failure
@pytest.mark.asyncio
async def test_my_feature(dashboard, page):
"""Test description."""
try:
# Navigate
await page.goto(dashboard.base_url)
# Wait for element
await wait_for_element(page, "#my-element")
# Interact
await page.click("button")
await page.fill("input#name", "test")
# Assert
result = await page.text_content("#result")
assert "success" in result
except Exception as e:
await screenshot_on_failure(page, "test_my_feature")
raise- Always use try/except with screenshot_on_failure
- Wait for elements before interacting:
wait_for_element() - Use specific selectors: ID > class > tag
- Add delays only when necessary:
page.wait_for_timeout(1000) - Keep test data minimal: 2-3 experiments is usually enough
Edit tests/conftest.py:
browser = await p.chromium.launch(headless=False, slow_mo=1000)content = await page.content()
print(content)open tests/screenshots/Run with -s to see output:
uv run pytest -s tests/test_dashboard_ui.pyTests are designed for CI:
- ✅ Headless mode by default
- ✅ Automatic cleanup
- ✅ No manual setup required
- ✅ Fast parallel execution
- ✅ Clear error messages with screenshots
- Unit tests: ~0.01s each
- UI tests: ~2-5s each (includes server startup)
- Full suite: ~30s
All test state is stored in ~/.skydeck/ (same as production):
- Database:
~/.skydeck/skydeck.db - Logs:
~/.skydeck/skydeck.log - PID file:
~/.skydeck/skydeck.pid
Tests use temporary databases that don't affect your production data.
To add group drag-and-drop tests:
@pytest.mark.asyncio
async def test_drag_experiment_to_group(dashboard, page):
"""Test dragging experiment between groups."""
await page.goto(dashboard.base_url)
# Get experiment row
row = await page.query_selector("tr.main-row[data-exp-id='test_exp_1']")
# Drag to group
target = await page.query_selector(".group-header[data-group='new_group']")
await row.drag_to(target)
# Verify
# ... check group assignment# Experiments table
"#experiments-table"
"tr.main-row"
"tr.expanded-row"
# Columns
".col-id"
".col-name"
".col-flag"
# Buttons
"button:has-text('Start')"
"button:has-text('Edit')"
"button:has-text('Delete')"
# Modals
"#create-modal.show"
"#flags-modal.show"
# Status
".status-badge.running"
".status-badge.stopped"When testing the dashboard:
- Start with simple tests: Load page, check elements exist
- Build up complexity: Click buttons, fill forms, verify results
- Use screenshots: They show exactly what's wrong
- Test error cases: Missing data, failed API calls, etc.
- Verify responsive behavior: Table scrolling, modal sizing
You can now test UI changes without manual browser interaction!