Skip to content

do-me/geospatial-atlas

 
 

Repository files navigation

Geospatial Atlas

This is a fork of Embedding Atlas adapted for geospatial data. As embeddings or rather their 2D projections share the exact same visualization challenges like 2D geospatial data, Embedding Atlas and all its functionality serve a great deal in geospatial data exploration!

It can visualize up to ~200M points in your WebGPU-enabled browser! Make sure to use Chrome, Safari or activate the flag in Firefox.

Find various example apps here. Try for example the 6M GlobalGeoTree explorer! Load your own data (up to around 6M points) here: https://do-me.github.io/geospatial-atlas/app/!

You can load the data from a remote URL too! Clicking this link, you download 100Mb of geolocated Wikipedia articles: https://do-me.github.io/geospatial-atlas/app/#?data=https://pub-016504dd3a4d419a9c17a8939840935e.r2.dev/v1/wikipedia_geotagged.parquet

LinkedIn Post for more context

Desktop App Download (v0.0.3)

One-click download of the latest desktop app. All bundles are unsigned, so every platform needs a one-off bypass step after install — shown per-row below.

More background in Desktop app releases below.

Example screenshots

alt text alt text alt text alt text alt text alt text alt text image

Installation

git clone https://github.com/do-me/geospatial-atlas.git
cd geospatial-atlas
npm install
npm run build

Running on an Intel Mac? Then add this line to packages/backend/pyproject.toml:

required-environments = ["sys_platform == 'darwin' and platform_machine == 'x86_64'"]

For Windows, Silicon Macs and Linux everything should work out of the box.

Usage (after installation above)

Execute this command directly from the root directory of the repository. The parquet file must either contain a geometry column or lat lon / latitude longitude columns.

uv --directory packages/backend run geospatial-atlas your_dataset_with_lat_lon_coords.parquet

If you have a small dataset (<5M places) you can add the --text flag to include a text column. Your names are then indexed and searchable. For large files this might cause out-of-memory errors.

uv --directory packages/backend run geospatial-atlas your_dataset_with_lat_lon_coords.parquet --text your_name_column

Alternatively you can cd into the backend folder and run it from there:

cd packages/backend
uv run geospatial-atlas your_dataset_with_lat_lon_coords.parquet --text your_name_column

The screenshots above were created with these two datasets:

Desktop app releases

Pre-built native apps (Tauri 2 shell + bundled Python sidecar) are on the releases page.

Platform File
macOS (Apple Silicon) geospatial-atlas-macos-arm64.dmg
Linux x86_64 (Debian/Ubuntu) geospatial-atlas-linux-x64.deb
Linux x86_64 (Fedora/RHEL) geospatial-atlas-linux-x64.rpm
Windows x86_64 (MSI) geospatial-atlas-windows-x64.msi
Windows x86_64 (NSIS setup) geospatial-atlas-windows-x64-setup.exe

Bundles are unsigned — Gatekeeper (macOS) and SmartScreen (Windows) will warn on first launch. Intel-Mac users aren't served by a pre-built bundle due to lack of runners on GitHub (11h waiting time and more); fall back to the CLI path further down.

macOS: "app is damaged and can't be opened"

That's a misleading Gatekeeper message for unsigned apps downloaded from the internet. After dragging Geospatial Atlas into Applications, strip the quarantine attribute once:

xattr -cr "/Applications/Geospatial Atlas.app"

Then double-click as usual. (Alternative: System Settings → Privacy & Security → Open Anyway after a failed launch attempt.)

Linux / Windows

sudo dpkg -i geospatial-atlas-linux-x64.deb   # Debian / Ubuntu
sudo rpm  -i geospatial-atlas-linux-x64.rpm   # Fedora / RHEL

On Windows, SmartScreen says "unrecognized publisher" — click More info → Run anyway.

Connect an LLM agent (Claude Desktop, Cursor, …)

The app (CLI and desktop, starting with v0.0.2) ships a Model Context Protocol server at /mcp. LLM clients can drive the viewer live: run SQL, add charts, fly to coordinates, grab screenshots of regions, cross-filter by bounding box, and more. Full setup: docs/MCP.md.

Tool surface at a glance (31 tools):

  • Dataget_data_schema, run_sql_query
  • Chartslist_charts, add_chart, delete_chart, get_chart_spec/set_chart_spec, get_chart_state/set_chart_state/clear_chart_state, get_chart_screenshot
  • Layoutget_layout_type/set_layout_type, get_layout_state/set_layout_state, get_full_screenshot
  • Renderinglist_renderers, get_column_styles, set_column_style
  • Geospatial (v0.0.2) — get_map_viewport, fly_to_point, fly_to_bbox, get_map_screenshot, get_map_screenshot_at, select_bbox, clear_selection, count_in_bbox, find_nearby, density_grid, highlight_points, set_basemap_style

Quick start with the CLI:

uv --directory packages/backend run geospatial-atlas your.parquet --mcp
# → URL: http://localhost:5055
# → MCP server: http://localhost:5055/mcp

Open the viewer at that URL in a real browser tab (for WebGPU), then point your LLM client's MCP config at http://localhost:5055/mcp. For autonomous / CI use, a Playwright-based headless viewer harness lives in scripts/mcp_harness/.

The desktop app surfaces a copyable MCP URL directly in the UI after a dataset is loaded (enabled by default, toggleable on the picker).

Client config

Most clients (Claude Desktop, Claude Code, Cursor, Continue, …) take a JSON entry with a single url field:

{
  "mcpServers": {
    "geospatial-atlas": {
      "url": "http://localhost:5055/mcp"
    }
  }
}

Claude Desktop config file locations:

OS Path
macOS ~/Library/Application Support/Claude/claude_desktop_config.json
Windows %APPDATA%\Claude\claude_desktop_config.json
Linux ~/.config/Claude/claude_desktop_config.json

Fully quit and reopen the client — the 31 tools above should appear in the tool picker. Swap the port if the server picked a different one (the URL banner prints the actual port on launch).

Build & Deploy GitHub Pages

The static web app is deployed manually (no CI). To rebuild and deploy:

# 1. Install dependencies (first time only)
npm install

# 2. Build all packages (utils, component, table, viewer, docs)
npm run build

# 3. Deploy the built site to the gh-pages branch
./scripts/deploy-gh-pages.sh

Then in GitHub → Settings → Pages, set the source to the gh-pages branch (root /).

The live site is available at: https://do-me.github.io/geospatial-atlas/

Testing

End-to-end tests use Playwright and cover both runtime modes (server mode with Python backend, and frontend-only mode with Vite dev server + DuckDB WASM).

Prerequisites:

npm run build              # server-mode tests need the built viewer
npx playwright install chromium

On first run the test suite auto-downloads a ~29 MB parquet fixture (GISCO Education) and caches it in e2e/.data/ (git-ignored). Override with E2E_PARQUET_FILE=/path/to/file.parquet if needed.

Run all tests:

npx playwright test

Run a single mode:

npx playwright test --project server-mode
npx playwright test --project frontend-mode

View the HTML report (generated on every run):

npx playwright show-report e2e/playwright-report

Test artifacts (traces, screenshots on failure, HTML report) are written to e2e/test-results/ and e2e/playwright-report/ — both git-ignored.

Test structure

e2e/
├── helpers.ts                # Auto-download, server lifecycle, page helpers
├── server-mode.spec.ts       # Full-stack: Python backend + pre-built viewer
│   ├── API                   #   Metadata endpoint, DuckDB query
│   ├── Rendering             #   Scatter canvas, MapLibre basemap, sidebar
│   ├── Basemap Alignment     #   Mercator formula, point-vs-map consistency
│   ├── Interaction           #   Scroll-to-zoom
│   └── Zoom Drift            #   Scatter-vs-map pixel alignment across zoom levels
└── frontend-mode.spec.ts     # Browser-only: Vite dev server + DuckDB WASM
    ├── File Upload           #   Drop zone, parquet upload transition
    └── Test Data Viewer      #   Synthetic data scatter, UI controls

To Do

  • Disallow zooming out further than zoom level 0 to avoid weird shifting effects
  • Adapt density and point radius ranges
  • Add basemap attribution
  • Release own "geospatial-atlas" pip package?
  • And much more! Feel free to open PRs!

Original Embedding Atlas Readme

NPM Version PyPI - Version Paper GitHub License

Embedding Atlas is a tool that provides interactive visualizations for large embeddings. It allows you to visualize, cross-filter, and search embeddings and metadata.

Features

  • 🏷️ Automatic data clustering & labeling: Interactively visualize and navigate overall data structure.

  • 🫧 Kernel density estimation & density contours: Easily explore and distinguish between dense regions of data and outliers.

  • 🧊 Order-independent transparency: Ensure clear, accurate rendering of overlapping points.

  • 🔍 Real-time search & nearest neighbors: Find similar data to a given query or existing data point.

  • 🚀 WebGPU implementation (with WebGL 2 fallback): Fast, smooth performance (up to few million points) with modern rendering stack.

  • 📊 Multi-coordinated views for metadata exploration: Interactively link and filter data across metadata columns.

Please visit https://apple.github.io/embedding-atlas for a demo and documentation.

screenshot of Embedding Atlas

Get started

To use Embedding Atlas with Python:

pip install embedding-atlas

embedding-atlas <your-dataset.parquet>

In addition to the command line tool, Embedding Atlas is also available as a Python Notebook (e.g., Jupyter) widget:

from embedding_atlas.widget import EmbeddingAtlasWidget

# Show the Embedding Atlas widget for your data frame:
EmbeddingAtlasWidget(df)

Finally, components from Embedding Atlas are also available in an npm package:

npm install embedding-atlas
import { EmbeddingAtlas, EmbeddingView } from "embedding-atlas";

// or with React:
import { EmbeddingAtlas, EmbeddingView } from "embedding-atlas/react";

// or Svelte:
import { EmbeddingAtlas, EmbeddingView } from "embedding-atlas/svelte";

For more information, please visit https://apple.github.io/embedding-atlas/overview.html.

BibTeX

For the Embedding Atlas tool:

@misc{ren2025embedding,
  title={Embedding Atlas: Low-Friction, Interactive Embedding Visualization},
  author={Donghao Ren and Fred Hohman and Halden Lin and Dominik Moritz},
  year={2025},
  eprint={2505.06386},
  archivePrefix={arXiv},
  primaryClass={cs.HC},
  url={https://arxiv.org/abs/2505.06386},
}

For the algorithm that automatically produces clusters and labels in the embedding view:

@misc{ren2025scalable,
  title={A Scalable Approach to Clustering Embedding Projections},
  author={Donghao Ren and Fred Hohman and Dominik Moritz},
  year={2025},
  eprint={2504.07285},
  archivePrefix={arXiv},
  primaryClass={cs.HC},
  url={https://arxiv.org/abs/2504.07285},
}

Development

For development instructions, please visit https://apple.github.io/embedding-atlas/develop.html, or checkout packages/docs/develop.md.

License

This code is released under the MIT license.

About

Geospatial Atlas is a tool that provides interactive visualizations for large point datasets. It allows you to visualize, cross-filter, and search the data and its metadata.

Resources

License

Code of conduct

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • TypeScript 36.2%
  • Svelte 21.7%
  • Python 18.0%
  • Rust 17.1%
  • JavaScript 4.8%
  • WGSL 1.0%
  • Other 1.2%