Read a GEDCOM file and translate the locations into GPS addresses.
- Produces KML map types with timelines and movement visualization.
- Generates interactive HTML maps.
- Summarizes places and locations with high-accuracy geocoding.
- Visualizes family lineageβascendants and descendants.
- Generates comprehensive statistics reports with demographics, temporal patterns, family relationships, and data quality metrics.
- Supports both command-line and GUI interfaces (GUI tested on Windows, macOS, and WSL).
- Now uses a modern services-based architecture: all exporters, renderers, and core logic require service objects implementing
IConfig,IState, andIProgressTracker(seeservices.py). - The legacy global options object (
gOp) has been fully removed; all code and tests use dependency injection for configuration and state.
Originally forked from [https://github.com/lmallez/gedcom-to-map], now in collaboration with colin0brass.
- β
Optional Processing Steps: Added configuration options to disable enrichment and statistics processing during GEDCOM load (
EnableEnrichmentandEnableStatisticsin Configuration Options β Processing Options). When disabled, these steps are skipped entirely, significantly speeding up loading of very large GEDCOM files (e.g., WilliamLongsword.ged). Summary report checkboxes for disabled features are automatically grayed out - β Background Worker Robustness: Enhanced background processing thread with comprehensive error handling, automatic recovery from failures, and proper state reset. Worker thread now always returns to ready state even when exceptions occur, preventing stuck UI states
- β Memory Monitoring Improvements: Memory tracking now shows only NEW allocations since app start using baseline snapshots, providing accurate memory usage analysis. Tracemalloc baseline comparison eliminates misleading cumulative statistics
- β File Opening Infrastructure: Fixed KML files opening in correct system default handlers (Google Earth) instead of text editors. All result types (HTML, KML, KML2, SUM) now open in appropriate applications based on platform-specific defaults
- β Pedigree Collapse Instrumentation: Added comprehensive tracking for pedigree collapse scenarios where same ancestors appear via multiple paths. Logs unique people vs total Line objects, helping users understand extreme line counts in royal genealogy datasets
- β Logging System Refinement: Reorganized logging levels with verbose internal state tracking moved to DEBUG level. Production logs now show only user-relevant information at INFO level, reducing log noise while maintaining full debug capability
- β Large Dataset Warnings: Enhanced AllEntities checkbox with two-tier warning system: critical errors for >10K people (prevents crashes) and standard warnings for 200-10K people. Explains memory implications and processing time
- β Progress Reporting Enhancements: Added detailed progress logging during KML generation phases and all createothers() operations, reporting every 1000 people to keep users informed during long operations
- β Memory Optimizations: Significant performance improvements in genealogical trace generation (creator.py) including event caching and optimized list operations, reducing memory allocations by 205k+ operations and improving performance for large family trees. Fixed double-traversal bug that was creating duplicate Line objects
- β
Configuration GUI Enhancements: Configuration Options dialog now includes controls for
earliest_credible_birth_year(Statistics section) andEnableTracemalloc(Performance section) for detailed memory tracking during development - β
Portable Pre-commit Setup: Pre-commit hooks now use
.pre-commit-pytest.shwrapper script that automatically finds Python with pytest installed, making setup portable across different development environments without hardcoded paths - β
Statistics Enhancements: Added
total_generationsfield to statistics YAML output (calculated as generation span + 1) for comprehensive genealogy metrics - β Automated Testing Infrastructure: Added GitHub Actions CI/CD, pre-commit hooks, and Makefile commands for automated testing across multiple OS platforms (Ubuntu, Windows, macOS) and Python versions (3.10-3.13). See docs/automated-testing.md
- βΉοΈ CI Platform Notes: For details on why Ubuntu core CI excludes wxPython while Windows/macOS include it, see the CI platform notes (wxPython).
- β GUI Service Integration Tests: Added test coverage for GUI-service layer attribute consistency to prevent AttributeError bugs
- β Dark Mode GUI Fixes: Fixed grid background colors to refresh reliably when switching between light and dark modes. Fixed Configuration Options dialog contrast issues in dark mode
- β
Statistics Summary Bug Fix: Fixed AttributeError in Actions β Statistics Summary menu (incorrect attribute name
selected_peoplevsselectedpeople) - β
Statistics Configuration: Added configurable
earliest_credible_birth_yearthreshold (default: 1000) ingedcom_options.yamlto filter implausible birth dates from statistics reports. Prevents data entry errors like year "1" from appearing in Executive Summary metrics - β Dark Mode Support: Comprehensive dark mode for HTML statistics reports and wxPython GUI with automatic system appearance detection on macOS. GUI colors use standard X11 color names for better readability and maintainability
- β
Cross-Platform Testing: Fixed UTF-8 encoding issues in statistics tests to ensure compatibility with Windows (cp1252), macOS, and Linux. All file operations now explicitly specify
encoding='utf-8' - β Geocoding UI Improvement: Configuration dialog now uses mutually exclusive radio buttons for geocoding mode (Normal/Geocode only/Cache only) instead of checkboxes, preventing confusing combinations
- β Windows Compatibility: Fixed Windows-specific crash during family record processing where accessing partner records could fail with "'NoneType' object has no attribute 'xref_id'" error
- β
Cache-Only Mode Fixes:
- Cache-only mode no longer retries previously failed geocode lookups, ensuring true read-only behavior
- geo_cache.csv file is not saved in cache-only mode, preventing timestamp updates
- β
Progress Reporting Infrastructure: Comprehensive progress tracking across all major operations with stop request support
- GEDCOM parsing displays accurate progress metrics (counter, target, ETA) in the GUI
- Statistics pipeline reports progress for all 14 collectors with "Statistics (X/Y): operation" format
- Enrichment pipeline shows progress for all 3 rules with "Enrichment (X/Y): operation" format
- Geocoding operations include progress for cache separation and location processing
- Progress reports every 100 records for optimal UI responsiveness
- Stop button properly interrupts long-running operations at all stages
- β Configuration System: Refactored configuration handling with separated concerns - dedicated loader classes for YAML, INI, and logging configuration for better structure, reliability, and testability
- β
Dependency Injection: Updated
GVConfigto support dependency injection pattern, making testing and maintenance easier - β Test Coverage: Added 26 new unit tests for configuration loaders, improving test isolation and coverage
- β Logging Improvements: Log file now always writes to application root directory (not dependent on working directory)
- β Photo Path Handling: Fixed cross-platform image display in HTML maps (Windows paths with backslashes now work correctly)
- β Progress Messaging: Added early progress messages during HTML generation for better user feedback
- β Loop Detection: Updated genealogical line creators to support pedigree collapse (same person in multiple branches)
- β Logging System: Simplified logging configuration with 12 core loggers, WARNING default level, and Clear Log File option
- β Configuration Dialog: Added "Set All Levels" control and improved logging grid display
- β ResultType Refactoring: Moved ResultType enum to dedicated render/result_type.py module for better organization
- β HTML Generation: Fixed issue where HTML output files weren't being saved during batch processing
- β Output File Paths: Corrected file path construction for all output types (HTML, KML, KML2, SUM)
- β Image Handling: Fixed cross-platform image loading to handle Windows paths in GEDCOM files
- β
Architecture: Removed legacy
gedcom_options.pydependency; all code now uses services-based architecture
- Genealogy hobbyists wanting spatial context for life events.
- Historians and researchers mapping migrations and demographic clusters.
- Developers and data scientists seeking GEDCOM-derived geodata for analysis or visualization.
--Be cautious with AllEntities checkbox: For datasets >10K people, the app shows a critical warning about hours-long processing and potential crashes. For royal genealogy files that trace to biblical figures, extreme pedigree collapse can create hundreds of thousands of Line objects even for relatively few unique people.
- Outputs (HTML/KML/SUM) are written next to your selected GEDCOM file; change the output filename in the GUI if you need a different base name. Generated files automatically open in appropriate applications (KML in Google Earth, HTML in browser).
- Double-left-click a person in the GUI list to set the starting person for traces and timelines.
- Edit
geo_cache.csvto correct or refine geocoding, then save and re-run to apply fixes. - Export KML to inspect results in Google Earth Pro, Google MyMaps, or ArcGIS Earth.
- Generate tables and CSV files listing people, places, and lifelines.
- Enable debug logging (Configuration Options β Logging) to see detailed internal operations including pedigree collapse statistics, memory tracking, and background worker state transition
- Use the GUI to pick your GEDCOM, choose output type, then click Load to parse and resolve places.
- Choose geocoding mode: Normal (uses cache and geocodes new addresses), Geocode only (always geocode, ignoring cache), or Cache only (read-only, no network requests).
- Outputs (HTML/KML/SUM) are written next to your selected GEDCOM file; change the output filename in the GUI if you need a different base name.
- Double-left-click a person in the GUI list to set the starting person for traces and timelines.
- Edit
geo_cache.csvto correct or refine geocoding, then save and re-run to apply fixes. - Export KML to inspect results in Google Earth Pro, Google MyMaps, or ArcGIS Earth.
- Generate tables and CSV files listing people, places, and lifelines.
- When using exporters or renderers in your own scripts or tests, always provide service objects for configuration, state, and progress tracking. See the
render/testsdirectory for up-to-date examples.
Assuming you have Python installed (see Install-Python if not):
-
Clone the repository:
git clone --recurse-submodules https://github.com/D-Jeffrey/gedcom-to-visualmap cd gedcom-to-visualmap git submodule update --init --recursive
Or download and unzip the latest release.
If you get an error because you have not set up git ssh, use the commands:
git clone --recurse-submodules https://github.com/D-Jeffrey/gedcom-to-visualmap cd gedcom-to-visualmap git config --global url."https://github.com/".insteadOf git@github.com: git submodule update --init --recursive
This project now uses a dependency-injection/services pattern for all configuration, state, and progress tracking. See services.py for interface details. All new code and tests should use this pattern.
-
Create and activate a virtual environment:
For Windows (PowerShell):
python3 -m venv venv venv\Scripts\activate.ps1
For Linux and Mac:
python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt pip install -r gedcom-to-map/geo_gedcom/requirements.txt
For development (includes testing tools, linting, pre-commit hooks):
pip install -r requirements-dev.txt pre-commit install # Sets up git hooks
Or use Makefile shortcut:
make install-dev # Installs dev tools and sets up pre-commit hooksNote: Pre-commit will auto-install tool environments (black, flake8, etc.) on first run. This takes a few minutes but only happens once.
-
Run the GUI interface:
cd gedcom-to-map python3 gv.py
Or run the command-line interface:
cd gedcom-to-map python3 gedcom-to-map.py /path/to/your.ged myTree -main "@I500003@"
-
Update your code and dependencies:
git config pull.rebase false https://github.com/D-Jeffrey/gedcom-to-visualmap git pull https://github.com/D-Jeffrey/gedcom-to-visualmap pip install -r requirements.txt pip install -r gedcom-to-map/geo_gedcom/requirements.txt
- Click
Input Fileto select your .ged file. - Set your options in the GUI, including:
- Geocoding mode: Choose between Normal (use cache and geocode), Geocode only (ignore cache), or Cache only (no geocode)
- Days between retrying failed lookups
- Default country for geocoding
- Processing options (via Configuration Options dialog): Enable/disable enrichment and statistics processing to speed up loading of very large files
- Click
Loadto parse and resolve addressesβprogress displays with counter, target, and ETA. - Use
Draw Updateto save and view results. Open GPSopens the CSV in Excel (close it before re-running).Stopaborts loading at any stage (GEDCOM parsing, statistics, enrichment, or geocoding) without closing the GUI.- Double-left-click a person to set the starting person for traces.
- Use
Geo Tableto edit and manage resolved/cached names. - Use
Traceto create a list of individuals from the starting person. - Use
Browserto open the last output HTML file. - Right-click a person for details and geocoding info.
- Progress tracking shows detailed status: During processing, you'll see messages like "Statistics (3/14): Analyzing demographics : 35% (700/2000)" or "Enrichment (2/3): Applying geographic rules : 50% (150/300)"
geo_cache.csvis created automatically when addresses are looked up.- You can use an alternative address file (e.g.,
my_family.csvformy_family.ged). - Do not keep CSV files open in Excel or other apps while running the program.
The comprehensive statistics report provides detailed demographic analysis, temporal patterns, family relationships, and data quality metrics:
- Demographics Section - Gender distribution with bar charts, popular names visualization, age statistics, and birth patterns
- Temporal Analysis - Historical timelines, longevity trends, mortality rates, and event distribution
- Family Relationships - Marriage statistics, children per family, divorce rates, and relationship path analysis
- Geographic Insights - Birth/death location distribution and migration patterns
- Data Quality - Event completeness metrics showing date and place coverage percentages
Reports are generated in both markdown (.md) and HTML (.html) formats, with the HTML version automatically opening in your browser for easy viewing.
- Set CSV or KML viewer in Options -> Setup.
SummaryOpencontrols whether SUM outputs auto-open; the app uses your configured file commands (Options -> Setup) and writes all summary CSV/PNG/HTML files beside the GEDCOM input.- KML2 is an improved version of KML.
- SUM is a summary CSV and plot of birth vs death by continent/country.
- SUM also generates comprehensive statistics reports with visualizations, charts, and detailed demographic analysis in both markdown and HTML formats.
The SUM results type generates a comprehensive genealogical statistics report that includes:
- π Executive Summary - Key metrics including total people, living/deceased counts, average lifespan, gender distribution, total marriages, number of generations, earliest birth year, and time span
- π₯ Demographics - Population analysis with gender distribution charts and popular names with bar charts
- β° Temporal Patterns - Historical timelines, birth/death patterns, and longevity trends
- π¨βπ©βπ§βπ¦ Family Relationships - Marriage statistics, children per family, divorce rates, and relationship paths
- π Geographic Distribution - Birth and death locations, migration patterns
- π Data Quality Metrics - Event completeness, date/place coverage percentages
- Markdown (.md) - Editable source format with ASCII charts and tables
- HTML (.html) - Beautiful browser-rendered version with GitHub styling, automatically opened after generation
- Select your GEDCOM file in the GUI
- Choose
SUMas the Results Type - Enable
SummaryOpenin options to automatically open the report - Click
Draw Updateto generate the report - The HTML report opens automatically in your browser
Statistics and performance settings can be customized in gedcom_options.yaml or via the GUI Configuration Options dialog (click the "Configuration Options..." button):
statistics_options:
earliest_credible_birth_year: {type: 'int', default: 1000, ini_section: 'Statistics'}
performance_options:
EnableTracemalloc: {type: 'bool', default: false, ini_section: 'Performance'}-
earliest_credible_birth_year(default: 1000): Filters birth years older than this threshold from the "earliest birth year" metric in the Executive Summary. This prevents data entry errors (like year "1" or "800") from appearing in reports. Configure via GUI or edit YAML file. Settings persist in INI[Statistics]section. Adjust this value based on your dataset:- Use
1000for general genealogy (filters medieval and ancient errors) - Use
500-800for legitimate medieval European genealogy - Use
1500-1700for modern datasets where older dates are unlikely
- Use
-
EnableTracemalloc(default: false): Enables detailed Python memory allocation tracking usingtracemallocmodule. Useful for debugging memory issues but adds ~15% performance overhead. Configure via GUI or edit YAML file. Settings persist in INI[Performance]section. Only enable when investigating memory usage.
GUI Access: Open Configuration Options dialog β Statistics Options section (birth year) and Performance Options section (memory tracking)
The report provides comprehensive insights into your genealogical data with 14 different statistical collectors analyzing demographics, events, names, ages, births, longevity, timelines, marriages, children, relationships, divorces, and geographic patterns.
| Project | GitHub Repo | Documentation | Purpose |
|---|---|---|---|
| wxPython | Phoenix | wxpython.org | GUI toolkit |
| ged4py | ged4py | docs | GEDCOM parser |
| simplekml | simplekml | docs | KML generation |
| geopy | geopy | docs | Geocoding |
| folium | folium | docs | Interactive maps |
| xyzservices | xyzservices | docs | Tile services |
| pyyaml | pyyaml | YAML processing | |
| rapidfuzz | RapidFuzz | Fuzzy string matching | |
| pycountry | pycountry | Country data | |
| pycountry-convert | pycountry-convert | Country/continent conversion | |
| pandas | pandas | Data analysis | |
| seaborn | seaborn | docs | Visualization |
| matplotlib | matplotlib | docs | Visualization |
The project includes comprehensive test coverage across all major components with 581 passing tests ensuring cross-platform compatibility (macOS, Windows, Linux).
# Run all tests
make test # or: pytest --quiet
# Run fast tests only (skip slow performance tests)
make test-fast # or: pytest -m "not slow"
# Run GUI service integration tests
make test-gui
# Run with coverage report
make test-cov # or: pytest --cov=gedcom-to-map --cov-report=htmlThe project includes multiple levels of automated testing:
- GitHub Actions CI/CD: Runs automatically on every push/PR, tests across 3 OS platforms and 4 Python versions
- Pre-commit Hooks: Runs fast tests automatically before each commit (install with
make install-dev) - Make Commands: Convenient shortcuts for common test operations
- CI platform behavior: Ubuntu core CI excludes wxPython for runner stability, while Windows/macOS core CI include wxPython. Details: CI platform notes (wxPython)
See docs/automated-testing.md for complete documentation.
-
Unit Tests: Fast, isolated tests for individual components
gedcom-to-map/services/tests/- Configuration, state, and progress tracking (131 tests)gedcom-to-map/gui/tests/- GUI service integration and attribute consistency testsgedcom-to-map/models/tests/- Data models and core structuresgedcom-to-map/render/tests/- Rendering and export functionality (UTF-8 encoding verified for Windows compatibility)gedcom-to-map/geo_gedcom/statistics/tests/- Statistics collectors and pipeline (68 tests including configurable threshold tests)
-
Integration Tests: Test component interactions
- Configuration loading and migration
- GEDCOM parsing with geocoding
- Output generation (HTML, KML, statistics)
-
Performance Tests: Marked with
@pytest.mark.slow(see below)
The configuration system includes dedicated test coverage:
test_config_loader.py- 26 unit tests for YAML, INI, and logging configuration loaderstest_config_service.py- Integration tests for the mainGVConfigservice- All tests use dependency injection and proper isolation (temporary files, mocked dependencies)
To run the address book/geocoding performance test and see detailed output in the terminal, use:
pytest -s -m slow gedcom-to-map/tests/test_addressbook_performance.py
This test benchmarks address book and geocoding operations across multiple GEDCOM samples, for both fuzzy and exact matching. It prints a markdown table of results to the terminal and also writes structured results to gedcom-to-map/tests/addressbook_performance_results.yaml for further analysis.
The -s option ensures that all print statements from the test are shown in the terminal.
Note: It requires some geo_cache files that may not be checked-in to the repo by default, so you might need to generate them manually using the "SUM" output option first.
To run the GeolocatedGedcom initialization performance test:
pytest -s -m slow gedcom-to-map/tests/test_geolocatedgedcom_performance.py
This test measures the initialization time and basic stats for the GeolocatedGedcom class across the same set of GEDCOM samples, for both fuzzy and exact matching. It prints a markdown table of results to the terminal and writes structured results to gedcom-to-map/tests/geolocatedgedcom_performance_results.yaml.
Some tests are marked with the @pytest.mark.slow decorator to indicate that they are slow or intended for manual/performance runs only. By default, these tests are skipped unless explicitly requested.
To run only the slow tests:
pytest -m slow
To run all tests except those marked as slow:
pytest -m 'not slow'
To mark a test as slow, add the following decorator above your test function:
import pytest
@pytest.mark.slowYou may want to register the marker in your pytest.ini to avoid warnings:
[pytest]
markers =
slow: marks tests as slow (deselect with '-m "not slow"')
See the releases page for detailed changelogs.
- @colin0brass
- @lmallez
- @D-jeffrey
See the main repository LICENSE.txt for details.








