Skip to content

Commit f1248c6

Browse files
committed
feat: historical data backfill (Milestone 9) — v0.6.0
- Backfill orchestrator: 24-month history split into 90-day chunks, one job per (chunk × index) with staggered countdowns - Auto-trigger on field creation with sentinel job pattern - Weekly auto-compute via Celery Beat (Monday 06:00 UTC) - Manual backfill API (POST /fields/{id}/backfill-indices, admin-only) - Backfill status API (GET /fields/{id}/backfill-status) - Chunk/index-level dedup and 409 Conflict endpoint guard - Alert suppression for backfill jobs (is_backfill flag) - Frontend: Backfill History button with status detection and banner - Batch backfill task for retroactive processing of existing fields - Config: index_backfill_months (24), index_backfill_chunk_days (90) - i18n: English + Spanish translations for all backfill strings
1 parent 3cfff12 commit f1248c6

15 files changed

Lines changed: 871 additions & 152 deletions

File tree

CHANGELOG.md

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,25 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/).
88

99
---
1010

11+
## [0.6.0] - 2026-03-19
12+
13+
### Added
14+
- **Historical data backfill** — automatic 24-month vegetation index backfill for all four indices (NDVI, EVI, SAVI, NDWI) on field creation.
15+
- Backfill orchestrator task (`backfill_indices_for_field`) that splits date ranges into 90-day chunks and dispatches one job per (chunk × index) pair with staggered countdowns.
16+
- Chunk-level and index-level deduplication — skips chunks/indices where raster data already exists.
17+
- Weekly auto-compute Celery Beat schedule (`schedule_weekly_index_compute`) — runs every Monday at 06:00 UTC, skips fields with recent data (< 5 days old).
18+
- Manual backfill API endpoint: `POST /fields/{id}/backfill-indices` with configurable month range (admin/owner only, rate-limited 1/minute).
19+
- Backfill status API endpoint: `GET /fields/{id}/backfill-status` — returns pending, running, and completed job counts.
20+
- Endpoint-level deduplication — returns 409 Conflict if a backfill is already in progress for the field.
21+
- Sentinel job pattern — synchronous placeholder job created before async Celery dispatch to prevent race conditions with status checks.
22+
- Alert suppression for backfill jobs — backfill pipeline runs do not create alerts (controlled via `is_backfill` flag in `params_json`).
23+
- Backfill History button on field Indices tab with loading state, active-backfill detection, and progress banner.
24+
- Batch backfill task (`backfill_all_existing_fields`) for retroactive backfill of all org fields.
25+
- Configuration settings: `index_backfill_months` (default 24), `index_backfill_chunk_days` (default 90).
26+
- English and Spanish translations for all backfill UI strings.
27+
28+
---
29+
1130
## [0.5.0] - 2026-03-17
1231

1332
### Added

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ Open source self-hostable and reproducible Crop Intelligence Platform
3636

3737
## Why OpenFarm
3838
- Self-hostable stack with clear service boundaries (Next.js ↔ FastAPI ↔ TiTiler ↔ MinIO ↔ PostGIS)
39-
- Multi-index vegetation monitoring — NDVI, EVI, SAVI (configurable L factor), and NDWI from Sentinel-2 imagery
39+
- Multi-index vegetation monitoring — NDVI, EVI, SAVI (configurable L factor), and NDWI from Sentinel-2 imagery with automatic 24-month historical backfill
4040
- ML-powered automatic field boundary detection (FTW model) with interactive review workflow
4141
- Daily weather data with agricultural indices (GDD, water balance, drought index) via Open-Meteo
4242
- Reproducible pipeline with provenance (Element84 STAC → COG → TiTiler tiles)
@@ -145,7 +145,7 @@ ruff format --check .
145145
- **Auth**: Google OAuth via NextAuth → JWT bridge (`/api/auth/token`)
146146
- **Orgs & RBAC**: owner/admin/member/viewer with audit logging
147147
- **Farms & Fields**: draw/upload GeoJSON/KML, area calc, soft delete
148-
- **Vegetation Monitoring**: NDVI, EVI, SAVI (configurable L factor), NDWI — STAC search → COG → TiTiler tiles → time-series stats
148+
- **Vegetation Monitoring**: NDVI, EVI, SAVI (configurable L factor), NDWI — STAC search → COG → TiTiler tiles → time-series stats, with automatic 24-month historical backfill on field creation and weekly auto-compute
149149
- **Boundary Detection**: automatic field boundary detection from Sentinel-2 imagery using FTW deep learning model — draw area, review results, accept as fields
150150
- **Weather Integration**: daily historical + 7-day forecast weather data per field — temperature, precipitation, ET₀, soil moisture/temperature, VPD, GDD, water balance, drought index
151151
- **Per-Index Alerts**: configurable threshold and drop-percentage rules, enriched with weather context

ROADMAP.md

Lines changed: 77 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,22 @@ This document outlines where OpenFarm is today and where it's headed. If you'd l
66

77
---
88

9+
## Platform Vision
10+
11+
### Layer A — Observation Infrastructure
12+
Satellite ingestion, weather, radar, sensor plugins, field boundaries, temporal storage, provenance, exports, APIs.
13+
14+
### Layer B — Intelligence Engine
15+
Phenology, crop type, stress segmentation, anomaly detection, disease/pest risk signals, irrigation/nutrient heuristics, yield forecasting, uncertainty scoring, benchmarking.
16+
17+
### Layer C — Delivery Surfaces
18+
Map UI, reports, API, webhooks, partner integrations, lightweight mobile scouting, LLM/MCP interfaces.
19+
20+
---
21+
922
## Current Status
1023

11-
OpenFarm **Phase 4 (Weather Integration) is complete**. The platform delivers end-to-end satellite-powered crop intelligence with four vegetation indices (NDVI, EVI, SAVI, NDWI), ML-powered automatic field boundary detection, and daily weather data integration with agricultural indices — all functional and deployed. Weather data (historical + 7-day forecast) is fetched per-field from Open-Meteo, stored with pre-computed agricultural indices (GDD, water balance, drought index), and surfaced across the entire UI: dedicated weather tab, alert enrichment, scouting snapshots, share reports, and NDVI+weather overlay charts. The focus now shifts to testing, documentation, and building the agricultural intelligence layer. See [Future Ideas](#future-ideas-post-mvp) for what's next.
24+
OpenFarm **Phase 5 (Historical Data Backfill) is complete**. The platform delivers end-to-end satellite-powered crop intelligence with four vegetation indices (NDVI, EVI, SAVI, NDWI), ML-powered automatic field boundary detection, daily weather data with agricultural indices, and automatic 24-month historical index backfill on field creation — all functional and deployed. New fields automatically receive two years of vegetation index history, and a weekly Celery Beat schedule keeps all fields up to date. The focus now shifts to testing, documentation, and building the agricultural intelligence layer. See [Future Ideas](#future-ideas-post-mvp) for what's next.
1225

1326
---
1427

@@ -112,6 +125,22 @@ OpenFarm **Phase 4 (Weather Integration) is complete**. The platform delivers en
112125
- [x] Alembic migrations for detected boundaries, nullable job field_id, updated_at
113126
- [x] i18n translations (English + Spanish) for all detection UI
114127

128+
## Milestone 9 — Historical Data Backfill ✅
129+
130+
- [x] Backfill orchestrator task — splits 24-month range into 90-day chunks, dispatches one job per (chunk × index) with staggered countdowns
131+
- [x] Chunk-level and index-level deduplication — skips chunks/indices where raster data already exists
132+
- [x] Alert suppression — backfill pipeline runs do not create alerts (`is_backfill` flag in `params_json`)
133+
- [x] Auto-trigger on field creation — new fields automatically get 24 months of all 4 vegetation indices
134+
- [x] Weekly auto-compute Celery Beat schedule (`schedule_weekly_index_compute`) — Monday 06:00 UTC, skips fresh fields
135+
- [x] Manual backfill API: `POST /fields/{id}/backfill-indices` (admin/owner, rate-limited 1/min)
136+
- [x] Backfill status API: `GET /fields/{id}/backfill-status` — pending, running, completed job counts
137+
- [x] Endpoint-level deduplication — 409 Conflict if backfill already in progress
138+
- [x] Sentinel job pattern — synchronous placeholder job before async dispatch to prevent race conditions
139+
- [x] Batch backfill task (`backfill_all_existing_fields`) for retroactive backfill of all org fields
140+
- [x] Backfill History button on Indices tab with active-backfill detection and progress banner
141+
- [x] Config settings: `index_backfill_months` (24), `index_backfill_chunk_days` (90)
142+
- [x] i18n translations (English + Spanish) for all backfill UI strings
143+
115144
## Milestone 8 — Weather Data Integration ✅
116145

117146
- [x] `weather_daily` table with 18 raw variables + 5 derived indices + metadata
@@ -138,60 +167,71 @@ OpenFarm **Phase 4 (Weather Integration) is complete**. The platform delivers en
138167

139168
## Future Ideas (Post-MVP)
140169

141-
These are under consideration but not yet committed. Grouped by theme and roughly ordered by priority.
170+
These are under consideration but not yet committed. Grouped by theme and ordered by dependency — items higher in each list are prerequisites for items below them.
142171

143172
### Platform Foundations
173+
- ~~**Historical data backfill**~~ — ✅ completed in Milestone 9
174+
- **Data export** — export field data, stats, and reports in CSV, GeoJSON, PDF formats
175+
- **Custom index builder** — UI for users to define custom indices from available bands with formula editor and visualization
176+
- **User roles and permissions** — more granular permissions (e.g., field-level access, read-only API keys) and user groups
144177
- **Email/Microsoft & Enterprise SSO** — support email/password, Microsoft OAuth, and SAML/OIDC for enterprise identity providers
145178
- **Direct API integration** — stable, versioned public API with API keys for integrating OpenFarm into existing farm management software
146-
- **Multi-satellite support** — Landsat, Planet (currently Sentinel-2 only)
179+
- **Multi-satellite support** — Landsat, Planet (currently Sentinel-2 only) _(prerequisite for higher-frequency monitoring)_
147180
- **Higher-frequency monitoring** — support for daily revisit satellites (e.g., PlanetScope) for near real-time crop monitoring
148-
- **Higher-res imagery** — support for sub-meter commercial imagery for detailed crop monitoring
149-
- **Custom index builder** — UI for users to define custom indices from available bands with formula editor and visualization
150-
- **Data export** — export field data, stats, and reports in CSV, GeoJSON, PDF formats
151-
- **User roles and permissions** — more granular permissions (e.g., field-level access, read-only API keys) and user groups
152-
- ~~**Weather data integration**~~ — ✅ Done (Milestone 8)
153-
181+
- **Higher-res imagery** — support for sub-meter commercial imagery for detailed crop monitoring _(prerequisite for tree canopy analysis, individual tree detection)_
182+
- **Drone imagery support** — ingest and process high-res drone imagery for field-level insights _(depends on higher-res imagery pipeline)_
154183

155184
### Agricultural Intelligence
156-
- **Crop detection and classification** — ML-based crop type identification from spectral data
157-
- **tree detection and classification** — ML-based tree crop identification and health monitoring
158-
- **Phenology tracking** — track crop growth stages and phenological events from satellite data
159-
- **tree canopy analysis** — canopy cover, leaf area index, and tree height estimation from high-res imagery
160-
- **drought monitoring** — integrate drought indices and soil moisture data for water stress assessment
161-
- **water stress monitoring** — integrate NDWI and soil moisture data for irrigation management
162-
- **soil moisture estimation** — integrate soil moisture data from remote sensing and in-situ sensors
163-
- **Disease/pest risk signals** — risk scoring framework combining vegetation anomalies, weather, and regional pest data
164-
- **nutrient deficiency detection** — identify spectral signatures of common nutrient deficiencies for early intervention
165-
- **Fertilizer and irrigation recommendations** — actionable insights based on crop health trends, weather forecasts, and agronomic models
166-
- **Anomaly detection** — unsupervised ML to identify unusual patterns in field health data that may indicate emerging issues
167-
- **Historical data backfill** — backfill historical vegetation index data for existing fields to enable trend analysis from day one
168-
- **Yield analysis and forecasting** — predict yield from historical NDVI trends, weather, and field data
169-
- **Harvest timing recommendations** — optimal harvest windows based on crop maturity models and vegetation indices
170-
- **Climate impact modeling** — estimate carbon sequestration, emissions, and climate impact of farming practices using field data and agronomic models
171-
- **Carbon/sustainability reporting** — track and report carbon sequestration, emissions, and sustainability metrics
185+
186+
Items are ordered by dependency. Each layer builds on the one above it.
187+
188+
**Layer 1 — Statistical analysis (CPU-only, builds on existing indices + weather)**
189+
- **Anomaly detection** — statistical detection of unusual patterns in vegetation index time series (z-score, moving average deviation from historical baseline) _(historical backfill ✅)_
190+
- **Phenology tracking** — track crop growth stages and phenological events from NDVI/EVI temporal curves _(historical backfill ✅)_
191+
- **Drought and water stress monitoring** — composite risk scoring from existing drought index, NDWI, soil moisture, ET₀, and water balance data _(extends existing weather + NDWI features)_
192+
193+
**Layer 2 — Composite intelligence (builds on Layer 1)**
194+
- **Disease/pest risk signals** — risk scoring framework combining vegetation anomalies, weather conditions, and regional pest data _(depends on anomaly detection + weather)_
195+
- **Soil moisture estimation** — enhanced soil moisture modeling combining remote sensing indices with in-situ sensor data _(depends on drought/water stress monitoring)_
196+
197+
**Layer 3 — Actionable recommendations (builds on Layer 2)**
198+
- **Yield analysis and forecasting** — predict yield from historical NDVI trends, weather, and field data _(depends on phenology + anomaly detection + weather)_
199+
- **Harvest timing recommendations** — optimal harvest windows based on crop maturity models and vegetation indices _(depends on phenology tracking)_
200+
- **Fertilizer and irrigation recommendations** — actionable insights based on crop health trends, weather forecasts, and agronomic models _(depends on disease risk + water stress)_
201+
202+
**Layer 4 — ML-powered classification (requires GPU + fine-tuned models)**
203+
- **Crop detection and classification** — ML-based crop type identification from multi-temporal spectral data using foundation models _(see [research doc](docs/features/crop-tree-detection.md))_
204+
- **Tree detection and classification** — ML-based tree crop identification and health monitoring _(depends on crop classification pipeline)_
205+
- **Nutrient deficiency detection** — identify spectral signatures of common nutrient deficiencies for early intervention _(depends on additional spectral bands from crop classification pipeline)_
206+
- **Tree canopy analysis** — canopy cover, leaf area index, and tree height estimation _(depends on higher-res imagery)_
207+
208+
**Layer 5 — Advanced modeling (builds on everything above)**
209+
- **Climate impact modeling** — estimate carbon sequestration, emissions, and climate impact of farming practices using field data and agronomic models _(depends on yield forecasting + crop classification)_
210+
- **Carbon/sustainability reporting** — track and report carbon sequestration, emissions, and sustainability metrics _(depends on climate impact modeling)_
172211

173212
### Analytics & Workflows
174-
- **Advanced analytics and reporting framework** — customizable dashboards, scheduled reports, and data export
175213
- **Field comparison** — side-by-side health comparison across fields
176-
- **Historical analytics** — season-over-season trend analysis
177-
- **Advanced workflows** — rule-based automation (e.g., auto-trigger analysis on new imagery, scheduled monitoring)
178-
- **Webhook/notification system** — email, Slack, or SMS on alerts
214+
- **Historical analytics** — season-over-season trend analysis _(historical backfill ✅)_
215+
- **Webhook/notification system** — email, Slack, or SMS on alerts _(prerequisite for advanced workflows)_
216+
- **Advanced analytics and reporting framework** — customizable dashboards, scheduled reports, and data export _(depends on field comparison + historical analytics)_
217+
- **Advanced workflows** — rule-based automation (e.g., auto-trigger analysis on new imagery, scheduled monitoring) _(depends on webhook/notification system)_
179218

180219
### Ecosystem & Integrations
181-
- **AI agent integration** — connect to LLMs for natural language insights, recommendations, and conversational interfaces
182-
- **Model Context Protocol (MCP) server** — standardized interface for AI agents to query field data and trigger analysis
183-
- **Device/Sensor plugin framework** — connect soil sensors, weather stations, and IoT devices
184-
- **Machinery telemetry integration** — ingest GPS tracks and operational data from farm equipment
220+
- **Direct API integration** — stable, versioned public API with API keys _(prerequisite for MCP + AI agent integration)_
221+
- **Plugin system** — extensible processing pipelines for custom analysis _(prerequisite for device/sensor framework)_
222+
- **Model Context Protocol (MCP) server** — standardized interface for AI agents to query field data and trigger analysis _(depends on API integration)_
223+
- **AI agent integration** — connect to LLMs for natural language insights, recommendations, and conversational interfaces _(depends on MCP server)_
224+
- **Device/Sensor plugin framework** — connect soil sensors, weather stations, and IoT devices _(depends on plugin system)_
225+
- **Machinery telemetry integration** — ingest GPS tracks and operational data from farm equipment _(depends on device/sensor framework)_
185226
- **Supply chain / traceability integrations** — link field data to downstream logistics and compliance systems
186-
- **Plugin system** — extensible processing pipelines for custom analysis
187227
- **Community data sharing** — opt-in anonymized data sharing for regional insights and benchmarking
188228

189229
### Enterprise & Scale
190230
- **Enterprise admin controls** — SSO enforcement, audit policies, usage quotas, multi-tenant admin
191-
- **Custom analytics packs per vertical** — tailored modules for tree crops, forestry, viticulture, etc.
192-
- **Modular ERP** — lightweight farm operations management (inventory, tasks, financials)
193231
- **Mobile app** — React Native companion for field scouting
194-
- **Hosted offering** — managed cloud version
232+
- **Custom analytics packs per vertical** — tailored modules for tree crops, forestry, viticulture, etc. _(depends on analytics framework)_
233+
- **Modular ERP** — lightweight farm operations management (inventory, tasks, financials)
234+
- **Hosted offering** — managed cloud version _(depends on enterprise admin controls)
195235

196236
---
197237

apps/web/messages/en.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,14 @@
322322
"saviThreshold": "SAVI fell below threshold ({value})",
323323
"ndwiDrop": "NDWI dropped {pct}% below rolling average",
324324
"ndwiThreshold": "NDWI fell below threshold ({value})"
325+
},
326+
"backfill": {
327+
"buttonLabel": "Backfill History",
328+
"buttonTitle": "Backfill 24 months of historical data for all indices",
329+
"processing": "Historical data is being processed",
330+
"processingDesc": "Satellite imagery for the past 24 months is being analyzed. Data will appear automatically as it's ready.",
331+
"started": "Historical analysis started. Data will appear over the next few hours.",
332+
"failed": "Failed to start backfill"
325333
}
326334
},
327335
"scoutingTab": {

apps/web/messages/es.json

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,14 @@
322322
"saviThreshold": "SAVI cayó por debajo del umbral ({value})",
323323
"ndwiDrop": "NDWI cayó {pct}% por debajo del promedio móvil",
324324
"ndwiThreshold": "NDWI cayó por debajo del umbral ({value})"
325+
},
326+
"backfill": {
327+
"buttonLabel": "Historial de datos",
328+
"buttonTitle": "Rellenar 24 meses de datos históricos para todos los índices",
329+
"processing": "Se están procesando datos históricos",
330+
"processingDesc": "Se están analizando imágenes satelitales de los últimos 24 meses. Los datos aparecerán automáticamente.",
331+
"started": "Análisis histórico iniciado. Los datos aparecerán en las próximas horas.",
332+
"failed": "Error al iniciar el relleno de datos"
325333
}
326334
},
327335
"scoutingTab": {

0 commit comments

Comments
 (0)