The docStreamer API provides endpoints for monitoring the migration status and managing on-demand data validation.
All endpoints are hosted by the internal API server (default: port 8080, configurable via config.yaml or environment variables).
Retrieves a detailed snapshot of the migration's current state, including full load progress, CDC lag, and high-level validation summaries.
| Detail | Value |
|---|---|
| Path | /status |
| Method | GET |
| Use Case | Health checks, dashboard integration, and general migration progress monitoring |
Example
curl -X GET http://localhost:8080/statusResponse Fields (Partial):
ok:trueunless the state is error.state: Current state (e.g.,running,starting,error).info: Detailed message about the state.cdcLagSeconds: Delay between the source event time and the applied time on the target.initialSync.cloneCompleted:trueif the initial data load is finished.
Triggers the background creation of any secondary indexes that were deferred during the Full Load phase (postpone_index_creation: true). This runs asynchronously, allowing the CDC stream to continue operating without interruption.
| Detail | Value |
|---|---|
| Path | /index |
| Method | POST |
| Mandatory Prerequisite | Full Load (Clone) must be completed. |
Example
curl -X POST http://localhost:8080/indexInitiates a graceful shutdown of the CDC stream. It waits for the stream to safely drain all in-flight operations, builds any deferred secondary indexes, and permanently marks the migration as "Finalized" in the metadata.
| Detail | Value |
|---|---|
| Path | /finalize |
| Method | POST |
| Mandatory Prerequisite | Full Load (Clone) must be completed. |
Example
curl -X POST http://localhost:8080/finalizeManually triggers the flow control mechanism to safely pause data extraction. The pipeline will finish writing any in-flight batches and then idle.
| Detail | Value |
|---|---|
| Path | /pause |
| Method | POST |
Example:
curl -X POST http://localhost:8080/pauseExample Response:
{
"status": "paused"
}Releases the manual flow control pause, allowing the data reader to wake up and resume the migration pipeline exactly where it left off.
| Detail | Value |
|---|---|
| Path | /resume |
| Method | POST |
Example:
curl -X POST http://localhost:8080/resumeExample Response:
{
"status": "resumed"
}Endpoints for managing the data consistency check process.
Performs a synchronous, real-time comparison for a specific set of documents between the source and target databases.
| Detail | Value |
|---|---|
| Path | /validate/adhoc |
| Method | POST |
| Mandatory Prerequisite | Full Load (Clone) must be completed — otherwise returns 403 Forbidden |
| Use Case | Manually verifying consistency of key documents after a known operation or error |
Request Body
| Field | Type | Description |
|---|---|---|
namespace |
string | The collection namespace (e.g., my_db.users) |
ids |
array | List of document _id values to check |
Example
curl -X POST http://localhost:8080/validate/adhoc \
-H "Content-Type: application/json" \
-d '{
"namespace": "ecommerce.orders",
"ids": [
"ObjectId(\"656e18471f468903c18b63a6\")",
"order_12345"
]
}'Response Format
Returns an array of ValidationResult objects:
[
{
"DocID": "order_12345",
"Status": "mismatch",
"Reason": "Field content differs"
}
]Retrieves the current aggregated counters for the entire validation process.
| Detail | Value |
|---|---|
| Path | /validate/stats |
| Method | GET |
| Use Case | Monitoring validation throughput and cumulative mismatch counts |
Example
curl -X GET http://localhost:8080/validate/statsResponse Format (Simplified)
[
{
"namespace": "db.collection",
"totalChecked": 100000,
"validCount": 99990,
"mismatchCount": 10,
"lastValidatedAt": "2025-12-05T18:00:00Z"
}
]Triggers a background job to re-validate every document currently recorded as a failure.
| Detail | Value |
|---|---|
| Path | /validate/retry |
| Method | POST |
| Use Case | Recalculate mismatch status after cleaning or patching data |
Example
curl -X POST http://localhost:8080/validate/retryResponse Format
{
"status": "accepted",
"message": "Queued re-validation check for 15 documents. This process updates status but does not repair data."
}Resets or reconciles validation statistics and failure logs.
| Detail | Value |
|---|---|
| Path | /validate/reset |
| Method | POST |
| Use Case | Correcting drift in validation counters or clearing all validation history |
| Mode | Query Parameter | Description |
|---|---|---|
| Reconcile (Default) | (none) | Re-calculates cumulative stats based on the current count of records in validation_failures. |
| Erase (Hard Reset) | ?mode=erase |
Wipes all validation stats and failure records from the metadata DB. |
Example (Default – Reconcile)
curl -X POST http://localhost:8080/validate/resetExample (Hard Reset)
curl -X POST http://localhost:8080/validate/reset?mode=eraseThe /status endpoint now provides granular metrics for CDC operations and the Flow Control system.
New Response Fields:
flowControl:isPaused: Boolean indicating if docStreamer is currently throttling writes.pauseReason: The specific threshold (Queued Ops or Resident MB) that triggered the pause.currentQueuedOps: The highest global lock queue depth detected across the cluster.
insertedDocs/updatedDocs/deletedDocs: Individual counters for each operation type applied during CDC.validation.pendingMismatches: The real-time count of document IDs currently flagged as out of sync.migrationFinalized:trueif the migration has been successfully finalized.indexing: An object tracking the live progress of index creation (isIndexing,currentNamespace,completedCollections,totalCollections,completed).
Retrieves a list of all document IDs currently recorded as validation mismatches.
| Detail | Value |
|---|---|
| Path | /validate/failures |
| Method | GET |
# Example Request
curl -X GET http://localhost:8080/validate/failuresReturns real-time metrics on the internal validation buffer and its backpressure state.
| Detail | Value |
|---|---|
| Path | /validate/queue |
| Method | GET |
# Example Response
{
"queue_used": 150,
"queue_capacity": 2000,
"is_throttled": false,
"status_message": "Healthy"
}Triggers a full background scan of a collection to find mismatches or "orphans" (records that exist on the target but not the source).
| Detail | Value |
|---|---|
| Path | /validate/scan |
| Method | POST |
Request Body:
namespace: The collection to scan (e.g.,db.users).scan_type:"source"(Standard check) or"orphans"(Finds extra records on target).
# Example: Orphan Scan
curl -X POST http://localhost:8080/validate/scan \
-H "Content-Type: application/json" \
-d '{
"namespace": "inventory.products",
"scan_type": "orphans"
}'The /validate/adhoc endpoint now supports complex shard keys via the keys array.
# Example: Validating by Shard Key
curl -X POST http://localhost:8080/validate/adhoc \
-H "Content-Type: application/json" \
-d '{
"namespace": "ecommerce.orders",
"keys": [
{ "_id": "order_123", "region": "US" },
{ "_id": "order_456", "region": "EU" }
]
}'