Use this checklist to systematically add distributed tracing to each service in the SubStream Protocol Backend.
- OpenTelemetry SDK initialization (
src/utils/opentelemetry.js) - Tracing utilities module (
src/utils/tracingUtils.js) - HTTP tracing middleware (
src/middleware/httpTracingMiddleware.js) - Trace context propagation (
src/utils/traceContextPropagation.js) - Service instrumentation factory (
src/utils/serviceInstrumentation.js) - Example implementations (
src/utils/exampleServiceInstrumentation.js) - Comprehensive documentation
- Testing suite
- Environment configuration
Required Tracing:
- JWT token generation
- Token verification
- Login flow
- Logout flow
- Nonce generation (for SIWE)
- Signature verification
- User lookup/creation
- Session management
Integration Steps:
-
Import service instrumentation:
const { traceServiceMethods } = require('../utils/serviceInstrumentation');
-
Wrap service methods:
module.exports = traceServiceMethods(service, 'auth-service', [ 'generateMemberToken', 'verifyToken', 'loginWithSignature', 'generateNonce' ]);
Status: Ready for Integration
Required Tracing:
- Content retrieval
- Content creation
- Content updates
- Content deletion
- Access control filtering
- Cache lookups
- View event tracking
Integration Steps:
-
Add database tracing:
const { createDatabaseTracing } = require('../utils/serviceInstrumentation'); const dbTracing = createDatabaseTracing();
-
Wrap database queries:
async getContent(id) { const tracer = dbTracing.traceQuery('SELECT', 'content', sql); try { const result = await db.query(sql, [id]); tracer.end(result.rowCount); return result.rows; } catch (error) { tracer.error(error); throw error; } }
Status: Ready for Integration
Required Tracing:
- Subscription creation
- Subscription verification
- Subscription expiry checking
- Tier-based access control
- Upgrade/downgrade flows
- Payment verification
Integration Steps:
- Wrap with service instrumentation
- Add blockchain tracing for Stellar calls
- Track payment verification events
Status: Ready for Integration
Required Tracing:
- Transcoding job creation
- Queue operations
- Progress tracking
- Error handling and retries
- Output file tracking
Integration Steps:
-
Add queue operation tracing:
const { createQueueTracing } = require('../utils/serviceInstrumentation'); const queueTracing = createQueueTracing();
-
Trace job publishing/consuming
-
Track processing pipeline stages
Status: Ready for Integration
Required Tracing:
- Content pinning
- Multi-region replication
- Failover attempts
- Health checks
- Cache operations
- API calls to Pinata/Web3.Storage
Integration Steps:
-
Add HTTP client tracing:
const { createHttpClientTracing } = require('../utils/serviceInstrumentation'); const httpTracing = createHttpClientTracing();
-
Setup axios with trace propagation:
setupAxiosTracing(axios, { format: 'w3c' });
-
Track failover patterns
Status: Ready for Integration
Required Tracing:
- Ledger synchronization
- Event indexing
- Transaction submission
- Account queries
- Contract interactions
- Error recovery
Integration Steps:
-
Add blockchain operation tracing:
const { createBlockchainSpan } = require('../utils/tracingUtils');
-
Track transaction hashes and ledger numbers
-
Monitor indexer lag
Status: Ready for Integration
Required Tracing:
- Event recording (views, engagement)
- Aggregation queries
- Heatmap generation
- Stats calculation
- Caching of results
Integration Steps:
- Trace database aggregations
- Track cache hit/miss for analytics
- Monitor computation time
Status: Ready for Integration
Required Tracing:
- Email sending
- Webhook dispatching
- Queue operations
- Template rendering
- Retry logic
Integration Steps:
- Add HTTP tracing for email/webhook APIs
- Track queue operations
- Monitor retry attempts
Status: Ready for Integration
Required Tracing:
- Connection pooling
- Query execution
- Transaction management
- Prepared statements
- Error handling
Integration Steps:
- Already instrumented via
@opentelemetry/instrumentation-pg - Verify attributes are correct
- Add custom tracing for critical queries
Status: Partially Complete (Core Tracing Done)
Required Tracing:
- GET operations
- SET operations
- DELETE operations
- TTL management
- Failover handling
Integration Steps:
-
Already instrumented via
@opentelemetry/instrumentation-redis -
Add custom cache span creation:
const { createCacheSpan } = require('../utils/tracingUtils');
-
Track hit/miss ratios
Status: Partially Complete (Core Tracing Done)
Required Tracing:
- Message publishing
- Message consumption
- Dead letter queue handling
- Retry logic
Integration Steps:
- Already instrumented via
@opentelemetry/instrumentation-amqp - Add queue operation tracing:
const { createQueueTracing } = require('../utils/serviceInstrumentation');
Status: Partially Complete (Core Tracing Done)
Required Tracing:
- Rate limit checks
- Quota calculation
- Token bucket updates
- Rejection handling
Integration Steps:
- Add custom span for rate limit checks
- Track quota usage patterns
Status: Ready for Integration
Required Tracing:
- Invoice generation
- Payment processing (Stripe)
- Dunning management
- Payout calculations
Integration Steps:
-
Add HTTP client tracing for Stripe API:
setupAxiosTracing(stripeClient, { format: 'w3c' });
-
Track payment state transitions
-
Monitor webhook receipts
Status: Ready for Integration
Required Tracing:
- Tenant creation
- Organization queries
- Configuration updates
- Multi-tenancy enforcement
Integration Steps:
- Add service instrumentation
- Track tenant isolation at request level
- Monitor configuration changes
Status: Ready for Integration
Required Tracing:
- Data scrubbing operations
- Encryption/decryption
- Privacy control enforcement
- Audit trail
Integration Steps:
- Add tracing to security operations
- Include operation type and result count
- Avoid tracing sensitive data content
Status: Ready for Integration
- GET
/auth/nonce- Trace nonce generation - POST
/auth/login- Trace login flow - POST
/auth/logout- Trace logout - POST
/auth/refresh- Trace token refresh
Integration:
router.post('/login', async (req, res, next) => {
return withSpan('route.login', async (span) => {
// Route implementation
});
});Status: Ready for Integration
- GET
/content- Trace list with filters - GET
/content/:id- Trace retrieval - POST
/content- Trace creation - PUT
/content/:id- Trace updates - DELETE
/content/:id- Trace deletion
Status: Ready for Integration
- POST
/analytics/view-event- Trace event recording - GET
/analytics/heatmap/:id- Trace heatmap generation - GET
/analytics/creator/:address- Trace stats retrieval
Status: Ready for Integration
- POST
/storage/pin- Trace pinning operations - GET
/storage/content/:id- Trace retrieval with failover - GET
/storage/health- Trace health checks
Status: Ready for Integration
- Environment variables template (
.env.tracing.example) - Production configuration (ask DevOps for endpoint)
- Staging configuration
- Development configuration (localhost)
- Kubernetes configuration example
- Docker Compose configuration example
Status: Mostly Complete
- Update Docker images with tracing
- Deploy Jaeger backend
- Configure OTLP endpoint in K8s
- Verify traces flowing to backend
- Set up Jaeger UI access
- Configure retention policies
- Set up alerts/dashboards
Status: Documentation Complete, Deployment TBD
- Main implementation guide (2000+ lines)
- Quick start guide
- Deployment guide
- Example implementations (5 complete services)
- Environment configuration
- Troubleshooting guide
- API documentation
Status: Complete
- Unit tests for tracing utilities
- Integration tests for middleware
- Performance benchmarks
- Load testing with tracing enabled
- E2E tests with trace verification
Status: Core Tests Complete
- Implement infrastructure
- Create examples
- Write documentation
- Developers start integrating services
- Local testing and validation
- Deploy to staging environment
- Verify trace collection
- Performance testing
- Team review and feedback
- Set up production Jaeger
- Enable with low sampling rate (1%)
- Monitor and adjust
- Expand based on learnings
Track these metrics to validate the implementation:
- 100% of services instrumented
- 0% data loss in trace export
- <5ms average latency overhead
- <2% memory overhead
- <1% CPU overhead
- 99.9% availability of tracing infrastructure
- Average trace latency <500ms
- Ability to correlate requests across services
Before marking a service as complete:
- All major methods are traced
- Errors are being captured
- Database queries show row counts
- External API calls show status codes
- Trace context is propagating
- Correlation IDs are consistent
- No sensitive data in spans
- Tests pass
- Documentation is updated
- Documentation: See
DISTRIBUTED_TRACING_GUIDE.md - Quick Start: See
TRACING_QUICK_START.md - Examples: See
src/utils/exampleServiceInstrumentation.js - Tests: Run
npm test -- test/distributedTracing.test.js
- This checklist should be completed service-by-service
- Each developer can work on their own service
- Parallel integration is encouraged
- Report issues in the main tracing module early
Last Updated: April 29, 2026
Implementation Status: π’ Production Ready Infrastructure, Ready for Service Integration