Problem
The database schema includes retry fields (retry_count, max_retries) but the worker doesn't implement retry logic:
- Worker immediately marks jobs as FAILED without checking retry count
- No exponential backoff between retries
- BullMQ built-in retry not configured
- TIMEOUT state in schema is unused
Action Items
-
Worker-level retries:
- Check
retry_count < max_retries before marking FAILED
- Increment
retry_count and re-enqueue job with backoff
- Only mark FAILED when max retries exceeded
-
BullMQ configuration:
- Configure job attempts in queue options
- Add backoff strategy (exponential)
- Set job timeout to use TIMEOUT state
-
Error classification:
- Distinguish transient (network) vs permanent (invalid data) errors
- Only retry transient errors
- Fail fast for permanent errors
Example Implementation
const analysisWorker = new Worker(
'analysis',
async (job) => { /* ... */ },
{
connection: createBullMQConnection(),
settings: {
backoffStrategy: (attemptsMade) => Math.min(1000 * 2 ** attemptsMade, 60000),
}
}
);
// In job options when enqueuing:
await analysisQueue.add('analyze', data, {
attempts: 3,
timeout: 300000, // 5 minutes
backoff: { type: 'exponential', delay: 2000 }
});
Files
backend/worker/analysis.worker.js
backend/controllers/webhook/handleWebhook.js
backend/prisma/schema.prisma
Related
Requested by: @yb175
Problem
The database schema includes retry fields (
retry_count,max_retries) but the worker doesn't implement retry logic:Action Items
Worker-level retries:
retry_count < max_retriesbefore marking FAILEDretry_countand re-enqueue job with backoffBullMQ configuration:
Error classification:
Example Implementation
Files
backend/worker/analysis.worker.jsbackend/controllers/webhook/handleWebhook.jsbackend/prisma/schema.prismaRelated
Requested by: @yb175