-
Notifications
You must be signed in to change notification settings - Fork 2
Description
The goal is to reduce messages to the database. Right now, every line like
log.debug('abc')
Does an insert into a database. So, one web request might cause a dozen or even a hundred different tiny messages to the database. This might have performance penalties.
It might be better to batch up a whole bunch of logs and only send them into the database every few seconds / minutes / some configurable amount.
Of course, if something crashes, we might lose data this way.
There's a QueueHandler and a QueueListener in the standard library:
- https://docs.python.org/3/library/logging.handlers.html#queuehandler
- https://docs.python.org/3/library/logging.handlers.html#queuelistener
It might be nice if logs were sent to a queue on the box (or on a separate box?) from whatever process is doing logging and then that a different process pops logs off that queue and sends a batch of logs to the database.
It's not clear that the queue handler can work with separate processes, however.
Maybe we'd need the python process to write to a TCP Handler, and that thing builds up a batch of logs, and inserts them
It's likely worth researching what's already out there in terms of queues and batches. Maybe all of this is already written in some forgotten package.
It's also probably good to build some kind of speed test benchmarking system so we can measure the different costs.