Conversation
15ca402 to
6149e16
Compare
|
For now we will stick to no transaction log to optimize for distributed readers and minimal IO during reading. The problem with transaction logs in our case would be that distributed readers would have to reapply them on every read, a disadvantage we also had with the hitchhiker-tree (in a fractal form). Since we have a single writer architecture and write throughput is generally fast enough such that we don't increase latency by writing each new path into the tree indices in parallel on a transaction, the main price we pay is more storage requirement and higher write throughput. The storage requirement can be handled with running gc regularly. And the write throughput is not that much better with a transaction log, too. As soon as you hit a high enough throughput rate and the log overflows on most transactions you have the same costs. So for now let's postpone the transaction log for simplicity and optimal read performance. On high write throughput Datahike also auto-batches transactions and reduces write amplification this way. |
SUMMARY
Fixes #718
Checks
Bugfix
fixes #numberFeature
fixes #numberADDITIONAL INFORMATION