Conversation
| * @private | ||
| */ | ||
| checkTornWrite() { | ||
| let position = this.storage.length; |
There was a problem hiding this comment.
To fix torn writes (which should be a storage level concern) we need to infer the position from the actual documents in the storage rather than the information from the index. Hence the storage needs to read the last document of each partition and return the highest document header sequenceNumber.
| const indexes = []; | ||
|
|
||
| let leastConsistentEntry = this.storage.index.lastEntry; | ||
| this.storage.forEachSecondaryIndex(index => { |
There was a problem hiding this comment.
Note: Updating secondary indexes is problematic, because they can fall back behind the primary index. However, we do not know if an index has actually fallen behind or if it just didn't match any later documents. This leads to the worst case of one secondary index only matching the first document written, but no other, which would mean the whole storage needs to be reindexed.
| }; | ||
| for (let event of events) { | ||
| const eventMetadata = Object.assign({ commitId, committedAt }, metadata, { commitVersion, streamVersion }); | ||
| const eventMetadata = Object.assign({ commitId, committedAt }, metadata, { commitVersion, commitSize, streamVersion }); |
|
This needs to be rewritten to only deal with unfinished commits as the torn writes has already been covered by #155 |
|
@copilot continue working on this. Take the previous comments into consideration and check why the tests are currently not finishing |
Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/f62b2596-872d-4f02-aedf-f9da349f654b
Fix auto-repair: remove broken incomplete code causing infinite loop in checkUnfinishedCommits
|
@copilot add tests that check that a repair (torn write/commit) will not lose any previously committed data. |
Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/312ca871-a86e-472c-ace8-51858ee39989
Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/06d32a8b-3e1a-4041-a3a3-4a81c46bd21a
Repair secondary indexes when opened after primary index truncation
|
@copilot find uncovered lines in this change and improve test coverage, since coveralls is still reporting -0.3% overall coverage on this PR |
…remove dead createStorage, add stale-index test Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/3c271323-430d-4355-a4ff-66d289bf2e48
| try { | ||
| this.readFrom(after); | ||
| // copy all truncated documents to some delete log | ||
| this.branchOff('truncated-' + Date.now(), after); |
There was a problem hiding this comment.
@copilot actually, putting this into the try was wrong. It needs to be guaranteed that when the partition file is truncated, the truncated part is stored into a new file (which is not supposed to be read by the library, but for manual recovery attempts). This is important when recovery finds corrupted data. The event store needs to return to a consistent state, but the malformed data might still contain valuable information.
…data Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/bd5b05a2-8387-4400-b7d5-aa21091a308e
Guarantee truncated data is saved before partition truncation, even when corrupted
|
@copilot take a look at the failing test |
…y error Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/44a7f6dc-8449-4945-b8c7-edd9acbddaee
Fix bench-index: use per-call unique filenames to avoid consistency error
|
@copilot there is still a failing test: |
Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/892aa17c-c679-47ec-8a06-df725d6f9a7c
Fix uncaught error when preRead hook throws during checkUnfinishedCommits
|
@copilot there is one more thing to consider: is there some scenario in which the primary index can run short of the actual documents stored in the storage? If so, how can we detect this status efficiently at startup time? When we do detect this condition, with #254 we will then be able to also repair the primary index. But in this PR we just want to detect this situation if it's even realistic, not yet resolve it. |
…ent in checkTornWrites() Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/a683727c-251a-4224-9d79-a1a6f3a68323
…n convention Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/0a866da0-b3ca-485a-a159-2fe01e2a8443
…Header in checkTornWrite() Co-authored-by: albe <4259532+albe@users.noreply.github.com> Agent-Logs-Url: https://github.com/albe/node-event-storage/sessions/cd6d455a-7dd3-462e-a3be-61c6eccf8112
Detect primary index lagging behind partition data at startup
Implements automatic recovery from unfinished commits and makes sure that indexes are not containing elements that were lost.
Also detects if the primary index has fallen behind. A repair for that case will be added after #254 is merged.
Related to #31