This document captures all learnings from two attempted migration sessions (objectId2 -> objectId3). Use it to plan a cleaner migration after refreshing upstream.
The goal: flip type ObjectId = objectId2 to type ObjectId = objectId3 in
go/internal/echo/ids/main.go:51.
type objectId2 struct {
virtual bool
genre genres.Genre
middle byte // '/', '!', '-', '%', '@', '.'
left, right catgut.String // e.g. left="one", middle='/', right="dos"
repoId catgut.String
}5-field struct using catgut.String. Max 255 chars. Encodes identity through
left/middle/right decomposition.
type objectId3 struct {
Genre genres.Genre
Seq doddish.Seq // token sequence, e.g. [ident("one"), op('/'), ident("dos")]
}2-field struct using doddish.Seq tokens. Simpler, more flexible.
Hidden Dependencies That Must Be Addressed BEFORE Flipping
These are the blocking issues discovered during the migration attempt. They should ideally be refactored in preparatory PRs on master before the actual flip.
Problem: sku.Transacted (which contains ObjectId) is gob-serialized in
multiple places. Gob encodes struct field layout. Switching ObjectId from
objectId2 to objectId3 changes the binary layout, making existing gob files
unreadable.
Where gob is used:
romeo/store_config/persist.go:171-gob.NewDecoderloadsconfig-mutableromeo/store_config/persist.go:249-gob.NewEncodersavesconfig-mutablejuliett/sku/main.go:12-gob.Register(Transacted{})juliett/sku/collections.go:20-21-gob.Registerfor TransactedSetjuliett/sku/keyers.go:17-gob.Registerfor keyerspapa/store_fs/main.go:28- duplicategob.Register(sku.Transacted{})
config-mutable is especially tricky: This gob file is rebuilt during
loadMutableConfig and is NOT version-gated. If the struct layout changes, the
file simply fails to decode. The fixture only has config-seed (text), and
config-mutable is rebuilt at runtime - but it's cached, so stale files from
a previous version will cause failures.
Recommended approach: Either:
- Add a version header to config-mutable and rebuild on version mismatch
- Switch config-mutable to a non-gob format before the migration
- Delete config-mutable as part of the store version upgrade path
objectId2 WriteTo format:
genre(1 byte) | [total_len, left_len](2 bytes) | left_bytes | middle(1 byte) | right_bytes
objectId3 needs a WriteTo/ReadFrom: Currently objectId3 only has
MarshalBinary/UnmarshalBinary (which delegate to Seq's binary marshaler). It
needs WriteTo/ReadFrom that write genre_byte + seq_binary for the stream
index writeFieldWriterTo call.
Stream index pages are an INDEX, not source of truth. The inventory lists (text format) are the source of truth. So changing the binary format just requires a version bump and reindex - the pages get rebuilt from inventory lists.
Key discovery from debugging: During BATS tests, the stream index binary
decoder was NEVER called during init-workspace or organize. Objects were
being served from somewhere other than the stream index pages. Investigation
suggests objects come from inventory list parsing during store initialization,
not from pre-existing binary pages.
VCurrent is still V12 in charlie/store_version/main.go:22. The plan said
to bump to V13 but it was never done. This means:
- BATS test fixtures are copied from
migration/v12/ - No reindex is triggered on version change
- Old objectId2 binary data in stream index pages is NOT rebuilt
The version bump MUST happen to ensure the stream index is rebuilt and old binary pages are discarded. Without it, the stream index pages contain objectId2 binary data that objectId3 can't read.
After bumping: Run just test-bats-generate to regenerate fixtures for the
new version.
objectId2 has SetLeft(string) and SetRight(string) which decompose the
ID into its constituent parts. objectId3 doesn't have these.
Only external caller: sierra/store_browser/item.go:40:
func (item *Item) GetObjectId() *ids.ObjectId {
var oid ids.ObjectId
errors.PanicIfError(oid.SetLeft(item.GetKey()))
return &oid
}Fix: There's already a TODO to replace this with ExternalObjectId. Do it
before the migration. Alternatively, add a Set() call with the full ID string
instead of SetLeft.
ids/main.go:219-231 has a fast path that type-asserts *objectId2 and copies
fields directly:
if other, ok := other.(*objectId2); ok {
id.genre = other.genre
other.left.CopyTo(&id.left)
// ...
}Fix: Replace with objectId3 equivalent using ResetWith or SetWithSeq.
The generic string-based fallback (lines 234+) works for any Id type, so the
fast path just needs updating.
objectId2 has a StringSansRepo() that returns the ID without the repo prefix.
objectId3 has no repo concept in its Seq, so StringSansRepo() should just
return String().
Callers:
echo/ids/id_stringer.go:14-StringerSansRepowrapperkilo/box_format/transacted.go:169- formats ObjectId for box output
Fix: Add StringSansRepo() to objectId3 that returns id.Seq.String().
echo/ids/tag.go:144-146-TodoSetFromObjectIdecho/ids/type.go:114-116-TodoSetFromObjectId
Callers:
november/queries/build_state.go:539romeo/store_config/main.go:197
Fix: Replace with tag.Set(objectId.String()) directly. These are already
marked TODO for removal.
These are methods that objectId2 has that external code calls on ObjectId:
| Method | Purpose | Status |
|---|---|---|
SetBlob(string) error |
Set genre to Blob | DONE |
StringSansRepo() string |
String without repo | DONE |
MarshalText() / UnmarshalText() |
Text serialization | DONE |
Clone() *objectId3 |
Pool-managed clone | DONE |
WriteTo(io.Writer) (int64, error) |
Binary stream write | DONE (genre + seq_len + seq_data) |
ReadFrom(io.Reader) (int64, error) |
Binary stream read | DONE (with Seq reset) |
SetWithGenre(string, GenreGetter) error |
Set with genre hint | DONE (genre dispatch) |
All 5 bugs have been resolved. Bugs 1 and 3 were fixed by implementing the missing methods with the correct behavior. Bugs 2, 4, and 5 were already addressed in the existing code.
ReadFrom resets id.Seq at the start before reading new data.
Committed in 7d03bd7e0.
UnmarshalBinary already handles ErrEmptySeq (object_id3.go lines 255-258).
SetWithGenre dispatches to SetType for types, SetBlob for blobs, and
falls back to generic Set for other genres. Committed in 7d03bd7e0.
ValidateSeqAndGetGenre already matches ident.ident with numeric parts as
genres.InventoryList (main.go lines 369-385).
Current IsEmpty() behavior (id.Seq.Len() == 0) is correct. The "/"
placeholder has a non-empty Seq [op('/')] with objectId3, which is the
desired behavior.
File extensions treated as seq when filenames have spaces. The doddish scanner
parses add.md as an unsupported seq pattern instead of treating it as a
filename/path.
Truncated numeric/path seqs rejected during config loading. When inventory list
ObjectIds (TAI timestamps) are parsed, the trailing . causes issues.
Abbreviation index (zettel_id_index) rejects non-zettel seqs.
Output shows [2149773475. !toml-type-v1] instead of [!md !toml-type-v1].
The type definition's ObjectId shows as a TAI timestamp instead of the actual
type ID like !md. This was the most pervasive and hardest to debug.
Root cause investigation: The organize command's data path goes through
QueryTransactedAsSkuType -> executeInternalQuerySkuType ->
FuncPrimitiveQuery -> streamIndex.ReadPrimitiveQuery. But debug logging
showed the stream index binary decoder was NEVER called. Objects must be
coming from somewhere else - likely from inventory list text parsing during
store initialization.
Possible root causes:
- The store initialization reads inventory lists to populate the stream index.
During this process,
inventory_list_store/main.go:185setsobject.ObjectId.SetWithSeq(tai.ToSeq())for the inventory LIST object itself. If pool management is incorrect, this TAI could leak into content objects. - The box format text parser within inventory list coders may be calling
Set()on objectId3 with a string that gets misidentified byValidateSeqAndGetGenre. - Config loading via gob may produce corrupted ObjectIds that propagate.
Type resolution failure - the coder system can't find a handler for !md.
Empty type panic, type mismatch, missing output.
organize command
-> store.QueryTransactedAsSkuType(query)
-> executor.ExecuteTransactedAsSkuType()
-> if isDotOperatorActive() && WorkspaceStore != nil:
executeExternalQueryCheckedOut() // workspace files
else:
executeInternalQuerySkuType() // stream index
-> FuncPrimitiveQuery()
-> streamIndex.ReadPrimitiveQuery()
-> for each page: makeStreamPageReader()
-> makeSeqObjectFromReader()
-> decoder.readFormatAndMatchSigil() // binary decode
- Source of truth: Inventory lists (text format, in
.dodder/local/share/inventory_lists_log) - Index (rebuilt): Stream index binary pages (in
.dodder/local/share/objects_index/Page-N) - Cache (rebuilt): Config mutable (gob, in
.dodder/local/share/config-mutable)
- Initialize inventory list store
- Make working list
- Create zettel ID index
- Create stream index via
stream_index.MakeIndex(lazy - doesn't read pages) - Stream index pages are read lazily on first query
- Explicit
dodder reindexcommand - Store version change triggers
SetNeedsFlushHistoryduringUnlock() Unlock()callsFlushInventoryList()which only adds the inventory list object itself (not contents) to the stream index
Refactor store_browser: ReplaceDONE — replaced withitem.GetObjectId()/SetLeft()i.String(), deletedGetObjectId()methodRemove TodoSetFromObjectId: Replace callers withDONE — deleted methods from tag.go and type.gotag.Set(oid.String())- Add version-aware config-mutable: Either add a version header or switch to a rebuildable format (DEFERRED — needs separate plan)
Add StringSansRepo to objectId3: Trivial method returningDONEString()
SetBlobDONEWriteTo/ReadFrom (genre byte + seq_len + seq binary)DONESetWithGenre with proper genre dispatchDONEMarshalText/UnmarshalTextDONECloneDONE
- Change VCurrent from V12 to V13
- Regenerate BATS fixtures with
just test-bats-generate - Verify migration tests cover the version bump path
- Change
type ObjectId = objectId2totype ObjectId = objectId3 - Change
GetObjectIdPool()to returngetObjectIdPool3() - Update
SetObjectIdOrBlobtype assertion - Fix compilation errors
- Run
go test ./...first (unit tests) - Run
just test-batssequentially (--jobs 1) to avoid test interference - Debug failures category by category
- Delete objectId2.go
- Delete poolObjectId2 / getObjectIdPool2()
- Consider removing SeqId alias (redundant with ObjectId)
- Remove temporary debug logging
- Always run BATS tests with
--jobs 1to eliminate parallel test interference - BATS tests need
DODDER_BINset and the debug binary on PATH - Use
nix develop .#go --command bash -c "just build"to build - Regenerate fixtures after any store version or binary format change
- The
info store-versioncommand determines which fixture directory is used - Debug logging with
fmt.Fprintf(os.Stderr, ...)works;ui.Log().Print()requires debug level flags