-
Notifications
You must be signed in to change notification settings - Fork 69
Description
I had a test that failed with the usual:
WARN: dropped CockroachInstance without cleaning it up first (there may still be a child process running and a temporary directory leaked)
WARN: temporary directory leaked: "/dangerzone/omicron_tmp/.tmpFf8lhP"
If you would like to access the database for debugging, run the following:
# Run the database
cargo xtask db-dev run --no-populate --store-dir "/dangerzone/omicron_tmp/.tmpFf8lhP/data"
# Access the database. Note the port may change if you run multiple databases.
cockroach sql --host=localhost:32221 --insecure
When I loaded up the database, its contents were inconsistent with what I expected based on the logging. More precisely:
- my test loads up 13 blueprints and then deletes 4
- the logging was pretty conclusive that it did delete the 4 before failing
- when I loaded up the database, all 13 blueprints were still there
I really couldn't see how this could happen so I added a tokio::time::sleep to give me long enough to connect to the live database right before the blown assertion and sure enough the 4 blueprints were gone.
In chat @jmpesp reported having run into this before, that it's a result of #8275, and it sounds like he routinely patches that out.
I had thought that 8275 was just disabling fsync. That would be alright because there's no host OS crash on the scene here. However, that's not the only thing this option does:
This not only disables fsync, but also disables flushing writes to the OS buffer.
That would explain things -- if cockroach is going down ungracefully, it may not have written this stuff to files at all, let alone fsync'd it.
We discussed this a bit in 8275 and I think we should revisit it. The Helios CI already runs on ZFS with sync=disabled, so I expect that this behavior shouldn't affect Helios CI time much. This would affect:
- local tests
- Linux CI
But aren't we putting this stuff in TMPDIR, which is usually in-memory anyway? What do you think @smklein @sunshowers?