feat: Cache Steam save file hashes between syncs#1227
Conversation
📝 WalkthroughWalkthroughThe PR introduces a hash caching system for Steam file operations by adding a new Room database entity to persist SHA-1 hashes with file metadata, implementing cache lookup logic during local scanning with hit/miss tracking, seeding the cache during cloud downloads, and integrating cleanup when apps are deleted. Changes
Sequence DiagramsequenceDiagram
participant SteamAutoCloud
participant SteamFileHashCacheDao
participant Database
participant FileSystem
rect rgba(100, 150, 200, 0.5)
note over SteamAutoCloud,FileSystem: Local File Scanning with Caching
SteamAutoCloud->>SteamAutoCloud: Collect local file paths
loop For each file
SteamAutoCloud->>SteamFileHashCacheDao: Query cache by appId & path
alt Cache Hit (size & mtime match)
SteamFileHashCacheDao-->>SteamAutoCloud: Return cached SHA
SteamAutoCloud->>SteamAutoCloud: Increment cache hit counter
else Cache Miss (not found or metadata differs)
SteamAutoCloud->>FileSystem: Compute SHA-1 hash
FileSystem-->>SteamAutoCloud: SHA bytes
SteamAutoCloud->>SteamFileHashCacheDao: Insert/update cache entry
SteamFileHashCacheDao->>Database: INSERT OR REPLACE
SteamAutoCloud->>SteamAutoCloud: Increment cache miss counter
end
end
end
rect rgba(200, 150, 100, 0.5)
note over SteamAutoCloud,FileSystem: Cloud Download with Cache Seeding
SteamAutoCloud->>FileSystem: Download file
FileSystem-->>SteamAutoCloud: Downloaded bytes
alt Size matches expected
SteamAutoCloud->>SteamFileHashCacheDao: Seed cache with SHA & metadata
SteamFileHashCacheDao->>Database: INSERT OR REPLACE
else Size mismatch
SteamAutoCloud->>SteamAutoCloud: Skip cache seeding
end
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
app/src/main/java/app/gamenative/service/SteamService.kt (1)
1180-1189: Also clear hash-cache rows for indirect DLC app IDs during app deletion.You already clear other per-app sync tables for indirect DLC entries in the loop; extending hash-cache cleanup there keeps DB state fully consistent.
♻️ Suggested patch
val indirectDlcAppIds = getDownloadableDlcAppsOf(appId).orEmpty().map { it.id } indirectDlcAppIds.forEach { dlcAppId -> appInfoDao.deleteApp(dlcAppId) changeNumbersDao.deleteByAppId(dlcAppId) fileChangeListsDao.deleteByAppId(dlcAppId) + db.steamFileHashCacheDao().deleteByAppId(dlcAppId) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/main/java/app/gamenative/service/SteamService.kt` around lines 1180 - 1189, The indirect DLC cleanup loop misses removing rows from the Steam file-hash cache, causing stale entries; update the loop that iterates indirectDlcAppIds (from getDownloadableDlcAppsOf) to also call db.steamFileHashCacheDao().deleteByAppId(dlcAppId) for each dlcAppId alongside appInfoDao.deleteApp, changeNumbersDao.deleteByAppId and fileChangeListsDao.deleteByAppId so hash-cache rows for indirect DLCs are cleared during app deletion.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt`:
- Around line 561-569: The code seeds the hash cache with the remote SHA
(file.shaFile) without verifying the downloaded bytes; update the logic around
SteamAutoCloud to only insert a SteamFileHashCache row after validating the file
on disk—either verify Files.size(actualFilePath) matches the expected size
and/or compute the SHA from the actualFilePath bytes (or compute during
streaming) and use that computed digest for hashCacheDao.insert; ensure this
prevents getCachedShaOrHash or hasHashConflicts from returning unverified remote
SHAs by replacing file.shaFile with the verified digest and only inserting after
validation (use actualFilePath, Files.size, Files.getLastModifiedTime, and
SteamFileHashCache in your checks).
In `@app/src/test/java/app/gamenative/service/SteamAutoCloudTest.kt`:
- Around line 593-602: The three cache assertions calling
db.steamFileHashCacheDao().getByAppIdAndPath(...) are inside a disabled test
(the `@Test` is commented out), so they never execute; fix by either re-enabling
that test (uncomment the `@Test` annotation) or moving these assertions into an
enabled test such as firstBoot_multipleCloudFiles_downloadsAll() so the
"download seeds steam_file_hash_cache" behavior is covered; ensure you keep the
exact calls/assertArrayEquals/assertNotNull checks (cacheEntry1/2/3 and
cloudFile1Sha/cloudFile2Sha/cloudFile3Sha) and any required setup variables in
scope when relocating.
---
Nitpick comments:
In `@app/src/main/java/app/gamenative/service/SteamService.kt`:
- Around line 1180-1189: The indirect DLC cleanup loop misses removing rows from
the Steam file-hash cache, causing stale entries; update the loop that iterates
indirectDlcAppIds (from getDownloadableDlcAppsOf) to also call
db.steamFileHashCacheDao().deleteByAppId(dlcAppId) for each dlcAppId alongside
appInfoDao.deleteApp, changeNumbersDao.deleteByAppId and
fileChangeListsDao.deleteByAppId so hash-cache rows for indirect DLCs are
cleared during app deletion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: aeab14cd-65a8-4253-9857-7c576748db3e
📒 Files selected for processing (9)
app/schemas/app.gamenative.db.PluviaDatabase/20.jsonapp/src/main/java/app/gamenative/data/PostSyncInfo.ktapp/src/main/java/app/gamenative/data/SteamFileHashCache.ktapp/src/main/java/app/gamenative/db/PluviaDatabase.ktapp/src/main/java/app/gamenative/db/dao/SteamFileHashCacheDao.ktapp/src/main/java/app/gamenative/di/DatabaseModule.ktapp/src/main/java/app/gamenative/service/SteamAutoCloud.ktapp/src/main/java/app/gamenative/service/SteamService.ktapp/src/test/java/app/gamenative/service/SteamAutoCloudTest.kt
There was a problem hiding this comment.
1 issue found across 9 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="app/src/main/java/app/gamenative/service/SteamAutoCloud.kt">
<violation number="1" location="app/src/main/java/app/gamenative/service/SteamAutoCloud.kt:567">
P1: Download path seeds hash cache with remote SHA (`file.shaFile`) instead of hashing the saved file, which can hide corrupted/truncated local downloads on later cache hits.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
app/src/main/java/app/gamenative/service/SteamAutoCloud.kt (2)
61-64: Consider using a regular class or addingcontentEqualsforByteArraycomparison.
HashLookupResultis adata classwith aByteArrayfield. Kotlin's generatedequals()andhashCode()will use reference equality for the array, so two instances with identical SHA content won't compare equal. This isn't a problem with the current usage (callers only access properties directly), but it's a subtle gotcha if the class is ever used in collections or comparisons.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt` around lines 61 - 64, HashLookupResult is defined as a data class containing a ByteArray (sha), which causes generated equals()/hashCode() to use reference equality; to fix, replace the data class with a regular class or keep it but override equals() and hashCode() to use sha.contentEquals(other.sha) and sha.contentHashCode(), ensuring logical equality and stable hashing for HashLookupResult instances when compared or used in collections.
414-417: Cache stats may include multiple scans.
getLocalUserFilesAsPrefixMapcan be called twice during a sync (initial scan at line 853, and post-download validation at line 890). The second invocation logs cumulative stats from both scans, which might be misleading. Consider either resetting counters before each scan or adjusting the log message to indicate cumulative stats.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt` around lines 414 - 417, The logged cache stats for getLocalUserFilesAsPrefixMap are cumulative across multiple scans because hashCacheHits and hashCacheMisses are not reset between the initial scan and the post-download validation; either reset these AtomicInteger counters at the start of each scan (clear hashCacheHits and hashCacheMisses before calling getLocalUserFilesAsPrefixMap) or change the Timber.i message near the end of getLocalUserFilesAsPrefixMap to indicate the values are cumulative (e.g., prepend "cumulative" or include a scan identifier) so callers know the stats span multiple runs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt`:
- Around line 61-64: HashLookupResult is defined as a data class containing a
ByteArray (sha), which causes generated equals()/hashCode() to use reference
equality; to fix, replace the data class with a regular class or keep it but
override equals() and hashCode() to use sha.contentEquals(other.sha) and
sha.contentHashCode(), ensuring logical equality and stable hashing for
HashLookupResult instances when compared or used in collections.
- Around line 414-417: The logged cache stats for getLocalUserFilesAsPrefixMap
are cumulative across multiple scans because hashCacheHits and hashCacheMisses
are not reset between the initial scan and the post-download validation; either
reset these AtomicInteger counters at the start of each scan (clear
hashCacheHits and hashCacheMisses before calling getLocalUserFilesAsPrefixMap)
or change the Timber.i message near the end of getLocalUserFilesAsPrefixMap to
indicate the values are cumulative (e.g., prepend "cumulative" or include a scan
identifier) so callers know the stats span multiple runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3e8850a5-18e2-45ee-8385-0cd471e4631a
📒 Files selected for processing (1)
app/src/main/java/app/gamenative/service/SteamAutoCloud.kt
There was a problem hiding this comment.
1 issue found across 9 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="app/src/main/java/app/gamenative/service/SteamService.kt">
<violation number="1" location="app/src/main/java/app/gamenative/service/SteamService.kt:1180">
P2: deleteApp() cleans hash cache only for the parent appId, but also removes DLC app records, leaving possible stale steam_file_hash_cache rows for DLC appIds.</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
|
There are conflicts. I'll review this again with fresh eyes since it needs to be bug-free |
57c6f82 to
c2bda32
Compare
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
|
@kiequoo conflicts |
- Fix ByteArray equals/hashCode in SteamFileHashCache data class - Replace runBlocking with runBlocking(Dispatchers.IO) to prevent potential deadlock under IO thread pool pressure - Replace var hashCacheHits/hashCacheMisses with AtomicInteger vals - Remove unused getAllByAppId DAO method - Wire deleteByAppId into deleteApp transaction for cache eviction - Add test for NoSuchFileException on missing file - Use imported SteamFileHashCacheDao type in getCachedShaOrHash signature
Replace runBlocking(Dispatchers.IO) bridge calls inside Java stream lambdas with a suspend lambda + for-loop pattern, eliminating the thread-starvation risk. Hoist hashCacheDao to the outer async scope so downloadFiles can reuse it instead of fetching the DAO again.
Trust the on-disk bytes rather than the manifest SHA: check the written file size matches the expected raw size before inserting into the cache, and compute the SHA from the actual file content. This prevents a truncated download from poisoning the cache with a remote SHA that could make hasHashConflicts() treat a bad file as valid on the next sync.
…sAll The assertions verifying that downloads seed steam_file_hash_cache were in a disabled test. Move them into the active firstBoot test and update them to compare against sha1(fileContent) rather than the manifest SHA, consistent with the cache now computing the digest from actual on-disk bytes.
The indirect DLC cleanup loop was missing a deleteByAppId call for the hash cache, leaving stale steam_file_hash_cache rows for DLC appIds.
8f7a5d9 to
9ed4d59
Compare
@utkarshdalal rebased! |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/src/main/java/app/gamenative/service/SteamAutoCloud.kt (1)
325-439: Prune cache rows for save files that disappeared.This scan only upserts paths that still exist. Deleted or rotating autosave files never remove their old
steam_file_hash_cacherows, so the table will grow monotonically until an app-level clear. Please collect the seen absolute paths here and delete the remainder forappInfo.idafter the scan.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt` around lines 325 - 439, Collect all absolute file paths seen during both the per-pattern loop and the SteamUserData recursive scan inside getLocalUserFilesAsPrefixMap (capture path.toString() for each path processed), then after the scans and before returning result call a hash-cache cleanup on hashCacheDao to remove any steam_file_hash_cache rows for appInfo.id whose path is NOT in the seen set (implement a DAO method such as deleteByAppIdExcludingPaths(appId: Int, keepPaths: Collection<String>) or deleteWhereAppIdAndPathNotIn(appId, keepPaths) if missing), ensuring the deletion runs once per scan and uses the same appInfo.id used in getCachedShaOrHash.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt`:
- Around line 434-437: The current Timber.i inside
getLocalUserFilesAsPrefixMap() logs hashCacheHits/hashCacheMisses per scan but
those Atomic counters are shared across the whole sync, producing misleading
partial totals; fix by logging once per sync instead: remove or stop logging
from getLocalUserFilesAsPrefixMap() and instead emit the Timber.i after the full
sync completes (in the method that orchestrates the sync), or alternatively
snapshot and reset hashCacheHits/hashCacheMisses at the start of each
getLocalUserFilesAsPrefixMap() invocation so the logged values represent only
that scan; reference getLocalUserFilesAsPrefixMap(), hashCacheHits,
hashCacheMisses, and the sync orchestration method to relocate the log.
---
Nitpick comments:
In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt`:
- Around line 325-439: Collect all absolute file paths seen during both the
per-pattern loop and the SteamUserData recursive scan inside
getLocalUserFilesAsPrefixMap (capture path.toString() for each path processed),
then after the scans and before returning result call a hash-cache cleanup on
hashCacheDao to remove any steam_file_hash_cache rows for appInfo.id whose path
is NOT in the seen set (implement a DAO method such as
deleteByAppIdExcludingPaths(appId: Int, keepPaths: Collection<String>) or
deleteWhereAppIdAndPathNotIn(appId, keepPaths) if missing), ensuring the
deletion runs once per scan and uses the same appInfo.id used in
getCachedShaOrHash.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1e401448-41dd-48ed-bb5d-41aa81ef6f47
📒 Files selected for processing (9)
app/schemas/app.gamenative.db.PluviaDatabase/21.jsonapp/src/main/java/app/gamenative/data/PostSyncInfo.ktapp/src/main/java/app/gamenative/data/SteamFileHashCache.ktapp/src/main/java/app/gamenative/db/PluviaDatabase.ktapp/src/main/java/app/gamenative/db/dao/SteamFileHashCacheDao.ktapp/src/main/java/app/gamenative/di/DatabaseModule.ktapp/src/main/java/app/gamenative/service/SteamAutoCloud.ktapp/src/main/java/app/gamenative/service/SteamService.ktapp/src/test/java/app/gamenative/service/SteamAutoCloudTest.kt
✅ Files skipped from review due to trivial changes (2)
- app/src/main/java/app/gamenative/di/DatabaseModule.kt
- app/src/main/java/app/gamenative/db/dao/SteamFileHashCacheDao.kt
🚧 Files skipped from review as they are similar to previous changes (3)
- app/src/main/java/app/gamenative/data/PostSyncInfo.kt
- app/src/main/java/app/gamenative/db/PluviaDatabase.kt
- app/src/main/java/app/gamenative/data/SteamFileHashCache.kt
| Timber.i( | ||
| "Local save hash cache stats for ${appInfo.id} (${appInfo.name}): " + | ||
| "hits=${hashCacheHits.get()}, misses=${hashCacheMisses.get()}, files=${hashCacheHits.get() + hashCacheMisses.get()}", | ||
| ) |
There was a problem hiding this comment.
Log cache stats once per sync, not once per scan.
getLocalUserFilesAsPrefixMap() runs more than once in some sync paths, while hashCacheHits/hashCacheMisses are shared across the whole sync. Logging here emits partial totals on the first scan and cumulative totals on later rescans, so the advertised per-sync metric is misleading.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@app/src/main/java/app/gamenative/service/SteamAutoCloud.kt` around lines 434
- 437, The current Timber.i inside getLocalUserFilesAsPrefixMap() logs
hashCacheHits/hashCacheMisses per scan but those Atomic counters are shared
across the whole sync, producing misleading partial totals; fix by logging once
per sync instead: remove or stop logging from getLocalUserFilesAsPrefixMap() and
instead emit the Timber.i after the full sync completes (in the method that
orchestrates the sync), or alternatively snapshot and reset
hashCacheHits/hashCacheMisses at the start of each
getLocalUserFilesAsPrefixMap() invocation so the logged values represent only
that scan; reference getLocalUserFilesAsPrefixMap(), hashCacheHits,
hashCacheMisses, and the sync orchestration method to relocate the log.
Description
Each sync,
SteamAutoCloudneeds the SHA-1 of every local save file to diff against what Steam Cloud has. Previously it streamed and hashed every file on every sync, even when nothing had changed.This PR adds a Room-backed hash cache (
steam_file_hash_cachetable, DB v20). Each row stores the absolute path, file size, last-modified time, and SHA-1 for a given app's save file. Before hashing a file, the cache is checked: if the stored size and mtime match what's on disk, the file hasn't changed and the cached SHA is returned directly — no file read needed. On a miss, the file is hashed as before and the result is written back to the cache.After a download, the cache is seeded so the next sync gets an immediate hit. The SHA is computed from the actual written bytes (not the manifest value), and only inserted if the on-disk file size matches the expected size — so a truncated or corrupted download can't poison the cache.
When an app is fully re-downloaded, the cache for that app is cleared so stale entries don't survive a game update.
Hit/miss counts are tracked per-sync and logged at the end:
Recording
Type of Change
Checklist
#code-changes, I have discussed this change there and it has been green-lighted. If I do not have access, I have still provided clear context in this PR. If I skip both, I accept that this change may face delays in review, may not be reviewed at all, or may be closed.CONTRIBUTING.md.Summary by CodeRabbit
New Features
Infrastructure