You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+69Lines changed: 69 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -232,6 +232,75 @@ See [uffs-mft README](crates/uffs-mft/README.md) for detailed documentation.
232
232
233
233
---
234
234
235
+
## 🔥 What Makes UFFS Blazing Fast
236
+
237
+
UFFS employs multiple layers of optimization to achieve maximum performance when reading the NTFS Master File Table:
238
+
239
+
### 1. Direct MFT Access with `FILE_FLAG_NO_BUFFERING`
240
+
241
+
Instead of using Windows file enumeration APIs, UFFS opens the raw volume and reads the MFT directly using unbuffered I/O. This bypasses the Windows file system cache and gives us full control over read patterns.
242
+
243
+
### 2. SSD/HDD-Aware I/O Tuning
244
+
245
+
UFFS automatically detects whether a drive is an SSD or HDD using Windows storage APIs (`IOCTL_STORAGE_QUERY_PROPERTY`) and tunes I/O parameters accordingly:
246
+
247
+
| Drive Type | Chunk Size | Rationale |
248
+
|------------|------------|-----------|
249
+
|**SSD**| 8 MB | Large sequential reads, no seek penalty |
250
+
|**HDD**| 4 MB | Balance between syscall overhead and seek time |
251
+
252
+
### 3. Minimal System Calls
253
+
254
+
By using large chunk sizes (4-8 MB instead of the typical 1 MB), UFFS reduces the number of `ReadFile` system calls by 4-8x. For a 4.5 GB MFT, this means ~500-1000 syscalls instead of ~4,500.
255
+
256
+
### 4. Zero-Allocation Record Parsing
257
+
258
+
Each thread uses a thread-local buffer for record parsing, eliminating per-record heap allocations. This is critical when processing millions of MFT records:
The `PrefetchMftReader` uses two alternating buffers to overlap I/O with processing:
271
+
- Read into buffer A while processing buffer B
272
+
- Swap buffers and repeat
273
+
- CPU never waits for disk I/O
274
+
275
+
### 6. Parallel Record Processing with Rayon
276
+
277
+
After reading chunks from disk, UFFS uses Rayon's parallel iterators to parse records across all CPU cores. Each core processes a portion of the chunk simultaneously.
278
+
279
+
### 7. Fragmented MFT Support
280
+
281
+
The MFT can be scattered across multiple non-contiguous extents on disk. UFFS handles this by:
282
+
1. Getting the extent map via `FSCTL_GET_RETRIEVAL_POINTERS`
Query operations use Polars' lazy API, which optimizes the query plan before execution. Filters are pushed down, columns are pruned, and operations are parallelized automatically.
289
+
290
+
### Performance Summary
291
+
292
+
| Optimization | Impact |
293
+
|--------------|--------|
294
+
| Direct MFT access | Bypasses slow Windows APIs |
295
+
| Large chunk sizes | 4-8x fewer syscalls |
296
+
| SSD/HDD detection | Optimal I/O parameters per drive |
297
+
| Thread-local buffers |~0 allocations during parsing |
298
+
| Double-buffering | Overlapped I/O with processing |
|**$MFT Bitmap**| In-use record flags |~64 KB | <10ms |
162
162
|**Full MFT**| All file records | 500 MB - 5 GB | 5-30s |
163
163
164
+
## Comparison with Windows Tools
165
+
166
+
You can verify `uffs_mft` output against built-in Windows tools:
167
+
168
+
```powershell
169
+
# Volume geometry and MFT metadata
170
+
fsutil fsinfo ntfsinfo C:
171
+
172
+
# Fragmentation analysis
173
+
defrag C: /A /V
174
+
```
175
+
176
+
### Count Differences Explained
177
+
178
+
| Metric | uffs_mft | Windows defrag |
179
+
|--------|----------|----------------|
180
+
| Directories | Higher | Lower |
181
+
| Files | Higher | Lower |
182
+
183
+
**Why?**`uffs_mft` parses **all** MFT records including:
184
+
- Deleted file entries (not yet overwritten)
185
+
- System metadata files ($MFT, $Bitmap, $LogFile, $Secure, etc.)
186
+
- NTFS internal structures
187
+
188
+
Windows `defrag` counts only **active, movable** user files and folders.
189
+
190
+
### MFT Fragmentation Note
191
+
192
+
Windows `defrag /A /V` may report "0 MFT fragments" while `uffs_mft` shows multiple extents. Why the difference? Look at defrag's note:
193
+
194
+
> *"File fragments larger than 64MB are not included in the fragmentation statistics."* — Windows defrag
195
+
196
+
Example: Your MFT is 4.44 GB across 28 extents = **~162 MB per extent average**. Since each extent is >64MB, Windows defrag doesn't count them as fragments!
197
+
198
+
`uffs_mft` uses `FSCTL_GET_RETRIEVAL_POINTERS` which returns the actual physical extent map — it's technically correct that the MFT is spread across 28 non-contiguous disk regions.
199
+
200
+
**Bottom line:** Both are correct. `uffs_mft` shows the true physical layout, while `defrag` focuses on performance-impacting fragmentation (small fragments that cause excessive disk seeks). Large extents like these don't significantly impact read performance.
201
+
164
202
## Requirements
165
203
166
204
-**Windows only** - Uses Windows APIs for raw disk access
0 commit comments