Commit a329a3b
authored
feat: Implement shared delete file loading and caching for ArrowReader (#1941)
## Which issue does this PR close?
- Closes #.
## What changes are included in this PR?
Currently, ArrowReader instantiates a new CachingDeleteFileLoader (and
consequently a new DeleteFilter) for each FileScanTask when calling
load_deletes. This
results in the DeleteFilter state being isolated per task. If multiple
tasks reference the same delete file (common in positional deletes),
that delete file is
re-read and re-parsed for every task, leading to significant performance
overhead and redundant I/O.
Changes
* Shared State: Moved the DeleteFilter instance into the
CachingDeleteFileLoader struct. Since ArrowReader holds a single
CachingDeleteFileLoader instance across
its lifetime, the DeleteFilter state is now effectively shared across
all file scan tasks processed by that reader.
* Positional Delete Caching: Implemented a state machine for loading
positional delete files (PosDelState) in DeleteFilter.
* Added try_start_pos_del_load: Coordinates concurrent access to the
same positional delete file.
* Added finish_pos_del_load: Signals completion of loading.
* Synchronization: Introduced a WaitFor state. Unlike equality deletes
(which are accessed asynchronously), positional deletes are accessed
synchronously by
ArrowReader. Therefore, if a task encounters a file that is currently
being loaded by another task, it must asynchronously wait
(notify.notified().await)
during the loading phase to ensure the data is fully populated before
ArrowReader proceeds.
* Refactoring: Updated load_file_for_task and related types in
CachingDeleteFileLoader to support the new caching logic and carry file
paths through the loading
context.
## Are these changes tested?
Added test_caching_delete_file_loader_caches_results to verify that
repeated loads of the same delete file return shared memory objects1 parent b7ba2e8 commit a329a3b
File tree
2 files changed
+155
-22
lines changed- crates/iceberg/src/arrow
2 files changed
+155
-22
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
23 | 23 | | |
24 | 24 | | |
25 | 25 | | |
26 | | - | |
| 26 | + | |
27 | 27 | | |
28 | 28 | | |
29 | 29 | | |
| |||
42 | 42 | | |
43 | 43 | | |
44 | 44 | | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
45 | 48 | | |
46 | 49 | | |
47 | 50 | | |
48 | 51 | | |
49 | 52 | | |
50 | 53 | | |
51 | | - | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
52 | 59 | | |
53 | 60 | | |
54 | 61 | | |
| |||
59 | 66 | | |
60 | 67 | | |
61 | 68 | | |
62 | | - | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
63 | 73 | | |
| 74 | + | |
64 | 75 | | |
65 | 76 | | |
66 | 77 | | |
| |||
69 | 80 | | |
70 | 81 | | |
71 | 82 | | |
| 83 | + | |
72 | 84 | | |
73 | 85 | | |
74 | 86 | | |
| |||
142 | 154 | | |
143 | 155 | | |
144 | 156 | | |
145 | | - | |
146 | 157 | | |
147 | 158 | | |
148 | 159 | | |
149 | 160 | | |
150 | 161 | | |
151 | 162 | | |
152 | 163 | | |
153 | | - | |
| 164 | + | |
154 | 165 | | |
155 | 166 | | |
156 | 167 | | |
157 | 168 | | |
158 | 169 | | |
159 | 170 | | |
160 | | - | |
| 171 | + | |
161 | 172 | | |
162 | 173 | | |
163 | 174 | | |
164 | 175 | | |
165 | 176 | | |
166 | 177 | | |
167 | 178 | | |
168 | | - | |
| 179 | + | |
169 | 180 | | |
170 | 181 | | |
171 | 182 | | |
| |||
181 | 192 | | |
182 | 193 | | |
183 | 194 | | |
184 | | - | |
185 | | - | |
186 | | - | |
| 195 | + | |
187 | 196 | | |
188 | | - | |
189 | | - | |
190 | | - | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
191 | 201 | | |
192 | 202 | | |
| 203 | + | |
| 204 | + | |
193 | 205 | | |
194 | 206 | | |
195 | 207 | | |
| |||
210 | 222 | | |
211 | 223 | | |
212 | 224 | | |
213 | | - | |
214 | | - | |
215 | | - | |
216 | | - | |
217 | | - | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
218 | 243 | | |
219 | 244 | | |
220 | 245 | | |
| |||
255 | 280 | | |
256 | 281 | | |
257 | 282 | | |
258 | | - | |
259 | | - | |
260 | | - | |
261 | | - | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
262 | 290 | | |
263 | 291 | | |
264 | 292 | | |
| |||
979 | 1007 | | |
980 | 1008 | | |
981 | 1009 | | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
982 | 1049 | | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
34 | 34 | | |
35 | 35 | | |
36 | 36 | | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
37 | 49 | | |
38 | 50 | | |
39 | 51 | | |
40 | 52 | | |
| 53 | + | |
41 | 54 | | |
42 | 55 | | |
43 | 56 | | |
44 | 57 | | |
45 | 58 | | |
46 | 59 | | |
47 | 60 | | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
48 | 73 | | |
49 | 74 | | |
50 | 75 | | |
| |||
82 | 107 | | |
83 | 108 | | |
84 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
85 | 151 | | |
86 | 152 | | |
87 | 153 | | |
| |||
0 commit comments