@@ -57,21 +57,19 @@ The following address space operations can be wrapped easily:
5757 * ``bmap ``
5858 * ``swap_activate ``
5959
60- ``struct iomap_folio_ops ``
60+ ``struct iomap_write_ops ``
6161--------------------------
6262
63- The ``->iomap_begin `` function for pagecache operations may set the
64- ``struct iomap::folio_ops `` field to an ops structure to override
65- default behaviors of iomap:
66-
6763.. code-block :: c
6864
69- struct iomap_folio_ops {
65+ struct iomap_write_ops {
7066 struct folio *(*get_folio)(struct iomap_iter *iter, loff_t pos,
7167 unsigned len);
7268 void (*put_folio)(struct inode *inode, loff_t pos, unsigned copied,
7369 struct folio *folio);
7470 bool (*iomap_valid)(struct inode *inode, const struct iomap *iomap);
71+ int (*read_folio_range)(const struct iomap_iter *iter,
72+ struct folio *folio, loff_t pos, size_t len);
7573 };
7674
7775 iomap calls these functions:
@@ -127,6 +125,10 @@ iomap calls these functions:
127125 ``->iomap_valid ``, then the iomap should considered stale and the
128126 validation failed.
129127
128+ - ``read_folio_range ``: Called to synchronously read in the range that will
129+ be written to. If this function is not provided, iomap will default to
130+ submitting a bio read request.
131+
130132These ``struct kiocb `` flags are significant for buffered I/O with iomap:
131133
132134 * ``IOCB_NOWAIT ``: Turns on ``IOMAP_NOWAIT ``.
@@ -269,7 +271,7 @@ writeback.
269271It does not lock ``i_rwsem `` or ``invalidate_lock ``.
270272
271273The dirty bit will be cleared for all folios run through the
272- ``->map_blocks `` machinery described below even if the writeback fails.
274+ ``->writeback_range `` machinery described below even if the writeback fails.
273275This is to prevent dirty folio clots when storage devices fail; an
274276``-EIO `` is recorded for userspace to collect via ``fsync ``.
275277
@@ -281,15 +283,14 @@ The ``ops`` structure must be specified and is as follows:
281283.. code-block :: c
282284
283285 struct iomap_writeback_ops {
284- int (*map_blocks)(struct iomap_writepage_ctx *wpc, struct inode *inode,
285- loff_t offset, unsigned len);
286- int (*prepare_ioend)(struct iomap_ioend *ioend, int status);
287- void (*discard_folio)(struct folio *folio, loff_t pos);
286+ int (*writeback_range)(struct iomap_writepage_ctx *wpc,
287+ struct folio *folio, u64 pos, unsigned int len, u64 end_pos);
288+ int (*writeback_submit)(struct iomap_writepage_ctx *wpc, int error);
288289 };
289290
290291 The fields are as follows:
291292
292- - ``map_blocks ``: Sets ``wpc->iomap `` to the space mapping of the file
293+ - ``writeback_range ``: Sets ``wpc->iomap `` to the space mapping of the file
293294 range (in bytes) given by ``offset `` and ``len ``.
294295 iomap calls this function for each dirty fs block in each dirty folio,
295296 though it will `reuse mappings
@@ -304,28 +305,26 @@ The fields are as follows:
304305 This revalidation must be open-coded by the filesystem; it is
305306 unclear if ``iomap::validity_cookie `` can be reused for this
306307 purpose.
307- This function must be supplied by the filesystem.
308-
309- - ``prepare_ioend ``: Enables filesystems to transform the writeback
310- ioend or perform any other preparatory work before the writeback I/O
311- is submitted.
312- This might include pre-write space accounting updates, or installing
313- a custom ``->bi_end_io `` function for internal purposes, such as
314- deferring the ioend completion to a workqueue to run metadata update
315- transactions from process context.
316- This function is optional.
317308
318- - ``discard_folio ``: iomap calls this function after ``->map_blocks ``
319- fails to schedule I/O for any part of a dirty folio.
320- The function should throw away any reservations that may have been
321- made for the write.
309+ If this methods fails to schedule I/O for any part of a dirty folio, it
310+ should throw away any reservations that may have been made for the write.
322311 The folio will be marked clean and an ``-EIO `` recorded in the
323312 pagecache.
324313 Filesystems can use this callback to `remove
325314 <https://lore.kernel.org/all/20201029163313.1766967-1-bfoster@redhat.com/> `_
326315 delalloc reservations to avoid having delalloc reservations for
327316 clean pagecache.
328- This function is optional.
317+ This function must be supplied by the filesystem.
318+
319+ - ``writeback_submit ``: Submit the previous built writeback context.
320+ Block based file systems should use the iomap_ioend_writeback_submit
321+ helper, other file system can implement their own.
322+ File systems can optionall to hook into writeback bio submission.
323+ This might include pre-write space accounting updates, or installing
324+ a custom ``->bi_end_io `` function for internal purposes, such as
325+ deferring the ioend completion to a workqueue to run metadata update
326+ transactions from process context before submitting the bio.
327+ This function must be supplied by the filesystem.
329328
330329Pagecache Writeback Completion
331330~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -339,10 +338,9 @@ If the write failed, it will also set the error bits on the folios and
339338the address space.
340339This can happen in interrupt or process context, depending on the
341340storage device.
342-
343341Filesystems that need to update internal bookkeeping (e.g. unwritten
344- extent conversions) should provide a `` ->prepare_ioend `` function to
345- set `` struct iomap_end::bio::bi_end_io `` to its own function.
342+ extent conversions) should set their own bi_end_io on the bios
343+ submitted by `` ->submit_writeback ``
346344This function should call ``iomap_finish_ioends `` after finishing its
347345own work (e.g. unwritten extent conversion).
348346
@@ -515,18 +513,33 @@ IOMAP_WRITE`` with any combination of the following enhancements:
515513
516514 * ``IOMAP_ATOMIC ``: This write is being issued with torn-write
517515 protection.
518- Only a single bio can be created for the write, and the write must
519- not be split into multiple I/O requests, i.e. flag REQ_ATOMIC must be
520- set.
516+ Torn-write protection may be provided based on HW-offload or by a
517+ software mechanism provided by the filesystem.
518+
519+ For HW-offload based support, only a single bio can be created for the
520+ write, and the write must not be split into multiple I/O requests, i.e.
521+ flag REQ_ATOMIC must be set.
521522 The file range to write must be aligned to satisfy the requirements
522523 of both the filesystem and the underlying block device's atomic
523524 commit capabilities.
524525 If filesystem metadata updates are required (e.g. unwritten extent
525- conversion or copy on write), all updates for the entire file range
526+ conversion or copy-on- write), all updates for the entire file range
526527 must be committed atomically as well.
527- Only one space mapping is allowed per untorn write.
528- Untorn writes must be aligned to, and must not be longer than, a
529- single file block.
528+ Untorn-writes may be longer than a single file block. In all cases,
529+ the mapping start disk block must have at least the same alignment as
530+ the write offset.
531+ The filesystems must set IOMAP_F_ATOMIC_BIO to inform iomap core of an
532+ untorn-write based on HW-offload.
533+
534+ For untorn-writes based on a software mechanism provided by the
535+ filesystem, all the disk block alignment and single bio restrictions
536+ which apply for HW-offload based untorn-writes do not apply.
537+ The mechanism would typically be used as a fallback for when
538+ HW-offload based untorn-writes may not be issued, e.g. the range of the
539+ write covers multiple extents, meaning that it is not possible to issue
540+ a single bio.
541+ All filesystem metadata updates for the entire file range must be
542+ committed atomically as well.
530543
531544Callers commonly hold ``i_rwsem `` in shared or exclusive mode before
532545calling this function.
0 commit comments