Skip to content

Add support for reading/writing VTK XML ImageData (.vti) format#6032

Draft
dzenanz wants to merge 5 commits intoInsightSoftwareConsortium:mainfrom
dzenanz:vtiSupport
Draft

Add support for reading/writing VTK XML ImageData (.vti) format#6032
dzenanz wants to merge 5 commits intoInsightSoftwareConsortium:mainfrom
dzenanz:vtiSupport

Conversation

@dzenanz
Copy link
Copy Markdown
Member

@dzenanz dzenanz commented Apr 9, 2026

Prompt for Claude Sonnet 4.6: Add support for .vti image file format. Look at how similar .vtk image file format is implemented in itk::VTKImageIO class.

Full implementation supporting:

  • Reading: ASCII, base64-binary, and raw-appended data formats; little-endian and big-endian files; scalar, vector, RGB, RGBA, and symmetric tensor pixel types
  • Writing: base64-binary (default) and ASCII formats; all standard scalar/vector/tensor pixel types
  • Byte-swapping to/from system native byte order
  • Proper block-size header handling (UInt32/UInt64) for base64 data

Note: I have not reviewed this myself. Opening PR to check automated testing results.

@github-actions github-actions bot added type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots area:IO Issues affecting the IO module labels Apr 9, 2026
@blowekamp
Copy link
Copy Markdown
Member

It would be good to use an XML parsing library like expat which is already in ITK.

@github-actions github-actions bot added area:Python wrapping Python bindings for a class type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct labels Apr 10, 2026
@hjmjohnson
Copy link
Copy Markdown
Member

Force-pushed ddf579b15f (replacing the prior 1c67ef7c34). This is a focused rewrite that addresses the CI failures and @blowekamp's review comment.

CI failures fixed

Failure Root cause Fix
KWStyle (ARMBUILD-*, Pixi-Cxx (windows-2022), etc.) — itkVTIImageIO.h:124: error: comment doesn't have \\class The enum class DataEncoding had a /** ... */ doxygen block; KWStyle's class-comment rule treats enum class like a class. Replaced with a plain // comment.
KeyError: 'VTIImageIOFactory' (ARMBUILD-Python, ITK.{Linux,macOS}.Python) The Python wrapping registration was missing entirely. Added Modules/IO/VTK/wrapping/itkVTIImageIO.wrap registering both VTIImageIO and VTIImageIOFactory (matching itkVTKImageIO.wrap).
ghostflow-check-main Commit subject started with WIP:, not in kwrobot's allowed prefix list. New commit uses ENH:.
(Linker would fail once parser was added) ITKExpat was not in the module deps. Added ITKExpat as PRIVATE_DEPENDS of ITKIOVTK. Added ImageIO::VTI to FACTORY_NAMES.

Review feedback addressed

@blowekamp: It would be good to use an XML parsing library like expat which is already in ITK.

The XML header parsing is now done by expat (the same library Modules/IO/XML/src/itkXMLFile.cxx already uses). The InternalReadImageInformation() flow:

  1. Slurps the file once.
  2. If the file contains <AppendedData>, the XML view fed to expat is truncated at that element and replaced with a self-closing <AppendedData/></VTKFile> so the parser sees a well-formed document. The byte offset of the _ marker is recorded for later seek-and-read of the binary block.
  3. The truncated XML view is parsed with XML_ParserCreate + XML_SetElementHandler + XML_SetCharacterDataHandler. Element handlers populate a VTIParseState with the VTKFile, ImageData, PointData, and active DataArray attributes; the character-data handler captures inline ASCII or base64 contents.
  4. After parsing, geometry / pixel type / encoding are populated on ImageIOBase.

Switching to expat removes all the ad-hoc content.find("<...") scans the previous version had, and fixes a class of latent bugs (attribute ordering, comments, whitespace, multiple DataArrays). The only remaining string scan is for the <AppendedData> boundary itself, which is unavoidable: the XML-illegal raw binary inside that element would crash any XML parser and is read directly via seek instead.

Tests added

The previous commit shipped no VTI tests at all. This commit adds Modules/IO/VTK/test/itkVTIImageIOTest.cxx with three classes of cases:

1. Round-trip via ImageFileWriterImageFileReader (region, spacing, origin, per-pixel bit-equivalence):

  • unsigned char 1D / 2D, short 3D, float 3D, double 3D, RGBPixel<uchar> 2D, Vector<float,3> 3D — both ASCII and binary (base64) encodings where applicable.

2. Behavior tests:

  • Symmetric tensor ASCII round-trip (writes 9-component layout, reads back into ITK's 6-component layout).
  • Symmetric tensor binary write must throw (silent layout corruption guard — verifies the on-disk NumberOfComponents=\"9\" header doesn't get paired with a 6-component memory buffer).

3. Hand-crafted-file readability tests for code paths the writer never produces but the reader must support (cross-checked against VTK's TestDataObjectXMLIO.cxx coverage matrix):

  • XML robustness — comments at multiple positions, attribute reordering, multiple DataArrays in PointData with Scalars=\"...\" pointing at the second array. Verifies the active-array selector logic.
  • header_type=\"UInt64\" base64 file — exercises the 8-byte block-size header path.
  • format=\"appended\" with raw binary in <AppendedData> — exercises the file-truncation + offset-seek path.
  • byte_order=\"BigEndian\" base64 file with data and block-size header pre-swapped — exercises the byte-swap-on-read path.
  • CanReadFile / CanWriteFile sanity.

This matches the relevant subset of VTK's TestDataObjectXMLIO.cxx coverage matrix (DataMode × ByteOrder × HeaderType × DataObjectType) for the features this PR claims. ZLIB/LZ4 compression and the VTK direction matrix are intentionally not exercised because this PR does not claim those features (separate follow-ups).

Local results

$ cmake --build build-ssim -j48 --target ITKIOVTKTestDriver ITKIOVTKHeaderTest1
[20/20] Linking CXX executable bin/ITKIOVTKTestDriver

$ ctest -R itkVTIImageIOTest --output-on-failure
Test #1259: itkVTIImageIOTest .... Passed   0.03 sec
100% tests passed, 0 tests failed out of 1

Test output (every assertion passes):

  Round-trip OK: vti_uchar2d_binary.vti
  Round-trip OK: vti_uchar2d_ascii.vti
  Round-trip OK: vti_short3d_binary.vti
  Round-trip OK: vti_short3d_ascii.vti
  Round-trip OK: vti_float3d_binary.vti
  Round-trip OK: vti_float3d_ascii.vti
  Round-trip OK: vti_double3d_binary.vti
  Round-trip OK: vti_rgb2d_binary.vti
  Round-trip OK: vti_vec3d_binary.vti
  Round-trip OK: vti_uchar1d_binary.vti
  Round-trip OK: vti_uchar1d_ascii.vti
  Tensor ASCII round-trip parsed without exception
  Binary tensor write correctly rejected
  XML robustness OK: comments, attribute reordering, multi-DataArray active selector
  UInt64 header_type base64 read OK
  Raw-appended-data read OK
  BigEndian byte-swap read OK

pre-commit (gersemi, clang-format, kw-pre-commit) is clean on every touched file.

Out-of-scope items (potential follow-ups)

  • ZLIB / LZ4 compressed <AppendedData> blocks
  • Direction matrix attribute (VTK 9 added it to <ImageData>)
  • Streaming reads of pieces (VTK splits a file into multiple <Piece> elements)

@dzenanz
Copy link
Copy Markdown
Member Author

dzenanz commented Apr 10, 2026

We should add a few test files converted into .vti format by ParaView, and regression test them against the existing .nrrd/.mha versions. And of course, manually review this.

@dzenanz
Copy link
Copy Markdown
Member Author

dzenanz commented Apr 10, 2026

Legacy removed tests failed:

Modules/IO/VTK/test/itkVTIImageIOTest.cxx:100:56: warning: 'itk::ImageConstIterator::IndexType itk::ImageConstIterator::GetIndex() const [with TImage = itk::Image<unsigned char, 1>; itk::ImageConstIterator::IndexType = itk::Index<1>]' is deprecated: Please use ComputeIndex() instead, or use an iterator with index, like ImageIteratorWithIndex! [-Wdeprecated-declarations]

hjmjohnson and others added 3 commits April 13, 2026 15:48
Closes InsightSoftwareConsortium#6030's parent issue thread; replaces the WIP implementation in
the original PR with one that addresses both the CI failures and the
review feedback.

== Why a rewrite ==

The original WIP commit on this branch failed CI on every C++ platform
and on Python wrapping, and Bradley Lowekamp's review asked for the XML
parsing to use ITK's existing expat library rather than ad-hoc string
matching.  Rather than patch the WIP implementation, this commit
restarts the file from the same upstream/main base with:

  - expat-based XML parsing for the file header
  - Python wrapping registration
  - KWStyle-clean doxygen comments
  - an ENH: prefix that satisfies kwrobot/ghostflow
  - a comprehensive round-trip and corner-case test suite

== Algorithm ==

VTIImageIO inherits from itk::ImageIOBase and supports the
serial-piece subset of the VTK XML ImageData (.vti) format.

Reading
  1. The whole file is slurped into memory.
  2. If the file contains an `<AppendedData>` element, the XML view fed
     to expat is truncated at that element and replaced with a
     self-closing `<AppendedData/></VTKFile>` so the parser sees a
     well-formed document.  The byte offset of the `_` marker that
     introduces the raw binary block is recorded for later seek-and-read.
  3. The truncated XML view is parsed with expat (XML_Parser /
     XML_SetElementHandler / XML_SetCharacterDataHandler).  Element
     handlers populate a VTIParseState with the VTKFile, ImageData,
     PointData, and active DataArray attributes, and the character data
     handler captures inline ASCII or base64 contents.
  4. After parsing, geometry/spacing/origin/component-type/pixel-type
     are populated on the ImageIOBase from the captured state.
  5. Read() then routes to one of three branches:
       - ASCII: parse the cached character data via ReadBufferAsASCII
       - base64: decode the cached base64 string, strip the
         UInt32/UInt64 block-size header, memcpy into the user buffer,
         then byte-swap if file != host endianness
       - raw appended: open the file, seek to
         m_AppendedDataOffset + m_DataArrayOffset + headerBytes, read,
         then byte-swap if necessary

The expat-based reader handles the things string matching cannot:
attribute reordering, optional whitespace, XML comments at any depth,
the standalone <?xml ... ?> declaration, and PointData with multiple
DataArrays where only one is the active scalar/vector/tensor.

Writing
  - ASCII or binary (base64) DataMode, selectable via SetFileType().
  - SymmetricSecondRankTensor pixels are expanded from ITK's 6-component
    storage to VTK's full 9-component layout for ASCII output.  Binary
    tensor writing is intentionally rejected with an exception because
    the on-disk NumberOfComponents="9" header would silently disagree
    with a 6-component memory buffer; this is verified by a test.
  - Output is always written in the host byte order; the file declares
    its byte_order accordingly.
  - The block-size header for base64 binary is currently UInt32 (the
    UInt64 reader path is exercised by a hand-crafted test file below).

Compression (zlib/lz4) and the VTK direction matrix are intentionally
out of scope for this PR.

== CI failures fixed ==

  - itkVTIImageIO.h:124  KWStyle:  comment doesn't have \class
      The DataEncoding `enum class` had a `/** ... */` doxygen comment;
      KWStyle's class-comment rule treats `enum class` like a class.
      Replaced with a plain `//` comment.

  - KeyError: 'VTIImageIOFactory' (Python tests)
      Added Modules/IO/VTK/wrapping/itkVTIImageIO.wrap registering both
      VTIImageIO and VTIImageIOFactory with the auto-loaded wrap module.

  - ghostflow-check-main: 'WIP:' prefix not in allowed list
      Commit message now uses ENH: prefix.

  - Module dependency: ITKExpat added as a PRIVATE_DEPENDS of ITKIOVTK
    so the new XML parser actually links.  Updated the FACTORY_NAMES to
    include `ImageIO::VTI`.

== Test strategy and coverage ==

The new itkVTIImageIOTest.cxx exercises the filter through 3 classes
of cases:

1. Round-trip tests via ImageFileWriter -> ImageFileReader for the
   common pixel type and dimensionality combinations:
     uchar 1D / 2D, short 3D, float 3D, double 3D, RGB 2D, vector<float,3>
     in 3D, both ASCII and binary (base64) encodings where applicable.
     Each round-trip checks region, spacing, origin, and per-pixel
     bit-equivalence.

2. Behavior tests:
     - Symmetric tensor ASCII round-trip (writes 9-component layout,
       reads back into 6-component ITK layout).
     - Symmetric tensor binary write must throw (silent layout
       corruption guard).

3. Hand-crafted-file readability tests for code paths the writer never
   produces but the reader must support:
     - XML robustness: comments at multiple positions, attribute
       reordering, multiple DataArrays in PointData with the active
       Scalars selector pointing at the second array.  Verifies
       dimensions, spacing, origin, AND that the correct active array
       is selected.
     - header_type="UInt64": base64 file with an 8-byte block-size
       header; verifies the dual UInt32/UInt64 header path.
     - format="appended" with raw binary in <AppendedData>: verifies
       the file-truncation + offset-seek path.
     - byte_order="BigEndian": base64 file with the data and the
       block-size header pre-swapped to big-endian; verifies the
       byte-swap-on-read path.
     - CanReadFile / CanWriteFile sanity.

This matches the relevant subset of VTK's own
TestDataObjectXMLIO.cxx coverage matrix (DataMode x ByteOrder x
HeaderType x DataObjectType) for the features this PR claims.  ZLIB/LZ4
compression and the VTK direction matrix are intentionally not exercised
since this PR does not claim those features.

== Local results ==

  $ cmake --build build-ssim -j48 --target ITKIOVTKTestDriver
  [4/4] Linking CXX executable bin/ITKIOVTKTestDriver

  $ ctest -R itkVTIImageIOTest --output-on-failure
  Test InsightSoftwareConsortium#1259: itkVTIImageIOTest .... Passed   0.03 sec
  100% tests passed, 0 tests failed out of 1

  pre-commit (gersemi, clang-format, kw-pre-commit) clean on every
  touched file.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This appears when legacy code is removed. The exact error message was:

Modules/IO/VTK/test/itkVTIImageIOTest.cxx:100:56: warning: 'itk::ImageConstIterator::IndexType itk::ImageConstIterator::GetIndex() const [with TImage = itk::Image<unsigned char, 1>; itk::ImageConstIterator::IndexType = itk::Index<1>]' is deprecated: Please use ComputeIndex() instead, or use an iterator with index, like ImageIteratorWithIndex! [-Wdeprecated-declarations]
@dzenanz
Copy link
Copy Markdown
Member Author

dzenanz commented Apr 13, 2026

@greptileai review this.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 13, 2026

Greptile Summary

This PR adds a new VTIImageIO class that reads and writes VTK XML ImageData (.vti) files using an expat-based XML parser, supporting ASCII, base64-binary, and raw-appended data formats. The infrastructure (factory, CMake wiring, wrapping, test data coverage) is solid, but two correctness defects in the core implementation need to be fixed before merge.

  • Tensor read-back is wrong: after writing a symmetric tensor as 9 per-pixel ASCII values (full 3×3), the reader calls SetNumberOfComponents(6) and reads only the first 6 stream values per pixel, producing incorrect index mapping (e.g. T[3] gets t[1] instead of t[3]). The round-trip test only asserts no exception, masking the corruption.
  • SwapBufferIfNeeded is a no-op for LE files on BE hosts: SwapRangeFromSystemToBigEndian is always used, but on a big-endian host it does nothing, leaving little-endian file data unswapped.

Confidence Score: 3/5

Not safe to merge as-is: two P1 data-correctness bugs must be fixed first.

Two P1 defects in the core implementation: tensor ASCII read-back silently produces wrong pixel values, and the read-path byte-swap is a no-op when a little-endian file is read on a big-endian system. Both affect data integrity. The PR author explicitly noted it has not been self-reviewed. Infrastructure, scalar/vector/RGB round-trips, and CMake wiring are correct and well-tested.

Modules/IO/VTK/src/itkVTIImageIO.cxx — specifically SwapBufferIfNeeded (lines ~723–749) and tensor component handling in InternalReadImageInformation (lines ~659–664).

Important Files Changed

Filename Overview
Modules/IO/VTK/src/itkVTIImageIO.cxx Core implementation: two P1 bugs — tensor read maps wrong indices, byte-swap no-op on BE hosts reading LE files; P2 issues include entire-file slurp for binary paths, uint32_t block-size truncation, and base64-appended encoding not distinguished from raw-appended.
Modules/IO/VTK/include/itkVTIImageIO.h New header following ITK conventions with SmartPointer, itkNewMacro, itkOverrideGetNameOfClassMacro; private state members well-documented; no issues found.
Modules/IO/VTK/itk-module.cmake Correctly adds ITKExpat as PRIVATE_DEPENDS and registers the VTI factory name; description updated.
Modules/IO/VTK/test/itkVTIImageIOTest.cxx Comprehensive round-trip tests for scalar/vector/RGB/tensor types; however tensor value correctness is not verified (test only checks no exception), missing coverage for the broken tensor mapping bug.
Modules/IO/VTK/test/itkVTIImageIOReadWriteTest.cxx Integration test using real ITK data files with pixel-level comparison; well structured.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[VTIImageIO::ReadImageInformation] --> B[SlurpFile into memory]
    B --> C{AppendedData in file?}
    C -- No --> D[Parse full XML with expat]
    C -- Yes --> E[Truncate XML at AppendedData tag / Record _ marker offset]
    E --> D
    D --> F{DataArray format?}
    F -- ascii --> G[Store ASCII text in m_AsciiDataContent]
    F -- binary/base64 --> H[Store base64 text in m_Base64DataContent]
    F -- appended --> I[Store m_AppendedDataOffset + m_DataArrayOffset]
    J[VTIImageIO::Read] --> K{m_DataEncoding?}
    K -- ASCII --> L[ReadBufferAsASCII from cached string]
    K -- Base64 --> M[DecodeBase64 / skip block-size header / memcpy]
    K -- RawAppended --> N[Seek to offset in file / read raw bytes]
    M --> O[SwapBufferIfNeeded]
    N --> O
    O --> P{fileBigEndian == sysBigEndian?}
    P -- Yes --> Q[No swap needed]
    P -- No --> R[SwapRangeFromSystemToBigEndian - no-op on BE host]
    S[VTIImageIO::Write] --> T{IOFileEnum?}
    T -- ASCII --> U[WriteBufferAsASCII / Expand tensor 6 to 9 components]
    T -- Binary --> V[Prepend UInt32 block header / EncodeBase64]
Loading

Reviews (1): Last reviewed commit: "COMP: Fix GetIndex() is deprecated: Plea..." | Re-trigger Greptile

Comment on lines +659 to +664
if (isTensor)
{
// VTK tensors are 3x3 = 9 components on disk; ITK uses 6 (symmetric).
this->SetPixelType(IOPixelEnum::SYMMETRICSECONDRANKTENSOR);
this->SetNumberOfComponents(6);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Tensor round-trip produces wrong values

SetNumberOfComponents(6) after finding a 9-component (3×3) tensor DataArray causes ReadBufferAsASCII to read only 6 of the 9 per-pixel ASCII values, mapping them to the wrong tensor indices.

The writer emits rows as t[0] t[1] t[2] | t[1] t[3] t[4] | t[2] t[4] t[5] (full 3×3), so the flat stream is [t0,t1,t2,t1,t3,t4,t2,t4,t5]. Reading 6 of these yields [t0,t1,t2,t1,t3,t4], not [t0,t1,t2,t3,t4,t5]. Symmetric indices 3, 4, and 5 are silently wrong after round-trip. The test only checks that no exception is thrown, not that pixel values are correct.

Comment on lines +723 to +749
SwapBufferIfNeeded(void * buffer, std::size_t componentSize, std::size_t numComponents, IOByteOrderEnum fileOrder)
{
const bool fileBigEndian = (fileOrder == IOByteOrderEnum::BigEndian);
const bool sysBigEndian = ByteSwapper<uint16_t>::SystemIsBigEndian();
if (fileBigEndian == sysBigEndian)
{
return;
}
switch (componentSize)
{
case 1:
break;
case 2:
ByteSwapper<uint16_t>::SwapRangeFromSystemToBigEndian(static_cast<uint16_t *>(buffer), numComponents);
break;
case 4:
ByteSwapper<uint32_t>::SwapRangeFromSystemToBigEndian(static_cast<uint32_t *>(buffer), numComponents);
break;
case 8:
ByteSwapper<uint64_t>::SwapRangeFromSystemToBigEndian(static_cast<uint64_t *>(buffer), numComponents);
break;
default:
{
ExceptionObject e_(__FILE__, __LINE__, "Unknown component size for byte swap.", ITK_LOCATION);
throw e_;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Byte-swap is a no-op when reading a little-endian file on a big-endian system

SwapRangeFromSystemToBigEndian performs a byte reversal only when the host is little-endian; on a big-endian host it is a no-op. The guard (fileBigEndian != sysBigEndian) correctly enters the swap branch when file=LE, sys=BE, but then the selected swap function does nothing, leaving the data in little-endian order while the caller expects system (big-endian) byte order.

The correct call for the file=LE path is SwapRangeFromSystemToLittleEndian, which byte-reverses on a big-endian host:

if (fileBigEndian)
{
  // data is BE, system is LE
  ByteSwapper<uintN_t>::SwapRangeFromSystemToBigEndian(ptr, n);
}
else
{
  // data is LE, system is BE
  ByteSwapper<uintN_t>::SwapRangeFromSystemToLittleEndian(ptr, n);
}

Comment on lines +220 to +229
else if (std::strcmp(name, "AppendedData") == 0)
{
st->sawAppendedData = true;
// We do not consume any character data from inside AppendedData via
// expat -- the binary content following the `_` marker is XML-illegal
// and is read directly from the file by the caller. We don't even
// get this far on a real raw-appended file because the binary bytes
// would have caused a parser error before reaching this start tag.
// The caller pre-truncates the XML at <AppendedData> to handle that.
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Base64-encoded appended data silently misread as raw binary

The encoding attribute of <AppendedData> (e.g., encoding="base64") is not captured. Any file with format="appended" and encoding="base64" in the AppendedData element is assumed to be raw binary and will produce garbage pixel data without any error or warning.

Consider capturing the encoding attribute and either handling base64-appended or throwing a clear itkExceptionMacro("AppendedData encoding=\"base64\" is not supported…").

Comment on lines +492 to +497
const std::string content = SlurpFile(m_FileName);
if (content.empty())
{
itkExceptionMacro("Cannot open or read file: " << m_FileName);
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Entire file slurped into memory for raw-appended paths

SlurpFile reads the complete file content (including all binary pixel data) into a std::string for every call path, including raw-appended files where only the XML header preceding <AppendedData> is needed. For large 3-D volumes this can double peak memory use. Consider reading only up to (and including) the <AppendedData …> tag when the appended-data path is detected.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just manually checked this. This remark is not accurate. But the entire file will be read twice.

Copy link
Copy Markdown
Member Author

@dzenanz dzenanz Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could of course be more performant. But a good first step would be to just make it work.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

content variable at line 492 is local to InternalReadImageInformation method, so it will go out of scope, thus not increasing the peak memory requirement.

file << ">\n";

// Prepend a UInt32 block-size header (number of raw data bytes).
const auto blockSize = static_cast<uint32_t>(totalBytes);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 uint32_t block-size header silently truncates for images >~4 GB

static_cast<uint32_t>(totalBytes) wraps around for images whose serialised byte count exceeds UINT32_MAX (~4 GiB). The resulting header value is wrong but no exception is thrown. An explicit range check would make the overflow visible:

if (totalBytes > static_cast<SizeType>(std::numeric_limits<uint32_t>::max()))
{
  itkExceptionMacro("Image exceeds 4 GB; use header_type=\"UInt64\" (not yet supported for writing).");
}
const auto blockSize = static_cast<uint32_t>(totalBytes);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Core Issues affecting the Core module area:IO Issues affecting the IO module area:Python wrapping Python bindings for a class type:Infrastructure Infrastructure/ecosystem related changes, such as CMake or buildbots type:Testing Ensure that the purpose of a class is met/the results on a wide set of test cases are correct

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants