Skip to content

Fix ConcurrentModificationException in ISO components#717

Open
chhil wants to merge 2 commits into
jpos:mainfrom
TransactilityInc:fix/concurrent-modification-exceptions
Open

Fix ConcurrentModificationException in ISO components#717
chhil wants to merge 2 commits into
jpos:mainfrom
TransactilityInc:fix/concurrent-modification-exceptions

Conversation

@chhil
Copy link
Copy Markdown
Contributor

@chhil chhil commented May 11, 2026

Add thread-safe dump() operations to ISOMsg, TLVList, FSDMsg, ISODatasetField, SecureKeyBlock, SecureKeySpec, CryptographicServiceMessage, and SimpleMsg.

Fixes race conditions where dump() iterates collections (Map.entrySet(), ArrayList) while other threads mutate them via set()/append()/addField().

Pattern applied:

  • dump() operations: create snapshot under lock, iterate snapshot
  • mutate operations: atomic synchronized blocks

Includes concurrent tests demonstrating the fix with 2000 iterations per thread, 4 threads, and CountDownLatch coordination.

@chhil chhil force-pushed the fix/concurrent-modification-exceptions branch from cc14feb to 99ca6c6 Compare May 11, 2026 06:27
Add thread-safe dump() operations to ISOMsg, TLVList, FSDMsg, ISODatasetField,
SecureKeyBlock, SecureKeySpec, CryptographicServiceMessage, and SimpleMsg.

Fixes race conditions where dump() iterates collections (Map.entrySet(),
ArrayList) while other threads mutate them via set()/append()/addField().

Pattern applied:
- dump() operations: create snapshot under lock, iterate snapshot
- mutate operations: atomic synchronized blocks

Includes concurrent tests demonstrating the fix with 2000 iterations
per thread, 4 threads, and CountDownLatch coordination.

Note: getFieldNumbers() uses TreeSet to maintain field ordering for XMLPackager.

Assisted-by: OpenClaw:minimax-coding-plan/MiniMax-M2.7
@chhil chhil force-pushed the fix/concurrent-modification-exceptions branch from 99ca6c6 to e0a42fe Compare May 11, 2026 06:37
Copy link
Copy Markdown
Contributor

@alcarraz alcarraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to this would be to just state that dump is not thread-safe. My only concern is about the extra pressure we may be putting on the GC.

I don't have a strong opinion on this, and I don't know if this would have an appreciable impact; I'm only mentioning it in case it wasn't considered. This only affects the dump changes.

I think the synchronized blocks are needed for set to be thread-safe. I'm not sure if we need that too, or how much the monitor locking would impact performance, as that method is heavily used.

Comment on lines +247 to +253
synchronized (fields) {
Integer i = (Integer) c.getKey();
fields.put (i, c);
if (i > maxField)
maxField = i;
dirty = true;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this woun't make much of a difference, but why not putting the synchronized block just arround fields.put?

Suggested change
synchronized (fields) {
Integer i = (Integer) c.getKey();
fields.put (i, c);
if (i > maxField)
maxField = i;
dirty = true;
}
Integer i = (Integer) c.getKey();
synchronized (fields) {
fields.put (i, c);5
}
if (i > maxField)
maxField = i;
dirty = true;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative to this would be to just state that dump is not thread-safe.

We see CME's as warning in the log. At least for jpos components it should work, but user ones is not jpos responsibility.

but why not putting the synchronized block just arround fields.put?

My test were failing. I distinctly recall having to push the dirty flag inside too.

maxField must stay inside the synchronized block. It has no volatile keyword and no synchronization on read (only via recalcBitMap() → getMaxField()). If maxField = i moved outside the lock, another thread could enter getFieldNumbers() (which does acquire the lock) and call recalcBitMap(), which iterates fields.keySet() — but sees a stale maxField. This would cause the bitmap to be recalculated with an incorrect range, potentially missing fields.

Copy link
Copy Markdown
Contributor

@alcarraz alcarraz May 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but why not putting the synchronized block just arround fields.put?

My test were failing. I distinctly recall having to push the dirty flag inside too.

maxField must stay inside the synchronized block. It has no volatile keyword and no synchronization on read (only via recalcBitMap() → getMaxField()). If maxField = i moved outside the lock, another thread could enter getFieldNumbers() (which does acquire the lock) and call recalcBitMap(), which iterates fields.keySet() — but sees a stale maxField. This would cause the bitmap to be recalculated with an incorrect range, potentially missing fields.

Yes, I was thinking only about the CME when I wrote that. Since I didn't see any other place where synchronized(fields) was accessing the dirty flags. Shouldn't recalcBitMap also be synchronized on fields then? And unset, getMaxField?

@ar-agt
Copy link
Copy Markdown
Collaborator

ar-agt commented May 11, 2026

I think we need to be careful with the framing here. ISOMsg itself is not currently designed to be thread-safe; in normal jPOS usage, thread isolation is provided at a higher level by the transaction-manager pattern. So I'm reluctant to add partial synchronization to ISOMsg and related classes, because it may create a stronger "thread-safe" impression than the codebase actually guarantees, which could lead users into unsafe usage patterns.

I also found at least one concrete issue in the current changes: the new ISODatasetField test fails locally. The implementation snapshots datasets under synchronized (datasets), but addDataset, removeDataset, and setValue still mutate the same ArrayList without that lock, so the synchronization does not reliably protect the collection.

More broadly, several changes appear to protect only the dump-side iteration while leaving other access/mutation paths unsynchronized or using different locking scopes. That makes the fix incomplete if the goal is true thread safety.

Given that, I don't think we should merge this as-is. The safer direction is probably to document/flag that these objects are not thread-safe and should not be mutated while being dumped, rather than introducing partial synchronization that suggests a guarantee we do not actually provide.

This is still a useful report and helped clarify an important boundary in the API.

…ferences

Fixes a lock mismatch in ISOMsg where pack()/unpack() used
synchronized(this) while set()/unset() used synchronized(fields),
allowing concurrent map modification during bitmap recalculation.
Unifies all mutating and reading paths to synchronize on the same
monitor (fields).
ISOMsg.java:
- Changed pack(), unpack(byte[]), unpack(InputStream) from
synchronized(this) → synchronized(fields)
- Added synchronized(fields) to getMaxField() and recalcBitMap() (were
unsynchronized)
- Added synchronized(fields) to unset(int) (was unsynchronized, raced
with set())
- Made maxField volatile for visibility across threads
- Comprehensive javadoc on all changed methods explaining the
synchronization contract
ISODatasetField.java:
- Added synchronized(datasets) to addDataset(), removeDataset(),
hasDatasets(), getDataset(int), getDatasets(int), setValue() —
pre-existing race where only dump() was locked
- Documented that all getter methods return live references, not copies,
requiring external synchronization for concurrent mutation
- Optimized getDatasets(int) zero-match path to return
Collections.emptyList() (zero allocation)
ISOMsgPackConcurrentTest.java:
- New 6-test suite targeting the pack/unpack + set/unset concurrency
race that existing tests (ISOMsgConcurrentTest) did not exercise
- Each test isolates a specific synchronization path with detailed
javadoc explaining what it verifies and why
@chhil
Copy link
Copy Markdown
Contributor Author

chhil commented May 12, 2026

I also found at least one concrete issue in the current changes: the new ISODatasetField test fails locally. The implementation snapshots datasets under synchronized (datasets), but addDataset, removeDataset, and setValue still mutate the same ArrayList without that lock, so the synchronization does not reliably protect the collection.

Fixed in 9f6f891

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants