Fix ConcurrentModificationException in ISO components#717
Conversation
cc14feb to
99ca6c6
Compare
Add thread-safe dump() operations to ISOMsg, TLVList, FSDMsg, ISODatasetField, SecureKeyBlock, SecureKeySpec, CryptographicServiceMessage, and SimpleMsg. Fixes race conditions where dump() iterates collections (Map.entrySet(), ArrayList) while other threads mutate them via set()/append()/addField(). Pattern applied: - dump() operations: create snapshot under lock, iterate snapshot - mutate operations: atomic synchronized blocks Includes concurrent tests demonstrating the fix with 2000 iterations per thread, 4 threads, and CountDownLatch coordination. Note: getFieldNumbers() uses TreeSet to maintain field ordering for XMLPackager. Assisted-by: OpenClaw:minimax-coding-plan/MiniMax-M2.7
99ca6c6 to
e0a42fe
Compare
There was a problem hiding this comment.
An alternative to this would be to just state that dump is not thread-safe. My only concern is about the extra pressure we may be putting on the GC.
I don't have a strong opinion on this, and I don't know if this would have an appreciable impact; I'm only mentioning it in case it wasn't considered. This only affects the dump changes.
I think the synchronized blocks are needed for set to be thread-safe. I'm not sure if we need that too, or how much the monitor locking would impact performance, as that method is heavily used.
| synchronized (fields) { | ||
| Integer i = (Integer) c.getKey(); | ||
| fields.put (i, c); | ||
| if (i > maxField) | ||
| maxField = i; | ||
| dirty = true; | ||
| } |
There was a problem hiding this comment.
I know this woun't make much of a difference, but why not putting the synchronized block just arround fields.put?
| synchronized (fields) { | |
| Integer i = (Integer) c.getKey(); | |
| fields.put (i, c); | |
| if (i > maxField) | |
| maxField = i; | |
| dirty = true; | |
| } | |
| Integer i = (Integer) c.getKey(); | |
| synchronized (fields) { | |
| fields.put (i, c);5 | |
| } | |
| if (i > maxField) | |
| maxField = i; | |
| dirty = true; |
There was a problem hiding this comment.
An alternative to this would be to just state that dump is not thread-safe.
We see CME's as warning in the log. At least for jpos components it should work, but user ones is not jpos responsibility.
but why not putting the synchronized block just arround fields.put?
My test were failing. I distinctly recall having to push the dirty flag inside too.
maxField must stay inside the synchronized block. It has no volatile keyword and no synchronization on read (only via recalcBitMap() → getMaxField()). If maxField = i moved outside the lock, another thread could enter getFieldNumbers() (which does acquire the lock) and call recalcBitMap(), which iterates fields.keySet() — but sees a stale maxField. This would cause the bitmap to be recalculated with an incorrect range, potentially missing fields.
There was a problem hiding this comment.
but why not putting the synchronized block just arround fields.put?
My test were failing. I distinctly recall having to push the dirty flag inside too.
maxField must stay inside the synchronized block. It has no volatile keyword and no synchronization on read (only via recalcBitMap() → getMaxField()). If maxField = i moved outside the lock, another thread could enter getFieldNumbers() (which does acquire the lock) and call recalcBitMap(), which iterates fields.keySet() — but sees a stale maxField. This would cause the bitmap to be recalculated with an incorrect range, potentially missing fields.
Yes, I was thinking only about the CME when I wrote that. Since I didn't see any other place where synchronized(fields) was accessing the dirty flags. Shouldn't recalcBitMap also be synchronized on fields then? And unset, getMaxField?
|
I think we need to be careful with the framing here. I also found at least one concrete issue in the current changes: the new More broadly, several changes appear to protect only the dump-side iteration while leaving other access/mutation paths unsynchronized or using different locking scopes. That makes the fix incomplete if the goal is true thread safety. Given that, I don't think we should merge this as-is. The safer direction is probably to document/flag that these objects are not thread-safe and should not be mutated while being dumped, rather than introducing partial synchronization that suggests a guarantee we do not actually provide. This is still a useful report and helped clarify an important boundary in the API. |
…ferences Fixes a lock mismatch in ISOMsg where pack()/unpack() used synchronized(this) while set()/unset() used synchronized(fields), allowing concurrent map modification during bitmap recalculation. Unifies all mutating and reading paths to synchronize on the same monitor (fields). ISOMsg.java: - Changed pack(), unpack(byte[]), unpack(InputStream) from synchronized(this) → synchronized(fields) - Added synchronized(fields) to getMaxField() and recalcBitMap() (were unsynchronized) - Added synchronized(fields) to unset(int) (was unsynchronized, raced with set()) - Made maxField volatile for visibility across threads - Comprehensive javadoc on all changed methods explaining the synchronization contract ISODatasetField.java: - Added synchronized(datasets) to addDataset(), removeDataset(), hasDatasets(), getDataset(int), getDatasets(int), setValue() — pre-existing race where only dump() was locked - Documented that all getter methods return live references, not copies, requiring external synchronization for concurrent mutation - Optimized getDatasets(int) zero-match path to return Collections.emptyList() (zero allocation) ISOMsgPackConcurrentTest.java: - New 6-test suite targeting the pack/unpack + set/unset concurrency race that existing tests (ISOMsgConcurrentTest) did not exercise - Each test isolates a specific synchronization path with detailed javadoc explaining what it verifies and why
Fixed in 9f6f891 |
Add thread-safe dump() operations to ISOMsg, TLVList, FSDMsg, ISODatasetField, SecureKeyBlock, SecureKeySpec, CryptographicServiceMessage, and SimpleMsg.
Fixes race conditions where dump() iterates collections (Map.entrySet(), ArrayList) while other threads mutate them via set()/append()/addField().
Pattern applied:
Includes concurrent tests demonstrating the fix with 2000 iterations per thread, 4 threads, and CountDownLatch coordination.