Optimization: Parallelize GXS message deserialization using OpenMP (4x speedup) #246

jolavillette · 2026-01-19T16:58:44Z

Code by Antigravity

This PR requires RetroShare pr/3136

Description

This PR significantly optimizes the loading performance of GXS services (Channels, Forums, Boards) by parallelizing the message deserialization process.
Profiling identified RsGenExchange::getMsgData as a major bottleneck during the loading phase, where deserialization was performed sequentially on a single thread. This PR introduces OpenMP to parallelize this workload across available CPU cores.

Changes

Global OpenMP Support ([retroshare.pri], see corresponding RetroShare PR
Enabled -fopenmp compiler and linker flags globally for Linux and Windows (MSYS2) builds. This ensures consistent OpenMP support across libretroshare and the GUI executable.
Parallel Deserialization (libretroshare): Refactored the main loop in RsGenExchange::getMsgData to use #pragma omp parallel for. This allows concurrent deserialization of RsGxsMsgItem objects.

Performance Results

Benchmarks performed on an Intel Xeon E3-1230 v6 (4 cores / 8 threads) on Ubuntu 24.04, and on an Intel 4790K (4 cores / 8 threads) on Windows 10
Example: Deserialization time on Xeon:
| Dataset Size | Serial (Before) | Parallel (After) | Speedup |
| Large (~6000 items) | ~1900 ms | ~440 ms | 4.3x
| Medium (~3000 items) | ~265 ms | ~65 ms | 4.0x
| Small (~500 items) | ~388 ms | ~80 ms | 4.8x

Impact

Reduced Loading Time**: The "User Perceived" total loading time for large groups is reduced by more than 50% (e.g., from ~13s to ~6s) on Win10 4790K for a channel with 6000 records.
Scalability: The performance gain scales linearly with the number of physical cores.
Scope: Applies to all GXS-based services (Channels, Forums, Boards, Wikis).

Notes

UI performance was also analyzed. Moving std::sort operations from the UI thread to a worker thread was tested but yielded negligible gains (~50ms) compared to Qt's rendering cost. Therefore, UI-specific changes were reverted to maintain code simplicity.

csoler · 2026-01-19T18:39:26Z

openmp is a great tool for parallelizing (I use if often), but you need to make sure that

there's no static involved below the parallel calls
the code below is 100% re-entrant. In particular, is it the case for sqlcipher?
According to what you can find on the net the following is unsafe for sqlite (and therefore sqlcipher):
Recursive calls into SQLCipher using the same sqlite3*
Using a connection simultaneously from multiple threads without coordination
So be make sure that your code does things properly. It may work on some machines and not on other machines.

jolavillette · 2026-01-19T18:56:20Z

Note that in this PR only rsgenexchange is changed.
All otther involved files have been modified only to trace the code and measure the speed improvement.

So here is what Antigravity says:

"I have reviewed the OpenMP parallelized code in
libretroshare/src/gxs/rsgenexchange.cc
(around line 1559) considering Cyril's feedback.

Safety Analysis:

No static variables involved below:
The parallel loop calls mSerialiser->deserialise(...).
I verified the call chain:
RsSerialiser
->
RsServiceSerializer
->
RsGxsForumSerialiser
(for forums).
These methods utilize local contexts (
SerializeContext
) and do not modify the state of the serializer object itself.
Memory allocation (new RsItem) uses the standard allocator (thread-safe) or a static mutex (if DO_STATISTICS is enabled), so it is safe.
100% Re-entrant:
The deserialization methods (RsTypeSerializer::serial_process) operate on memory buffers passed as arguments.
No shared memory areas are modified without locks (the results vector tempItems is accessed via disjoint indices).
SQLCipher (Recursive/Simultaneous usage):
The database access (mDataAccess->getMsgData) is performed BEFORE the parallel loop.
The loop itself performs only CPU-intensive processing (binary deserialization) in memory. It does not touch SQLite/SQLCipher.
Conclusion: The code appears to comply with the stated constraints and should not cause crashes or corruption related to SQLCipher or static variables. The parallelization here is strictly computational (deserialization).

However, please ensure that all services using
RsGenExchange
(Forums, Channels, etc.) strictly use "clean" serializers (like the Forums one I verified), which follows the standard pattern in the current codebase."

jolavillette added 7 commits January 14, 2026 22:33

add trace

46b221e

Merge branch 'master' into GxsPerfV1

b3f9b4c

remove std::endl in RsDbg() calls

448c67c

remove debug message in retrodb.cc

0b5f793

Optimize getMsgData using OpenMP parallelization

66261ee

Build: Enable OpenMP support in .pro file

c2945fe

Build: Remove OpenMP flags from project file (moved to global pri)

2e95b5b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Optimization: Parallelize GXS message deserialization using OpenMP (4x speedup) #246

Optimization: Parallelize GXS message deserialization using OpenMP (4x speedup) #246

Uh oh!

jolavillette commented Jan 19, 2026 •

edited

Loading

Uh oh!

csoler commented Jan 19, 2026

Uh oh!

jolavillette commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimization: Parallelize GXS message deserialization using OpenMP (4x speedup) #246

Are you sure you want to change the base?

Optimization: Parallelize GXS message deserialization using OpenMP (4x speedup) #246

Uh oh!

Conversation

jolavillette commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes

Performance Results

Impact

Notes

Uh oh!

csoler commented Jan 19, 2026

Uh oh!

jolavillette commented Jan 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jolavillette commented Jan 19, 2026 •

edited

Loading