Conversation
kosumosu
commented
May 29, 2022
- Fixed multithreading problem in l_studio.cpp when cache entry was not locked and could be destroyed in callback on async IO thread (should affect all architectures).
- Added alternative thread-safe collections implementation, since e2k does not support 128-bit atomics. New implementation could be improved though, since atomic shared_ptr operations seem to use mutexes inside. At least it works.
- Simplified code handling sequential jobs in viewrender.cpp since thread pool supports serialization of jobs anyway.
- Plus some smaller changed (see commit messages)
|
|
||
| void InitColormeshParams( ModelInstance_t &instance, studiohwdata_t *pStudioHWData, colormeshparams_t *pColorMeshParams ); | ||
| CColorMeshData *FindOrCreateStaticPropColorData( ModelInstanceHandle_t handle ); | ||
| std::shared_ptr<CColorMeshData> FindOrCreateStaticPropColorData( ModelInstanceHandle_t handle ); |
There was a problem hiding this comment.
I would advise against using std:: stuff in the engine, you should instead use one of the custom classes in public/tierX (maybe CRefPtr), that way someone later can go and optimize that class and have it affect the entire codebase
There was a problem hiding this comment.
I found no ready-to-use primitives there. CRefPtr and other stuff will require CColorMeshData to implement ref counting inside it. It also have to be thread safe. I'd better rely it on using std::shared_ptr instead. First, correctness is better then speed. And second, I don't think it will impact performance here - there's plenty of code between creation and release.
I would insist on accepting it as is and improving in another PR.
|
|
||
| //----------------------------------------------------------------------------- | ||
|
|
||
| PLATFORM_INTERFACE bool RunTSQueueTests( int nListSize = 10000, int nTests = 1 ); |
There was a problem hiding this comment.
these 2 lines need to go in your tslist_alternative file, it's redefined in tslists_atomics.h
So what FPS boost do we get from using one core to more cores? (4 or 8) |
On Elbrus-801 PC (Elbrus-8C, 32Gb DDR3-ECC, Radeon R7 240 2Gb) FPS increases by 2-3 times. |
...since e2k does not support 128-bit atomics. Seems like atomics for shared_ptr are NOT lock-free though...
Unlocked access caused access to an object alredy deleted from another thread
Reverted this commit. I cannot check how it impacts performance since the game just doesn't work with nvidia drivers on linux amd64. So far I figured out it locks vertex buffer while it is still locked on another thread. Usually this happens when rendering quads for text. But this is unrelated issue. |
|
Can confirm that e2k performance improved significantly. On Elbrus-8CB (e2k_v5 arch) with Radeon RX 570 playing on cs_office map I used to get 11-13 fps in open spaces and 19-20 fps inside buildings. After building e2k_fixup branch I get 20-25 and 30-40 fps respectively. |
|
Hi everyone, so I've compiled both main branch and this PR and haven't found any measurable drop of performance on my system (AMD Ryzen 5800HS, Radeon Vega 8 Integrated Graphics, Arch Linux kernel 6.5.2). Both builds felt pretty much the same during playing (de_dust2, CT respawn -> main gates with bots), I've recorded first 30 seconds of match metrics using MangoHUD and here are the results: Run 1: Main branch Run 2: PR #32 Hope this helps! csgo_linux64_2023-09-18_02-43-43_summary.csv |
|
considering this project hasn't gotten any commits in 15 months, you should fork this and continue work on it :) |


