-
Notifications
You must be signed in to change notification settings - Fork 665
Description
This is the OpenCL version, because latest Cuda version still requires Cuda v12 shared libraries while Cuda 13 was released (2 months before Katago 16.4 came out) and I cannot downgrade my Cuda installation for this.
$ ./katago runtests
Running rng and hash tests
Running fancy math tests
Running base64 tests
Running thread tests
Running board IO tests
Running board basic tests
Running board area tests
Running rules tests
Running board undo test
Running board handicap test
Running board stress test
Running sgf tests
Running basic symmetries tests
Running board symmetry tests
Running symmetry difference tests
Running board replay test
Not being run out of git repo, skipping config parsing tests
All tests passed
$ ./katago gtp -config default_gtp.cfg -model katago-network.bin.gz
KataGo v1.16.4
Using TrompTaylor rules initially, unless GTP/GUI overrides this
version
quit
Seems to "start up" but does not react to any commands and requires ctrl+c termination to exit. As you can see I tried to issue "version" and "quit" commands.
Also, is it normal that there is no first-time start up tuning run? Just wondering.
The model is the "stable latest - recommended" katago network.
Trying to re-tune via "benchmark" command hangs at
$ ./katago benchmark -config default_gtp.cfg -model kata1-b28c512nbt-s12283775232-d5679728027.bin.gz
2026-01-25 13:48:15+0100: Running with following config:
allowResignation = true
lagBuffer = 1.0
logAllGTPCommunication = true
logDir = gtp_logs
logSearchInfo = true
logSearchInfoForChosenMove = false
logToStderr = false
maxTimePondering = 60.0
maxVisits = 500
numSearchThreads = 6
ponderingEnabled = false
resignConsecTurns = 3
resignThreshold = -0.90
rules = tromp-taylor
searchFactorAfterOnePass = 0.50
searchFactorAfterTwoPass = 0.25
searchFactorWhenWinning = 0.40
searchFactorWhenWinningThreshold = 0.95
2026-01-25 13:48:15+0100: Loading model and initializing benchmark...
2026-01-25 13:48:15+0100: Testing with default positions for board size: 19
2026-01-25 13:48:15+0100: nnRandSeed0 = 9546287206378380450
2026-01-25 13:48:15+0100: After dedups: nnModelFile0 = kata1-b28c512nbt-s12283775232-d5679728027.bin.gz useFP16 auto useNHWC auto
2026-01-25 13:48:15+0100: Initializing neural net buffer to be size 19 * 19 exactly
Running the above in strace shows this as final line:
futex(0x763de08a4878, FUTEX_WAIT_PRIVATE, 1, NULL
so katago suffers from some deadlock maybe.
GPU resources via nvidia-smi show almost no GPU/VRAM utilization so there is no low-resource problem.