Problem
MCMC is unusably slow on typical metagenomics datasets (1980 features, 180 samples). With default parameters (n_iter=10000), it gets OOM-killed in Docker. Even with reduced n_iter=1000, it's orders of magnitude slower than GA, ACO, or SA.
Root causes
- Not parallelized — runs entirely sequentially on one core
- SBS (Sequential Backward Selection) iterates from all features down to nmin, running full MCMC at each step — O(features × n_iter) total iterations
- Memory usage — stores trace data for all iterations × all features
- No GPU support — unlike GA/Beam
Benchmark comparison (Qin2014, 1980 features, 180 samples)
| Method |
Time |
Memory |
| ILS |
0.05s |
~10MB |
| LASSO |
0.1s |
~10MB |
| SA |
0.3s |
~10MB |
| GA |
0.5s |
~50MB |
| ACO |
7.1s |
~50MB |
| MCMC |
OOM (>256MB) or minutes |
>>256MB |
Potential fixes
Problem
MCMC is unusably slow on typical metagenomics datasets (1980 features, 180 samples). With default parameters (n_iter=10000), it gets OOM-killed in Docker. Even with reduced n_iter=1000, it's orders of magnitude slower than GA, ACO, or SA.
Root causes
Benchmark comparison (Qin2014, 1980 features, 180 samples)
Potential fixes