feat: JOSS paper draft — paper/, figures, CONTRIBUTING.md, Python data prep scripts#20
Draft
OldCrow wants to merge 22 commits into
Draft
feat: JOSS paper draft — paper/, figures, CONTRIBUTING.md, Python data prep scripts#20OldCrow wants to merge 22 commits into
OldCrow wants to merge 22 commits into
Conversation
- paper/paper.md: JOSS submission draft (v3.7.0)
- paper/paper.bib: 16-entry bibliography
- paper/figures/generate_figures.py: generates all three paper figures
(speedup comparison, ECME convergence, VonMises boundary analysis)
- paper/figures/figure{1,2,3}_{speedup,convergence,wind_boundary}.{pdf,png}
- CONTRIBUTING.md: build/test instructions for macOS and Windows
- scripts/prepare_dax_data.py: Python equivalent of prepare_dax_data.R
- scripts/prepare_wind_data.py: Python equivalent of prepare_wind_data.R
Co-Authored-By: Oz <oz-agent@warp.dev>
- paper.md: add Python bindings paragraph in Software Design section - paper.md: replace [Author Name] placeholder with Gary Wolfman - paper.bib: add @software{pylibhmm} entry (v0.4.0) Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
- Summary: note JAHMM port origin and MASc research context - Acknowledgements: RMC Computer Security Lab, JAHMM, Kevin Forest (YASWIN) - paper.bib: add @software{JAHMM} entry (attribution to confirm) Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
- paper/arxiv/libhmm_arxiv.tex: full draft (~772 lines) - paper/arxiv/libhmm_arxiv.bib: 20-entry bibliography Co-Authored-By: Oz <oz-agent@warp.dev>
… data Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
GHMM 0.9-rc3 vs libhmm on macOS Catalina (Intel i7-3820QM, AppleClang, AVX): GHMM: ~20,545 obs/ms vs libhmm: ~2,235 obs/ms — 9.2x ratio Same hardware, same compiler, same benchmark harness. Larger ratio vs HMMLib (9.2x vs 3.2x) explained by weaker SIMD tier on macOS (AVX-only vs AVX-512 on Ryzen 7). Co-Authored-By: Oz <oz-agent@warp.dev>
…ments) Co-Authored-By: Oz <oz-agent@warp.dev>
… accuracy - Section order: Elk → DAX → S&P 500 → Earthquake → Wind - Timing table reordered to match - Hardware note: correctly identifies DAX/S&P/Wind as Windows Ryzen 7, Elk/Earthquake as macOS Catalina (marked with dag, pending re-run) - Earthquake: note data is embedded (no CSV needed for Windows re-run) - AMD Ryzen 7 7745 specified by model in hardware note Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
…ents - Elk: obs count 14,394 -> 725 (moveHMM::elk_data bundled subset) Travelling state params corrected (pre-fix local-optimum replaced): libhmm step mean 1741 -> 3189 m, SD 1519 -> 4392 m, kappa 0.782 -> 0.204 moveHMM reference 1751 -> 3247 m, SD 1527 -> 4394 m, kappa 0.780 -> 0.208 Wall time 99 ms -> 55 ms (libhmm), ~2000 ms -> ~1270 ms (moveHMM, same machine) Speedup ~20x -> ~23x (same-machine comparison) - Earthquake: wall time 4 ms -> 2 ms (warm), speedup ~5x -> ~10x - Remove dagger notation; all libhmm timings now on Windows Ryzen 7 / AVX-512 - Regenerate figure1_speedup with corrected data - Old elk travelling-state numbers were from a pre-fix run (VonMises kappa convergence bug, fixed in v3.7.0); new numbers match moveHMM reference within 2% on the same dataset Co-Authored-By: Oz <oz-agent@warp.dev>
…4.9x Fresh benchmark on Intel Core i7-7820HQ (Kaby Lake), macOS 13.7.8 Ventura, AppleClang, march=native (AVX2+FMA). Same hardware for both libraries. GHMM 0.9-rc3 vs libhmm v3.7.0 (Dishonest Casino + Weather, T in 10^3..10^6): libhmm: 4,277 obs/ms average GHMM: 20,775 obs/ms average Ratio: 4.86x (GHMM faster) All log-likelihoods match to machine precision. Replaces macOS Catalina/Ivy Bridge/AVX result (9.2x) with same-hardware AVX2 measurement. Removes speculation about Ryzen 7 ratio. Attributes GHMM advantage to flat C array layout + scaled FB, consistent with the HMMLib explanation already in the paper. Co-Authored-By: Oz <oz-agent@warp.dev>
…e version improvement - libhmm DAX timing: 2 s -> 1.1 s (Ryzen 7 / AVX-512 / Windows) - fHMM: 1360 s (v1.2.0, Ivy Bridge) -> 5.5 s / 13 s (v1.4.3, same machine, 1 run / 10 restarts) - Speedup: ~680x -> ~5x / ~12x (same-machine, current versions) - Both papers: add version-comparison note acknowledging fHMM improvements v1.2.0 -> v1.4.3 - JOSS: fix 'on the same hardware' error; fill date (26 May 2026); update Figure 1 caption - arXiv: update Table 2 and hardware note; add wall-time paragraph to Section 6.2 - Regenerate figure1_speedup with corrected data Co-Authored-By: Oz <oz-agent@warp.dev>
Co-Authored-By: Oz <oz-agent@warp.dev>
- Author metadata: name -> given-names/surname/suffix (avoids parser issues) - AI disclosure: add model/platform version info per updated JOSS AI policy - Acknowledgements: add no-funding statement; restore JAHMM sentence; suggested - Add .github/workflows/paper-draft.yml: builds JOSS draft PDF via openjournals/inara on every push to joss-paper touching paper/ files Co-Authored-By: Oz <oz-agent@warp.dev>
- Flatten figure paths: copy figure2/figure3 into arxiv/ dir, remove ../figures/ prefix
- Add bookmarks=false to hyperref (guards against PDF conversion failure)
- Add typeout 4-pass hint after \end{document}
Co-Authored-By: Oz <oz-agent@warp.dev>
…d array/placeins/caption packages Co-Authored-By: Oz <oz-agent@warp.dev>
- std::cyl_bessel_i: unavailable on ALL Apple platforms (not just Catalina) Apple libc++ has not implemented C++17 special math functions on any macOS version (absent as of Xcode 15 / macOS 14). Fixed in both papers. Verified against include/libhmm/math/bessel.h and CMakeLists.txt. - JOSS wind boundary count: 990 -> 730 hours in 330-360 bin, matching the arXiv table, generate_figures.py annotation, and analysis output. Co-Authored-By: Oz <oz-agent@warp.dev>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
JOSS paper draft
Adds the initial JOSS submission materials for libhmm v3.7.0.
Contents
paper/paper.md— JOSS submission draftpaper.bib— bibliography (16 entries)figures/generate_figures.py— generates all three paper figures from embedded/CSV datafigures/figure1_speedup.{pdf,png}— wall-time comparison: libhmm vs R packages (log scale)figures/figure2_convergence.{pdf,png}— ECME EM convergence on DAX 2000–2022; shows 152-nat improvement over kurtosis MOM and fHMM surpassed at iteration 5figures/figure3_wind_boundary.{pdf,png}— VonMisesDistribution vs Normal boundary failure at 0°/360°; wind rose + disagreement rate per 30° binCONTRIBUTING.md— build/test instructions for macOS (Catalina+, Apple Clang) and Windows (VS 2022)scripts/prepare_dax_data.py— Python equivalent ofprepare_dax_data.R; downloads DAX and S&P 500 log-returns via yfinance (for systems without R)scripts/prepare_wind_data.py— Python equivalent ofprepare_wind_data.R; downloads NOAA ISD O'Hare 2015 wind data via urllibFigure notes
All figures generated on Windows (MSVC/AVX-512) from live benchmark runs:
examples/README.mddax_regime_example(5,838 observations, 200 EM iterations); MOM reference from v3.6.0 run on same dataohare_wind_2015.csv(11,894 hourly observations) using fitted VonMises parameters fromwind_direction_exampleTODO before submission
paper.mdWarp conversation
Co-Authored-By: Oz oz-agent@warp.dev