prob_conf_mat is a Python package for performing statistical inference with confusion matrices. It quantifies the amount of uncertainty present, aggregates semantically related experiments into experiment groups, and compares experiments against each other for significance.
Installation can be done using from pypi can be done using pip:
pip install prob_conf_matOr, if you're using uv, simply run:
uv add prob_conf_matThe project currently depends on the following packages:
Dependency tree
prob-conf-mat
├── jaxtyping
├── matplotlib
├── numpy
├── scipy
└── tabulateAdditionally, pandas is an optional dependency for some reporting functions.
This project was developed using uv. To install the development environment, simply clone this github repo:
git clone https://github.com/ioverho/prob_conf_mat.gitAnd then run the uv sync --dev command:
uv sync --devThe development dependencies should automatically install into the .venv folder.
For more information about the package, motivation, how-to guides and implementation, please see the documentation website. We try to use Daniele Procida's structure for Python documentation.
The documentation is broadly divided into 4 sections:
- Getting Started: a collection of small tutorials to help new users get started
- How To: more expansive guides on how to achieve specific things
- Reference: in-depth information about how to interface with the library
- Explanation: explanations about why things are the way they are
| Learning | Coding | |
|---|---|---|
| Practical | Getting Started | How-To Guides |
| Theoretical | Explanation | Reference |
This project was developed using the following (amazing) tools:
- Package management:
uv - Linting:
ruff - Static Type-Checking:
basedpyright - Documentation:
zensical - Pre-commit:
prek
Most of the common development commands are included in ./Makefile. If make is installed, you can immediately run the following commands:
Usage:
make <target>
Utility
help Display this help
hello-world Tests uv and make
clean Clean up caches and build artifacts
Environment
install Install default dependencies
install-dev Install dev dependencies
upgrade Upgrade installed dependencies
export Export uv to requirements.txt file
Testing, Linting, Typing & Formatting
test Runs all tests
coverage Checks test coverage
lint Run linting
type Run static typechecking
commit Run pre-commit checks
commit-log Run pre-commit checks in verbose mode and log output to external file
Documentation
docs-build Update the docs
docs-serve Serve documentation siteThe following are some packages and libraries which served as inspiration for aspects of this project: arviz, bayestestR, BERTopic, jaxtyping, mici, , python-ci, statsmodels.
A lot of the approaches and methods used in this project come from published works. Some especially important works include:
- Goutte, C., & Gaussier, E. (2005). A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In European conference on information retrieval (pp. 345-359). Berlin, Heidelberg: Springer Berlin Heidelberg.
- Tötsch, N., & Hoffmann, D. (2021). Classifier uncertainty: evidence, potential impact, and probabilistic treatment. PeerJ Computer Science, 7, e398.
- Kruschke, J. K. (2013). Bayesian estimation supersedes the t test. Journal of Experimental Psychology: General, 142(2), 573.
- Makowski, D., Ben-Shachar, M. S., Chen, S. A., & Lüdecke, D. (2019). Indices of effect existence and significance in the Bayesian framework. Frontiers in psychology, 10, 2767.
- Hill, T. (2011). Conflations of probability distributions. Transactions of the American Mathematical Society, 363(6), 3351-3372.
- Chandler, J., Cumpston, M., Li, T., Page, M. J., & Welch, V. J. H. W. (2019). Cochrane handbook for systematic reviews of interventions. Hoboken: Wiley, 4.
@software{ioverho_prob_conf_mat,
author = {Verhoeven, Ivo},
license = {MIT},
title = {{prob\_conf\_mat}},
url = {https://github.com/ioverho/prob_conf_mat}
}