active-learning

Labels cost money. Active learning asks: instead of labeling random examples, label the ones the model learns most from — and reach the same accuracy with fewer labels. This project measures exactly how many labels each query strategy needs to hit a target accuracy, against a random baseline, on a real dataset.

active-learning                 # labels-to-target table
active-learning --target-acc 0.95 --batch 10
active-learning --json

The loop

Start from a tiny labeled seed (2 examples per class), a large pool of unlabeled digits, and a held-out test set. Each round: fit a logistic-regression model, score the test set, then query a batch from the pool with the chosen strategy, "label" them, and repeat — recording the learning curve (labels used → accuracy). The query strategies:

random — the baseline (label a random batch).
least_confidence — label where the top class probability is lowest.
margin — label where the top-two classes are closest (smallest margin).
entropy — label where the probability distribution is flattest.

Measured results

active-learning on scikit-learn digits (10 classes, seed 20 / pool 1237 / test 540), target 90%:

strategy	labels → 90%	saved vs random	AUC	final acc
random	130	—	0.885	94.6%
least_confidence	150	−15% (worse)	0.899	97.0%
margin	80	+38%	0.928	97.6%
entropy	180	−38% (worse)	0.883	96.7%

The honest finding has two halves:

Active learning works — margin sampling reaches 90% accuracy with 80 labels vs random's 130, a 38% labeling saving — and a higher final accuracy (97.6% vs 94.6%) at the same budget. In a label-scarce setting that's real money saved.
But "use active learning" is not the advice — "use the right strategy" is. On this dataset the two most popular uncertainty heuristics, least-confidence and entropy, actually underperform random (needing 15% and 38% more labels to hit the target). They fixate on inherently-ambiguous points — near-duplicate or outlier digits — that the model can never get right and that don't move the decision boundary. Margin sampling avoids that trap by targeting points right on a class boundary (top-two nearly tied), which is where a label actually reshapes the model. Pick the wrong query rule and active learning costs you labels.

Why it matters

The label budget is the binding constraint on most applied ML projects. This shows the discipline to (a) measure label efficiency as a learning curve rather than assume it, and (b) report the uncomfortable part — that a fashionable technique can backfire, and the win comes from the specific strategy, validated on real data, not the buzzword.

Install & test

pip install -e ".[dev]"
pytest -q          # 6 passed — incl. "an active strategy beats random on labels-to-target"

Stack

scikit-learn (logistic regression, digits), NumPy. Pool-based active-learning loop with least-confidence / margin / entropy query strategies, learning-curve + labels-to-target scoring.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.github/workflows		.github/workflows
src/activelearn		src/activelearn
tests		tests
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

active-learning

The loop

Measured results

Why it matters

Install & test

Stack

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

active-learning

The loop

Measured results

Why it matters

Install & test

Stack

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages