German-qualified lawyer and former NLP data scientist. I turn EU financial and AI regulation — MiCAR, the EU AI Act, DORA — into tested, cited, reviewable software.
Most engineers who build legal AI cannot read a regulation at the Article level. Most lawyers who can cannot ship code. I do both: I trained as a Volljurist, practised at Hengeler Mueller, Freshfields and Cleary Gottlieb, and built Python NLP pipelines at Dudenverlag before that. Today I build the layer that makes legal AI safe to rely on — evaluation, supervised workflows, and source-grounded regulatory automation.
These are synthetic proof-of-work prototypes, built in 2026 to show how I structure legal AI workflows. They are not production systems and not client work. Every example uses synthetic data only: no client data, no privileged material, no candidate data, no personal data.
If you are evaluating me for a Legal Engineer, Forward-Deployed, or Product role, start with the one repo that runs end to end:
git clone https://github.com/sebastianfoerste/contract-review-eval-harness
cd contract-review-eval-harness
make install && make test # 8 unit tests; the scorer itself is tested
make demo # writes scorecard.md for a synthetic NDAThe scorecard grades AI contract-review output against a hand-authored answer set: clause F1 0.91, citation grounding 4/5, and a hallucination count of 1 — a fabricated citation the harness catches and marks for rejection. That is the whole thesis: legal AI quality is measured, not asserted. Runs offline and deterministic, no API key required.
Then skim two more:
legal-ops-agent— a supervised legal workflow with typed intake, deterministic risk triage, reviewer routing, and an approval gate that blocks export until a human signs off.eu-ai-act-classifier— deterministic first-pass EU AI Act classification with cited risk tiers, obligations, and explicit review status.
Legal AI quality should be tested, not promised. I build harnesses that check whether an output is grounded, complete, and safe to rely on, and that count the failures lawyers care about — a risk flagged at the wrong severity, a citation that is not in the document.
| Repository | Focus |
|---|---|
contract-review-eval-harness |
Scores contract-review output against a gold answer set: clause precision/recall, risk-flag accuracy, citation grounding, hallucination count. |
Useful legal AI keeps intake structured, assumptions visible, and human judgment in the loop. These prototypes explore how agentic legal work stays trustworthy without skipping review, provenance, or approval.
| Repository | Focus |
|---|---|
legal-ops-agent |
Supervised workflow: typed matter intake, risk triage, reviewer routing, approval-controlled export, audit trail. |
legal-ai-adoption-dashboard |
Adoption signals after the demo — account health, practice-group usage, blockers, product feedback. |
ai-saas-legal-ops-starter-kit |
Operating layer for recurring AI SaaS legal work: contract intake, DPA triage, vendor review, launch governance. |
legal-ai-workshop-kit |
Enablement materials for legal AI workshops, workflow discovery, and adoption follow-up. |
This is the part no generalist engineer can fake. I encode EU regulation into deterministic checks with cited findings and a visible review state — designed as a review packet, never as legal advice.
| Repository | Focus |
|---|---|
eu-ai-act-classifier |
First-pass EU AI Act classification with cited risk tiers, obligations, timelines, and review status. |
eu-financial-reg-horizon-scanner |
Source-aware monitoring for EU financial regulation. |
micar-whitepaper-linter |
Deterministic MiCAR white-paper checks with cited findings and remediation output. |
MiCAR-Authorization-Co-Pilot |
Source-anchored MiCAR authorisation drafting and review workflow. |
dora-third-party-register-and-resilience-workbench |
DORA third-party register and resilience-testing workbench with board-pack export. |
Useful legal AI is not about generating text. The harder questions are the ones I build around:
- Is the legal intake structured before drafting begins?
- Are assumptions, sources, and gaps visible?
- Can a user see what is draft, checked, approved, or blocked?
- Can quality be tested instead of merely asserted?
- Can the workflow make a lawyer faster without pretending judgment has disappeared?
That is why these projects lean on deterministic checks, evaluation scripts, explicit review states, blocked exports, and audit trails — not just prompts.
Partner at gunnercooke in Germany, advising on AI, SaaS, crypto, capital markets, payments, and EU financial regulation. German-qualified lawyer, admitted 2012; trained at Hengeler Mueller, Freshfields Bruckhaus Deringer, and Cleary Gottlieb. Earlier, data scientist at Dudenverlag building Python NLP pipelines.
Languages: German (native), English (fluent), French (professional working knowledge).
Synthetic examples only. No client data, no privileged material, no confidential negotiation history, no candidate data, no personal data. Public outputs are draft and review artifacts; they are not legal advice.


