L-MARS

L-MARS stands for Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search.

📄 Paper: L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

L-MARS is a multi-agent legal question answering system designed for grounded answers over current legal information. It combines structured query decomposition, agentic web search, evidence filtering, and cited answer synthesis. The project also includes optional local retrieval over user-provided documents and CourtListener integration for case-law search.

What L-MARS does

L-MARS supports two operating modes:

Simple Mode: a single-pass retrieval pipeline that decomposes the question, searches for evidence, and synthesizes a grounded answer.
Multi-Turn Mode: an iterative search-and-verify loop that refines queries until the evidence is sufficient or a maximum number of iterations is reached.

The system can use the following evidence sources:

Web search via Serper
Local RAG over user-provided documents using BM25
CourtListener for case-law retrieval

Pipeline overview

Query Agent parses the question into structured search intents.
Search Agent retrieves evidence from the enabled sources.
Judge Agent checks whether the evidence is sufficient and flags missing information.
Summary Agent writes the final answer with citations and rationale.

Evaluation

The paper evaluates L-MARS on two settings:

LegalSearchQA: a 50-question benchmark that requires post-training, time-sensitive legal knowledge.
Bar Exam QA: a reasoning-focused benchmark where retrieval provides only limited gains.

Reported metrics in the paper focus on accuracy. The benchmark is designed for grounded legal QA rather than classification metrics such as micro F1.

Installation

pip install -r requirements.txt

Run L-MARS

Simple Mode

Quick legal research with online search only:

python main.py "Your legal question"

Enable offline RAG for local documents:

python main.py --offline-rag "Your legal question"

Enable all sources (offline RAG + CourtListener + web search):

python main.py --all-sources "Your legal question"

Verbose output:

python main.py -v "Your legal question"

Multi-Turn Mode

Run iterative research with refinement:

python main.py --multi "Complex contract dispute..."

Set a custom number of iterations:

python main.py --multi --max-iterations 5 "Your question"

Benchmark scripts

If you are reproducing the paper's evaluation pipeline:

python run/single_turn_pipeline.py \
  --dataset legalsearchqa \
  --model openai:gpt-4o-mini \
  --use-cache true \
  --output results/lmars_preds.jsonl

python eval/run_eval.py \
  --preds results/lmars_preds.jsonl \
  --judge-sample 20 \
  --llm_model openai:gpt-4o-mini

Citation

If you use L-MARS in your research, please cite:

@misc{wang2025lmarslegalmultiagentworkflow,
  title={L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search},
  author={Ziqi Wang and Boqin Yuan},
  year={2025},
  eprint={2509.00761},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2509.00761},
}

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
data		data
eval		eval
lmars		lmars
prompts		prompts
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

L-MARS

What L-MARS does

Pipeline overview

Evaluation

Installation

Run L-MARS

Simple Mode

Multi-Turn Mode

Benchmark scripts

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

L-MARS

What L-MARS does

Pipeline overview

Evaluation

Installation

Run L-MARS

Simple Mode

Multi-Turn Mode

Benchmark scripts

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages