Skip to content
#

bird-benchmark

Here are 8 public repositories matching this topic...

Language: All
Filter by language

Production-grade Asymmetric Multi-Agent Text-to-SQL on BIRD-SQL. Offline CHESS/FAISS pruning + MCI-SQL enrichment feed ≤3 Groq API calls: gpt-oss-120b generator · llama-4-scout reflector · gpt-oss-20b critic. LangGraph · SQLAlchemy sandbox · LangSmith · Docker · MIT.

  • Updated May 24, 2026
  • Python

Portfolio NL→SQL assistant: 94.0% execution accuracy on BIRD Mini-Dev (above the 92.96% human-expert baseline), schema-RAG + sqlglot AST guards + deterministic chart picker, $0 free-tier LLM stack. Streamlit chat UI, live demo. · Портфолио NL→SQL: 94% EA на BIRD Mini-Dev, schema-RAG + sqlglot-гарды + chart picker, бесплатный стек, живое демо.

  • Updated Jun 17, 2026
  • HTML

Improve this page

Add a description, image, and links to the bird-benchmark topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the bird-benchmark topic, visit your repo's landing page and select "manage topics."

Learn more