Skip to content

Add experimental natural-language query parser prototype for malariagen_data#1254

Open
CharithKalasi wants to merge 1 commit intomalariagen:masterfrom
CharithKalasi:nlp-query-demo
Open

Add experimental natural-language query parser prototype for malariagen_data#1254
CharithKalasi wants to merge 1 commit intomalariagen:masterfrom
CharithKalasi:nlp-query-demo

Conversation

@CharithKalasi
Copy link
Copy Markdown

Overview

This PR introduces an experimental prototype for translating natural-language queries into structured API calls using the malariagen_data package.

The goal is to explore approaches for lowering the barrier to accessing genomic data, particularly for users without programming experience (e.g. public health practitioners and entomologists).

What’s Included

  • A lightweight rule-based NLP parser for common genomic queries
  • Structured query representation (ParsedQuery)
  • Mapping layer to malariagen_data API calls
  • Example workflows demonstrating end-to-end usage
  • Basic tests covering parsing and query mapping

Motivation

The malariagen_data API provides powerful access to genomic datasets, but currently requires familiarity with Python and data structures.

This prototype explores how natural-language interfaces could:

  • Simplify data access
  • Enable faster insights
  • Broaden usability of the platform

Example

Input:
"Show SNP frequency in Uganda"

Pipeline:
Natural language → Parsed query → API call → Result

Design Principles

  • Non-intrusive: no changes to core API
  • Deterministic and transparent parsing
  • Easily extensible for future NLP/LLM integration

Scope

This PR is intended as an exploratory prototype.

Future Work

  • Ontology-based entity normalization
  • LLM-assisted parsing with guardrails
  • Interactive interfaces

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant