Skip to content

absfish/quantum-grammer-recognizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Quantum Grammar Recognizer (QGR)

A web application that determines whether a string belongs to a Context-Free Grammar (CFG) — but instead of using a plain yes/no answer, it models the problem as a probabilistic state-evolution process inspired by quantum computing.

You define a grammar, type a string, and the engine tells you the probability that the string is accepted, along with a full breakdown of how it arrived at that answer.


What does "quantum-inspired" mean here?

Real quantum computers manipulate qubits — quantum states that exist in superposition until measured. This project borrows three ideas from that model and applies them to grammar parsing:

Quantum concept How it maps to parsing
Superposition All possible derivation paths are explored at the same time, not one after another
Amplitude Each path carries a numerical weight that decays with depth and branching factor
Interference Valid paths that reach the target string reinforce the acceptance probability; invalid ones cancel
Measurement At the end, all amplitudes are collapsed into a single probability value in [0, 1]

This is a simulation running on ordinary Python — no quantum hardware or quantum libraries are used.


How the engine works (step by step)

Input string + Grammar
        │
        ▼
1. CNF Conversion
   The grammar is transformed into Chomsky Normal Form (CNF).
   CNF requires every rule to be either  A → BC  or  A → a.
   This is needed for the CYK algorithm to work.
        │
        ▼
2. CYK Amplitude Table  (the "ground truth" pass)
   The CYK (Cocke–Younger–Kasami) algorithm fills an n×n table
   where each cell [i][j] stores the quantum amplitude for every
   non-terminal that can span tokens i through j.
   Amplitudes multiply like a tensor product:
     amp(A→BC, i..j) += amp(B, i..k) × amp(C, k+1..j)
        │
        ▼
3. Derivation Path Simulation  (the "visual" pass)
   A breadth-first expansion of all leftmost derivations runs
   in parallel. Each thread carries an amplitude that:
     - splits evenly across sibling rules  (1/√branching)
     - prefers shorter rule bodies         (1/√|body|)
     - decays gently with depth            (e^(−0.05·depth))
   Threads whose terminal prefix contradicts the target are
   terminated immediately (destructive interference).
   When too many threads exist, the weakest are pruned (decoherence).
        │
        ▼
4. Measurement
   CYK result (authoritative, 70%) and simulation result (30%)
   are blended into a final acceptance probability.
   If probability ≥ threshold  →  Accept
   Otherwise                   →  Reject

Project structure

QGR/
├── engine.py          Core computation: CNF, CYK, simulator, recognizer
├── server.py          Flask API server + built-in example grammars
├── requirements.txt   Python dependencies (Flask + Flask-CORS)
├── .gitignore
├── README.md          This file
└── UI/
    ├── index.html     Single-page frontend
    ├── style.css      Stylesheet
    └── app.js         Frontend logic (calls Flask API)

Setup and running

Prerequisites

  • Python 3.10 or newer — check with python --version
  • pip — usually bundled with Python

1. Install dependencies

Open a terminal in the project folder and run:

pip install -r requirements.txt

This installs Flask (the web server) and Flask-CORS (allows the browser to talk to the server).

2. Start the server

python server.py

You should see:

══════════════════════════════════════════════════
  Quantum Grammar Recognizer
  http://127.0.0.1:5050
══════════════════════════════════════════════════

3. Open the app

Go to http://127.0.0.1:5050 in any modern browser.


Using the app

Sidebar

Click any example in the left sidebar to load a pre-built grammar. Five examples are included:

Example Language
aⁿbⁿ Strings like a b, a a b b, a a a b b b
Palindromes Strings that read the same forwards and backwards
Balanced Parentheses Properly nested ( )
Arithmetic Expressions id + id * id style expressions
wwᴿ Strings of the form w followed by w reversed

Defining your own grammar

Fill in the four fields on the left:

Non-Terminals — the variables in your grammar, comma-separated. Example: S, A, B

Terminals — the actual symbols the grammar produces, comma-separated. Example: a, b

Start Symbol — the non-terminal the derivation starts from. Example: S

Production Rules — one rule per line. Use -> or as the arrow. Use ε (or leave the right side blank) for an empty production.

S -> a S b
S -> a b

Input String — the string to test, with each token separated by a space.

a a b b

Important: tokens are space-separated. For the aⁿbⁿ grammar, write a b not ab.

Acceptance Threshold — the slider sets the minimum probability required to declare Accept. The default is 0.010 (1%). Raising it makes the recognizer stricter.

Reading the results

After clicking Run Recognition, the result panel shows:

  • Accepted / Rejected — the final decision
  • P(accept) — the blended acceptance probability
  • Path counts — how many derivation threads were explored, and how many succeeded

Four tabs provide detailed diagnostics:

Amplitudes — a bar chart of all explored paths ranked by their quantum amplitude. Green bars are successful paths; amber bars are failed ones.

Derivation Paths — the actual step-by-step derivation for each thread (up to 20 paths shown).

CYK Table — the filled dynamic-programming table. Green cells indicate the start symbol spanning the full input — that is, acceptance evidence. Teal cells show other non-terminals with non-zero amplitude.

Interference Log — a chronological list of interference events:

  • + constructive: a rule was applied, adding a new thread
  • destructive: a thread was terminated because its terminal prefix contradicted the input
  • ~ decoherence: a thread was pruned because the population exceeded the maximum

API reference

The Flask server exposes a small REST API. You can call it directly with curl or any HTTP client.

GET /api/health

Returns engine status.

GET /api/examples

Returns a list of all built-in grammars (name, description, test strings).

GET /api/example/<key>

Returns the full grammar definition for one example. Valid keys: anbn, palindromes, balanced_parens, arithmetic, wwr.

POST /api/recognize

Run recognition on a custom grammar.

Request body (JSON):

{
  "grammar": {
    "variables":   ["S"],
    "terminals":   ["a", "b"],
    "start":       "S",
    "productions": [
      {"head": "S", "body": ["a", "S", "b"]},
      {"head": "S", "body": ["a", "b"]}
    ]
  },
  "string":    "a a b b",
  "threshold": 0.01
}

Response (JSON):

{
  "accept":            true,
  "probability":       0.7,
  "threshold":         0.01,
  "total_paths":       12,
  "successful_paths":  2,
  "failed_paths":      10,
  "amplitude_distribution": [...],
  "cyk_table":         [...],
  "derivation_paths":  [...],
  "interference_log":  [...],
  "step_log":          [...]
}

POST /api/recognize_example

Same as /api/recognize but uses a built-in grammar by key.

{"example_key": "anbn", "string": "a a b b", "threshold": 0.01}

Theoretical background

Context-Free Grammars (CFGs)

A CFG is a 4-tuple G = (V, Σ, P, S) where:

  • V is the set of non-terminal symbols (variables)
  • Σ is the set of terminal symbols
  • P is the set of production rules of the form A → α
  • S ∈ V is the start symbol

The language L(G) is the set of all strings over Σ that can be derived from S by applying rules in P.

Chomsky Normal Form (CNF)

The CYK algorithm requires every rule to be in one of two forms:

  • A → BC (two non-terminals)
  • A → a (one terminal)

Any CFG can be converted to CNF without changing its language. The engine does this automatically using four steps: adding a new start, removing ε-productions, removing unit productions, and binarising long rules.

The CYK Algorithm

CYK is an O(n³ · |G|) algorithm that fills a triangular table. Cell [i][j] contains all non-terminals that can derive the substring from position i to j. If the start symbol appears in cell [0][n−1], the string is in the language.

In this engine, cells store amplitudes rather than boolean membership — the amplitude of a non-terminal in a cell reflects its "quantum weight" for spanning that substring.

Quantum inspiration

The simulation models the parser as a quantum system:

  1. The initial state is |S⟩ with amplitude 1.
  2. Applying a production rule is a transformation of the state.
  3. All possible rule applications at each step coexist as a superposition.
  4. Paths that produce a terminal prefix inconsistent with the target undergo destructive interference and are removed.
  5. At the end, all surviving amplitudes are measured (squared and summed) to produce the acceptance probability.

Troubleshooting

"Could not reach server" in the sidebar The Flask server is not running. Start it with python server.py.

"Recognition failed" error in the UI Usually a malformed grammar. Check that:

  • every symbol used in a production is declared in the non-terminals or terminals list
  • the start symbol is in the non-terminals list
  • each production line has the format Head -> body

Probability is 0 for a string that should be accepted Make sure tokens are space-separated. The grammar S -> a b produces the two-token string a b, not the single-token string ab.

The server port is already in use Change the port in the last line of server.py:

app.run(debug=True, port=5051)   # change 5050 to anything free

License

MIT — use, modify, and distribute freely with attribution.

About

A web app that treats CFG membership as a quantum-inspired probability problem. It explores all derivation paths in parallel, assigns amplitudes, and collapses them into an acceptance probability — built with Python, Flask, and vanilla JS. No quantum hardware, just the ideas.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors