A web application that determines whether a string belongs to a Context-Free Grammar (CFG) — but instead of using a plain yes/no answer, it models the problem as a probabilistic state-evolution process inspired by quantum computing.
You define a grammar, type a string, and the engine tells you the probability that the string is accepted, along with a full breakdown of how it arrived at that answer.
Real quantum computers manipulate qubits — quantum states that exist in superposition until measured. This project borrows three ideas from that model and applies them to grammar parsing:
| Quantum concept | How it maps to parsing |
|---|---|
| Superposition | All possible derivation paths are explored at the same time, not one after another |
| Amplitude | Each path carries a numerical weight that decays with depth and branching factor |
| Interference | Valid paths that reach the target string reinforce the acceptance probability; invalid ones cancel |
| Measurement | At the end, all amplitudes are collapsed into a single probability value in [0, 1] |
This is a simulation running on ordinary Python — no quantum hardware or quantum libraries are used.
Input string + Grammar
│
▼
1. CNF Conversion
The grammar is transformed into Chomsky Normal Form (CNF).
CNF requires every rule to be either A → BC or A → a.
This is needed for the CYK algorithm to work.
│
▼
2. CYK Amplitude Table (the "ground truth" pass)
The CYK (Cocke–Younger–Kasami) algorithm fills an n×n table
where each cell [i][j] stores the quantum amplitude for every
non-terminal that can span tokens i through j.
Amplitudes multiply like a tensor product:
amp(A→BC, i..j) += amp(B, i..k) × amp(C, k+1..j)
│
▼
3. Derivation Path Simulation (the "visual" pass)
A breadth-first expansion of all leftmost derivations runs
in parallel. Each thread carries an amplitude that:
- splits evenly across sibling rules (1/√branching)
- prefers shorter rule bodies (1/√|body|)
- decays gently with depth (e^(−0.05·depth))
Threads whose terminal prefix contradicts the target are
terminated immediately (destructive interference).
When too many threads exist, the weakest are pruned (decoherence).
│
▼
4. Measurement
CYK result (authoritative, 70%) and simulation result (30%)
are blended into a final acceptance probability.
If probability ≥ threshold → Accept
Otherwise → Reject
QGR/
├── engine.py Core computation: CNF, CYK, simulator, recognizer
├── server.py Flask API server + built-in example grammars
├── requirements.txt Python dependencies (Flask + Flask-CORS)
├── .gitignore
├── README.md This file
└── UI/
├── index.html Single-page frontend
├── style.css Stylesheet
└── app.js Frontend logic (calls Flask API)
- Python 3.10 or newer — check with
python --version - pip — usually bundled with Python
Open a terminal in the project folder and run:
pip install -r requirements.txtThis installs Flask (the web server) and Flask-CORS (allows the browser to talk to the server).
python server.pyYou should see:
══════════════════════════════════════════════════
Quantum Grammar Recognizer
http://127.0.0.1:5050
══════════════════════════════════════════════════
Go to http://127.0.0.1:5050 in any modern browser.
Click any example in the left sidebar to load a pre-built grammar. Five examples are included:
| Example | Language |
|---|---|
| aⁿbⁿ | Strings like a b, a a b b, a a a b b b |
| Palindromes | Strings that read the same forwards and backwards |
| Balanced Parentheses | Properly nested ( ) |
| Arithmetic Expressions | id + id * id style expressions |
| wwᴿ | Strings of the form w followed by w reversed |
Fill in the four fields on the left:
Non-Terminals — the variables in your grammar, comma-separated.
Example: S, A, B
Terminals — the actual symbols the grammar produces, comma-separated.
Example: a, b
Start Symbol — the non-terminal the derivation starts from.
Example: S
Production Rules — one rule per line. Use -> or → as the arrow. Use ε (or leave the right side blank) for an empty production.
S -> a S b
S -> a b
Input String — the string to test, with each token separated by a space.
a a b b
Important: tokens are space-separated. For the aⁿbⁿ grammar, write
a bnotab.
Acceptance Threshold — the slider sets the minimum probability required to declare Accept. The default is 0.010 (1%). Raising it makes the recognizer stricter.
After clicking Run Recognition, the result panel shows:
- Accepted / Rejected — the final decision
- P(accept) — the blended acceptance probability
- Path counts — how many derivation threads were explored, and how many succeeded
Four tabs provide detailed diagnostics:
Amplitudes — a bar chart of all explored paths ranked by their quantum amplitude. Green bars are successful paths; amber bars are failed ones.
Derivation Paths — the actual step-by-step derivation for each thread (up to 20 paths shown).
CYK Table — the filled dynamic-programming table. Green cells indicate the start symbol spanning the full input — that is, acceptance evidence. Teal cells show other non-terminals with non-zero amplitude.
Interference Log — a chronological list of interference events:
+constructive: a rule was applied, adding a new thread−destructive: a thread was terminated because its terminal prefix contradicted the input~decoherence: a thread was pruned because the population exceeded the maximum
The Flask server exposes a small REST API. You can call it directly with curl or any HTTP client.
Returns engine status.
Returns a list of all built-in grammars (name, description, test strings).
Returns the full grammar definition for one example. Valid keys: anbn, palindromes, balanced_parens, arithmetic, wwr.
Run recognition on a custom grammar.
Request body (JSON):
{
"grammar": {
"variables": ["S"],
"terminals": ["a", "b"],
"start": "S",
"productions": [
{"head": "S", "body": ["a", "S", "b"]},
{"head": "S", "body": ["a", "b"]}
]
},
"string": "a a b b",
"threshold": 0.01
}Response (JSON):
{
"accept": true,
"probability": 0.7,
"threshold": 0.01,
"total_paths": 12,
"successful_paths": 2,
"failed_paths": 10,
"amplitude_distribution": [...],
"cyk_table": [...],
"derivation_paths": [...],
"interference_log": [...],
"step_log": [...]
}Same as /api/recognize but uses a built-in grammar by key.
{"example_key": "anbn", "string": "a a b b", "threshold": 0.01}A CFG is a 4-tuple G = (V, Σ, P, S) where:
- V is the set of non-terminal symbols (variables)
- Σ is the set of terminal symbols
- P is the set of production rules of the form A → α
- S ∈ V is the start symbol
The language L(G) is the set of all strings over Σ that can be derived from S by applying rules in P.
The CYK algorithm requires every rule to be in one of two forms:
- A → BC (two non-terminals)
- A → a (one terminal)
Any CFG can be converted to CNF without changing its language. The engine does this automatically using four steps: adding a new start, removing ε-productions, removing unit productions, and binarising long rules.
CYK is an O(n³ · |G|) algorithm that fills a triangular table. Cell [i][j] contains all non-terminals that can derive the substring from position i to j. If the start symbol appears in cell [0][n−1], the string is in the language.
In this engine, cells store amplitudes rather than boolean membership — the amplitude of a non-terminal in a cell reflects its "quantum weight" for spanning that substring.
The simulation models the parser as a quantum system:
- The initial state is
|S⟩with amplitude 1. - Applying a production rule is a transformation of the state.
- All possible rule applications at each step coexist as a superposition.
- Paths that produce a terminal prefix inconsistent with the target undergo destructive interference and are removed.
- At the end, all surviving amplitudes are measured (squared and summed) to produce the acceptance probability.
"Could not reach server" in the sidebar
The Flask server is not running. Start it with python server.py.
"Recognition failed" error in the UI Usually a malformed grammar. Check that:
- every symbol used in a production is declared in the non-terminals or terminals list
- the start symbol is in the non-terminals list
- each production line has the format
Head -> body
Probability is 0 for a string that should be accepted
Make sure tokens are space-separated. The grammar S -> a b produces the two-token string a b, not the single-token string ab.
The server port is already in use
Change the port in the last line of server.py:
app.run(debug=True, port=5051) # change 5050 to anything freeMIT — use, modify, and distribute freely with attribution.