A compiler for a Persian-syntax programming language, built with Flex and Bison. Source programs are written entirely in Persian script (including keywords, identifiers, and numerals), and the compiler produces intermediate C code via a quadruple-based intermediate representation.
- Full Persian keyword set — all reserved words are valid Persian words
- Persian & ASCII numerals — mix
۱۲۳and123freely - Persian & ASCII punctuation — semicolons (
؛/;), commas (،/,), and question marks (؟/?) - Four primitive types — integer, real, character, boolean
- Structs, global/local variables, constants, and arrays
- Functions with typed parameters and return values
- Control flow — if/else, while, switch/case/default
- Rich operator set — arithmetic, relational, logical, assignment shortcuts, increment/decrement
- Intermediate code generation — outputs a valid C file (
generatedCode.c) that can be compiled with any C compiler - Parse trace — writes grammar rule reductions to
output.txtfor debugging
| Persian | English equivalent |
|---|---|
برنامه |
program (entry point declaration) |
ساختار |
struct |
ثابت |
const |
صحیح |
int |
اعشاری |
real / float |
حرف |
char |
منطقی |
bool |
درست |
true |
غلط |
false |
اگر |
if |
آنگاه |
then |
وگرنه |
else |
کلید |
switch |
حالت |
case |
پیشفرض |
default |
تمام |
end (closes a switch block) |
وقتی |
while |
برگردان |
return |
بشکن |
break |
و |
and (&&) |
یا |
or (||) |
یاوگرنه |
xor |
وهمچنین |
also (short-circuit and) |
خلاف |
not (!) |
| Category | Operators |
|---|---|
| Arithmetic | + - * / % |
| Relational | < > <= >= == |
| Assignment | = += -= *= /= |
| Increment / Decrement | ++ -- |
| Unary | - (negation) * (array length) ؟/? (random element) |
برنامه فاکتوریل
صحیح محاسبه_فاکتوریل (صحیح عدد) {
اگر عدد < ۰ آنگاه برگردان -۱؛
وگرنه اگر عدد == ۱ آنگاه برگردان ۱؛
وگرنه {
صحیح قبلی = محاسبه_فاکتوریل (عدد - ۱)؛
برگردان قبلی * عدد؛
}
}
صحیح اصلی () {
صحیح نتیجه = محاسبه_فاکتوریل(۵)؛
}
برنامه جمله
صحیح اصلی () {
صحیح کنترلگر = ۱۰؛
صحیح مقدار = ۱؛
وقتی (درست) {
اگر کنترلگر < ۰ آنگاه بشکن؛
وگرنه {
مقدار *= کنترلگر؛
کنترلگر--؛
}
}
صحیح عملگر = ۲؛
اعشاری اول = ۲.۵؛
صحیح دوم = ۴؛
کلید (عملگر)
حالت ۱: { اول += دوم؛ بشکن؛ }؛
حالت ۲: { اول = اول - دوم؛ بشکن؛ }؛
پیشفرض: اول = ۰؛
تمام
}
برنامه من
ساختار ماهی {
ثابت صحیح آ = ۲۳؛
اعشاری ب = ۰.۰۳؛
حرف ث = 'آ'؛
منطقی ت = درست؛
}
| Tool | Purpose |
|---|---|
| Flex | Lexer generator |
| Bison | Parser generator |
clang++ (C++14) or g++ |
Compiles the generated scanner and parser |
Install on Debian/Ubuntu:
sudo apt-get install flex bison clang# Build the compiler and run it against input.txt
make
# Clean all generated files
make clean
# Build with verbose parser output (parser.output)
make debugThe compiler reads from input.txt and produces:
| Output file | Contents |
|---|---|
output.txt |
Grammar rule trace (one line per reduction) |
generatedCode.c |
Intermediate C code ready to compile |
To execute the generated program:
clang generatedCode.c -o program && ./programPersian-Compiler/
├── lex.lex # Flex lexer — tokenises Persian source code
├── parser.y # Bison parser — grammar rules + intermediate code generation
├── llist.h # Linked-list header (used for backpatching)
├── llist.cpp # Linked-list implementation
├── Makefile # Build rules
├── input.txt # Sample program (main test input)
├── expression.txt # Sample: expressions and arrays
├── function.txt # Sample: functions and structs
└── statement.txt # Sample: control-flow statements
Persian source (.txt)
│
▼
Lexer (Flex) lex.lex — converts Persian text to tokens,
│ handles both Persian & ASCII digits/punctuation
▼
Parser (Bison) parser.y — validates grammar, builds symbol table,
│ emits quadruples (4-tuple intermediate representation)
▼
Code Generator parser.y — translates quadruples to C statements,
│ writes generatedCode.c
▼
generatedCode.c — standard C file, compilable with clang/gcc
The intermediate representation uses quadruples of the form (operation, arg1, arg2, result), covering arithmetic, comparisons, control-flow jumps, array access, type casts, and print helpers.