Skip to content

Commit 01380c9

Browse files
Add BOOL().
Add styling to SPECIFICATION.md
1 parent d271df8 commit 01380c9

File tree

4 files changed

+37
-17
lines changed

4 files changed

+37
-17
lines changed

SPECIFICATION.md

Lines changed: 30 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,21 @@
11
# Abstract State Machine Language Specification
22

3+
<style>
4+
@import url('https://fonts.googleapis.com/css2?family=Open+Sans:wght@400;700&display=swap');
5+
@import url('https://fonts.googleapis.com/css2?family=Source+Code+Pro:wght@400;700&display=swap');
6+
7+
:root { --asm-default-text-color: rgb(153,221,255); }
8+
:root { --asm-default-background-color: rgb(34,34,34); }
9+
:root { --asm-keyword-background-color: rgb(60,60,60); }
10+
11+
body { color: var(--asm-default-text-color); background-color: var(--asm-default-background-color); font-family: 'Open Sans', sans-serif; }
12+
code { color: var(--asm-default-text-color); background-color: var(--asm-keyword-background-color); font-family: 'Source Code Pro', monospace; }
13+
</style>
14+
15+
<p align="center">
16+
<img alt="ASM-Lang icon" src="./icon.png">
17+
</p>
18+
319
This document specifies an imperative programming language with a binary integer data model and an explicit small-step execution semantics. A program is compiled into an initial machine state (a seed configuration) and then executed solely by repeatedly applying a fixed, program-independent state-transition (rewrite) function. All intermediate states are (semi-)human-readable and serializable, and execution can be traced and replayed exactly, including I/O and nondeterministic choices, which are modeled explicitly for deterministic replay.
420

521
## 1. Overview
@@ -123,13 +139,7 @@ Every state produced during execution is serializable and human-readable. The in
123139

124140
## 8. Standard Library
125141

126-
The standard library is a group of modules for common use cases distributed with the interpreter.
127-
It includes modules for:
128-
- CSPRNG: Cryptographically secure (ChaCha20) pseudorandom number generation with seed management.
129-
- Decimal: convert decimal strings to/from binary integers.
130-
- Prime: primality testing and prime number generation, along with factorization and other common prime utilities.
131-
- PRNG: Fast (LCG) pseudorandom number generation with seed management.
132-
- Waveforms: Waveforms generator functions.
142+
The standard library is a group of modules for common use cases distributed with the interpreter. at the ASM-Lang directory \lib (or /lib on non-Windows systems).
133143

134144
## 9. Tracebacks and Error Reporting
135145

@@ -138,11 +148,11 @@ When a runtime error or exception occurs the interpreter must produce a determin
138148
### 9.1 Traceback semantics (high level)
139149

140150

141-
GOTOPOINT and GOTO: Two additional control-flow primitives allow program execution to jump to a dynamically-registered program location. The statement form `GOTOPOINT(n)` evaluates the expression `n` and registers a gotopoint with that identifier at the point where the statement executes. Registration occurs at runtime when the `GOTOPOINT` statement executes; subsequent execution (including execution within loops or after imports) may depend on whether the gotopoint has been registered. Identifiers may be `INT` or `STR`; negative `INT` identifiers are invalid.
151+
`GOTOPOINT` and `GOTO`: Two additional control-flow primitives allow program execution to jump to a dynamically-registered program location. The statement form `GOTOPOINT(n)` evaluates the expression `n` and registers a gotopoint with that identifier at the point where the statement executes. Registration occurs at runtime when the `GOTOPOINT` statement executes; subsequent execution (including execution within loops or after imports) may depend on whether the gotopoint has been registered. Identifiers may be `INT` or `STR`; negative `INT` identifiers are invalid.
142152

143-
The statement form `GOTO(n)` evaluates `n` at runtime and transfers execution to the previously-registered gotopoint whose identifier equals `n`, matching both type and value. If no gotopoint with that identifier has been registered in a scope visible to the jump target, the interpreter raises a runtime error. GOTO may jump forward or backward relative to the target gotopoint; jumping to an unregistered identifier is an error. Gotopoints are not restricted to a single lexical block: they are visible across the containing function (or top-level program scope) in which they are defined. Implementations may choose to expose an even broader scope (for example, process-wide), but by default gotopoints registered within a function or at top-level are available to any GOTO executed within the same function or top-level code. This change enables cross-block jumps while preserving a clear containment model: a GOTO cannot target a gotopoint defined in an unrelated function or module unless the implementation explicitly exposes that mapping.
153+
The statement form `GOTO(n)` evaluates `n` at runtime and transfers execution to the previously-registered gotopoint whose identifier equals `n`, matching both type and value. If no gotopoint with that identifier has been registered in a scope visible to the jump target, the interpreter raises a runtime error. `GOTO` may jump forward or backward relative to the target gotopoint; jumping to an unregistered identifier is an error. Gotopoints are not restricted to a single lexical block: they are visible across the containing function (or top-level program scope) in which they are defined. Implementations may choose to expose an even broader scope (for example, process-wide), but by default gotopoints registered within a function or at top-level are available to any `GOTO` executed within the same function or top-level code. This change enables cross-block jumps while preserving a clear containment model: a `GOTO` cannot target a gotopoint defined in an unrelated function or module unless the implementation explicitly exposes that mapping.
144154

145-
GOTO and GOTOPOINT are intended to be low-level primitives and their use can make programs harder to reason about. They are serialized in the stat log like other statements so that execution is fully replayable for debugging and tracing.
155+
`GOTO` and `GOTOPOINT` are intended to be low-level primitives and their use can make programs harder to reason about. They are serialized in the stat log like other statements so that execution is fully replayable for debugging and tracing.
146156
- Trigger: a traceback is produced when a runtime error occurs that prevents normal forward execution (e.g., an assertion failure, divide by zero, undefined variable reference, executing `RETURN` outside of a function, or any other interpreter-defined runtime error).
147157
- Content: the traceback must list frames in chronological call order from the outermost (earliest) frame to the innermost (where the error occurred), and for each frame include: function name (or `<top-level>` for global code), precise source location, a short excerpt of the offending statement, and identifiers linking to the corresponding states in the state log.
148158
- State linkage: every frame must reference at least one serialized state snapshot from the state log that corresponds to the machine configuration immediately before the failing rewrite step for that frame. The innermost frame must also reference the rewrite (transition) record that produced the error (that is, the failing step).
@@ -163,10 +173,10 @@ To make tracebacks precise and implementable, the state log (and the machine sta
163173

164174
The following format is recommended for the concise textual traceback:
165175

166-
Traceback (most recent call last):
176+
<code>Traceback (most recent call last):
167177
File "<file>", line <line>, in <function_or_<top-level>>
168178
<statement excerpt>
169-
State log index: <step_index> State id: <state_id>
179+
State log index: <step_index> State id: <state_id></code>
170180

171181
The final (innermost) frame is then followed by a short error message that includes the failing rewrite (for example: `DivisionByZero`), the rewrite rule name and the `step_index` at which it failed. In verbose mode, each frame block is followed by a labelled `State snapshot: and a `State transformation: section that contain the serialized `env_snapshot` and the `rewrite_record` (including `from_state_id` and `to_state_id`) so that the failure can be reproduced by replaying the states and the associated inputs.
172182

@@ -191,14 +201,14 @@ The trace must clearly identify the point of origin (the expression or statement
191201

192202
### 9.8 Example (concise textual traceback)
193203

194-
Traceback (most recent call last):
204+
<code>Traceback (most recent call last):
195205
File "prog.asmln", line 21, in <top-level>
196206
result = compute(foo, bar)
197207
State log index: 121 State id: s_0121
198208
File "lib.asmln", line 88, in compute
199209
x = DIV(a, b)
200210
State log index: 123 State id: s_0123
201-
DivisionByZero: attempted to DIV by zero at step_index=123 (rewrite: DIV)
211+
DivisionByZero: attempted to DIV by zero at step_index=123 (rewrite: DIV)</code>
202212

203213
This example shows the outer call at the program top-level and the innermost failing call in `compute`. A verbose report would additionally print the `env_snapshot` for `compute` showing `a: 17, b: 0` and the full `rewrite_record` describing the DIV operation that failed.
204214

@@ -300,6 +310,7 @@ Tensor indexing uses the expression form `tensor[i1, i2, ..., iN]`. The number o
300310
- `OR(ANY: a1, ..., ANY: aN):INT` ; Boolean OR
301311
- `XOR(ANY: a1, ..., ANY: aN):INT` ; Boolean XOR
302312
- `NOT(ANY: a):INT` ; Boolean NOT
313+
- `BOOL(ANY: item):INT` ; returns the truthiness of `item` as `INT` (`INT`: non-zero -> `1`, `STR`: non-empty -> `1`, `TNS`: true if any element is true), otherwise `0`
303314

304315
### Comparisons
305316
- `EQ(ANY: a, ANY: b):INT` ; 1 if a == b else 0
@@ -326,7 +337,11 @@ Tensor indexing uses the expression form `tensor[i1, i2, ..., iN]`. The number o
326337
- `MADD/MSUB/MMUL/MDIV(TNS: x, TNS: y):TNS` — Elementwise addition, subtraction, multiplication, and integer division. Shapes must match; all elements must be `INT`. `MDIV` raises on division by zero.
327338
- `MSUM(TNS: t1, ..., TNS: tN):TNS` — Elementwise sum across tensors. Shapes must match; elements must be `INT`.
328339
- `MPROD(TNS: t1, ..., TNS: tN):TNS` — Elementwise product across tensors. Shapes must match; elements must be `INT`.
329-
- `TADD/TSUB/TMUL/TDIV/TPOW(TNS: x, INT: y):TNS` — Elementwise `INT`-scalar add, subtract, multiply, divide, and exponentiate. Division by zero and negative exponents are errors (matching `DIV`/`POW` semantics).
340+
- `TADD(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar addition
341+
- `TSUB(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar subtraction
342+
- `TMUL(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar multiplication
343+
- `TDIV(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar integer division. Division by zero is an error.
344+
- `TPOW(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar exponentiation. Negative exponents are an error.
330345

331346
### Logarithms
332347
- `LOG(INT: a):INT` ; floor(log2(a)) for a > 0

asm-lang.exe

1.07 KB
Binary file not shown.

asm-lang.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ def _parse_statements_from_source(text: str, filename: str) -> List[Statement]:
1919

2020

2121
def run_repl(verbose: bool) -> int:
22-
print("\x1b[38;2;153;221;255mASM-Lang\033[0m REPL. Enter statements, blank line to run buffer.")
22+
print("\x1b[38;2;153;221;255mASM-Lang\033[0m REPL. Enter statements, blank line to run buffer.") # "ASM-Lang" in light blue
2323
# Use "<string>" as the REPL's effective source filename so that MAIN() and imports behave
2424
had_output = False
2525
def _output_sink(text: str) -> None:
@@ -35,7 +35,7 @@ def _output_sink(text: str) -> None:
3535
buffer: List[str] = []
3636

3737
while True:
38-
prompt = "\x1b[38;2;153;221;255m>>>\033[0m " if not buffer else "\x1b[38;2;153;221;255m..>\033[0m "
38+
prompt = "\x1b[38;2;153;221;255m>>>\033[0m " if not buffer else "\x1b[38;2;153;221;255m..>\033[0m " # light blue
3939
if had_output:
4040
# Ensure prompt starts on a fresh line if the program printed anything
4141
print()

interpreter.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -329,6 +329,7 @@ def __init__(self) -> None:
329329
self._register_custom("OR", 2, 2, self._or)
330330
self._register_custom("XOR", 2, 2, self._xor)
331331
self._register_custom("NOT", 1, 1, self._not)
332+
self._register_custom("BOOL", 1, 1, self._bool)
332333
self._register_custom("ARGV", 0, 0, self._argv)
333334
self._register_custom("EQ", 2, 2, self._eq)
334335
self._register_custom("IN", 2, 2, self._in)
@@ -706,6 +707,10 @@ def _xor(self, _: "Interpreter", args: List[Value], __: List[Expression], ___: E
706707
def _not(self, _: "Interpreter", args: List[Value], __: List[Expression], ___: Environment, ___loc: SourceLocation) -> Value:
707708
return Value(TYPE_INT, 1 if self._as_bool_value(args[0]) == 0 else 0)
708709

710+
def _bool(self, _: "Interpreter", args: List[Value], __: List[Expression], ___: Environment, __loc: SourceLocation) -> Value:
711+
# BOOL(ANY: item):INT -> truthiness of item (INT: nonzero, STR: non-empty, TNS: any true element)
712+
return Value(TYPE_INT, 1 if self._as_bool_value(args[0]) != 0 else 0)
713+
709714
def _eq(self, interpreter: "Interpreter", args: List[Value], __: List[Expression], ___: Environment, ___loc: SourceLocation) -> Value:
710715
a, b = args
711716
if a.type != b.type:

0 commit comments

Comments
 (0)