|
45 | 45 |
|
46 | 46 | ## 1. Overview |
47 | 47 |
|
48 | | -The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang now has three runtime data types: binary integers (`INT`), strings (`STR`), and non-scalar tensors (`TNS`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`. |
| 48 | +The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang has four runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), and non-scalar tensors (`TNS`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`. |
49 | 49 |
|
50 | 50 | The interpreter compiles source code into a single initial configuration (the seed state), which includes the program code, an empty variable environment, and an initial I/O history. It then advances execution by repeatedly applying a single, fixed small-step transition function that is independent of the particular program. A disassembler and log view expose all intermediate states so that every step and every control-flow decision is inspectable and replay-able. |
51 | 51 |
|
|
93 | 93 |
|
94 | 94 | ## 3. Data Model |
95 | 95 |
|
96 | | -ASM-Lang supports three literal data types: binary integers, strings, and non-scalar tensors. |
| 96 | +ASM-Lang supports four runtime data types: binary integers, binary floating-point numbers, strings, and non-scalar tensors. |
97 | 97 |
|
98 | 98 | Binary integer literal: an unsigned non-empty sequence of `{0,1}` (for example, `0`, `1`, `1011`), or a signed literal formed by a leading `-` (the dash is part of the literal, not an operator) followed by optional spaces, tabs, or carriage returns and then a non-empty sequence of `{0,1}`. A `-` that does not immediately introduce a literal is a syntax error. |
99 | 99 |
|
| 100 | +Binary floating-point literal: an IEEE754 floating-point value written in binary fixed-point notation `n.n`, where both sides of the radix point are non-empty sequences of `{0,1}`. Examples: |
| 101 | + |
| 102 | +- `0.1` denotes one-half. |
| 103 | +- `0.01` denotes one-quarter. |
| 104 | +- `0.11` denotes three-quarters. |
| 105 | + |
| 106 | +FLT literals MUST NOT begin with the radix point (so `.1` is invalid). A leading `-` may prefix a FLT literal using the same rules as for integers (the dash is part of the literal and is not an operator). |
| 107 | + |
100 | 108 | String literal: a sequence of ASCII characters enclosed in either double quotation marks (`"`) or single quotation marks (`'`) with no escape processing. A string opened with one delimiter must be closed with the same delimiter; other quotation characters appearing inside the string are treated as ordinary characters. Newlines are not permitted inside string literals. The literal's value is the contained character sequence. |
101 | 109 |
|
102 | 110 | Tensor literal: a non-empty bracketed collection of expressions. Each pair of matching brackets introduces a dimension; nested brackets must form a rectangular shape (all sublists at a given depth have the same length) or a syntax error is raised. The outermost bracket corresponds to dimension 1. Examples (lengths shown in binary): |
|
111 | 119 |
|
112 | 120 | All other tensor operations are non-mutating: tensor literals and tensor-valued built-ins produce new tensor values rather than mutating existing ones. Because indexed assignment mutates a tensor object, if the same tensor value is aliased (bound to multiple identifiers, passed as an argument, or stored inside another tensor), all aliases observe the mutation. |
113 | 121 |
|
114 | | -Every runtime value has a static type: `INT`, `STR`, or `TNS`. Integers are conceptually unbounded mathematical integers. Strings are byte strings of ASCII characters. Tensors are non-scalar aggregates whose elements may be `INT`, `STR`, or `TNS`. When a Boolean interpretation is required, `INT` treats 0 as false and non-zero as true; `STR` treats the empty string as false and any non-empty string as true; `TNS` is true if any contained element is true by these rules, otherwise false. Control-flow conditions (`IF`, `ELSIF`, `WHILE`) and `ASSERT` convert strings to integers using the same rules as the `INT` built-in; tensors are first reduced to their Boolean truth value (1 or 0). |
| 122 | +Every runtime value has a static type: `INT`, `FLT`, `STR`, or `TNS`. Integers are conceptually unbounded mathematical integers. Floats are IEEE754 binary floating-point numbers. Strings are byte strings of ASCII characters. Tensors are non-scalar aggregates whose elements may be `INT`, `FLT`, `STR`, or `TNS`. |
| 123 | + |
| 124 | +When a Boolean interpretation is required, `INT` treats 0 as false and non-zero as true; `FLT` treats 0.0 as false and any non-zero value as true; `STR` treats the empty string as false and any non-empty string as true; `TNS` is true if any contained element is true by these rules, otherwise false. Control-flow conditions (`IF`, `ELSIF`, `WHILE`) and `ASSERT` convert strings to integers using the same rules as the `INT` built-in; tensors are first reduced to their Boolean truth value (1 or 0). |
| 125 | + |
| 126 | +`INT` and `FLT` are not interoperable: no implicit conversion occurs. Operators that accept both types require that all numeric arguments have the same numeric type. |
115 | 127 |
|
116 | 128 | ## 4. Statements and Control Flow |
117 | 129 |
|
|
343 | 355 | - `MUL(INT:a, INT:b):INT` ; a * b |
344 | 356 | - `DIV(INT: a, INT: b):INT` ; floor(a / b) |
345 | 357 | - `CDIV(INT: a, INT: b):INT` ; ceil(a / b) |
346 | | -- `POW(INT: a, INT: b):INT` ; a ^ b (b >= 0) |
| 358 | +- `POW(INT: a, INT: b):INT` ; a ^ b |
| 359 | +- `ROOT(INT|FLT: x, INT|FLT: n):INT|FLT` ; nth root of `x`. No mixing of `INT` and `FLT` is allowed. For `INT` arguments `n` must be non-zero; positive `n` returns the integer nth root (largest integer r with r^n <= x for x >= 0); negative `n` yields an integer result only for `x` equal to `1` or `-1` (reciprocal is integer), and `x < 0` requires odd `n`. For `FLT` arguments the result is `x^(1/n)` (negative `n` allowed); negative `x` is allowed only when `n` is an odd integer. Division by zero is an error. |
347 | 360 | - `MOD(INT: a, INT: b):INT` ; remainder of a / b |
348 | 361 | - `NEG(INT: a):INT` ; -a (additive inverse) |
349 | 362 | - `ABS(INT: a):INT` ; absolute value of a |
|
366 | 379 |
|
367 | 380 | ### Comparisons |
368 | 381 | - `EQ(ANY: a, ANY: b):INT` ; 1 if a == b else 0 |
369 | | -- `GT(INT: a, INT: b):INT` ; 1 if a > b else 0 |
370 | | -- `LT(INT: a, INT: b):INT` ; 1 if a < b else 0 |
371 | | -- `GTE(INT: a, INT: b):INT` ; 1 if a >= b else 0 |
372 | | -- `LTE(INT: a, INT: b):INT` ; 1 if a <= b else 0 |
| 382 | +- `GT(INT|FLT: a, INT|FLT: b):INT` ; 1 if a > b else 0 (no mixing INT/FLT) |
| 383 | +- `LT(INT|FLT: a, INT|FLT: b):INT` ; 1 if a < b else 0 (no mixing INT/FLT) |
| 384 | +- `GTE(INT|FLT: a, INT|FLT: b):INT` ; 1 if a >= b else 0 (no mixing INT/FLT) |
| 385 | +- `LTE(INT|FLT: a, INT|FLT: b):INT` ; 1 if a <= b else 0 (no mixing INT/FLT) |
373 | 386 |
|
374 | 387 | ### Aggregates / Utilities |
375 | | -- `MAX(INT|STR: a1, ..., INT|STR: aN):INT|STR` ; `INT` -> numeric max; `STR` -> longest string; mixing `INT` and `STR` or supplying tensors is an error |
376 | | -- `MIN(INT|STR: a1, ..., INT|STR: aN):INT|STR` ; `INT` -> numeric min; `STR` -> shortest string; mixing `INT` and `STR` or supplying tensors is an error |
377 | | -- `SUM(INT: a1, ..., INT: aN):INT` ; sum of the arguments |
| 388 | +- `MAX(INT|FLT|STR: a1, ..., INT|FLT|STR: aN):INT|FLT|STR` ; numeric max for `INT`/`FLT`, longest for `STR`; supplying tensors or mixing types is an error |
| 389 | +- `MIN(INT|FLT|STR: a1, ..., INT|FLT|STR: aN):INT|FLT|STR` ; numeric min for `INT`/`FLT`, shortest for `STR`; supplying tensors or mixing types is an error |
| 390 | +- `SUM(INT|FLT: a1, ..., INT|FLT: aN):INT|FLT` ; sum of the arguments (no mixing INT/FLT) |
378 | 391 | - `LEN(INT|STR: a1, ..., INT|STR: aN):INT` ; number of arguments (N), rejects tensors |
379 | 392 | - `ALL(ANY: a1, ..., ANY: aN):INT` ; Boolean AND (empty string -> false, non-empty -> true) |
380 | 393 | - `ANY(ANY: a1, ..., ANY: aN):INT` ; Boolean OR (empty string -> false, non-empty -> true) |
381 | 394 | - `JOIN(INT|STR: a1, INT|STR: a2, ..., INT|STR: aN):INT|STR` ; `INT` -> concatenate binary spellings with consistent sign; `STR` -> concatenate strings; mixing `INT` and `STR` or supplying tensors raises an error |
382 | 395 | - `PROD(INT: a1, ..., INT: aN):INT` ; product of the arguments |
| 396 | +- `PROD(INT|FLT: a1, ..., INT|FLT: aN):INT|FLT` ; product of the arguments (no mixing INT/FLT) |
383 | 397 |
|
384 | 398 | ### Tensor operations |
385 | 399 | - `SHAPE(TNS: tensor):TNS` — Returns the tensor's shape as a 1D `TNS` (vector) of `INT` lengths (one entry per dimension). |
386 | 400 | - `TLEN(TNS: tensor, INT: dim):INT` — Returns the length of the specified 1-based dimension. Errors if `dim` is out of range. |
387 | 401 | - `FILL(TNS: tensor, ANY: value):TNS` — Returns a new tensor with the same shape as `tensor`, filled with `value`. The supplied value`s type must match the existing element type at every position. |
388 | 402 | - `TNS(TNS: shape, ANY: value):TNS` — Creates a new `TNS` with the shape described by a 1D `TNS` of positive `INT` lengths, filled with `value`. |
389 | | -- `MADD/MSUB/MMUL/MDIV(TNS: x, TNS: y):TNS` — Elementwise addition, subtraction, multiplication, and integer division. Shapes must match; all elements must be `INT`. `MDIV` raises on division by zero. |
390 | | -- `MSUM(TNS: t1, ..., TNS: tN):TNS` — Elementwise sum across tensors. Shapes must match; elements must be `INT`. |
391 | | -- `MPROD(TNS: t1, ..., TNS: tN):TNS` — Elementwise product across tensors. Shapes must match; elements must be `INT`. |
392 | | -- `TADD(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar addition |
393 | | -- `TSUB(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar subtraction |
394 | | -- `TMUL(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar multiplication |
395 | | -- `TDIV(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar integer division. Division by zero is an error. |
396 | | -- `TPOW(TNS: x, INT: y):TNS` - Elementwise `INT`-scalar exponentiation. Negative exponents are an error. |
| 403 | +- `MADD/MSUB/MMUL/MDIV(TNS: x, TNS: y):TNS` — Elementwise addition, subtraction, multiplication, and division. Shapes must match; all elements must be `INT` or all `FLT` (no mixing). Division by zero is an error. |
| 404 | +- `MSUM(TNS: t1, ..., TNS: tN):TNS` — Elementwise sum across tensors. Shapes must match; elements must be all `INT` or all `FLT` (no mixing). |
| 405 | +- `MPROD(TNS: t1, ..., TNS: tN):TNS` — Elementwise product across tensors. Shapes must match; elements must be all `INT` or all `FLT` (no mixing). |
| 406 | +- `TADD/TSSUB/TMUL/TDIV/TPOW(TNS: x, INT|FLT: y):TNS` — Tensor-scalar arithmetic. Tensor elements and scalar must both be `INT` or both be `FLT` (no mixing). Division by zero is an error. |
397 | 407 |
|
398 | 408 | ### Logarithms |
399 | 409 | - `LOG(INT: a):INT` ; floor(log2(a)) for a > 0 |
| 410 | +- `LOG(FLT: a):FLT` ; floor(log2(a)) for a > 0 |
400 | 411 | - `CLOG(INT: a):INT` ; ceil(log2(a)) for a > 0 |
401 | 412 |
|
| 413 | +### Numeric conversions / predicates |
| 414 | +- `INT(ANY: a):INT` — Explicit conversion to integer. If `a` is `FLT`, conversion truncates toward zero. |
| 415 | +- `FLT(ANY: a):FLT` — Explicit conversion to float. |
| 416 | +- `ISFLT(SYMBOL: name):INT` — 1 if `name` is bound and has type `FLT`, otherwise 0. |
| 417 | + |
| 418 | +### Rounding |
| 419 | +- `ROUND(FLT: float, STR: mode="floor", INT: ndigits=0):FLT` — Round `float` to `ndigits` places right of the radix point (binary places; `ndigits` may be negative). Modes are: |
| 420 | + |
| 421 | +- `ROUND(FLT: float, STR: mode="floor", INT: ndigits=0):FLT` — Round `float` to `ndigits` places right of the radix point (binary places; `ndigits` may be negative). When exactly two arguments are supplied and the second is an `INT`, it is treated as `ndigits` with the mode defaulting to `"floor"`. Modes are: |
| 422 | + |
| 423 | + - `"floor"` — round toward $-\infty$ |
| 424 | + - `"ceiling"` or `"ceil"` — round toward $+\infty$ |
| 425 | + - `"zero"` — round toward zero |
| 426 | + - `"logical"` or `"half-up"` — round half away from zero |
| 427 | + |
402 | 428 | ### Module operations: |
403 | 429 | - `IMPORT(MODULE: name)` or `IMPORT(MODULE: name, SYMBOL: alias)` — Loads another source file and exposes it as a distinct module namespace. When an optional alias identifier is supplied, the imported module's bindings are exposed under the `alias` prefix rather than the module's own name (for example, `IMPORT(mod, ali)` makes `ali.F()` valid while `mod.F()` is not). |
404 | 430 |
|
|
433 | 459 | - `ARGV():TNS` — Returns the interpreter's argument vector as a one-dimensional `TNS` of `STR`. The tensor's elements are the command-line argument strings supplied to the process, in the same order as the process `argv`, with index 1 holding the interpreter's invocation entry (TNS indices are 1-based). |
434 | 460 |
|
435 | 461 | ### Control / Function / Statement Signatures (statement position) |
436 | | -- Assignment: `TYPE : identifier = expression` on first use; subsequent assignments omit TYPE but must match the original type. TYPE is `INT`, `STR`, or `TNS`. Tensor elements may be reassigned with `identifier[i1,...,iN] = expression` (indices are 1-based with negative-index support, and the stored type at that element must not change). |
| 462 | +- Assignment: `TYPE : identifier = expression` on first use; subsequent assignments omit TYPE but must match the original type. TYPE is `INT`, `FLT`, `STR`, or `TNS`. Tensor elements may be reassigned with `identifier[i1,...,iN] = expression` (indices are 1-based with negative-index support, and the stored type at that element must not change). |
437 | 463 | - Block: `{ statement1 ... statementN }` |
438 | 464 | - `IF(condition){ block }` (optional `ELSIF(condition){ block }` ... `ELSE{ block }`) |
439 | 465 | - `WHILE(condition){ block }` |
|
447 | 473 | - `GOTO(n)` ; jump to a previously-registered gotopoint with identifier `n` (`INT` or `STR`) within the same function or top-level scope; runtime error if not registered in that scope |
448 | 474 |
|
449 | 475 | ### Notes |
450 | | -- Built-ins are statically typed. Boolean contexts treat `INT` 0 as false and non-zero as true; `STR` is false when empty and true when non-empty unless a rule explicitly converts via `INT`. A `TNS` is true if any element is true by those `INT`/`STR` rules. |
| 476 | +- Built-ins are statically typed. Boolean contexts treat `INT` 0 as false and non-zero as true; `FLT` is false when 0.0 and true otherwise; `STR` is false when empty and true when non-empty unless a rule explicitly converts via `INT`. A `TNS` is true if any element is true by those rules. |
451 | 477 | - Argument evaluation order: left-to-right. |
452 | 478 | - User-defined functions use the same call syntax as built-ins; keyword arguments are permitted only after positional arguments and only for parameters that declare defaults. Built-ins reject keyword arguments except that `READFILE` and `WRITEFILE` accept an optional `coding=` keyword. When a keyword parameter is omitted, its default expression is evaluated at call time in the function's defining environment. |
453 | 479 |
|
|
0 commit comments