|
45 | 45 |
|
46 | 46 | ## 1. Overview |
47 | 47 |
|
48 | | -The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang has four runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), and non-scalar tensors (`TNS`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`. |
| 48 | +The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang has four runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), and non-scalar tensors (`TNS`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`. |
49 | 49 |
|
50 | 50 | The interpreter compiles source code into a single initial configuration (the seed state), which includes the program code, an empty variable environment, and an initial I/O history. It then advances execution by repeatedly applying a single, fixed small-step transition function that is independent of the particular program. A disassembler and log view expose all intermediate states so that every step and every control-flow decision is inspectable and replay-able. |
51 | 51 |
|
|
115 | 115 |
|
116 | 116 | Expressions inside a tensor literal may themselves evaluate to `INT`, `FLT`, `STR`, or `TNS`. If an element expression evaluates to a tensor value, it occupies a single position and does not contribute additional dimensions; only the explicit bracket structure determines the shape. A tensor literal that mixes sub-brackets of differing lengths is invalid (for example, `[[0,0,0],[1,1]]`). |
117 | 117 |
|
118 | | -Tensor values carry a fixed shape (dimension count and length per dimension). Each location in a tensor is statically typed by the value it first receives; attempting to write a different type to that location is a runtime error. Tensors are indexed with one-based indices. Negative indices address from the end of the corresponding dimension (for example, index -1 is the last element of that dimension); index 0 is invalid. Tensors may be re-assigned in two ways: binding a new tensor to the variable name, or writing through an index such as `tensor[dim1,...,dimN] = expr`. The latter mutates the existing tensor value in place at the indexed location and does not construct a new tensor object. |
| 118 | +Tensor values carry a fixed shape (dimension count and length per dimension). Each location in a tensor is statically typed by the value it first receives; attempting to write a different type to that location is a runtime error. Tensors are indexed with one-based indices. Negative indices are allowed and count backward from the end of the dimension: index `-k` resolves to position `len - k + 1` (so `-1` is the last element, and `-10`—binary `-2`—is the second-to-last). Index `0` is invalid. Any index whose absolute value exceeds the length of the dimension is out of range, even if negative. Tensors may be re-assigned in two ways: binding a new tensor to the variable name, or writing through an index such as `tensor[dim1,...,dimN] = expr`. The latter mutates the existing tensor value in place at the indexed location and does not construct a new tensor object. |
119 | 119 |
|
120 | 120 | All other tensor operations are non-mutating: tensor literals and tensor-valued built-ins produce new tensor values rather than mutating existing ones. Because indexed assignment mutates a tensor object, if the same tensor value is aliased (bound to multiple identifiers, passed as an argument, or stored inside another tensor), all aliases observe the mutation. |
121 | 121 |
|
122 | 122 | Every runtime value has a static type: `INT`, `FLT`, `STR`, or `TNS`. Integers are conceptually unbounded mathematical integers. Floats are IEEE754 binary floating-point numbers. Strings are byte strings of ASCII characters. Tensors are non-scalar aggregates whose elements may be `INT`, `FLT`, `STR`, or `TNS`. |
123 | 123 |
|
124 | | -When a Boolean interpretation is required, `INT` treats 0 as false and non-zero as true; `FLT` treats 0.0 as false and any non-zero value as true; `STR` treats the empty string as false and any non-empty string as true; `TNS` is true if any contained element is true by these rules, otherwise false. Control-flow conditions (`IF`, `ELSIF`, `WHILE`) and `ASSERT` convert strings to integers using the same rules as the `INT` built-in; tensors are first reduced to their Boolean truth value (1 or 0). |
| 124 | +When a Boolean interpretation is required, `INT` treats 0 as false and non-zero as true; `FLT` treats 0.0 as false and any non-zero value as true; `STR` treats the empty string as false and any non-empty string as true; `TNS` is true if any contained element is true by these rules, otherwise false. Control-flow conditions (`IF`, `ELSEIF`, `WHILE`) and `ASSERT` convert strings to integers using the same rules as the `INT` built-in; tensors are first reduced to their Boolean truth value (1 or 0). |
125 | 125 |
|
126 | 126 | `INT` and `FLT` are not interoperable: no implicit conversion occurs. Operators that accept both types require that all numeric arguments have the same numeric type. |
127 | 127 |
|
128 | 128 | ## 4. Statements and Control Flow |
129 | 129 |
|
130 | | -A program consists of zero or more statements separated by newlines. Each top-level expression or assignment must appear on its own line. The basic statement forms are assignments of the form `identifier = expression`, expression statements such as calls to `PRINT` whose result is ignored, control-flow constructs (`IF`, `ELSIF`, `ELSE`, `WHILE`, and `FOR`), and function definitions (`FUNC` declarations; see Section 6). |
| 130 | +A program consists of zero or more statements separated by newlines. Each top-level expression or assignment must appear on its own line. The basic statement forms are assignments of the form `identifier = expression`, expression statements such as calls to `PRINT` whose result is ignored, control-flow constructs (`IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`), and function definitions (`FUNC` declarations; see Section 6). |
131 | 131 |
|
132 | 132 | Blocks group one or more statements and are enclosed in curly brackets: `{ statement1 ... statementN }`. Curly braces must match (that is, `{` closes with `}`). Blocks serve as the bodies of control-flow constructs and functions. |
133 | 133 |
|
134 | 134 | Assignments have the syntax `TYPE : identifier = expression` on first use, where TYPE is `INT`, `STR`, or `TNS`. Spaces around the colon and equals sign are optional. Subsequent assignments to an existing identifier may omit the type but must preserve the original type. Variables are deallocated only when `DEL(identifier)` is executed. |
135 | 135 |
|
136 | 136 | Tensor elements can be reassigned with the indexed form `identifier[i1,...,iN] = expression`. The base must be a previously-declared `TNS` binding. The indices must match the tensor's dimensionality, follow the same one-based/negative-index rules as ordinary indexing, and must reference anexisting position. The element's original type cannot change: attempting to store a different type at that position is a runtime error. Indexed assignment mutates the `TNS`. |
137 | 137 |
|
138 | | -The language provides `IF`, `ELSIF`, and `ELSE` constructs for conditional execution. An `IF` statement has the general form `IF(condition){ block }`. Optional chained branches may follow: one or more `ELSIF(condition){ block }` clauses and an optional terminal `ELSE{ block }` clause. An `ELSIF` or `ELSE` must immediately follow an `IF` or another `ELSIF`; otherwise it is a syntax error. At most one `ELSE` may appear in a given chain. Evaluation proceeds by first evaluating the condition of the initial `IF`. If it is non-zero, the associated block executes and the rest of the chain is skipped. Otherwise, the conditions of subsequent `ELSIF` clauses are evaluated in order until one is non-zero; its block then executes and the chain terminates. If no `IF` or `ELSIF` condition is satisfied and an `ELSE` is present, the `ELSE` block executes; if there is no `ELSE`, no block in the chain executes. |
| 138 | +The language provides `IF`, `ELSEIF`, and `ELSE` constructs for conditional execution. An `IF` statement has the general form `IF(condition){ block }`. Optional chained branches may follow: one or more `ELSEIF(condition){ block }` clauses and an optional terminal `ELSE{ block }` clause. An `ELSEIF` or `ELSE` must immediately follow an `IF` or another `ELSEIF`; otherwise it is a syntax error. At most one `ELSE` may appear in a given chain. Evaluation proceeds by first evaluating the condition of the initial `IF`. If it is non-zero, the associated block executes and the rest of the chain is skipped. Otherwise, the conditions of subsequent `ELSEIF` clauses are evaluated in order until one is non-zero; its block then executes and the chain terminates. If no `IF` or `ELSEIF` condition is satisfied and an `ELSE` is present, the `ELSE` block executes; if there is no `ELSE`, no block in the chain executes. |
139 | 139 |
|
140 | 140 | Conditions accept `INT`, `STR`, or `TNS`. `STR` conditions are first converted to `INT` using the `INT` built-in rules (empty -> 0; binary string -> that integer; other non-empty -> 1) before truthiness is checked. A `TNS` condition is true if any of its elements is true by the `INT`/`STR` truthiness rules. |
141 | 141 |
|
|
401 | 401 | - `TLEN(TNS: tensor, INT: dim):INT` — Returns the length of the specified 1-based dimension. Errors if `dim` is out of range. |
402 | 402 | - `FLIP(INT|STR: obj):INT|STR` — For `INT` input, returns an `INT` whose binary-digit spelling is the reverse of the absolute-value binary spelling of `obj` (sign is preserved). For `STR` input, returns the character-reversed `STR`. |
403 | 403 | - `TFLIP(TNS: obj, INT: dim):TNS` — Returns a new `TNS` with the elements along 1-based dimension `dim` reversed. Errors if `dim` is out of range. |
| 404 | +- `SCATTER(TNS: src, TNS: dst, TNS: ind):TNS` — Returns a copy of `dst` with a rectangular slice replaced by `src`. `ind` must be a 2D tensor of `INT` pairs with shape `[TLEN(dst, 1), 10]` (binary `10` = decimal 2), i.e., one `[lo, hi]` row per destination dimension (rank; for example `rank = TLEN(SHAPE(dst), 1)`). Indices are 1-based; negatives follow the tensor indexing rules (for example, `-1` is the last element) and `0` is invalid. For each dimension, the inclusive span `hi - lo + 1` must equal the corresponding `src` dimension length, and all bounds must fall within `dst`. Elements outside the slice are copied from `dst` unchanged. |
404 | 405 | - `FILL(TNS: tensor, ANY: value):TNS` — Returns a new tensor with the same shape as `tensor`, filled with `value`. The supplied value`s type must match the existing element type at every position. |
405 | 406 | - `TNS(TNS: shape, ANY: value):TNS` — Creates a new `TNS` with the shape described by a 1D `TNS` of positive `INT` lengths, filled with `value`. |
406 | 407 | - `CONVOLVE(TNS: x, TNS: kernel):TNS` — N-dimensional discrete convolution producing an output tensor with the same shape as `x`. The `kernel` must have the same rank as `x` and each kernel dimension length must be odd (so the kernel has a well-defined center). At the boundaries, out-of-range sample coordinates are clamped to the nearest valid index (replicate padding). Both tensors must contain only `INT` or only `FLT` elements (no mixed element types within a tensor). If both tensors are `INT`-valued, the output is an `INT` tensor; otherwise the output is a `FLT` tensor. |
|
476 | 477 | ### Control / Function / Statement Signatures (statement position) |
477 | 478 | - Assignment: `TYPE : identifier = expression` on first use; subsequent assignments omit TYPE but must match the original type. TYPE is `INT`, `FLT`, `STR`, or `TNS`. Tensor elements may be reassigned with `identifier[i1,...,iN] = expression` (indices are 1-based with negative-index support, and the stored type at that element must not change). |
478 | 479 | - Block: `{ statement1 ... statementN }` |
479 | | -- `IF(condition){ block }` (optional `ELSIF(condition){ block }` ... `ELSE{ block }`) |
| 480 | +- `IF(condition){ block }` (optional `ELSEIF(condition){ block }` ... `ELSE{ block }`) |
480 | 481 | - `WHILE(condition){ block }` |
481 | 482 | - `FOR(counter, INT: target){ block }` ; `counter` initialized to 0, loop until counter >= target |
482 | 483 | `FUNC name(T1:arg1, T2:arg2, ..., TN:argN):R{ block }` ; typed function definition with return type R (`INT`, `STR`, or `TNS`); optional defaults use `Tk:arg=expr` and must appear only after all positional parameters. Functions with return type `TNS` must explicitly execute `RETURN(value)`; there is no implicit default tensor value. |
|
0 commit comments