Skip to content

Commit 1f3fad5

Browse files
Fix issue 2
Impliment first-class functions.
1 parent 54d7b23 commit 1f3fad5

File tree

5 files changed

+208
-115
lines changed

5 files changed

+208
-115
lines changed

SPECIFICATION.html

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -45,7 +45,7 @@
4545

4646
## 1. Overview
4747

48-
The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang has four runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), and non-scalar tensors (`TNS`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`.
48+
The language is a familiar statement-based, imperative language. Programs consist of variable declarations via assignment, expressions, and control-flow constructs such as `IF`, `ELSEIF`, `ELSE`, `WHILE`, and `FOR`. ASM-Lang has five runtime data types: binary integers (`INT`), binary floating-point numbers (`FLT`, IEEE754), strings (`STR`), non-scalar tensors (`TNS`), and first-class user-defined functions (`FUNC`). Identifiers, function parameters, and return values are statically typed; the type of every symbol must be declared when it is first introduced. Computation proceeds by evaluating expressions and executing statements in sequence, with explicit constructs for branching and looping. Input and output are modeled through built-in operators, in particular `INPUT` and `PRINT`.
4949
5050
The interpreter compiles source code into a single initial configuration (the seed state), which includes the program code, an empty variable environment, and an initial I/O history. It then advances execution by repeatedly applying a single, fixed small-step transition function that is independent of the particular program. A disassembler and log view expose all intermediate states so that every step and every control-flow decision is inspectable and replay-able.
5151
@@ -95,7 +95,7 @@
9595

9696
## 3. Data Model
9797

98-
ASM-Lang supports four runtime data types: binary integers, binary floating-point numbers, strings, and non-scalar tensors.
98+
ASM-Lang supports five runtime data types: binary integers, binary floating-point numbers, strings, non-scalar tensors, and first-class functions.
9999

100100
Binary integer literal: an unsigned non-empty sequence of `{0,1}` (for example, `0`, `1`, `1011`), or a signed literal formed by a leading `-` (the dash is part of the literal, not an operator) followed by optional spaces, tabs, or carriage returns and then a non-empty sequence of `{0,1}`. A `-` that does not immediately introduce a literal is a syntax error.
101101

@@ -125,7 +125,9 @@
125125

126126
All other tensor operations are non-mutating: tensor literals and tensor-valued built-ins produce new tensor values rather than mutating existing ones. Because indexed assignment mutates a tensor object, if the same tensor value is aliased (bound to multiple identifiers, passed as an argument, or stored inside another tensor), all aliases observe the mutation.
127127

128-
Every runtime value has a static type: `INT`, `FLT`, `STR`, or `TNS`. Integers are conceptually unbounded mathematical integers. Floats are IEEE754 binary floating-point numbers. Strings are byte strings of ASCII characters. Tensors are non-scalar aggregates whose elements may be `INT`, `FLT`, `STR`, or `TNS`.
128+
Every runtime value has a static type: `INT`, `FLT`, `STR`, `TNS`, or `FUNC`. Integers are conceptually unbounded mathematical integers. Floats are IEEE754 binary floating-point numbers. Strings are byte strings of ASCII characters. Tensors are non-scalar aggregates whose elements may be `INT`, `FLT`, `STR`, `FUNC`, or `TNS`.
129+
130+
Function value (`FUNC`): a reference to a user-defined function body (including its lexical closure). A `FUNC` value can be stored in variables or tensors, passed as an argument, or returned from a function. The call syntax applies to any expression that evaluates to `FUNC`; for example, `alias()` calls the function bound to `alias`, and `tns[1]()` calls the function stored in that tensor element. `FUNC` values are always truthy; equality compares object identity (two references are equal only if they refer to the same function definition). String rendering produces an implementation-defined placeholder such as `<func name>`.
129131
130132
When a Boolean interpretation is required, `INT` treats 0 as false and non-zero as true; `FLT` treats 0.0 as false and any non-zero value as true; `STR` treats the empty string as false and any non-empty string as true; `TNS` is true if any contained element is true by these rules, otherwise false. Control-flow conditions (`IF`, `ELSEIF`, `WHILE`) and `ASSERT` convert strings to integers using the same rules as the `INT` built-in; tensors are first reduced to their Boolean truth value (1 or 0).
131133
@@ -139,7 +141,7 @@
139141
140142
ASYNC blocks: The `ASYNC` statement introduces a background task whose body is a regular block: `ASYNC{ ... }`. The statements inside the block execute synchronously with respect to each other but asynchronously relative to the rest of the program: the interpreter begins executing the block in a background task and immediately continues with the next statement in the current thread. The block shares the main program namespace (it executes with the same lexical environment), so assignments and mutations performed inside an `ASYNC` block are visible to the rest of the program immediately. Runtime errors raised inside an `ASYNC` block are reported via the interpreter's error hooks and recorded in the state log; they do not synchronously abort the main thread. Note that language invariants still apply inside the `ASYNC` block (for example, `RETURN` outside a function is a runtime error).
141143
142-
Assignments have the syntax `TYPE : identifier = expression` on first use, where TYPE is `INT`, `STR`, or `TNS`. Spaces around the colon and equals sign are optional. Subsequent assignments to an existing identifier may omit the type but must preserve the original type. Variables are deallocated only when `DEL(identifier)` is executed.
144+
Assignments have the syntax `TYPE : identifier = expression` on first use, where TYPE is `INT`, `FLT`, `STR`, `TNS`, or `FUNC`. Spaces around the colon and equals sign are optional. Subsequent assignments to an existing identifier may omit the type but must preserve the original type. Variables are deallocated only when `DEL(identifier)` is executed.
143145
144146
Tensor elements can be reassigned with the indexed form `identifier[i1,...,iN] = expression`. The base must be a previously-declared `TNS` binding. The indices must match the tensor's dimensionality, follow the same one-based/negative-index rules as ordinary indexing, and must reference anexisting position. The element's original type cannot change: attempting to store a different type at that position is a runtime error. Indexed assignment mutates the `TNS`.
145147
@@ -158,17 +160,19 @@
158160
159161
## 5. Functions
160162
161-
Functions are defined using the `FUNC` keyword with explicit parameter and return types. The canonical positional-only form is `FUNC name(T1:arg1, T2:arg2, ..., TN:argN):R{ block }`, where each `Tk` and `R` is `INT`, `STR`, or `TNS`. Parameters may also declare a call-time default value using `Tk:arg=expr`. A parameter without a default is positional; a parameter with a default is keyword-capable. Positional parameters must appear before any parameters with defaults. Defining a function binds `name` to a callable body with the specified typed formal parameters. Function names must not conflict with the names of built-in operators or functions.
163+
Functions are defined using the `FUNC` keyword with explicit parameter and return types. The canonical positional-only form is `FUNC name(T1:arg1, T2:arg2, ..., TN:argN):R{ block }`, where each `Tk` and `R` is `INT`, `FLT`, `STR`, `TNS`, or `FUNC`. Parameters may also declare a call-time default value using `Tk:arg=expr`. A parameter without a default is positional; a parameter with a default is keyword-capable. Positional parameters must appear before any parameters with defaults. Defining a function binds `name` to a callable body with the specified typed formal parameters. Function names must not conflict with the names of built-in operators or functions.
164+
165+
A user-defined function is called with the same syntax as a built-in: `callee(expr1, expr2, ..., exprN)`. The callee may be any expression that evaluates to `FUNC`, including identifiers, tensor elements, or intermediate expressions. Calls may supply zero or more positional arguments (left-to-right) followed by zero or more keyword arguments of the form `param=expr`. Keyword arguments can only appear after all positional arguments. At the call site, every positional argument is bound to the next positional parameter; keyword arguments must match the name of a parameter that declared a default value. Duplicate keyword names, supplying too many positional arguments, or providing a keyword for an unknown parameter are runtime errors. If a keyword-capable parameter is omitted from the call, its default expression is evaluated at call time in the function's lexical environment after earlier parameters have been bound. The evaluated default must match the parameter's declared type. Built-in functions do not accept keyword arguments except that `READFILE` and `WRITEFILE` allow a single optional `coding=` keyword; attempting to pass any other keyword raises a runtime error. Arguments are evaluated left-to-right. The function body executes in a new environment (activation record) that closes over the defining environment. If a `RETURN(v)` statement is executed, the function terminates immediately and yields `v`; the returned value must match the declared return type. If control reaches the end of the body without `RETURN`, the function returns a default value of the declared return type (0 for `INT`, 0.0 for `FLT`, "" for `STR`). Functions whose return type is
166+
`TNS` or `FUNC` must execute an explicit `RETURN` of the declared type; reaching the end of the body without returning is a runtime error for `TNS`- or `FUNC`-returning functions.
162167
163-
A user-defined function is called with the same syntax as a built-in: `name(expr1, expr2, ..., exprN)`. Calls may supply zero or more positional arguments (left-to-right) followed by zero or more keyword arguments of the form `param=expr`. Keyword arguments can only appear after all positional arguments. At the call site, every positional argument is bound to the next positional parameter; keyword arguments must match the name of a parameter that declared a default value. Duplicate keyword names, supplying too many positional arguments, or providing a keyword for an unknown parameter are runtime errors. If a keyword-capable parameter is omitted from the call, its default expression is evaluated at call time in the function's lexical environment after earlier parameters have been bound. The evaluated default must match the parameter's declared type. Built-in functions do not accept keyword arguments except that `READFILE` and `WRITEFILE` allow a single optional `coding=` keyword; attempting to pass any other keyword raises a runtime error. Arguments are evaluated left-to-right. The function body executes in a new environment (activation record) that closes over the defining environment. If a `RETURN(v)` statement is executed, the function terminates immediately and yields `v`; the returned value must match the declared return type. If control reaches the end of the body without `RETURN`, the function returns a default value of the declared return type (0 for `INT`, "" for `STR`). Functions whose return type is
164-
`TNS` must execute an explicit `RETURN` with a tensor value; reaching the end of the body without returning is a runtime error for `TNS`-returning functions.
168+
Because `FUNC` is a first-class type, functions can be assigned to variables, stored inside tensors, passed as arguments, or returned from other functions. Calling `alias()` invokes the function bound to `alias`, while `tns[1]()` invokes the `FUNC` stored in the first tensor slot. Equality compares identity: two `FUNC` values are equal only if they refer to the same function object.
165169
166170
Built-in operators and functions can be viewed as pre-defined functions provided by the runtime environment. User-defined functions share the same call syntax and are distinguished only by their names and bodies. Because of the shared namespace, a user-defined function is not permitted to use any name already reserved for a built-in. Attempting to violate this will raise an exception.
167171
168172
169173
## 6. Variables and Memory Model
170174
171-
A variable is created only when it is first assigned with an explicit type annotation of the form `TYPE : name = expression`, where TYPE is `INT`, `STR`, or `TNS`. For example: `INT : counter = 0` or `STR : message = "hi"`. Subsequent assignments to the same name must match the declared type and may omit the type annotation (`counter = ADD(counter,1)`). Assigning to an undeclared name without a type annotation is a runtime error. A variable exists until `DEL(name)` is executed. Referencing a variable that has never been declared, or that has been deleted, is a runtime error.
175+
A variable is created only when it is first assigned with an explicit type annotation of the form `TYPE : name = expression`, where TYPE is `INT`, `FLT`, `STR`, `TNS`, or `FUNC`. For example: `INT : counter = 0` or `FUNC : handler = DISPATCH`. Subsequent assignments to the same name must match the declared type and may omit the type annotation (`counter = ADD(counter,1)`). Assigning to an undeclared name without a type annotation is a runtime error. A variable exists until `DEL(name)` is executed. Referencing a variable that has never been declared, or that has been deleted, is a runtime error.
172176
173177
The language assumes at least a global typed environment mapping identifiers to (type, value) pairs. Function calls create new environments for parameters and local variables, as described in Section 6.2; the precise details of name resolution depend on the chosen scoping rules.
174178

asm-lang.exe

1.26 KB
Binary file not shown.

extensions.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -397,6 +397,7 @@ def build_default_services() -> RuntimeServices:
397397
services.type_registry.seal("FLT")
398398
services.type_registry.seal("STR")
399399
services.type_registry.seal("TNS")
400+
services.type_registry.seal("FUNC")
400401
return services
401402

402403

0 commit comments

Comments
 (0)