Skip to content

Commit a6759cd

Browse files
Add operators: ARGV, SPLIT, STRIP, REPLACE.
Rename DRETURN to POP. Add lib\path.asmln
1 parent 3c6ae72 commit a6759cd

File tree

9 files changed

+167
-12
lines changed

9 files changed

+167
-12
lines changed

SPECIFICATION.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -87,17 +87,25 @@ After the module finishes executing the first time, the interpreter caches the m
8787

8888
This caching behavior ensures that importing a module multiple times produces the same shared namespace instance for all importers. The interpreter does not automatically perform cycle detection beyond using the cached instance once execution has completed; careful module design should avoid import cycles where possible.
8989

90+
`ARGV()` returns the interpreter's argument vector as a one-dimensional `TNS` of `STR`. The tensor's elements are the command-line argument strings supplied to the process, in the same order as the process `argv`, with index 1 holding the interpreter's invocation entry (TNS indices are 1-based). This lets programs inspect their launch arguments from within ASM-Lang.
91+
9092
`ASSERT(a)` checks that its argument is true in the Boolean sense. If `a` is non-zero, execution proceeds normally; if `a` is `0`, the program crashes with an assertion failure.
9193

9294
`BYTES(n)` converts a non-negative integer into its big-endian byte representation. The result is a one-dimensional `TNS` whose elements are `INT` values in the range `0..11111111` (0-255 decimal), ordered most-significant byte first. The tensor length is `max(1, ceil(bit_length(n)/8))`; `BYTES(0)` returns a single zero byte. Supplying a negative integer is a runtime error.
9395

96+
`SPLIT(string, delimiter = " ")` splits `string` on the exact substring `delimiter` and returns the parts as a one-dimensional `TNS` of `STR`. The delimiter defaults to a single space. The delimiter must be non-empty; otherwise a runtime error is raised. Consecutive delimiters and trailing delimiters are preserved, so empty-string elements may appear in the result. The tensor length equals the number of resulting segments.
97+
98+
`STRIP(string, remove)` returns a `STR` formed by removing every occurrence of the substring `remove` from `string`. The `remove` argument must be a non-empty string; supplying an empty `remove` raises a runtime error. `STRIP` does not mutate its inputs and always returns a new `STR` value.
99+
100+
`REPLACE(string, a, b)` returns a `STR` formed by replacing every occurrence of the substring `a` in `string` with `b`. The `a` argument must be a non-empty string; supplying an empty `a` raises a runtime error. `REPLACE` does not mutate its inputs and always returns a new `STR` value.
101+
94102
`MAIN()` returns `1` when the call site belongs to the primary program file (the file passed as the interpreter's first argument, or `<string>` when `-source` is used). It returns `0` when executed from code that came from an `IMPORT` (including nested imports). The result is determined solely by the source file that contains the call expression, not by the caller's call stack.
95103

96104
Program termination is exposed via `EXIT`. `EXIT()` or `EXIT(code)` requests immediate termination of the interpreter. If an integer `code` is supplied, it is used as the interpreter's process exit code; otherwise `0` is used. Execution stops immediately when `EXIT` is executed (no further statements run), and an entry is recorded in the state log to make deterministic replay possible. Using `EXIT` inside a function terminates the entire program (not just the function).
97105

98106
Memory-management and function-return behavior are also exposed via operators. `DEL(x)` deletes the variable `x` from the current environment, freeing its memory; any subsequent reference to `x` is an error unless `x` is re-assigned. `RETURN(a)`, when executed inside a function body, immediately terminates that function and returns the value of `a` to the caller. Executing `RETURN` outside of a function is a runtime error.
99107

100-
`DRETURN(x)` is a convenience operator combining `RETURN` and `DEL`: when executed inside a function body it retrieves the current value of the identifier `x`, deletes the binding `x` from the environment (so subsequent references are an error), and returns the retrieved value to the caller. Using `DRETURN` outside of a function is a runtime error. If `x` is frozen or undefined, `DRETURN(x)` raises the same runtime errors as `DEL(x)` or a reference to `x` would.
108+
`POP(x)` is a convenience operator combining `RETURN` and `DEL`: when executed inside a function body it retrieves the current value of the identifier `x`, deletes the binding `x` from the environment (so subsequent references are an error), and returns the retrieved value to the caller. Using `POP` outside of a function is a runtime error. If `x` is frozen or undefined, `POP(x)` raises the same runtime errors as `DEL(x)` or a reference to `x` would.
101109

102110

103111
## 5. Statements and Control Flow
161 KB
Binary file not shown.

__pycache__/lexer.cpython-314.pyc

12.2 KB
Binary file not shown.

__pycache__/parser.cpython-314.pyc

31.9 KB
Binary file not shown.

interpreter.py

Lines changed: 69 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@
3434
Parser,
3535
Program,
3636
ReturnStatement,
37-
DReturnStatement,
37+
PopStatement,
3838
SourceLocation,
3939
Statement,
4040
TensorLiteral,
@@ -329,6 +329,7 @@ def __init__(self) -> None:
329329
self._register_custom("OR", 2, 2, self._or)
330330
self._register_custom("XOR", 2, 2, self._xor)
331331
self._register_custom("NOT", 1, 1, self._not)
332+
self._register_custom("ARGV", 0, 0, self._argv)
332333
self._register_custom("EQ", 2, 2, self._eq)
333334
self._register_custom("IN", 2, 2, self._in)
334335
self._register_int_only("GT", 2, lambda a, b: 1 if a > b else 0)
@@ -345,12 +346,15 @@ def __init__(self) -> None:
345346
self._register_custom("SLEN", 1, 1, self._slen)
346347
self._register_custom("ILEN", 1, 1, self._ilen)
347348
self._register_variadic("JOIN", 1, self._join)
349+
self._register_custom("SPLIT", 1, 2, self._split)
348350
self._register_int_only("LOG", 1, self._safe_log)
349351
self._register_int_only("CLOG", 1, self._safe_clog)
350352
self._register_custom("INT", 1, 1, self._int_op)
351353
self._register_custom("STR", 1, 1, self._str_op)
352354
self._register_custom("UPPER", 1, 1, self._upper)
353355
self._register_custom("LOWER", 1, 1, self._lower)
356+
self._register_custom("STRIP", 2, 2, self._strip)
357+
self._register_custom("REPLACE", 3, 3, self._replace)
354358
self._register_custom("MAIN", 0, 0, self._main)
355359
self._register_custom("OS", 0, 0, self._os)
356360
self._register_custom("IMPORT", 1, 1, self._import)
@@ -669,6 +673,23 @@ def _join(self, values: List[Value], location: SourceLocation) -> Value:
669673
bits = "".join("0" if v == 0 else format(v, "b") for v in ints)
670674
return Value(TYPE_INT, int(bits or "0", 2))
671675

676+
def _split(
677+
self,
678+
_: "Interpreter",
679+
args: List[Value],
680+
__: List[Expression],
681+
___: Environment,
682+
location: SourceLocation,
683+
) -> Value:
684+
text = self._expect_str(args[0], "SPLIT", location)
685+
delimiter = " " if len(args) == 1 else self._expect_str(args[1], "SPLIT", location)
686+
if delimiter == "":
687+
raise ASMRuntimeError("SPLIT delimiter must not be empty", location=location, rewrite_rule="SPLIT")
688+
689+
parts = text.split(delimiter)
690+
data = np.array([Value(TYPE_STR, part) for part in parts], dtype=object)
691+
return Value(TYPE_TNS, Tensor(shape=[len(parts)], data=data))
692+
672693
# Boolean-like operators treating strings via emptiness
673694
def _and(self, _: "Interpreter", args: List[Value], __: List[Expression], ___: Environment, location: SourceLocation) -> Value:
674695
a, b = args
@@ -777,6 +798,37 @@ def _lower(
777798
# Convert ASCII letters to lower-case; other bytes are unchanged
778799
return Value(TYPE_STR, s.lower())
779800

801+
def _strip(
802+
self,
803+
interpreter: "Interpreter",
804+
args: List[Value],
805+
__: List[Expression],
806+
___: Environment,
807+
location: SourceLocation,
808+
) -> Value:
809+
# STRIP(STR: string, STR: remove):STR -> remove all occurrences of `remove` from `string`
810+
s = self._expect_str(args[0], "STRIP", location)
811+
rem = self._expect_str(args[1], "STRIP", location)
812+
if rem == "":
813+
raise ASMRuntimeError("STRIP: remove substring must not be empty", location=location, rewrite_rule="STRIP")
814+
return Value(TYPE_STR, s.replace(rem, ""))
815+
816+
def _replace(
817+
self,
818+
interpreter: "Interpreter",
819+
args: List[Value],
820+
__: List[Expression],
821+
___: Environment,
822+
location: SourceLocation,
823+
) -> Value:
824+
# REPLACE(STR: string, STR: a, STR: b):STR -> replace all occurrences of `a` in `string` with `b`
825+
s = self._expect_str(args[0], "REPLACE", location)
826+
a = self._expect_str(args[1], "REPLACE", location)
827+
b = self._expect_str(args[2], "REPLACE", location)
828+
if a == "":
829+
raise ASMRuntimeError("REPLACE: substring must not be empty", location=location, rewrite_rule="REPLACE")
830+
return Value(TYPE_STR, s.replace(a, b))
831+
780832
def _safe_log(self, value: int) -> int:
781833
if value <= 0:
782834
raise ASMRuntimeError("LOG argument must be > 0", rewrite_rule="LOG")
@@ -1011,6 +1063,19 @@ def _print(
10111063
interpreter.io_log.append({"event": "PRINT", "values": [arg.value for arg in args]})
10121064
return Value(TYPE_INT, 0)
10131065

1066+
def _argv(
1067+
self,
1068+
interpreter: "Interpreter",
1069+
args: List[Value],
1070+
__: List[Expression],
1071+
___: Environment,
1072+
____: SourceLocation,
1073+
) -> Value:
1074+
# Return the process argument vector as a 1-D TNS of STR values.
1075+
entries: List[str] = [str(s) for s in sys.argv]
1076+
data = np.array([Value(TYPE_STR, s) for s in entries], dtype=object)
1077+
return Value(TYPE_TNS, Tensor(shape=[len(data)], data=data))
1078+
10141079
def _assert(
10151080
self,
10161081
interpreter: "Interpreter",
@@ -1891,14 +1956,14 @@ def _execute_statement(self, statement: Statement, env: Environment) -> None:
18911956
closure=env,
18921957
)
18931958
return
1894-
if isinstance(statement, DReturnStatement):
1959+
if isinstance(statement, PopStatement):
18951960
frame: Frame = self.call_stack[-1]
18961961
if frame.name == "<top-level>":
1897-
raise ASMRuntimeError("DRETURN outside of function", location=statement.location, rewrite_rule="DRETURN")
1962+
raise ASMRuntimeError("POP outside of function", location=statement.location, rewrite_rule="POP")
18981963
# Expect identifier expression to delete a symbol
18991964
expr = statement.expression
19001965
if not isinstance(expr, Identifier):
1901-
raise ASMRuntimeError("DRETURN expects identifier", location=statement.location, rewrite_rule="DRETURN")
1966+
raise ASMRuntimeError("POP expects identifier", location=statement.location, rewrite_rule="POP")
19021967
name = expr.name
19031968
try:
19041969
value = env.get(name)

lexer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ class Token:
2727
"FOR",
2828
"FUNC",
2929
"RETURN",
30-
"DRETURN",
30+
"POP",
3131
"BREAK",
3232
"CONTINUE",
3333
"GOTO",

lib/path.asmln

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
# Path utilities for ASM-Lang
2+
3+
FUNC NORMALIZE_PATH(STR: path):STR{
4+
RETURN(REPLACE(path,"\","/"))
5+
}
6+
7+
FUNC BASEPATH(STR: path):STR{
8+
TNS: tpath = SPLIT(NORMALIZE_PATH(path), "/")
9+
STR: abspath = ""
10+
FOR(i,SUB(TLEN(tpath,1),1) ){
11+
IF(EQ(i,0)){CONTINUE()}
12+
abspath = JOIN(abspath,tpath[i],"/")
13+
}
14+
DEL(tpath)
15+
POP(abspath)
16+
}
17+
18+
FUNC BASENAME(STR: path):STR{
19+
TNS: tpath = SPLIT(NORMALIZE_PATH(path), "/")
20+
STR: basename = tpath[MAX(TLEN(tpath,1),1)]
21+
DEL(tpath)
22+
POP(basename)
23+
}
24+
25+
FUNC SPLITEXT(STR: path):TNS{
26+
STR: npath = NORMALIZE_PATH(path)
27+
STR: base = BASENAME(npath)
28+
STR: dir = BASEPATH(npath)
29+
TNS: parts = SPLIT(base, ".")
30+
# If there is no dot, return the path unchanged and empty ext
31+
IF(EQ(TLEN(parts,1),1)){
32+
DEL(base)
33+
TNS: result = [npath, ""]
34+
POP(result)
35+
}
36+
# Extension is the last segment (1-based indexing)
37+
STR: ext = parts[ TLEN(parts,1) ]
38+
# Name is base without the final "." + ext suffix
39+
STR: name = REPLACE(base, JOIN(".", ext), "")
40+
DEL(parts)
41+
STR: delex = ""
42+
IF(EQ(dir, "")){
43+
delex = name
44+
} ELSE {
45+
delex = JOIN(dir, name)
46+
}
47+
DEL(dir)
48+
DEL(name)
49+
TNS: result = [delex, ext]
50+
DEL(ext)
51+
DEL(delex)
52+
POP(result)
53+
}
54+
55+
FUNC EXTNAME(STR: path):STR{
56+
RETURN(SPLITEXT(path)[10])
57+
}
58+
59+
FUNC DELEXT(STR: path):STR{
60+
RETURN(SPLITEXT(path)[1])
61+
}
62+
63+
STR: interpreter = ARGV()[1]
64+
STR: interpreter_dir = NORMALIZE_PATH(BASEPATH(ARGV()[1]))
65+
IF(GTE(TLEN(ARGV(),1),10)){
66+
STR: script = ARGV()[10]
67+
STR: script_dir = NORMALIZE_PATH(BASEPATH(ARGV()[10]))
68+
} ELSE {
69+
STR: script = ""
70+
STR: script_dir = ""
71+
}

parser.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ class ReturnStatement(Statement):
9292

9393

9494
@dataclass
95-
class DReturnStatement(Statement):
95+
class PopStatement(Statement):
9696
expression: "Expression"
9797

9898

@@ -197,8 +197,8 @@ def _parse_statement(self) -> Statement:
197197
return self._parse_for()
198198
if token.type == "RETURN":
199199
return self._parse_return()
200-
if token.type == "DRETURN":
201-
return self._parse_dreturn()
200+
if token.type == "POP":
201+
return self._parse_pop()
202202
if token.type == "BREAK":
203203
return self._parse_break()
204204
if token.type == "CONTINUE":
@@ -297,10 +297,10 @@ def _parse_return(self) -> ReturnStatement:
297297
expression: Expression = self._parse_parenthesized_expression()
298298
return ReturnStatement(location=self._location_from_token(keyword), expression=expression)
299299

300-
def _parse_dreturn(self) -> DReturnStatement:
301-
keyword = self._consume("DRETURN")
300+
def _parse_pop(self) -> PopStatement:
301+
keyword = self._consume("POP")
302302
expression: Expression = self._parse_parenthesized_expression()
303-
return DReturnStatement(location=self._location_from_token(keyword), expression=expression)
303+
return PopStatement(location=self._location_from_token(keyword), expression=expression)
304304

305305
def _parse_break(self) -> BreakStatement:
306306
keyword = self._consume("BREAK")

test.asmln

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,7 @@ IMPORT(prng)
22
IMPORT(csprng)
33
IMPORT(prime)
44
IMPORT(decimal)
5+
IMPORT(path)
56
IMPORT(waveforms)
67

78
FUNC RUN_TESTS():INT{
@@ -54,6 +55,16 @@ FUNC RUN_TESTS():INT{
5455

5556
PRINT("Decimal library tests passed.")
5657

58+
# Path library tests
59+
ASSERT( EQ(path.NORMALIZE_PATH("a\b\c"), "a/b/c") )
60+
ASSERT( EQ(path.BASENAME("a/b/c.txt"), "c.txt") )
61+
ASSERT( EQ(path.EXTNAME("foo.bar"), "bar") )
62+
ASSERT( EQ(path.DELEXT("foo.bar"), "foo") )
63+
ASSERT( EQ(path.EXTNAME("foo"), "") )
64+
ASSERT( EQ(path.DELEXT("foo"), "foo") )
65+
66+
PRINT("Path library tests passed.")
67+
5768
# Waveform library tests
5869
INT: freq = 1100100 # 100
5970
INT: ms = 1010 # 10

0 commit comments

Comments
 (0)