generating site

nilaykumar · nilaykumar · commit 0f1cfff70a7d · 2025-12-21T17:49:03.000-05:00
diff --git a/content/garden/december-adventure.md b/content/garden/december-adventure.md
@@ -2,7 +2,7 @@
 title = "december adventure 2025"
 author = ["Nilay Kumar"]
 date = 2025-12-08T00:00:00-05:00
-lastmod = 2025-12-14T02:07:32-05:00
+lastmod = 2025-12-21T17:46:28-05:00
 tags = ["december-adventure", "code", "japanese", "chinese", "calligraphy", "photography"]
 draft = false
 progress = "in-progress"
@@ -17,13 +17,13 @@ I've taken the liberty of retroactively logging the days I missed.
 
 <div class="ox-hugo-table calendar-table">
 
-| 日                | 月                | 火                  | 水                  | 木                 | 金                 | 土                 |
-|------------------|------------------|--------------------|--------------------|-------------------|-------------------|-------------------|
-|                   | [01](#december-1) | [02](#december-2-3) | [03](#december-2-3) | [04](#december-4)  | [05](#december-5)  | [06](#december-6)  |
-| [07](#december-7) | [08](#december-8) | [09](#december-9)   | [10](#december-10)  | [11](#december-11) | [12](#december-12) | [13](#december-13) |
-| 14                | 15                | 16                  | 17                  | 18                 | 19                 | 20                 |
-| 21                | 22                | 23                  | 24                  | 25                 | 26                 | 27                 |
-| 28                | 29                | 30                  | 31                  |                    |                    |                    |
+| 日                    | 月                    | 火                    | 水                    | 木                    | 金                 | 土                 |
+|----------------------|----------------------|----------------------|----------------------|----------------------|-------------------|-------------------|
+|                       | [01](#december-1)     | [02](#december-2-3)   | [03](#december-2-3)   | [04](#december-4)     | [05](#december-5)  | [06](#december-6)  |
+| [07](#december-7)     | [08](#december-8)     | [09](#december-9)     | [10](#december-10)    | [11](#december-11)    | [12](#december-12) | [13](#december-13) |
+| [14](#december-14-18) | [15](#december-14-18) | [16](#december-14-18) | [17](#december-14-18) | [18](#december-14-18) | [19](#december-19) | [20](#december-20) |
+| [21](#december-21)    | 22                    | 23                    | 24                    | 25                    | 26                 | 27                 |
+| 28                    | 29                    | 30                    | 31                    |                       |                    |                    |
 
 </div>
 
@@ -558,6 +558,315 @@ In additional to traversal there's a query API, but I'll scope that out later.
 I'll leave this here for now. Tomorrow I'll see if I have the time to put
 together an example static site and get started on the actual site generation.
 
+
+## december 14-18 {#december-14-18}
+
+Bedridden due to the flu. No adventuring was attempted.
+
+
+## december 19 {#december-19}
+
+Had a bit of energy by the time evening rolled around, so I decided to learn a
+bit more `uxntal`, with the short-term goal of improving my keffiyeh-drawing
+pattern to be less hard-coded. I started pretty basic, just trying to learn a
+little bit about defining subroutines and branching. I wrote a bit of code to
+print the byte or short that's at the top of the working stack.
+
+```uxntal
+@print-short ( a* -- a* )
+  DUP2 SWP print-byte POP print-byte POP
+  JMP2r
+
+@print-byte ( a -- a )
+  DUP #f0 AND #04 SFT print-nibble
+  DUP #0f AND print-nibble
+  JMP2r
+
+@print-nibble ( a -- )
+  DUP #0a LTH ?{ !print-hex-digit } !print-digit
+  JMP2r
+
+@print-digit ( a -- )
+  #30 ADD .Console/write DEO
+  JMP2r
+
+@print-hex-digit ( a -- )
+  #57 ADD .Console/write DEO
+  JMP2r
+```
+
+As far as I understand, the `.Console/write` port only supports printing ascii
+characters, so the task was to grab the byte (the case of the short is clearly
+reduced to that of the byte in `print-short`), break it up into the two nibbles,
+figure out the corresponding ascii values, and printing those.
+
+Now that I'm looking back at the code, it looks absolutely trivial, but I
+learned a lot writing it. I'm starting to get comfortable with the working stack
+(still don't really know much about the return stack) and I now understand the
+`?{ }` construction. In `print-nibble` we use it to branch based on whether the
+nibble is 0-9 or a-f.
+
+To make sure the code worked, I set up some test data in memory and looped
+through it, printing the bytes encountered:
+
+```uxntal
+;test-data
+@byte-loop
+  DUP2 ;test-data SUB2 #00ff GTH2 ?&end
+  DUP2 LDA print-byte POP #20 .Console/write DEO
+  INC2 !byte-loop
+&end
+#0a .Console/write DEOk DEO
+```
+
+There's `ff` bytes worth of test data, so the first line of the loop jumps breaks
+us out of the loop if we've passed the last relevant memory address. The next
+line does the printing, and the last line moves us to the next memory address
+before looping. There's a similar chunk of code that moves through the data 2
+bytes at a time and uses `print-short`.
+
+I used the sublabel `&end` twice (once in the byte-printing test and once in the
+short-printing test), and was curious how things work under the hood. And
+indeed, if you look at the `.rom.sym` file generated by the assembler (say with
+`cat` or `hexdump -C`), you'll find separate symbols `byte-loop/end` and
+`short-loop/end`, so the sublabels are effectively namespaced by the parent label.
+
+Looking around at the `uxntal` documentation I later realized that a slick code
+snippet to do this is already provided on the [software page](https://wiki.xxiivv.com/site/uxntal_software.html). It's quite clever
+and I haven't yet understood how it works. I'll return to this later (see day
+21).
+
+
+## december 20 {#december-20}
+
+In the process of learning the
+basics of `uxntal`, I've been writing code in `emacs` with no syntax highlighting or
+other conveniences. I thought today might be a good time to try to change that.
+Let's write an `emacs` major mode for `uxntal` using `tree-sitter`!
+
+This is a bit of an experiment in hubris, given that I don't know much about
+`emacs`, `emacs-lisp`, `uxntal`, or `tree-sitter`. But I've found a great [article](https://www.masteringemacs.org/article/lets-write-a-treesitter-major-mode) on how
+to do this by Mickey Petersen on exactly this topic. In case it's useful to
+anyone else, the code will be [here](https://git.sr.ht/~nilaykumar/uxntal-ts-mode). Here's what it looks like so far:
+
+{{< figure src="images/december-adventure-2025/uxntal-ts-major-mode.jpg" alt="screenshot of some dark-mode themed, syntax highlighted uxntal code" >}}
+
+I won't go through all the code in detail,
+just some important aspects worth highlighting -- definitely read Mickey's
+article if you're interesting in writing your own `tree-sitter` major mode. What
+follows is more of a log than a walkthrough.
+
+We start by defining a new major mode derived from `prog-mode`, which is the
+generalized "major mode for editing programming language source code".
+
+```emacs-lisp
+(define-derived-mode uxntal-ts-mode prog-mode "uxntal[ts]"
+  "Tree-sitter major mode for editing uxntal code."
+  (setq-local font-lock-defaults nil)
+  (when (treesit-ready-p 'uxntal)
+    (treesit-parser-create 'uxntal)
+    (uxntal-ts-setup)))
+```
+
+Here we set `font-lock-defaults` to `nil`, which I think turns off the default font-locking
+system (a somewhat dated regex-based system for syntax highlighting). We're
+going to be using `tree-sitter` do that. Next, we make sure that we have the
+`uxntal` grammar available, and then create a parser and install it into the
+buffer with `treesit-parser-create`. Finally, we'll do a bunch of setup in a
+separate function.
+
+In particular, we're going to tell `emacs` which piece of code to highlight in
+what color. As far as I can tell, the colors -- or rather, faces -- available to
+use are listed in the manual [here](https://www.gnu.org/software/emacs/manual/html_node/elisp/Faces-for-Font-Lock.html). For example, let's start by displaying
+comments using `font-lock-comment-face`. To do this, we need to add a new feature
+to `treesit-font-lock-feature-list`, and then define it by telling `tree-sitter`
+which pieces of code to apply which face to.
+
+```emacs-lisp
+(setq-local treesit-font-lock-feature-list '((comment)))
+
+(defvar uxntal-ts-font-lock-rules
+  '(
+    :language uxntal
+    :override t
+    :feature comment
+    ((comment) @font-lock-comment-face)))
+
+(setq-local treesit-font-lock-settings
+            (apply #'treesit-font-lock-rules uxntal-ts-font-lock-rules))
+```
+
+The key point to focus on here is the query `((comment) @font-lock-comment-face)`.
+This is telling tree-sitter that every comment found in the syntax tree should
+be decorated with the `font-lock-comment-face`.
+
+Let's take a moment to understand how `tree-sitter` views `uxntal` code. Using `M-x
+treesit-explore-mode`, we can see that the code
+
+```uxntal
+@print-short ( a* -- a* )
+  DUP2 SWP print-byte POP print-byte POP
+  JMP2r
+```
+
+has a syntax tree that looks like:
+
+```nil
+(subroutine
+  (label @ (identifier))
+  (comment)
+  (opcode DUP2)
+  (opcode SWP)
+  (identifier)
+  (opcode POP)
+  (identifier)
+  (opcode POP)
+  (opcode JMP2r))
+```
+
+Pretty easy to understand. So far we've got the comment highlighted. Next let's
+highlight the word/subroutine/label `print-short` we're defining. I don't know if
+it'd be fine just to write a query for label, but out of an abundance of
+caution let's highlight occurrences of label that are children of
+subroutine.  To this, we can use
+
+```emacs-lisp
+:language uxntal
+:override t
+:feature label
+((subroutine (label) @font-lock-function-name-face))
+```
+
+Note that if the query had been `((subroutine (label))
+@font-lock-function-name-face)` instead, any subroutine with a label child would
+be highlighted. Since we don't want to highlight the code of the whole
+subroutine, we make sure that we place our face name right after the element to
+be highlighted.
+
+Similarly, we can highlight sublabels with:
+
+```emacs-lisp
+:language uxntal
+:override t
+:feature sublabel
+((rune (rune_char "&") @font-lock-variable-name-face (identifier) @font-lock-variable-name-face))
+```
+
+Note how we single out specifically only runes with `&`, and make sure to
+highlight both the rune's character and the identifier itself. We do _not_
+highlight the full rune element, though, because that will lead to expressions
+like `?&end` being fully highlighted. I'd prefer just `&end` to be highlighted, with
+the rune `?` left alone.
+
+As one last example, we can highlight square brackets with the comment face
+since they are purely aesthetic:
+
+```emacs-lisp
+:language uxntal
+:override t
+:feature sqbrackets
+((brackets [ "[" "]" ] @font-lock-comment-face))
+```
+
+Note that in the syntax above we can specify the language every time we're
+describing a new feature -- this comes in handy in the case where you're dealing
+with multiple languages in a single buffer. The `override t` acts sort of as a
+cascading behavior -- features defined later can override those defined earlier.
+This could be useful, for instance, if you wanted to single out comments that
+comments that are defined on the same line as a label.
+
+One final note here: currently my feature list looks like:
+
+```emacs-lisp
+  (setq-local treesit-font-lock-feature-list
+              '((comment opcode hex_literal raw_ascii label sublabel relpad abspad sqbrackets)))
+```
+
+We could instead write this as
+
+```emacs-lisp
+  (setq-local treesit-font-lock-feature-list
+              '((comment opcode hex_literal raw_ascii)
+                (label)
+                (sublabel)
+                (relpad)
+                (abspad)
+                (sqbrackets)
+              ))
+```
+
+where we've rearranged how many sublists our features are split into. There's a
+variable `treesit-font-lock-level` that controls how far down this list
+`tree-sitter` will go when actually executing features. I wasted a lot of time
+because I had arranged each feature into its own sublist and then was very
+confused when all but the first 4 of my features weren't activating. For now
+I've thrown everything in a flat list. Later on, when I know a bit more about
+`uxntal`, I'll think about how to group them more carefully. The documentation
+tells us:
+
+> Major modes categorize their fontification features into levels,
+> from 1 which is the absolute minimum, to 4 that yields the maximum
+> fontifications.
+>
+> Level 1 usually contains only comments and definitions.
+> Level 2 usually adds keywords, strings, data types, etc.
+> Level 3 usually represents full-blown fontifications, including
+> assignments, constants, numbers and literals, etc.
+> Level 4 adds everything else that can be fontified: delimiters,
+> operators, brackets, punctuation, all functions, properties,
+> variables, etc.
+
+Another thing I found slightly confusing at first was the syntax tree query
+system. The [manual](https://www.gnu.org/software/emacs/manual/html_node/elisp/Pattern-Matching.html) was kinda helpful, though I wish there were more concrete
+examples. Mostly I wish `treesit-explore-mode` allowed for running queries
+directly. Instead, I had to play around with `treesit-query-capture` and
+`treesit-node-on` manually, which was a bit painful. Still, the flexibility,
+experimentability, and level of documentation in `emacs` is super impressive. The
+fact that I could get syntax highlighting working as a complete beginner is a
+testament to `emacs` strengths (a lot of `M-x helpful...` commands were run).
+
+With this small success under my belt, I'm a bit more confident about
+implementing other features that a `uxntal` major mode might have, so I hope to
+come back to this soon.
+
+
+## december 21 {#december-21}
+
+Let's return to the `uxntal` code snippet for printing hex that we were looking at
+on day 19:
+
+```uxntal
+@<phex> ( short* -: )
+  SWP /b
+  &b ( byte -: )
+    DUP #04 SFT /c
+  &c ( byte -: )
+    #0f AND DUP #09 GTH #27 MUL ADD [ LIT "0 ] ADD #18 DEO
+    JMP2r
+```
+
+Two key things to understand here:
+
+1.  instead of branching we can just use a piecewise formula to determine the
+    ascii character of a nibble. Easy enough
+2.  the subroutine/sublabel calls `/b` and `/c` are crucial here. What happens is
+    that when we jump to `/b`, we've pushed the address of the call location to the
+    return stack, so by the time we print out the first nibble (out of 4 total to
+    be printed), we have on the return stack the address of the calls to `/b` and
+    `/c`. The `JMP2r` call pops the `/c` and has us then go print the second nibble.
+    Since the top of the return stack is now `/b`, the `JMP2r` after printing the
+    second nibble takes us back to (just after) the `/b` call, and we thus repeat.
+    We end up printing two more nibbles, but this time of the next byte (as the
+    working stack has changed).
+
+Very very clever: thanks to `d_m` on `irc` for helping me understand some of this. I
+was confused at first because I couldn't see how the print (`#18 DEO`) was getting
+called 4 times, which would be necessary for printing the nibbles of a short.
+What I hadn't understood is that the jumps to the sublabels here do more than
+just move the program counter -- they also modify the working stack. In other
+words, `JMP2r` is not a simple `return`-from-`@subroutine`, the way I had been
+conceptualizing it.
+
 ---
 
 Down here I'm collecting the little project ideas that tend to pop into