Skip to content

Latest commit

 

History

History
151 lines (114 loc) · 7.14 KB

File metadata and controls

151 lines (114 loc) · 7.14 KB

PROJECT KNOWLEDGE BASE

Generated: 2026-01-23 Commit: 081e5b6 Branch: main

OVERVIEW

LLM/NLP educational course repository: 15 browser-based interactive demos + Marp/LaTeX lecture slides + Jupyter assignments. Dartmouth PSYC 51.17 "Models of Language and Communication".

STRUCTURE

llm-course/
├── demos/           # 15 interactive web demos (vanilla JS, Transformers.js)
│   ├── shared/      # CSS theme + visualization-utils.js
│   └── 01-15/       # ELIZA → Embeddings Comparison
├── slides/          # Marp markdown lectures
│   ├── week1-10/    # Each week symlinks to template_deck/themes/
│   └── template_deck/  # Build engine (compile.sh, autoscale.js)
├── tests/           # 1500+ tests, custom TestRunner (no framework)
├── assignments/     # Assignment web pages + GitHub Classroom submodules
│   ├── assignment-X/    # Rendered HTML instruction pages
│   └── *-llm-course/    # Submodules (template repos with notebooks)
├── notes/           # Internal debugging logs (not student-facing)
└── admin/           # Syllabus, fonts, compile scripts

SUBMODULES

Assignment template repositories are git submodules under assignments/:

Submodule Notebook Data Files
eliza-llm-course Assignment1_ELIZA.ipynb instructions.txt
spam-classifier-llm-course Assignment2_SPAM_Classifier.ipynb training.zip
embeddings-llm-course Assignment3_Wikipedia_Embeddings.ipynb -
customer-service-bot-llm-course Assignment4_Customer_Service_Chatbot.ipynb -
gpt-llm-course Assignment5_GPT.ipynb -
final-project-llm-course FinalProject_Template.ipynb -

Each notebook has an "Open in Colab" badge. Students fork via GitHub Classroom, not direct clone.

Assignment instruction pages (GitHub Pages):

  • Pattern: https://contextlab.github.io/llm-course/assignments/assignment-X/
  • Example: https://contextlab.github.io/llm-course/assignments/assignment-1/
  • Final project: https://contextlab.github.io/llm-course/assignments/final-project/

Submodule commands:

git submodule update --init --recursive  # Clone submodules after fresh clone
git submodule update --remote            # Pull latest from submodule remotes

WHERE TO LOOK

Task Location Notes
Add/edit demo demos/XX-name/ Follow 4-step: dir → HTML → index entry → README
Edit demo visuals demos/shared/css/demo-styles.css CSS variables for dark/light
Demo utilities demos/shared/js/visualization-utils.js ThemeManager, CanvasUtils, MatrixUtils
Compile slides slides/template_deck/compile.sh Run from week dir: ../template_deck/compile.sh lecture.md
Edit slide theme slides/template_deck/themes/cdl-theme.css Propagates via symlinks
Run tests npm test or npm run test:demoXX Custom TestRunner per file
CI workflows .github/workflows/ test-demos, deploy-demos, build-slides

CONVENTIONS

Indentation: 2-space for JS/HTML/CSS.

Demo structure: Each demo has index.html, js/, optional css/, README.md. Import shared utils via relative paths.

Slide compilation: Markdown → process_markdown.py (splits long content) → Marp → autoscale.js injected into HTML.

Testing: Self-contained TestRunner class in each .mjs file. No Jest/Mocha. Tests mock DOM globals for browser code.

Naming:

  • Demos: XX-kebab-name/ (eliza, chatbot-evolution)
  • Tests: test-demoXX-description.mjs
  • Slides: lectureN.md per week directory

ANTI-PATTERNS (THIS PROJECT)

Pattern Why Forbidden
git checkout/restore/reset without backup CRITICAL: Always git stash or commit first. Violating destroys work.
Refreshing slides without F5 Browser caches. MUST hard-refresh after compile.
Modifying .diagram-container in autoscale Fixed-size elements break if scaled.
Partial regex matches in chatbots Use word boundaries (\b) to avoid "you" matching "young".
PARRY emotions outside [0,20] Breaks historical fidelity to Colby 1972.
Bidirectional substitution without placeholders "I"→"you" then "you"→"me" overwrites. Use temp placeholders.
ALICE botmaster = master Template "My {{BOT:botmaster}} is {{BOT:master}}" needs different values. botmaster="creator" (role), master="Dr. Richard Wallace" (name).

UNIQUE STYLES

Autoscaling philosophy: Start at max font size, scale DOWN only when overflow detected. Never scale up.

Cross-demo imports: Demo 02 imports ElizaEngine from ../../eliza/js/eliza-engine.js. Demos share /demos/data/gutenberg/ corpus.

Slide preprocessing: process_markdown.py auto-splits tables/code >N lines, adds "continued..." markers, renders ```flow blocks as SVG. Per-slide override: <!-- split: N --> or `` before a code block or table. Splits inside callout boxes auto-wrap continuations in the same box type.

COMMANDS

# Development
npm install                     # Install dependencies (just 'compromise')
python -m http.server 8000      # Serve demos locally

# Testing
npm test                        # Run all 1500+ tests
npm run test:demoXX            # Run specific demo tests (01-15)
npm run test:chatbot           # Runs ELIZA, PARRY, ALICE subtests

# Slides
cd slides/weekN
../template_deck/compile.sh lectureN.md   # Compile single lecture
cd slides && ./compile_all_slides.sh      # Compile all LaTeX slides

# Syllabus
cd admin && ./compile.sh       # Rebuild syllabus.pdf

CHATBOT EVOLUTION (Demo 02)

The chatbot evolution demo has multiple bot implementations:

Bot File Patterns Notes
ELIZA Imports from demos/eliza/js/eliza-engine.js ~200 rules Pattern-matching therapist
PARRY js/parry.js State machine Emotional state [0,20], paranoid responses
ALICE js/alice.js + data/alice-patterns-original.json 41,380 Full AIML pattern set
GPT js/gpt-bot.js Neural SmolLM2 auto-selects by RAM

ALICE Template Syntax:

  • {{BOT:property}} - Bot properties from this.context
  • {{STAR:N}} - Wildcard captures (1-indexed)
  • {{SRAI:pattern}} - Recursive pattern redirect
  • {{GET:var}} / {{SET:var:value}} - Context variables
  • {{RANDOM:["a","b","c"]}} - Random selection
  • {{THINK:...}} - Silent execution (no output)

Critical: Bot properties botmaster and master MUST differ. See anti-patterns.

NOTES

  • Demo 07 (RAG) has known memory issues in CI. Uses NODE_OPTIONS="--max-old-space-size=4096".
  • GitHub Pages auto-deploys on push to main. Live at contextlab.github.io/llm-course/demos/ and contextlab.github.io/llm-course/assignments/.
  • Root index.html is dynamic: parses /slides/README.md client-side to build course outline.
  • Fonts: Duplicated in /admin/ and /fonts/. Both needed for different build targets (LaTeX vs HTML).
  • notes/ is dev logs, not course notes. Don't expose to students.
  • Cross-demo imports: Demo 02 (chatbot-evolution) imports from Demo 01 (eliza). Don't refactor eliza without checking downstream.