From 67b096ce35d499faee4b7209577ce3159afc72d0 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Mateusz=20S=C5=82uszniak?= <mateusz.sluszniak@swmansion.com>
Date: Tue, 3 Feb 2026 19:35:49 +0100
Subject: [PATCH 1/3] docs: Add glossary of terms

---
 .cspell-wordlist.txt                          |  1 +
 .../01-fundamentals/04-glossary-of-terms.md   | 59 +++++++++++++++++++
 2 files changed, 60 insertions(+)
 create mode 100644 docs/docs/01-fundamentals/04-glossary-of-terms.md

diff --git a/.cspell-wordlist.txt b/.cspell-wordlist.txt
index 30fede370..7804b7cfd 100644
--- a/.cspell-wordlist.txt
+++ b/.cspell-wordlist.txt
@@ -98,3 +98,4 @@ ocurred
 libfbjni
 libc
 gradlew
+TTFT
diff --git a/docs/docs/01-fundamentals/04-glossary-of-terms.md b/docs/docs/01-fundamentals/04-glossary-of-terms.md
new file mode 100644
index 000000000..9a1420577
--- /dev/null
+++ b/docs/docs/01-fundamentals/04-glossary-of-terms.md
@@ -0,0 +1,59 @@
+# Glossary of Terms
+
+This glossary defines key concepts used throughout the React Native ExecuTorch ecosystem, covering high-level machine learning terms and library-specific components.
+
+## Backend
+
+The execution engine responsible for running the actual computations of a model on specific hardware.
+
+- **XNNPACK:** A highly optimized library for floating-point neural network inference on ARM, x86, and WebAssembly. It is the default CPU backend for ExecuTorch.
+
+- **Core ML:** Apple's framework for optimizing and running machine learning models on iOS, macOS, and iPadOS devices. Using the Core ML backend allows ExecuTorch to delegate operations to the Apple Neural Engine (ANE) for significantly faster and more energy-efficient inference.
+
+## Forward Function
+
+The primary method of a PyTorch module (usually `forward()`) that defines the computation performed at every call. In the context of ExecuTorch, this is the logic that gets exported and compiled. When you run inference in React Native, you are essentially invoking this compiled forward function with new inputs.
+
+## Inference
+
+The process of using a trained machine learning model to make predictions or generate outputs based on new, unseen input data. Unlike training (which updates the model's weights), inference is static and computationally lighter, making it suitable for running directly on mobile devices.
+
+## Out-of-the-Box Support
+
+Refers to features, models, or architectures that work immediately with React Native ExecuTorch without requiring custom compilation, manual kernel registration, or complex configuration. For example, standard Llama architectures have out-of-the-box support, meaning you can download the `.pte` file and run it instantly.
+
+## Prefill
+
+The initial phase of generating text with an LLM (Large Language Model) where the model processes the entire input prompt (context) at once.
+
+- **Why it matters:** This step is computationally intensive because the model must "understand" all provided tokens simultaneously.
+
+- **Performance Metric:** "Time to First Token" (TTFT) usually measures the speed of the prefill phase.
+
+## Quantization
+
+A technique to reduce the size of a model and speed up inference by representing weights and activations with lower-precision data types (e.g., converting 32-bit floating-point numbers to 8-bit integers).
+
+- **Benefits:** Drastically lowers memory usage (RAM) and saves battery life on mobile devices.
+
+- **Trade-off:** Slight reduction in model accuracy, though often negligible for deployment.
+
+## Tensor
+
+The fundamental data structure in PyTorch and ExecuTorch. A tensor is a multi-dimensional array (like a matrix) that holds the inputs, weights, and outputs of a model.
+
+- **Example:** An image might be represented as a tensor of shape `[3, 224, 224]` (3 color channels, 224 pixels high, 224 pixels wide).
+
+## Token
+
+The basic unit of text that an LLM reads and generates. A token can be a word, part of a word, or even a single character.
+
+- **Rule of thumb:** 1,000 tokens is roughly equivalent to 750 words in English.
+
+- **Context:** Models have a "Context Window" limit (e.g., 2048 tokens), which is the maximum number of tokens they can remember from the conversation history.
+
+## Tokenization
+
+The process of converting raw text (strings) into a sequence of numerical IDs (tokens) that the model can understand.
+
+- **Tokenizer (Component):** In React Native ExecuTorch, the `Tokenizer` is a utility class that handles encoding text into tensors and decoding output tensors back into readable text strings.

From e05ebcabc24670ac6ba91e5b2004c6070459c7a5 Mon Sep 17 00:00:00 2001
From: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>
Date: Wed, 4 Feb 2026 11:37:52 +0100
Subject: [PATCH 2/3] Update docs/docs/01-fundamentals/04-glossary-of-terms.md

Co-authored-by: Jakub Chmura <92989966+chmjkb@users.noreply.github.com>
---
 docs/docs/01-fundamentals/04-glossary-of-terms.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/docs/01-fundamentals/04-glossary-of-terms.md b/docs/docs/01-fundamentals/04-glossary-of-terms.md
index 9a1420577..b3ccd240b 100644
--- a/docs/docs/01-fundamentals/04-glossary-of-terms.md
+++ b/docs/docs/01-fundamentals/04-glossary-of-terms.md
@@ -16,7 +16,7 @@ The primary method of a PyTorch module (usually `forward()`) that defines the co
 
 ## Inference
 
-The process of using a trained machine learning model to make predictions or generate outputs based on new, unseen input data. Unlike training (which updates the model's weights), inference is static and computationally lighter, making it suitable for running directly on mobile devices.
+The process of using a trained machine learning model to make predictions or generate outputs for given input data.
 
 ## Out-of-the-Box Support
 

From 36b59973c10043ef90b80c53c30dd22cc04c4977 Mon Sep 17 00:00:00 2001
From: Mateusz Sluszniak <56299341+msluszniak@users.noreply.github.com>
Date: Wed, 4 Feb 2026 11:38:29 +0100
Subject: [PATCH 3/3] Update docs/docs/01-fundamentals/04-glossary-of-terms.md

---
 docs/docs/01-fundamentals/04-glossary-of-terms.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/docs/docs/01-fundamentals/04-glossary-of-terms.md b/docs/docs/01-fundamentals/04-glossary-of-terms.md
index b3ccd240b..40d7e049e 100644
--- a/docs/docs/01-fundamentals/04-glossary-of-terms.md
+++ b/docs/docs/01-fundamentals/04-glossary-of-terms.md
@@ -56,4 +56,4 @@ The basic unit of text that an LLM reads and generates. A token can be a word, p
 
 The process of converting raw text (strings) into a sequence of numerical IDs (tokens) that the model can understand.
 
-- **Tokenizer (Component):** In React Native ExecuTorch, the `Tokenizer` is a utility class that handles encoding text into tensors and decoding output tensors back into readable text strings.
+- **TokenizerModule (Component):** In React Native ExecuTorch, the `Tokenizer` is a utility class that handles encoding text into tensors and decoding output tensors back into readable text strings.