You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: ggml/1_TensorOperationsAndComputationGraphs.md
+62Lines changed: 62 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -126,3 +126,65 @@ GGML has explicit support for view-style and layout-style operations. At the ten
126
126
The public API includes `ggml_view_tensor(...)`, and the repo also exposes shaped view helpers such as `ggml_view_1d(...)` and `ggml_view_2d(...)`. In the GPT-2 example, GGML uses `ggml_view_1d`, `ggml_view_2d`, `ggml_reshape_3d`, and `ggml_permute` to slice KV-cache memory and reinterpret layout without rebuilding tensors from scratch.
127
127
128
128
So the right mental model is: a view operation creates a new tensor object that points back to existing storage through `view_src` and an offset/layout description, while `ne[]` and `nb[]` describe the new logical shape and strides. That is what enables zero-copy or low-copy transformations such as slicing, reshaping, permutation, and transpose-style layout reinterpretation.
129
+
130
+
## Computation Graph Construction
131
+
132
+
### Graph Structure
133
+
134
+
-`include/ggml.h`
135
+
-`src/ggml.c`
136
+
-`src/ggml-opt.cpp`
137
+
138
+
`ggml_cgraph` is GGML’s computation graph object. The public API exposes graph construction through functions such as `ggml_new_graph(...)`, `ggml_new_graph_custom(...)`, and `ggml_build_forward_expand(...)`. The workflow is: define tensor expressions first, build a graph from the final output tensor, and execute it later.
139
+
140
+
The graph stores more than a single output pointer. The implementation uses graph fields such as `size`, `n_nodes`, `n_leafs`, `nodes`, `leafs`, `visited_hash_set`, `grads`, and `grad_accs`, which shows that the graph contains ordered execution nodes, leaf tensors, traversal state, and gradient-related bookkeeping.
141
+
142
+
In GGML, tensor dependencies are represented through each tensor’s `src[]` pointers, while `ggml_cgraph` stores the ordered result of traversing those dependencies so the graph can be executed in dependency order.
143
+
144
+
### Graph Building Process
145
+
146
+
-`include/ggml.h`
147
+
-`src/ggml.c`
148
+
149
+
The main entry point for forward graph construction is `ggml_build_forward_expand(gf, output_tensor)`. The documented usage pattern is:
150
+
151
+
1. create tensors in a `ggml_context`
152
+
2. compose tensor operations
153
+
3. create a `ggml_cgraph`
154
+
4. expand the graph from the final output tensor
155
+
5. execute the graph later
156
+
157
+
At a high level, graph construction works like this:
158
+
159
+
- start from the output tensor
160
+
- follow dependencies through `src[]`
161
+
- collect leaf tensors separately from computed nodes
162
+
- build an ordered node list so dependencies come before the nodes that consume them
163
+
- maintain visited-state and optional gradient mappings inside the graph
164
+
165
+
This organization ensures that sequential graph execution can process nodes in dependency order.
166
+
167
+
### Forward and Backward Graphs
168
+
169
+
GGML supports both forward and backward graph workflows. The public header notes that users can define a function graph once and compute forward or backward graphs multiple times using the same memory buffer. It also documents `ggml_set_param(...)` for marking tensors as optimization parameters.
170
+
171
+
The implementation also includes gradient-related graph state through `grads` and `grad_accs`, which is used when duplicating graphs and preserving gradient mappings.
172
+
173
+
### Graph Planning and Execution
174
+
175
+
-`include/ggml-backend.h`
176
+
177
+
After construction, a `ggml_cgraph` can be planned and executed through the backend layer. The backend API includes:
178
+
179
+
-`ggml_backend_graph_plan_create(...)`
180
+
-`ggml_backend_graph_plan_free(...)`
181
+
-`ggml_backend_graph_plan_compute(...)`
182
+
-`ggml_backend_graph_compute(...)`
183
+
-`ggml_backend_graph_compute_async(...)`
184
+
185
+
So the overall model is:
186
+
187
+
-`ggml_tensor` objects encode dependencies
188
+
-`ggml_build_forward_expand(...)` collects them into a `ggml_cgraph`
189
+
- the graph stores ordered nodes, leaves, and optional gradient metadata
190
+
- backend APIs plan and execute the graph on one or more devices
0 commit comments