You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,7 +11,7 @@ The TensorFlow wheel used in this project can be downloaded from PyPI: https://p
11
11
12
12
## Model Design
13
13
14
-
The model uses a **Gated Recurrent Unit (GRU)** architecture with ~700,000 trainable parameters, comprised of a 500-character scanning window, each with a 256-dimension embedding and 1024-dimension GRU, in addition to a 109-dimension output layer matching the vobulary size.
14
+
The model uses a **Gated Recurrent Unit (GRU)** architecture with ~700,000 trainable parameters, comprised of a 500-character scanning window, each with a 256-dimension embedding and 1024-dimension GRU, in addition to a 109-dimension output layer matching the vocabulary size.
15
15
16
16
The purpose of this project largely being a learning exercise for the author is the primary reason behind the choice of GRU over other architectures (e.g., Transformer) — it consumes relatively few computing resources while offering reasonable ability to capture medium-range dependencies, appropriate for paragraph-length academic abstracts.
0 commit comments