Skip to content

Commit ad35f98

Browse files
committed
transformerless_lm: 100x compression RE-VERIFIED on fresh run
Independent re-run confirms the original v2 result: arch params compression val vs dense dense_crt 801,664 1.0x 2.5602 - fibgen_K16_separable 8,064 100.4x 2.9020 +13.3% fibgen_K32_separable 9,216 87.9x 2.7282 +6.6% Both compressed variants are substantially below the uniform-random floor of 4.17 (ln(65)), confirming they LEARN the corpus structure despite having 100x and 88x fewer stored parameters than the dense baseline. This is THE validated headline result for the substrate framework: - 100x weight compression - +13% val loss penalty (single digit at 88x) - 90-93% of dense throughput at inference (validated separately) - 10-37x less RAM at deployment (grows with d_model) The training-speed claims for the various substrate OPERATORS (Subsim L1-distance, FSM recurrence, etc.) remain scale-bound to larger T or d than fits in our CPU bench budget. The training-speed substrate wins exist asymptotically but are not realized in pure PyTorch on CPU at our test scale. The deployment-side compression story stands as the substrate framework's most concretely validated result.
1 parent ad546bd commit ad35f98

1 file changed

Lines changed: 176 additions & 0 deletions

File tree

Lines changed: 176 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,176 @@
1+
{
2+
"dense_crt": {
3+
"arch": "dense_crt",
4+
"n_params": 801664,
5+
"final_val": 2.5602230206131935,
6+
"wall_time": 49.024354457855225,
7+
"val_history": [
8+
[
9+
0,
10+
92.43370771408081,
11+
0.10876584053039551
12+
],
13+
[
14+
300,
15+
2.98776376247406,
16+
5.3295581340789795
17+
],
18+
[
19+
600,
20+
2.749406397342682,
21+
11.435869932174683
22+
],
23+
[
24+
900,
25+
2.6236911714076996,
26+
17.805394649505615
27+
],
28+
[
29+
1200,
30+
2.625833183526993,
31+
23.5450336933136
32+
],
33+
[
34+
1500,
35+
2.5822636634111404,
36+
29.158379316329956
37+
],
38+
[
39+
1800,
40+
2.5743977576494217,
41+
34.97354340553284
42+
],
43+
[
44+
2100,
45+
2.560856521129608,
46+
40.879321813583374
47+
],
48+
[
49+
2400,
50+
2.5528121292591095,
51+
46.79512023925781
52+
],
53+
[
54+
2499,
55+
2.575013518333435,
56+
48.8493754863739
57+
]
58+
]
59+
},
60+
"fibgen_K16_separable": {
61+
"arch": "fibgen_K16_separable",
62+
"n_params": 8064,
63+
"final_val": 2.902008831501007,
64+
"wall_time": 58.964789390563965,
65+
"val_history": [
66+
[
67+
0,
68+
5.270428001880646,
69+
0.15264534950256348
70+
],
71+
[
72+
300,
73+
3.4643709510564804,
74+
7.071675539016724
75+
],
76+
[
77+
600,
78+
3.3865112960338593,
79+
14.172980785369873
80+
],
81+
[
82+
900,
83+
3.305111363530159,
84+
21.31161069869995
85+
],
86+
[
87+
1200,
88+
3.2289299070835114,
89+
28.397788286209106
90+
],
91+
[
92+
1500,
93+
3.0719925612211227,
94+
35.42947506904602
95+
],
96+
[
97+
1800,
98+
2.9877236634492874,
99+
42.33727979660034
100+
],
101+
[
102+
2100,
103+
2.954200729727745,
104+
49.212780714035034
105+
],
106+
[
107+
2400,
108+
2.8961553126573563,
109+
56.331286907196045
110+
],
111+
[
112+
2499,
113+
2.9088927507400513,
114+
58.72295355796814
115+
]
116+
]
117+
},
118+
"fibgen_K32_separable": {
119+
"arch": "fibgen_K32_separable",
120+
"n_params": 9216,
121+
"final_val": 2.7281879782676697,
122+
"wall_time": 66.19854712486267,
123+
"val_history": [
124+
[
125+
0,
126+
5.49858295917511,
127+
0.1640477180480957
128+
],
129+
[
130+
300,
131+
3.3001507967710495,
132+
7.89574122428894
133+
],
134+
[
135+
600,
136+
3.0636631697416306,
137+
15.41629934310913
138+
],
139+
[
140+
900,
141+
2.8641782253980637,
142+
23.382102489471436
143+
],
144+
[
145+
1200,
146+
2.831878826022148,
147+
31.46604561805725
148+
],
149+
[
150+
1500,
151+
2.796993166208267,
152+
40.040778160095215
153+
],
154+
[
155+
1800,
156+
2.7588206231594086,
157+
47.84621453285217
158+
],
159+
[
160+
2100,
161+
2.753335326910019,
162+
55.48466753959656
163+
],
164+
[
165+
2400,
166+
2.7218541502952576,
167+
63.28416299819946
168+
],
169+
[
170+
2499,
171+
2.7392602413892746,
172+
65.91092014312744
173+
]
174+
]
175+
}
176+
}

0 commit comments

Comments
 (0)