@copilot Verify each finding against the current code and only fix it if needed.
In @transformer_lattice.c around lines 266 - 273, new_lattice(...) allocates
per-layer/head tensors that are never freed, causing leaks; add and call a
destructor to release those allocations. Implement free_head to free Head
tensors with free_tensor for Wq,Wk,Wv,Wo; free_mha to loop and free_head for
each head and free the heads array; free_ffn to free FFN weights; free_block to
call free_mha and free_ffn for each Block; and free_lattice to iterate layers,
call free_block, free the layers array and null it. After copying data and
before returning from the API path in the code that calls new_lattice (the block
around lattice_forward, memcpy, free_tensor(&X)), call free_lattice(&net) to
ensure per-call allocations from new_lattice are released.
Originally posted by @drQedwards in #229 (comment)