Skip to content

Commit 79962f6

Browse files
committed
model : Add tokenizer from LFM2.5-Audio-1.5B
[LFM2.5-Audio-1.5B](https://huggingface.co/LiquidAI/LFM2.5-Audio-1.5B) introduced lightweight audio tokenizer. Tokenizer based on LFM2 architecture and acts as "embedding" model with different input `n_embd` and output `n_embd_out`. To be used in ggml-org#18641. To convert use ```shell python3 convert_hf_to_gguf.py /path/to/LFM2.5-Audio-1.5B/audio_detokenizer ```
1 parent afa6bfe commit 79962f6

1 file changed

Lines changed: 19 additions & 0 deletions

File tree

convert_hf_to_gguf.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10891,6 +10891,25 @@ def modify_tensors(self, data_torch, name, bid):
1089110891
yield from super().modify_tensors(data_torch, name, bid)
1089210892

1089310893

10894+
@ModelBase.register("Lfm25AudioTokenizer")
10895+
class LFM25AudioTokenizer(LFM2Model):
10896+
model_arch = gguf.MODEL_ARCH.LFM2
10897+
10898+
def set_gguf_parameters(self):
10899+
super().set_gguf_parameters()
10900+
self.gguf_writer.add_sliding_window(self.hparams["sliding_window"])
10901+
self.gguf_writer.add_embedding_length_out(self.hparams.get("output_size"))
10902+
10903+
def modify_tensors(self, data_torch: Tensor, name: str, bid: int | None) -> Iterable[tuple[str, Tensor]]:
10904+
if name == "istft.window" or name.startswith("emb.emb"):
10905+
return []
10906+
10907+
if name.startswith("lin"):
10908+
name = name.replace("lin", "dense_2_out")
10909+
10910+
yield from super().modify_tensors(data_torch, name, bid)
10911+
10912+
1089410913
@ModelBase.register("SmallThinkerForCausalLM")
1089510914
class SmallThinkerModel(TextModel):
1089610915
model_arch = gguf.MODEL_ARCH.SMALLTHINKER

0 commit comments

Comments
 (0)