Skip to content

[E2E Model Support] Add TileGym kernel integration for LFM2 MoE #154

@iamanishx

Description

@iamanishx

Overview

This issue tracks the addition of end-to-end TileGym kernel support for Liquid AI's LFM2 MoE model family, starting with: LiquidAI/LFM2-8B-A1B
LFM2-8B-A1B is a hybrid MoE model with 8.3B total parameters and 1.5B active parameters. It combines full-attention blocks, convolution-style blocks, GQA attention, RoPE, RMSNorm-style normalization, SwiGLU-style expert FFNs, and top-k MoE routing.

Planned Steps

  • Add apply_tilegym_kernel_to_lfm2_moe in monkey_patch.py
  • Register lfm2_moe in MODEL_TYPE_TO_APPLY_TILEGYM_FN
  • Patch compatible common kernels where applicable:
    • RoPE
    • RMSNorm
    • attention path
    • SwiGLU / expert MLP path
  • Add an LFM2 MoE wrapper that reuses TileGym fused_moe
  • Verify expert weight layout and routing semantics:
    • gate/up projection order
    • down projection layout
    • top-k routing weights
    • norm_topk_prob
    • routed_scaling_factor
    • expert bias handling
  • Add an E2E inference / benchmark script for LiquidAI/LFM2-8B-A1B

Questions

  • Is LiquidAI/LFM2-8B-A1B an acceptable validation target for a 5090-friendly E2E model integration?
  • Are there existing TileGym conventions for hybrid architectures with non-attention convolution blocks that this integration should follow?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions