Migration Guide

This document covers the major breaking upgrade paths.

`0.6.3` -> `0.6.4`

No public API break, but Android arm64 native packaging defaults changed.

Shorthand config such as android-arm64: [vulkan] is still supported.
If no CPU policy is set, Android arm64 now defaults to cpu_profile: full (all CPU variants).
If you want smaller baseline-only packaging, set cpu_profile: compact explicitly.
cpu_variants: [...] (when provided) overrides cpu_profile.

Example (preserve compact baseline-style packaging):

hooks:
  user_defines:
    llamadart:
      llamadart_native_backends:
        platforms:
          android-arm64:
            backends: [vulkan]
            cpu_profile: compact

`0.5.x` -> `0.6.x`

Template routing / handler APIs

The legacy custom handler/override registry APIs were removed:

ChatTemplateEngine.registerHandler(...)
ChatTemplateEngine.unregisterHandler(...)
ChatTemplateEngine.clearCustomHandlers(...)
ChatTemplateEngine.registerTemplateOverride(...)
ChatTemplateEngine.unregisterTemplateOverride(...)
ChatTemplateEngine.clearTemplateOverrides(...)

Legacy per-call handler routing fields were also removed:

render param: customHandlerId
parse param: handlerId

Error behavior in template render/parse

Template render/parse paths no longer silently downgrade to content-only fallback when a handler/parser fails. Failures are now surfaced to the caller.

Audit call sites that previously relied on silent fallback behavior and handle exceptions explicitly.

`0.4.x` -> `0.5.0`

1) ChatSession API

Old pattern (string-in, string-out helpers):
- session.chat(...)
- session.chatText(...)
New pattern:
- session.create(List<LlamaContentPart> ...)
- stream LlamaCompletionChunk

Example migration:

// Before
await for (final token in session.chat('Hello')) {
  stdout.write(token);
}

// After
await for (final chunk in session.create([LlamaTextContent('Hello')])) {
  stdout.write(chunk.choices.first.delta.content ?? '');
}

2) LlamaChatMessage constructor names

LlamaChatMessage.text(...) -> LlamaChatMessage.fromText(...)
LlamaChatMessage.multimodal(...) -> LlamaChatMessage.withContent(...)

Example migration:

// Before
LlamaChatMessage.text(role: LlamaChatRole.user, content: 'Hi');

// After
LlamaChatMessage.fromText(role: LlamaChatRole.user, text: 'Hi');

3) Logging configuration moved off ModelParams

Removed: ModelParams(logLevel: ...)
Use engine-level controls instead:
- await engine.setDartLogLevel(...)
- await engine.setNativeLogLevel(...)
- or await engine.setLogLevel(...) to set both

Example migration:

// Before
await engine.loadModel(path, modelParams: ModelParams(logLevel: LlamaLogLevel.info));

// After
await engine.setNativeLogLevel(LlamaLogLevel.info);
await engine.loadModel(path);

4) Model reload lifecycle

loadModel(...) now throws if a model is already loaded.
Call await engine.unloadModel() (or dispose()) before loading another model.

5) Public exports tightened

The package root (package:llamadart/llamadart.dart) no longer exports some previous internals. In particular:

ToolRegistry
LlamaTokenizer
ChatTemplateProcessor

Use LlamaEngine, ChatSession, ToolDefinition, and the template APIs as the supported surface.

6) Custom backend implementers

If you maintain your own LlamaBackend implementation, update it to match the current interface:

Add getVramInfo().
Update applyChatTemplate(...) signature/return type (string-based prompt rendering input/output).

7) Template routing in strict parity mode

Template/render/parse behavior is now strict llama.cpp parity:

customTemplate remains supported for per-call template overrides.
Legacy customHandlerId/parse handlerId routing was removed.
ChatTemplateEngine.registerHandler(...) and ChatTemplateEngine.registerTemplateOverride(...) were removed.
Render/parse paths no longer silently downgrade to content-only fallback when a handler/parser fails; failures are surfaced to the caller.

8) Quick migration checklist

Replace old ChatSession chat helpers with create(...) streaming.
Rename LlamaChatMessage named constructors.
Remove ModelParams.logLevel usage.
Audit imports that depended on removed root exports.
For custom backends, implement the latest LlamaBackend interface.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migration Guide

`0.6.3` -> `0.6.4`

`0.5.x` -> `0.6.x`

Template routing / handler APIs

Error behavior in template render/parse

`0.4.x` -> `0.5.0`

1) ChatSession API

2) LlamaChatMessage constructor names

3) Logging configuration moved off ModelParams

4) Model reload lifecycle

5) Public exports tightened

6) Custom backend implementers

7) Template routing in strict parity mode

8) Quick migration checklist

FilesExpand file tree

MIGRATION.md

Latest commit

History

MIGRATION.md

File metadata and controls

Migration Guide

0.6.3 -> 0.6.4

0.5.x -> 0.6.x

Template routing / handler APIs

Error behavior in template render/parse

0.4.x -> 0.5.0

1) ChatSession API

2) LlamaChatMessage constructor names

3) Logging configuration moved off ModelParams

4) Model reload lifecycle

5) Public exports tightened

6) Custom backend implementers

7) Template routing in strict parity mode

8) Quick migration checklist

`0.6.3` -> `0.6.4`

`0.5.x` -> `0.6.x`

`0.4.x` -> `0.5.0`