18 Apr 20:14

SomeOddCodeGuy

8d5c76c

v0.62.2 Latest

Latest

Major New Features

End-to-End Tool Call Passthrough — Full tool calling support across all LLM handlers (Claude, OpenAI, Ollama) with both streaming and non-streaming paths. Frontend API handlers extract tools and tool_choice from incoming requests, thread them through the workflow pipeline, and forward them to backend LLM handlers. Backend handlers parse tool call data from LLM responses and return it through the response pipeline to the frontend client. A new allowTools boolean on workflow node configs (default false) gates which nodes forward tools, so memory nodes, summarizers, and categorizers silently suppress tool calls during internal processing. OpenAI format is used as the internal standard; Claude and Ollama handlers convert to/from their native formats. Streaming tool call chunks bypass all text processing (prefix stripping, think-block removal, group-chat reconstruction) and are emitted directly as SSE.
DelimitedChunker Workflow Node — New node type that splits content on a delimiter and returns the first N (head) or last N (tail) chunks, rejoined with the same delimiter. Useful for trimming logs, CSV rows, or section-separated documents. Configurable via content, delimiter, mode ("head"/"tail"), and count properties. Supports variable substitution in both content and delimiter fields.
Conversation Variable Formatting Controls — Two new formatting options for chat_user_prompt_* workflow variables. Node-level addUserAssistantTags (boolean) prepends User: / Assistant: / System: role prefixes to each message in conversation variable strings. User-level separateConversationInVariables (boolean) with conversationSeparationDelimiter (string) replaces the default newline between messages with a custom delimiter.
Node-Level Image Controls — Standard nodes can now control image passthrough via acceptImages (boolean, preserves images on conversation messages sent to the LLM) and maxImagesToSend (integer, limits total images sent keeping the most recent; 0 = no limit). Images are trimmed oldest-first.
/v1/chat/completions Versioned Route — Added /v1/chat/completions as the primary versioned route for the OpenAI-compatible API. The existing /chat/completions is kept as an alias for backward compatibility.

Bug Fixes

Curly Brace Escaping in Agent Outputs/Inputs — Fixed str.format() crashes when agent outputs, agent inputs, or enriched tool call text contain literal curly braces (e.g., JSON from tool calls or files loaded by GetCustomFile). Uses a two-pass sentinel-escaping system: literal braces in variable values are replaced with sentinel tokens before formatting, then restored afterward.
Category Matching With Underscores — Fixed _match_category failing to match category keys containing underscores (e.g., NEW_INSTRUCTION). The existing code stripped punctuation (including underscores) from the LLM output but compared against raw keys. Both sides are now normalized before comparison.

Code Quality

Numeric Config Field Resolution — Replaced ad-hoc maxResponseSizeInTokens variable resolution with a table-driven _resolve_numeric_config_fields() method that handles all ~30 integer config fields and 1 float field in a single pass at the start of node processing.

Bug Fix Pt 2

Concurrency Limiting — Fixed issue where the concurrency issue was stopping GET endpoints from responding, so models wouldn't load in frontends while another call was going through. Now only POST endpoints, which hit the LLM APIs should be limited.
Dependabot Bumps — Dependabot bumped a couple of dependency versions for CVEs.

Assets 2

13 Apr 01:03

SomeOddCodeGuy

v0.62.1

9ce5b98

0.62.1 - Tool Calling

Major New Features

End-to-End Tool Call Passthrough — Full tool calling support across all LLM handlers (Claude, OpenAI, Ollama) with both streaming and non-streaming paths. Frontend API handlers extract tools and tool_choice from incoming requests, thread them through the workflow pipeline, and forward them to backend LLM handlers. Backend handlers parse tool call data from LLM responses and return it through the response pipeline to the frontend client. A new allowTools boolean on workflow node configs (default false) gates which nodes forward tools, so memory nodes, summarizers, and categorizers silently suppress tool calls during internal processing. OpenAI format is used as the internal standard; Claude and Ollama handlers convert to/from their native formats. Streaming tool call chunks bypass all text processing (prefix stripping, think-block removal, group-chat reconstruction) and are emitted directly as SSE.
DelimitedChunker Workflow Node — New node type that splits content on a delimiter and returns the first N (head) or last N (tail) chunks, rejoined with the same delimiter. Useful for trimming logs, CSV rows, or section-separated documents. Configurable via content, delimiter, mode ("head"/"tail"), and count properties. Supports variable substitution in both content and delimiter fields.
Conversation Variable Formatting Controls — Two new formatting options for chat_user_prompt_* workflow variables. Node-level addUserAssistantTags (boolean) prepends User: / Assistant: / System: role prefixes to each message in conversation variable strings. User-level separateConversationInVariables (boolean) with conversationSeparationDelimiter (string) replaces the default newline between messages with a custom delimiter.
Node-Level Image Controls — Standard nodes can now control image passthrough via acceptImages (boolean, preserves images on conversation messages sent to the LLM) and maxImagesToSend (integer, limits total images sent keeping the most recent; 0 = no limit). Images are trimmed oldest-first.
/v1/chat/completions Versioned Route — Added /v1/chat/completions as the primary versioned route for the OpenAI-compatible API. The existing /chat/completions is kept as an alias for backward compatibility.

Bug Fixes

Curly Brace Escaping in Agent Outputs/Inputs — Fixed str.format() crashes when agent outputs, agent inputs, or enriched tool call text contain literal curly braces (e.g., JSON from tool calls or files loaded by GetCustomFile). Uses a two-pass sentinel-escaping system: literal braces in variable values are replaced with sentinel tokens before formatting, then restored afterward.
Category Matching With Underscores — Fixed _match_category failing to match category keys containing underscores (e.g., NEW_INSTRUCTION). The existing code stripped punctuation (including underscores) from the LLM output but compared against raw keys. Both sides are now normalized before comparison.

Code Quality

Numeric Config Field Resolution — Replaced ad-hoc maxResponseSizeInTokens variable resolution with a table-driven _resolve_numeric_config_fields() method that handles all ~30 integer config fields and 1 float field in a single pass at the start of node processing.

Bug Fix Pt 2

Concurrency Limiting — Fixed issue where the concurrency issue was stopping GET endpoints from responding, so models wouldn't load in frontends while another call was going through. Now only POST endpoints, which hit the LLM APIs should be limited.

Assets 2

12 Apr 21:25

SomeOddCodeGuy

v0.62

7dc535d

v0.62 - Tool Calling

Major New Features

End-to-End Tool Call Passthrough — Full tool calling support across all LLM handlers (Claude, OpenAI, Ollama) with both streaming and non-streaming paths. Frontend API handlers extract tools and tool_choice from incoming requests, thread them through the workflow pipeline, and forward them to backend LLM handlers. Backend handlers parse tool call data from LLM responses and return it through the response pipeline to the frontend client. A new allowTools boolean on workflow node configs (default false) gates which nodes forward tools, so memory nodes, summarizers, and categorizers silently suppress tool calls during internal processing. OpenAI format is used as the internal standard; Claude and Ollama handlers convert to/from their native formats. Streaming tool call chunks bypass all text processing (prefix stripping, think-block removal, group-chat reconstruction) and are emitted directly as SSE.
DelimitedChunker Workflow Node — New node type that splits content on a delimiter and returns the first N (head) or last N (tail) chunks, rejoined with the same delimiter. Useful for trimming logs, CSV rows, or section-separated documents. Configurable via content, delimiter, mode ("head"/"tail"), and count properties. Supports variable substitution in both content and delimiter fields.
Conversation Variable Formatting Controls — Two new formatting options for chat_user_prompt_* workflow variables. Node-level addUserAssistantTags (boolean) prepends User: / Assistant: / System: role prefixes to each message in conversation variable strings. User-level separateConversationInVariables (boolean) with conversationSeparationDelimiter (string) replaces the default newline between messages with a custom delimiter.
Node-Level Image Controls — Standard nodes can now control image passthrough via acceptImages (boolean, preserves images on conversation messages sent to the LLM) and maxImagesToSend (integer, limits total images sent keeping the most recent; 0 = no limit). Images are trimmed oldest-first.
/v1/chat/completions Versioned Route — Added /v1/chat/completions as the primary versioned route for the OpenAI-compatible API. The existing /chat/completions is kept as an alias for backward compatibility.

Bug Fixes

Curly Brace Escaping in Agent Outputs/Inputs — Fixed str.format() crashes when agent outputs, agent inputs, or enriched tool call text contain literal curly braces (e.g., JSON from tool calls or files loaded by GetCustomFile). Uses a two-pass sentinel-escaping system: literal braces in variable values are replaced with sentinel tokens before formatting, then restored afterward.
Category Matching With Underscores — Fixed _match_category failing to match category keys containing underscores (e.g., NEW_INSTRUCTION). The existing code stripped punctuation (including underscores) from the LLM output but compared against raw keys. Both sides are now normalized before comparison.

Code Quality

Numeric Config Field Resolution — Replaced ad-hoc maxResponseSizeInTokens variable resolution with a table-driven _resolve_numeric_config_fields() method that handles all ~30 integer config fields and 1 float field in a single pass at the start of node processing.

Assets 2

05 Apr 15:32

SomeOddCodeGuy

v0.61

56230a6

v0.61 - Dependabot update

What's Changed

Bump cryptography from 46.0.5 to 46.0.6 by @dependabot[bot] in #86

New Contributors

@dependabot[bot] made their first contribution in #86

Full Changelog: v0.6...v0.61

Contributors

dependabot

Assets 2

29 Mar 22:57

SomeOddCodeGuy

v0.6

1df4274

v0.6 - Multi-user improvements, more memory and consistency improvements, and lots of bug fixes

v0.6 - March 2026

Major New Features

ContextCompactor Workflow Node — New node type that summarizes conversation messages into two rolling summaries (Old + Oldest) using token-based windowing. Separate from the memory system; designed for recency-aware conversation compaction. Uses XML-style tags and configurable via a settings file.
Automatic Memory Condensation — Optional condensation layer for file-based memories. After enough new memories accumulate (configurable threshold), the oldest batch is LLM-summarized into a single condensed entry, reducing file bloat over long conversations.
Per-Message Image Association — Major refactor replacing synthetic {"role": "images"} messages with a per-message "images" key. Images now stay associated with their originating message from ingestion through to LLM dispatch. Includes OpenAI multimodal content parsing on ingestion.
Claude API Image Support — Full image support for the Claude handler. Supports base64, data URIs, and HTTP URLs. Uses PIL/Pillow for format detection (optional, falls back to JPEG). Images placed before text per Anthropic recommendation.
Per-User Encryption — When an API key is provided via Authorization: Bearer, files are stored in isolated per-key directories. Optional Fernet encryption (AES-128-CBC + HMAC-SHA256, PBKDF2 key derivation) can be enabled per user. Transparent plaintext-to-encrypted migration. Includes a re-keying script.
Multi-User Support — A single WilmerAI instance can now serve multiple users via repeated --User flags. Full per-user isolation: per-user config reads, request-scoped user identification, per-user log directories, aggregated models/tags endpoints.
WSGI Concurrency Limiting Middleware — New --concurrency (default: 1) and --concurrency-timeout (default: 900s) CLI flags on all entry points. Requests exceeding the limit queue until a slot opens or timeout (503). Implemented at the WSGI layer so the semaphore holds across streaming responses.

Bug Fixes

SillyTavern Streaming Hang — Fixed streaming hang when using SillyTavern as a front end.
Open WebUI Streaming Error — Restored JSON heartbeat format (was changed to bare newline, causing JSONDecodeError in Open WebUI's NDJSON parser).
Memory Generation Stalling — Fixed memory generation never triggering after the first run due to empty-message hash collision when front-end injects Author's Note with only a [DiscussionId] tag.
GetCurrentMemoryFromFile Returning Wrong Data — Was sharing a code path with GetCurrentSummaryFromFile and returning rolling chat summary instead of memory chunks. Now correctly returns memory chunks.
Image Lookback Default Regression — Restored default lookback window from 5 back to 10 (was silently halved).
Multi-Word Prefix Detection in Streaming — Fixed StreamingResponseHandler failing to strip multi-word response prefixes (e.g., "AI: ").
Data URI Stripping Before LLM Dispatch — Hardened image key stripping to cover all image formats when llm_takes_images is False.

Hardening and Security

Dependency Pinning — All dependencies pinned to exact versions (==) to mitigate supply chain attacks. Updated several packages including requests (CVE fix), cryptography (reverted to 46.0.5, pre-supply-chain-attack window).
Thread Safety — Per-discussion locks in timestamp service, context compactor, and RAG tool. Thread-safe globals via threading.local(). Lock dictionaries capped at 500 with LRU eviction. Atomic file writes (temp + rename).
Sensitive Logging / Prompt Redaction — New sensitive_logging_utils module. All log statements exposing user content converted to redactable versions. Redaction activates when encryption is enabled or redactLogOutput: true.
JSON Parsing Hardening — Incoming API handlers now use get_json(force=True, silent=True) returning 400 instead of unhandled 500 on invalid JSON.
Configurable Categorization Retries — Removed hardcoded 4-retry loop; now configurable via maxCategorizationAttempts (default: 1).

Code Quality

Optimized variable generation — Conversation-slice variables only computed when referenced in the prompt.
Lazy-load time_context_summary — Skips file I/O when the variable isn't referenced.

Assets 2

09 Feb 03:26

SomeOddCodeGuy

v0.5

4771775

v0.5 - Better message variables for prompts, some new nodes, and memory fixes

Summary

NOTE: This introduces new variables to help deprecate variables like "chat_user_prompt_last_twenty". I'm not getting rid of those, for backwards compatibility purposes, but going forward we don't need them as much.

New Workflow Nodes

JsonExtractor node: extracts fields from JSON in LLM responses without an additional LLM call
TagTextExtractor node: extracts content between XML/HTML-style tags without an additional LLM call

Configurable Prompt Variables

nMessagesToIncludeInVariable: node property to control how many messages are included in chat/templated prompt variables
estimatedTokensToIncludeInVariable: token-budget-based message selection, accumulates recent messages up to a token limit
minMessagesInVariable + maxEstimatedTokensInVariable: combo mode pulling a minimum message count then filling up to a token budget

Token Estimation

Recalibrated rough_estimate_token_length word ratio (1.538 -> 1.35 tokens/word)
Added configurable safety_margin parameter (default 1.10)

Memory System Fixes

Fixed file_exists check that was permanently disabling message-threshold triggers for new conversations
Fixed off-by-one in trigger comparisons (> to >=)
Added HTTP session cleanup via close() to prevent keep-alive connections from blocking llama.cpp slots
Split timeouts into (connect, read) tuples
Added diagnostic logging for memory trigger decisions

Code Quality

Fixed bare except clauses to except Exception in cancellation paths
Added prompt-aware info logging for configurable variable slicing

Example Workflow Configs

Updated all example workflow JSON files to use new configurable variable syntax

Assets 2

05 Jan 03:53

SomeOddCodeGuy

v0.4.1

a437d1e

v0.4.1 - Small hotfix for memories

What's Changed

Corrected an issue with memory system due to recent change removing the imagespecific handlers. by @SomeOddCodeGuy in #82

Contributors

SomeOddCodeGuy

Assets 2

04 Jan 21:26

SomeOddCodeGuy

v0.4

f9f2a6e

v0.4 - Workflow collections, bug fixes, test UI, and some simplification

What's Changed

Fix oldest message chunk being silently discarded in memory generation
Fix incorrect new message count causing duplicate processing of memorized messages
Fix pytest.ini test path case sensitivity

Features:

Add shared workflow collections and workflow selection via API model field (/v1/models and /api/tags endpoints)
Add workflow node execution summary logging with timing info
Add workflowConfigsSubDirectoryOverride for shared workflow folders
Add sharedWorkflowsSubDirectoryOverride for custom shared folder names
Add {Discussion_Id} and {YYYY_MM_DD} variables for file paths
Add variable substitution support for maxResponseSizeInTokens
Add web-based setup wizard (setup_wizard_web.py) (this is a WIP and may be temporary/replaced)
Add vector memory resumability with per-chunk hash logging

Refactors:

Consolidated image handlers into standard handlers (remove ~700 lines)
Standardize preset/workflow naming convention (hyphenated)
Archive legacy workflows to _archive subdirectories
Add pre-configured shared workflow folders

Simplification:

Updated preset names to match endpoint names. Now it makes more sense, as you can more easily use presets to make sure each endpoint gets the appropriate settings.
The _example_general_workflow is the one stop shop for example productivity workflows, and thanks to the custom workflow system its easier to spin more off easily. You can just drop new folders into _shared within workflows and suddenly have new workflows available to you as models. I'll make a video about this later.
Dropped the imagespecific handlers. Finally. Those were something I did early on and I just kept putting off dealing with them, but they always annoyed me. Regular handlers have the image frameworks added in, if they support it.

Tests:

Update tests for corrected memory hash behavior
Added tests for new workflow override features

Assets 2

07 Dec 23:14

SomeOddCodeGuy

v0.3.1

ac447fc

v0.3.1

What's Changed

Updating urllib3 to correct a dependabot issue by @SomeOddCodeGuy in #79

Full Changelog: v0.3.0...v0.3.1

Contributors

SomeOddCodeGuy

Assets 2

13 Oct 02:50

SomeOddCodeGuy

v0.3.0

8b4963b

v0.3.0 - API swapped, Claude Support added, other fixes

Added support for the Claude llm_api
Replaced Flask exposed runnable api with Eventlet for MacOS/Linux, and Waitress for Windows
Fixed the unit tests not running in Windows properly
Corrected two places where a slash at the end of the llm_api url and at the end of the ConfigDirectory folder name caused a break
Added attempt at proper cancellation ability, where pressing "stop" in open webui or other front-ends will appropriately end a workflow and cascade down to an LLM
- Some llm apis work with this, some don't. This should appropriately kill Wilmer and its workflows, but an LLM api in the middle of processing a prompt may not be compelled to stop.
Added the ability to replace Endpoints and Presets with variables
- Limited to hardcoded variables at top of workflow, or agentXInputs from parent workflows

Assets 2

Releases: SomeOddCodeGuy/WilmerAI

v0.62.2

Major New Features

Bug Fixes

Code Quality

Bug Fix Pt 2

Uh oh!

0.62.1 - Tool Calling

Major New Features

Bug Fixes

Code Quality

Bug Fix Pt 2

Uh oh!

v0.62 - Tool Calling

Major New Features

Bug Fixes

Code Quality

Uh oh!

v0.61 - Dependabot update

What's Changed

New Contributors

Contributors

Uh oh!

v0.6 - Multi-user improvements, more memory and consistency improvements, and lots of bug fixes

v0.6 - March 2026

Major New Features

Bug Fixes

Hardening and Security

Code Quality

Uh oh!

v0.5 - Better message variables for prompts, some new nodes, and memory fixes

Summary

New Workflow Nodes

Configurable Prompt Variables

Token Estimation

Memory System Fixes

Code Quality

Example Workflow Configs

Uh oh!

v0.4.1 - Small hotfix for memories

What's Changed

Contributors

Uh oh!

v0.4 - Workflow collections, bug fixes, test UI, and some simplification

What's Changed

Uh oh!

v0.3.1

What's Changed

Contributors

Uh oh!

v0.3.0 - API swapped, Claude Support added, other fixes

Uh oh!