Skip to content

Conversation

@dtrawins
Copy link
Collaborator

@dtrawins dtrawins commented Dec 22, 2025

🛠 Summary

#3844

Fix convert_tokenizer configuration
use only non-restricted test models
fixed version of optimum-intel

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

RH-steve-grubb added a commit to RH-steve-grubb/openvino_model_server that referenced this pull request Dec 22, 2025
Fix convert_tokenizer configuration
use only non-restricted test models
fixed version of optimum-intel
@mzegla mzegla requested a review from Copilot January 7, 2026 09:49
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the build and test configuration for RedHat images to fix tokenizer handling and switch to non-restricted test models. The changes address issue #3844 by fixing the convert_tokenizer configuration, updating to a specific version of optimum-intel, and replacing restricted HuggingFace models with publicly accessible alternatives.

Key changes:

  • Switched from facebook/opt-125m to HuggingFaceTB/SmolLM2-360M-Instruct as the primary LLM test model
  • Changed meta-llama/Llama-3.1-8B-Instruct to unsloth/Llama-3.1-8B-Instruct for tokenizer testing
  • Fixed openvino_tokenizers installation process in RedHat Dockerfile to build from source and install Python bindings correctly

Reviewed changes

Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
windows_prepare_llm_models.bat Added new LLM_MODEL variable and updated LLAMA3_MODEL to use non-restricted model sources
src/test/pull_gguf_hf_model_test.cpp Increased timeout and removed modelscope.cn test case
src/test/llm/output_parsers/llama3_output_parser_test.cpp Updated tokenizer paths to use unsloth Llama model instead of meta-llama
src/test/llm/lm_legacy_regular.pbtxt Changed models_path to use new SmolLM2 model
src/test/llm/lm_cb_regular.pbtxt Changed models_path to use new SmolLM2 model
src/test/llm/llmnode_test.cpp Updated test model paths and adjusted token limits for new model's max length (8192)
run_unit_tests.sh Added proxy configuration support for bazel tests
prepare_llm_models.sh Updated model variables, fixed optimum-intel version, and added facebook/opt-125m as separate download
docs/deploying_server_baremetal.md Updated release version references from v2025.4 to v2025.4.1
demos/python_demos/requirements.txt Pinned optimum-intel to specific commit and updated transformers version constraint
demos/common/export_models/requirements.txt Updated optimum-intel commit, openvino versions to 2025.4.1
create_package.sh Fixed convert_tokenizer script generation and updated version metadata
ci/build_test_OnCommit.groovy Enabled tests for RedHat release image build (RUN_TESTS=1)
Dockerfile.redhat Refactored openvino_tokenizers installation to always build from source with proper Python bindings

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1487 to +1488
// creating prompt that will be tokenized to 8194 tokens when model max length is 8192.
for (int i = 0; i < 8192 - 29; i++) {
Copy link

Copilot AI Jan 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment states '8194 tokens' but the loop creates 8163 tokens (8192 - 29). Additionally, the comment indicates this should exceed the max length of 8192, which would require 8163 + chat template tokens. Clarify whether the total including chat template tokens equals 8194.

Copilot uses AI. Check for mistakes.
@mzegla
Copy link
Collaborator

mzegla commented Jan 7, 2026

This PR will likely need the same adjustments that were done for: #3890

@dtrawins dtrawins requested a review from mzegla January 7, 2026 12:45
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@dtrawins dtrawins merged commit 83bc7bb into releases/2025/4 Jan 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants