Add multimodal embedding support for VertexAI (Phase 1) by Ndunge-Makau · Pull Request #590 · crmne/ruby_llm

Ndunge-Makau · 2026-01-30T13:30:32Z

What this does

This PR adds support for multimodal embeddings in RubyLLM, enabling embedding images and videos alongside text for use cases like semantic image search, video content analysis, and cross-modal retrieval.

Changes:

Added Vertex AI's multimodalembedding model to the list of known models
Implemented multimodal embedding support for Vertex AI provider
Updated models.json (auto-generated from rake task)

Example usage:

# Create embeddings
RubyLLM.embed "Ruby is elegant and expressive"

# Embedding: #<RubyLLM::Embedding:0x00... >
# Embedding.vectors: {text: [0.214382708, -0.126103446, ... ]}


# Create multimodal embeddings (with supported models)
RubyLLM.embed "Ruby is elegant and expressive" with: ["image.png", "video.mp4"], model: "multimodalembedding"

# Embedding: #<RubyLLM::Embedding:0x00... >
# Embedding.vectors: Vector:  {text: [-0.00527974777, ...], image: [0.0258393418, ... ], video: [{"endOffsetSec" => 0, "startOffsetSec" => 16, "embedding" => [-0.0250446387, 0.0323432237, ...]}, ... }

Type of change

Scope check

I read the Contributing Guide
This aligns with RubyLLM's focus on LLM communication
This isn't application-specific logic that belongs in user code
This benefits most users, not just my specific use case

Required for new features

I opened an issue before writing code and received maintainer approval
Linked issue: #529

PRs for new features or enhancements without a prior approved issue will be closed.

Quality check

I ran overcommit --install and all hooks pass
I tested my changes thoroughly
- For provider changes: Re-recorded VCR cassettes with bundle exec rake vcr:record[provider_name]
- All tests pass: bundle exec rspec
I updated documentation if needed
I didn't modify auto-generated files manually (models.json, aliases.json)

AI-generated code

I used AI tools to help write this code
I have reviewed and understand all generated code (required if above is checked)

API changes

Breaking change
New public methods/classes
Changed method signatures
No API changes

Multimodal embeddings

Fix dimension argument to allow embedding of any size

Add the multimodalembedding model by VertexAI

…dal-embeddings Add with parameter to multimodal embeddings

…dal-embeddings Edit README.md

kaka-ruto

LGTM. Well done @Ndunge-Makau

kaka-ruto · 2026-02-02T13:59:08Z

lib/ruby_llm/providers/vertexai/models.rb

+              modalities: Capabilities.modalities_for(model_id),
+              capabilities: Capabilities.capabilities_for(model_id),
+              pricing: Capabilities.pricing_for(model_id),
+              metadata: Capabilities.determine_metadata(model_id)


Good job here

crmne

Thanks for the contribution.

My preference is multimodal embeddings for all providers that support them, not only Vertex AI. The issue/PR scope sounded provider-wide, which is why I approved it.

If you want to ship Vertex-only, please make that explicit in both issue and PR (title + description) as "phase 1: Vertex AI only," and add follow-up issues for other providers.

Please confirm direction before I do line-by-line review:

multi-provider in this PR, or
explicit Vertex-only phase 1.

Also please fix:

potential regression in Vertex text embeddings
unrelated models.json churn

Ndunge-Makau · 2026-03-04T13:05:55Z

Hi @crmne,

Thanks for the feedback. I'd like to confirm that this PR is explicitly for Vertex AI only as phase 1, and I'll create follow-up issues for other providers.

I've also fixed the regression issue in Vertex text embeddings. Regarding the models.json changes—these were auto-generated by the rake task after I added Vertex's multimodalembedding model to the list of known models. Let me know if you'd still want the previous version restored.

Ndunge-Makau and others added 21 commits November 27, 2025 19:06

Add support for multimodal embedding

46704cf

Remove brackets

1df5fe5

Enable embeddings with and without text

ffe5c2a

Clean up comments

8ce3654

Fix dimensions parameter for VertexAI embeddings

da901df

Merge pull request #1 from Ndunge-Makau/multimodal-embeddings

ae5925c

Multimodal embeddings

Merge branch 'crmne:main' into main

7f2902c

Fix dimension argument to allow embedding of any size

9297de7

Merge pull request #2 from Ndunge-Makau/Fix-dimension-bug

4cf79a1

Fix dimension argument to allow embedding of any size

Merge branch 'crmne:main' into main

71ecbff

Merge branch 'crmne:main' into main

4c58f83

Merge branch 'crmne:main' into main

4406bd3

Merge branch 'crmne:main' into main

4ef2e6a

Merge branch 'crmne:main' into main

37a2da2

Add the multimodalembedding model by VertexAI

979b176

Merge pull request #3 from Ndunge-Makau/add-multimodalembedding-model

42f2052

Add the multimodalembedding model by VertexAI

Add with parameter to hold all media arguments

62a4dd7

Fix image and video input bug

8caa042

Merge pull request #4 from Ndunge-Makau/add-with-parameter-to-multimo…

a3b9b11

…dal-embeddings Add with parameter to multimodal embeddings

Edit README.md

73f2e5e

Merge pull request #5 from Ndunge-Makau/add-with-parameter-to-multimo…

9cb2957

…dal-embeddings Edit README.md

Ndunge-Makau mentioned this pull request Jan 30, 2026

[FEATURE] Add multimodal embedding support (image and video) #529

Open

6 tasks

kaka-ruto reviewed Feb 2, 2026

View reviewed changes

crmne linked an issue Feb 28, 2026 that may be closed by this pull request

[FEATURE] Add multimodal embedding support (image and video) #529

Open

6 tasks

crmne requested changes Mar 1, 2026

View reviewed changes

Fix potential regression issue

ef4c6a2

Ndunge-Makau changed the title ~~Add multimodal embedding support~~ Add multimodal embedding support for VertexAI (Phase 1) Mar 4, 2026

Ndunge-Makau requested a review from crmne March 4, 2026 13:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add multimodal embedding support for VertexAI (Phase 1)#590

Add multimodal embedding support for VertexAI (Phase 1)#590
Ndunge-Makau wants to merge 22 commits intocrmne:mainfrom
Ndunge-Makau:add-multimodal-embedding-support

Ndunge-Makau commented Jan 30, 2026 •

edited

Loading

Uh oh!

kaka-ruto left a comment

Uh oh!

kaka-ruto Feb 2, 2026

Uh oh!

Ndunge-Makau Feb 9, 2026

Uh oh!

crmne left a comment

Uh oh!

Ndunge-Makau commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Ndunge-Makau commented Jan 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this does

Type of change

Scope check

Required for new features

Quality check

AI-generated code

API changes

Uh oh!

kaka-ruto left a comment

Choose a reason for hiding this comment

Uh oh!

kaka-ruto Feb 2, 2026

Choose a reason for hiding this comment

Uh oh!

Ndunge-Makau Feb 9, 2026

Choose a reason for hiding this comment

Uh oh!

crmne left a comment

Choose a reason for hiding this comment

Uh oh!

Ndunge-Makau commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Ndunge-Makau commented Jan 30, 2026 •

edited

Loading