Embedding lambda sends updateEmbedding message to Thrall#4631
Merged
ellenmuller merged 59 commits intomainfrom Mar 11, 2026
Merged
Embedding lambda sends updateEmbedding message to Thrall#4631ellenmuller merged 59 commits intomainfrom
ellenmuller merged 59 commits intomainfrom
Conversation
joelochlann
reviewed
Feb 19, 2026
joelochlann
reviewed
Feb 19, 2026
Member
|
Not related to this PR but just for reference while we have it. This should count the vectors in the vectors store in TEST: aws s3vectors list-vectors \
--vector-bucket-name image-embeddings-test \
--index-name cohere-embed-english-v3 \
--profile media-service \
--region eu-central-1 \
--page-size 1000 | jq '.vectors | length' |
9aacaa9 to
03517c3
Compare
|
Seen on auth, image-loader, metadata-editor, leases, cropper, collections, media-api, kahuna (merged by @ellenmuller 9 minutes and 49 seconds ago) Please check your changes! |
|
Seen on usage (merged by @ellenmuller 9 minutes and 56 seconds ago) Please check your changes! |
|
Seen on thrall (merged by @ellenmuller 10 minutes and 1 second ago) Please check your changes! |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this change?
Completes the pipeline for persisting image embeddings into Elasticsearch by having the image-embedder lambda send embedding data to Thrall via its Kinesis stream, where Thrall writes it to ES using the existing
migrationAwareUpdaterpattern.Previously, the image-embedder lambda generated embeddings and stored them in the S3 Vector Store, but they were not indexed in Elasticsearch. This PR closes that gap. We are now writing the embeddings to both.
How it works
UpdateEmbeddingMessageand publishes it to the Thrall Kinesis stream usingPutRecords. Failed Kinesis publishes are reported back asbatchItemFailuresso SQS can retry them.UpdateEmbeddingMessagecase is handled, delegating toElasticSearch.updateEmbedding.updateEmbeddingusesmigrationAwareUpdaterwith a Painless script to setctx._source.embeddingon the image document.Kinesis publishing is currently gated off on PROD — embeddings are only written to Thrall on non-PROD stages while testing is in progress.
We will monitor TEST after this is merged in and start embedding to ES on PROD after we feel confident that all is well!
How should a reviewer test this change?
How can success be measured?
Who should look at this?
Tested? Documented?