Skip to content

Feature/build embeddings#180

Merged
kuraisle merged 43 commits intomainfrom
feature/build-embeddings
Mar 3, 2026
Merged

Feature/build embeddings#180
kuraisle merged 43 commits intomainfrom
feature/build-embeddings

Conversation

@kuraisle
Copy link
Copy Markdown
Member

@kuraisle kuraisle commented Jan 7, 2026

✨ Feature

PR Description

Previously, if you don't have a parquet file for embeddings and want to have some in OMOP with PGVector, there wasn't anything to be done about it. This code provides the ability to read from Athena vocabulary CSVs or postgres, create embeddings from some string representation of concepts, then either load them into the db, or write to a parquet file.

I've meant to do this for a while, getting it done was prompted by someone wanting to run lettuce and getting stuck because they couldn't make embeddings

Related Issues or other material

Related #179
Closes #179

Screenshots, example outputs/behaviour etc.

✅ Added/updated tests?

  • This PR contains relevant tests / Or doesn't need to per the below explanation

Copy link
Copy Markdown
Collaborator

@CodeByKarthik CodeByKarthik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi James, thanks for your commit. The PR doesn't have any issues and I have left few comments. Let me know your thoughts on this.

Comment thread build-embeddings/embedding_utils/fetch_concept_batches.py Outdated
Comment thread build-embeddings/embedding_utils/fetch_concept_batches.py Outdated
Comment thread build-embeddings/embedding_utils/fetch_concept_batches.py Outdated
Comment thread build-embeddings/embedding_utils/fetch_concept_batches.py
Copy link
Copy Markdown
Collaborator

@CodeByKarthik CodeByKarthik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks James.

@kuraisle kuraisle merged commit a18f7cb into main Mar 3, 2026
6 checks passed
@kuraisle kuraisle deleted the feature/build-embeddings branch March 3, 2026 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Pipeline for embeddings creation

2 participants