Skip to content

feat: retrieve embeddings from database only when necessary#119

Open
Fangyi Zhou (fangyi-zhou) wants to merge 2 commits intolangchain-ai:mainfrom
fangyi-zhou:select-columns-to-query
Open

feat: retrieve embeddings from database only when necessary#119
Fangyi Zhou (fangyi-zhou) wants to merge 2 commits intolangchain-ai:mainfrom
fangyi-zhou:select-columns-to-query

Conversation

@fangyi-zhou
Copy link
Copy Markdown

When performing a similarity search without using maximal marginal relevance, the database query includes the embeddings by default, whereas the retrived embeddings are discarded without use.

This can be very suboptimal when retrieve a large number of documents due to communication overhead.

@fangyi-zhou Fangyi Zhou (fangyi-zhou) changed the title feat: retrieve embeddings only from database when necessary feat: retrieve embeddings from database only when necessary Sep 18, 2024
@fangyi-zhou
Copy link
Copy Markdown
Author

Fangyi Zhou (fangyi-zhou) commented Sep 23, 2024

Hello can I get a review of this PR? Eugene Yurtsev (@eyurtsev)

@eyurtsev
Copy link
Copy Markdown
Collaborator

Looks reasonable could you add unit tests?

@fangyi-zhou
Copy link
Copy Markdown
Author

I'm not sure how to add unit test for this performance patch, any ideas?

When performing a similarity search without using maximal marginal
relevance, the database query includes the embeddings by default,
whereas the retrived embeddings are discarded without use.

This can be very suboptimal when retrieve a large number of documents
due to communication overhead.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants