Join the Embedbase Discord Server!.
- Embeddings
- openai embeddings
- cohere embeddings
- Google PaLM embeddings
- local (BERT, LLaMa, Vicuna, etc.)
- Vector database
- supabase
- postgres
- qdrant
- local (memory, sqlite, etc.)
- fastapi
- Authentication (optional)
We have a growing task list of issues. Find an issue that appeals to you and make a comment that you'd like to work on it. Include in your comment a brief description of how you'll solve the problem and if there are any open questions you want to discuss.
If the issue is currently unclear but you are interested, please post in Discord and someone can help clarify the issue in more detail.
We write the documentation using Nextra in the docs folder in https://github.com/different-ai/embedbase. It is automatically indexed on changes in Embedbase Cloud and provide a GPT-4-QA interface.
We're all working on different parts of Embedbase together. To make contributions smoothly we recommend the following:
- Fork this project repository and clone it to your local machine. (Read more About Forks) or use gitpod.io.
- Before working on any changes, try to sync the forked repository to keep it up-to-date with the upstream repository.
- On a
new branch
in your fork (aka a "feature branch" and not
main) work on a small focused change that only touches on a few files. - Package up a small bit of work that solves part of the problem into a Pull Request and send it out for review.
- If you're lucky, we can merge your change into
mainwithout any problems. If there are changes to files you're working on, resolve them by:- First try to rebase as suggested in these instructions.
- If rebasing feels too painful, merge as suggested in these instructions.
- Once you've resolved conflicts (if any), finish the review and squash and merge your PR (when squashing try to clean up or update the individual commit messages to be one sensible single one).
- Merge in your change and move on to a new issue or the second step of your current issue.
Additionally, if someone is working on an issue that interests you, ask if they need help on it or would like suggestions on how to approach the issue. If so, share wildly. If they seem to have a good handle on it, let them work on their solution until a challenge comes up.
- At any point you can compare your feature branch to the upstream/main of
different-ai/embedbaseby using a URL like this: https://github.com/different-ai/embedbase/compare/main...bobm4894:embedbase:my-example-feature-branch. Obviously just replacebobm4894with your own GitHub user name andmy-example-feature-branchwith whatever you called the feature branch you are working on, so something likehttps://github.com/different-ai/embedbase/compare/main...<your_github_username>:embedbase:<your_branch_name>. This will show the changes that would appear in a PR, so you can check this to make sure only the files you have changed or added will be part of the PR. - Try not to work on the
mainbranch in your fork - ideally you can keep this as just an updated copy ofmainfromdifferent-ai/embedbase. - If your feature branch gets messed up, just update the
mainbranch in your fork and create a fresh new clean "feature branch" where you can add your changes one by one in separate commits or all as a single commit. - When working on Github actions, you can test locally using act like so
act -W .github/workflows/ci_core.yml --container-architecture linux/amd64(container-architecture is necessary if you use Mac M series)
A review finishes when all blocking comments are addressed and at least one owning reviewer has approved the PR. Be sure to acknowledge any non-blocking comments either by making the requested change, explaining why it's not being addressed now, or filing an issue to handle it later.
- Bump the version in
pyproject.toml
make releasecd sdk/embedbase-py.- Bump the version in
pyproject.toml
make releaseFor the Javascript SDK, just push to main, we use semantic-release to automatically release when a change has been made to the main branch.
We use https://semantic-release.gitbook.io/semantic-release/ under the hood.
cd hosted
make releaseJust push
Just push
FYI the documentation gpt4 extension is using a dataset automatically synced with all Embedbase content (github, discord, etc.).