-
Notifications
You must be signed in to change notification settings - Fork 138
feat: [NCS] Integrate NCS support into DiskANN #1418
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
dcff880
8347129
1f54ba4
dbad931
4cd5ce1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| FROM ubuntu:22.04 | ||
|
|
||
| ENV DEBIAN_FRONTEND=noninteractive | ||
| ENV CMAKE_VERSION="v3.28.5" | ||
| ENV CMAKE_TAR="cmake-3.28.5-linux-x86_64.tar.gz" | ||
| ENV CCACHE_VERSION="v4.9.1" | ||
| ENV CCACHE_DIR="ccache-4.9.1-linux-x86_64" | ||
| ENV CCACHE_TAR="ccache-4.9.1-linux-x86_64.tar.xz" | ||
| ENV BFLOAT16_WHL="bfloat16-1.4.0-cp311-cp311-linux_x86_64.whl" | ||
|
|
||
| RUN apt update \ | ||
| && apt install -y ca-certificates apt-transport-https software-properties-common lsb-release \ | ||
| && apt install -y --no-install-recommends wget curl git make gfortran gcc g++ swig \ | ||
| && apt install -y gcc-12 g++-12 \ | ||
| && apt install -y python3.11 python3.11-dev python3.11-distutils \ | ||
| && apt install -y python3-setuptools \ | ||
| && apt install -y openssh-client \ | ||
| && apt-get install -y --no-install-recommends libpci3 libpci-dev redis-server \ | ||
| && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-11 110 \ | ||
| && update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-12 120 \ | ||
| && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-11 110 \ | ||
| && update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-12 120 \ | ||
| && update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.10 310 \ | ||
| && update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 311 \ | ||
| && curl -sS https://bootstrap.pypa.io/get-pip.py | python3 \ | ||
| && export PATH=$PATH:$HOME/.local/bin \ | ||
| && pip3 install wheel \ | ||
| && apt remove --purge -y \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| # install cmake, ccache and bfloat16 | ||
| RUN cd /tmp \ | ||
| && wget https://github.com/Kitware/CMake/releases/download/${CMAKE_VERSION}/${CMAKE_TAR} \ | ||
| && tar --strip-components=1 -xz -C /usr/local -f ${CMAKE_TAR} \ | ||
| && rm -f ${CMAKE_TAR} \ | ||
| && wget https://github.com/ccache/ccache/releases/download/${CCACHE_VERSION}/${CCACHE_TAR} \ | ||
| && tar -xf ${CCACHE_TAR} \ | ||
| && cp ${CCACHE_DIR}/ccache /usr/local/bin \ | ||
| && rm -f ${CCACHE_TAR} \ | ||
| && wget https://github.com/zilliztech/knowhere/releases/download/v2.3.1/${BFLOAT16_WHL} \ | ||
| && pip3 install ${BFLOAT16_WHL} \ | ||
| && rm -f ${BFLOAT16_WHL} | ||
|
|
||
| # install knowhere dependancies | ||
| RUN apt update \ | ||
| && apt install -y libopenblas-openmp-dev libcurl4-openssl-dev libaio-dev libevent-dev lcov \ | ||
| && pip3 install conan==1.61.0 \ | ||
| && conan remote add default-conan-local https://milvus01.jfrog.io/artifactory/api/conan/default-conan-local | ||
|
|
||
| WORKDIR /workspace | ||
|
|
||
| # Default entrypoint is an interactive shell so you can run build steps manually. | ||
| ENTRYPOINT ["/bin/bash"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| #!/usr/bin/env bash | ||
|
|
||
| # Parse arguments | ||
| FULL_MODE="" | ||
|
|
||
| # Parse flags (support both in any order) | ||
| while [[ $# -gt 0 ]]; do | ||
| case "$1" in | ||
| --full) | ||
| if [[ -z "$2" || ("$2" != "Debug" && "$2" != "Release") ]]; then | ||
| echo "Usage: $0 [--full <Debug|Release>]" | ||
| exit 1 | ||
| fi | ||
| FULL_MODE="$2" | ||
| shift 2 | ||
| ;; | ||
| *) | ||
| echo "Usage: $0 [--full <Debug|Release>]" | ||
| exit 1 | ||
| ;; | ||
| esac | ||
| done | ||
|
|
||
| docker build -f Dockerfile.builder -t knowhere-builder:latest . | ||
|
|
||
| # Set entrypoint based on mode | ||
| ENTRYPOINT="/bin/bash" | ||
| if [[ -n "$FULL_MODE" ]]; then | ||
| ENTRYPOINT="/workspace/builder_entrypoint.sh" | ||
| fi | ||
|
|
||
| docker run --rm -it \ | ||
| -v "$(pwd)":/workspace \ | ||
| -v "${HOME}/.conan":/root/.conan \ | ||
| -w /workspace \ | ||
| --entrypoint "$ENTRYPOINT" \ | ||
| knowhere-builder:latest ${FULL_MODE:+"$FULL_MODE"} |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| #!/usr/bin/env bash | ||
|
|
||
| BUILD_TYPE=$1 | ||
|
|
||
| # Trap to ensure we always drop into shell, even on error | ||
| trap 'echo "==> Error occurred! Entering interactive shell for debugging..."; exec /bin/bash' ERR | ||
|
|
||
| set -e | ||
|
|
||
| echo "==> Starting full build with mode: $BUILD_TYPE" | ||
|
|
||
| echo "==> Step 1: Cleaning build directory" | ||
| rm -rf build | ||
|
|
||
| echo "==> Step 2: Creating build directory" | ||
| mkdir build && cd build | ||
|
|
||
| echo "==> Step 3: Running conan install" | ||
| conan install .. --build=missing \ | ||
| -o with_ut=True \ | ||
| -o with_diskann=True \ | ||
| -o with_asan=True \ | ||
| -s compiler.libcxx=libstdc++11 \ | ||
| -s build_type=$BUILD_TYPE | ||
|
|
||
| echo "==> Step 4: Running conan build" | ||
| conan build .. | ||
| echo "==> Build complete!" | ||
|
|
||
| echo "==> Step 5: Starting Redis server for NCS tests" | ||
| redis-server --daemonize yes | ||
|
|
||
| echo "Entering interactive shell..." | ||
| exec /bin/bash |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -111,6 +111,7 @@ def requirements(self): | |
| self.requires("libcurl/8.2.1") | ||
| self.requires("simde/0.8.2") | ||
| self.requires("xxhash/0.8.3") | ||
| self.requires("hiredis/1.2.0") | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. where is this used exactly?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hiredis is a client library for the Redis database. It is used by diskann when the Redis implementation of NCS is enabled. Specifically, Hiredis is used in the
|
||
| if self.settings.os == "Android": | ||
| self.requires("openblas/0.3.27") | ||
| if not self.options.with_light: | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please fix this one appropriately to
https://github.com/zilliztech/milvus-common.gitor let us know about the proposed change thereThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have three related PRs across three repositories:
milvus-common,knowhere, andmilvus. The high-level idea behind them is described in the linked GitHub issue.From the point of view of DiskANN, NCS (Near Compute Storage) is an alternative to a locally attached SSD for storing index data. Specifically, NCS can replace local storage when querying the index; the build process remains unchanged.
The proposed change in milvus-common provides the NCS infrastructure, including:
Ncsclass for NCS bucket managementNcsConnectorclass for writing and reading to/from the NCS backendNCS is used by both knowhere (to access the data) and milvus (the coordinator manages NCS). Thus, we added the NCS infrastructure to milvus-common.