Automating the generation of subgraph documentation through the utilization of Large Language Models.
We utilize poetry for dependency management. Please run poetry install to install the dependencies from within the relevant directory (e.g. graphdoc or mlflow-manager). You can also run poetry shell to activate the virtual environment. Please see the poetry documentation for more information.
We utilize commitizen for commit messages and semantic versioning. Please run cz commit to commit your changes. Commitizen can be installed with pip install commitizen or brew install commitizen.
We utilize docker for managing the tracking of our service and associated expirements through mlflow. In our docker image, we spin up a mlflow, postgres, and minio instance. This is very similar to our production setup, and allows for a pretty smooth development flow between local and prod. Please ensure you have downloaded and are running docker in the background of your machine.
Here are some quick commands for getting started:
brew install poetry
brew install commitizencd graphdoc
poetry install
cd ../mlflow-manager
poetry install There are two .env files that we expect the user to set up. They are divided between mlflow-manager and graphdoc. First, let's setup the mlflow-manager .env file. You can leave these values as they are, or modify them as you see fit:
# navigate to the docker root
cd mlflow-manager
cd docker
# copy the .env.example for setup
cp .env.example .env # set values directly in your newly created .env file Next, let's set up the .env file to be used by our graphdoc program.
# navigate to the graphdoc root
cd ../..
# copy the .env.example for setup
cp .env.example .env # set values directly in your newly created .env file The run.sh script is a convenience script for development. It provides a few shortcuts for running useful commands.
# make sure you are in the root of the repository
# ensure that the script is executable
chmod +x run.sh
# install both packages in development mode
./run.sh devTo setup the mlflow-manager services, run the following command:
# default username: admin
# default password: password
./run.sh mlflow-setupBelow, we provide an overview of the commands available in the run.sh script.
| Command | Description |
|---|---|
./run.sh dev |
Install both packages (graphdoc and mlflow-manager) in development mode |
./run.sh install |
Install both packages (graphdoc and mlflow-manager) in production mode |
./run.sh mlflow-setup |
Setup the mlflow-manager services |
./run.sh mlflow-teardown |
Teardown the mlflow-manager services |
./run.sh doc-quality-train |
Train a document quality model (using the default config in graphdoc/assets/configs/single_prompt_doc_quality_trainer.yaml) |
./run.sh build-and-run-doc-quality-trainer |
Build and run the document quality trainer (./run.sh mlflow-setup && ./run.sh doc-quality-train) |
We utilize python==3.13.0 for this project. Please see the python documentation for more information. We recommend using pyenv to install python.
Below, we include some useful commands for installing python using pyenv.
# install pyenv
brew install pyenv
# install python 3.13.0
pyenv install 3.13.0
# set the python version
pyenv local 3.13.0
# check the python version
python --versionThere is a chance that your system may have python allocated under the name python3. If this is the case, the following commands may help resolve the python namespace to your python3 installation.
# export the pyenv path
export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
# reload the shell
source ~/.zshrc
# check the versions
python --version # Should show 3.13.0
which python # Should show ~/.pyenv/shims/pythonThis project is licensed under the Apache 2.0 License.