This guide provides detailed instructions for setting up your development environment, configuring LLMs, and integrating various tools necessary for your project.
We recommend using python 3.10.13
We recommend installing using Conda:
conda env create -f environment_dev.yml
conda activate AutoPromptInstall using pip directly:
pip install -r requirements.txtInstall using pipenv:
pip install pipenv
pipenv syncSet your OpenAI API key in the configuration file config/llm_env.yml. For assistance locating your API key, visit this link.
-
For LLM, we recommend using OpenAI's GPT-4. Alternatively, configure Azure by setting llm type in
config/config_default.ymlto"Azure"and specifying the key inconfig/llm_env.yml. Our system also supports various LLMs, including open source models, through Langchain Pipeline. Change the llmtypeto"HuggingFacePipeline"and specify the model ID in the llmnamefield. -
Configure your Predictor. We employ a predictor to estimate prompt performance. The default predictor LLM is GPT-3.5. Configuration is located in the
predictorsection ofconfig/config_default.yml.
Our pipeline incorporates a human-in-the-loop annotation process using Argilla. Follow these steps to set it up:
-
Set Up Argilla Server and UI: Follow the instructions to install and set up an Argilla server and user interface.
-
Quick Installation Option: For a faster setup, we recommend deploying Argilla on a Hugging Face space.
-
Configure API Settings: After setting up the server, modify the
api_urlandapi_keyin theconfig/config_default.ymlfile. For instance, if using the recommended Hugging Face space, your API URL should be formatted as follows:api_url: 'https://<your-argilla-space-name>.hf.space'.
To specify an LLM as the annotation tool in your pipeline, update the annotator section in the config/config_default.yml file as follows:
annotator:
method: 'llm'
config:
llm:
type: 'OpenAI'
name: 'gpt-4-1106-preview'
instruction:
'Assess whether the text contains a harmful topic.
Answer Yes if it does and No otherwise.'
num_workers: 5
prompt: 'prompts/predictor_completion/prediction.prompt'
mini_batch_size: 1
mode: 'annotation'
We recommend using a robust LLM, like GPT-4, for annotation purposes. In the instruction field, you specify the task instructions for the annotation. The mini_batch_size field determines the number of samples processed in a single annotation pass, allowing you to balance efficiency with LLM token usage.
To effectively track your optimization process, including metrics like score, prompts instances, and error analysis across iterations, we recommend using Weights and Biases.
-
Sign Up for Weights and Biases: Visit their website and follow the instructions to create an account.
-
Enable wandb in Your Configuration: In your project's
config/config_default.ymlfile, setuse_wandbtoTrueto activate wandb support.