# pip install py_nl2sql
from py_nl2sql import LLM,DBInstance,NL2SQLWorkflow
llm = LLM()
instance = DBInstance(
db_type="mysql",
db_name="classicmodels",
need_sql_sample=True,
db_user="root",
db_password="",
db_host="127.0.0.1",
db_port="3306",
llm=llm,
)
query = "what is price of `1968 Ford Mustang`"
service = NL2SQLWorkflow(instance, query, llm)
res = service.get_response()
print(res)When using this project, users need to provide the following three pieces of information:
Users can provide api_key and base_url in two ways: directly pass the parameters, or set OPENAI_API_KEY and OPENAI_BASE_URL in the environment variables.
- Currently, the large model used in this project is only compatible with OpenAI models, and local models and other models will be supported in the future.
- Due to the need to use OpenAI's structured output feature, the default model is set to
gpt-4o-mini.
from py_nl2sql.models.llm import LLM base_url
llm = LLM(api_key="sk-xx",base_url="https://xxx")When creating a new database instance, you need to pass in the LLM (Large Language Model). During the instantiation process, the following operations will be executed:
- Use the embedding model to embed the database table information and store the results in the vector database (this step is mandatory).
- Generate sample SQL based on the database information, which will serve as a reference for converting user queries into SQL later (this step is optional). By default, this feature is enabled; if you do not need to generate sample SQL, you can set
need_sql_sampletoFalse.
instance = DBInstance(
db_type="mysql",
db_name="classicmodels",
need_sql_sample=True,
db_user="root",
db_password="",
db_host="127.0.0.1",
db_port="3306",
llm=llm,
)Features:
- Supports the initialization of multiple database instances for management in scenarios with multiple databases.
- If there are changes to the database, you can directly call the
instance.db_update()method to update the database. During the update process, the database table information will be re-embedded and stored in the vector database.
Note:
The design of DBInstance adopts a multi-instance pattern combined with a state machine. Different objects are instantiated based on db_type + db_name.
User only needs to pass the database instance to be queried and the corresponding query statement, and then call the get_response() method to get the final result.
service = NL2SQLWorkflow(instance, query)
res = service.get_response()Meantime, the NL2SQLWorkflow object saves a series of intermediate process metadata, such as
service.text_to_sql_query # used for sql generation
service.interpretation_query # used for final response generation
service.related_table_summary # Table information related to the query
service.first_sql_query # SQL query generated from the query for the first time
service.final_sql_query # SQL query generated from the query using the similarity SQL
...
The MIT License (MIT)
