Skip to content

Performance Reproduction for DDL Schema #4

@katieluo88

Description

@katieluo88

Hello, thanks for releasing the model weights! I'm trying to reproduce the results for the XiYanSQL-QwenCoder-32B by running the released Modelscope weights with the Chinese prompt, as per-example. However, appending the SQL schemas in DDL format yields a very low EX performance (14.47 EX over the total dev set):

                     simple               moderate             challenging          total
count                925                  464                  145                  1534
======================================    ACCURACY    =====================================
accuracy             19.68                7.54                 3.45                 14.47
===========================================================================================

Should I be using a different prompt? I'm using the model with vLLM, prompted according to the following prompt:

NL2SQL_TEMPLATE_CN = f"""你是一名{{dialect}}专家,现在需要阅读并理解下面的【数据库schema】描述,以及可能用到的【参考信息】,并运用{{dialect}}知识生成sql语句回答【用户问题】。
【用户问题】
{{question}}

【数据库schema】
{{db_schema}}

【参考信息】
{{evidence}}

【用户问题】
{{question}}

```sql"""

where the db_schema is given as the concatenation of the DDL schemas within the database: "\n\n".join(f"{ddl}" for _, ddl in db_schemas.items()).

I wish to check if this is this the way it should be prompted, or if it needs to be modified to be something different. Many thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions