Unable to Reproduce Citation Quality Score from the Paper

Hi, 

Thank you for sharing this interesting project and the accompanying paper! I am trying to replicate the results reported in the paper, specifically the **citation quality score**, but I am encountering a significant discrepancy.

In the paper, the citation quality score is reported to be around **0.8**. However, when I replicate the experiments using the 20 papers provided in the dataset, my results are much lower, fluctuating between **0.3–0.4**. 

Here are the steps I followed for my experiment:

1. I followed the instructions in the paper and considered the **first 1500 tokens** of the full text when generating the write-up.
2. Other parts of the pipeline, such as literature retrieval, outline generation, and scoring, were used as-is without modifications.

Additionally, I ran the evaluation on my own custom dataset, but the citation quality score remained in the **0.3–0.4** range. 

To ensure my setup was correct, I used the following command for the evaluation:

```sh
python evaluation.py --topic "Domain Specialization of LLMs" 
               --gpu 0
               --saving_path ./output/
               --model gpt-4o-2024-05-13
               --db_path ./database
               --embedding_model ./model/nomic-embed-text-v1
               --api_url 
               --api_key sk-
```

Could you please provide more details on how the citation quality score is computed and any additional considerations or settings that might help achieve the scores reported in the paper? I’d appreciate any guidance on how to ensure an accurate reproduction of the results.

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to Reproduce Citation Quality Score from the Paper #29

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unable to Reproduce Citation Quality Score from the Paper #29

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions