RAGSynth

We will open-source the dataset and scripts after the review is finished.

wash_data_scripts: This section illustrates some of the data cleaning processes we employed during data collection.
src: This section differentiates between datasets, highlighting which datasets are used for evaluation and which serve as corpora.
knowledge_enhancement: This section demonstrates an implementation of RAGSynth.
1. assemble: The process of synthesizing data using the content in the components.
2. components: A specific implementation of RAGSynth in the form of a pipeline, where each step produces an output that serves as the input for the next step. This includes the synthesis of both single-hop and multi-hop data.
3. chunk_by_files: The logic for chunking documents.
4. rag：training scripts and logs；

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
knowledge_enhancement		knowledge_enhancement
src		src
wash_data_scripts		wash_data_scripts
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback