Skip to content

EachSheep/RAGSynth

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAGSynth

We will open-source the dataset and scripts after the review is finished.

  1. wash_data_scripts: This section illustrates some of the data cleaning processes we employed during data collection.
  2. src: This section differentiates between datasets, highlighting which datasets are used for evaluation and which serve as corpora.
  3. knowledge_enhancement: This section demonstrates an implementation of RAGSynth.
    1. assemble: The process of synthesizing data using the content in the components.
    2. components: A specific implementation of RAGSynth in the form of a pipeline, where each step produces an output that serves as the input for the next step. This includes the synthesis of both single-hop and multi-hop data.
    3. chunk_by_files: The logic for chunking documents.
    4. rag:training scripts and logs;

About

The implementation of RAGSynth: Synthetic Data for Robust and Faithful RAG Component Optimization

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published