Skip to content

kitab-project-org/one_to_all

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

41 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This repository contains files containing text reuse data generated by passim and sorted by text version.

Each text version ("book 1") has two tsv files:

  • one in the "stats" folder, which contains a single row for each other text version ("book 2") in the corpus passim has detected text reuse with. Columns:
    • id: version ID (without language component and extension) of book 2
    • book: book URI of book 2
    • alignments: number of alignments with book 2
    • ch_match: number of characters in book 1 that are matched in book 2
  • one in the "msdata" folder, which contains a row for each text reuse alignment passim found for book 1 Columns:
    • ms1: milestone number in book 1
    • b1: character offset of the start of the alignment in book 1
    • e1: character offset of the end of the alignment in book 1
    • id2: version ID (without language component and extension) of book 2
    • ms2: milestone number in book 2
    • b2: character offset of the start of the alignment in book 2
    • e2: character offset of the end of the alignment in book 2
    • ch_match: number of characters in ms1 that are matched in ms2
    • matches_percent: percentage of characters in ms1 matched in ms2

About

passim text reuse data: one book compared to the entire corpus

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages