Skip to content

handling dually annotated data #71

@keighrim

Description

@keighrim

Because

(I'm using the term dual annotation to indicate manual annotation redundantly done by any number of annotators more than one)


So far, all the annotation projects we've worked on had single annotation. Based on that fact, we designed workflow regarding processing of annotation data (raw >> gold, organization under batches and dates, etc.) without consideration of

  • IAA measurement
  • adjudication/curation for merging dual annotation

However, in the latest annotation effort - RFB - we started dual annotation, at least for a subset of the whole dataset. And I think it's now a time to discuss how we want to host dual annotations and the adjudicated single set "raw" data in this public repo. Concretely,

  1. We need fixed terms to indicate
    1. raw manual annotation (currently called raw, hereinafter "raw")
    2. adjudicated "gold" annotation (currently no such thing, hereinafter "gold")
    3. machine-ready "public" annotation (currently called gold, hereinafter "release")
  2. Do we want to host both "raw" and "gold", or "gold" only?
  3. How do we publish the adjudication process, if any. I can imagine all-manual adjudication and code-assisted adjudication. In the latter, should we consider special handling of adjudication code, just like process.py?
  4. Where should the IAA calculation results be reported? In README, or a separate file/directory?

And maybe more questions.

Starting this issue to discuss details Any input is welcome!

Done when

We set a guideline or template for handling

  1. dual "raw" annotation files
  2. IAA reports
  3. documentation of adjudication process

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    💡AExtra attention is needed📝DImprovements or additions to documentation

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions