Skip to content
This repository was archived by the owner on Jun 11, 2026. It is now read-only.
This repository was archived by the owner on Jun 11, 2026. It is now read-only.

Problem about the label of NER dataset #5

Description

@WangYuxuan93

Hi,
I wanted to reproduce the results in the paper and downloaded the dataset from https://xglue.blob.core.windows.net/xglue/xglue_full_dataset.tar.gz
However, I found that the label of the NER dataset is quite different from CoNLL 03.
For example, for instance "Peter Blackburn", in the "en.train" file, the labels are:

Peter I-PER
Blackburn I-PER

While in the original CoNLL 03 file, the labels are:

Peter B-PER
Blackburn I-PER

Briefly, I found that almost all the labels that start with 'B' are replaced with 'I', and only a few labels starting with 'B' still remained in the NER dataset of xglue.
Therefore, I wonder if there is any problem with the NER dataset in xglue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions