Skip to content

Closed-ended classifiers: Inconsistency in handling blank strings #4

@erleholgersen

Description

@erleholgersen

Blank strings/ strings that contain none of the feature words are currently handled differently by the three closed-ended classifiers.

For form and target, such strings are predicted to belong to the most common class in the training set (rally/ demonstration and domestic government, respectively). For issue, they are classified as none, which is not the most common class in the training set.

See page 19 of Alex's thesis chapter 2, and the following example code

import pandas as pd
from mpeds.classify_protest import MPEDS

test_classifier = MPEDS()
test_data = pd.Series(['', 'avocados and grapefruits'])


test_classifier.getIssue(test_data)
test_classifier.getForm(test_data)
test_classifier.getTarget(test_data)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions