Blank strings/ strings that contain none of the feature words are currently handled differently by the three closed-ended classifiers.
For form and target, such strings are predicted to belong to the most common class in the training set (rally/ demonstration and domestic government, respectively). For issue, they are classified as none, which is not the most common class in the training set.
See page 19 of Alex's thesis chapter 2, and the following example code
import pandas as pd
from mpeds.classify_protest import MPEDS
test_classifier = MPEDS()
test_data = pd.Series(['', 'avocados and grapefruits'])
test_classifier.getIssue(test_data)
test_classifier.getForm(test_data)
test_classifier.getTarget(test_data)
Blank strings/ strings that contain none of the feature words are currently handled differently by the three closed-ended classifiers.
For form and target, such strings are predicted to belong to the most common class in the training set (rally/ demonstration and domestic government, respectively). For issue, they are classified as none, which is not the most common class in the training set.
See page 19 of Alex's thesis chapter 2, and the following example code