Skip to content

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte #17

@shawnwang95

Description

@shawnwang95

Prefix dict has been built succesfully.
Traceback (most recent call last):
File "predict.py", line 23, in
lstm_predict(sentence)
File "code/Sentiment_lstm.py", line 187, in lstm_predict
data=input_transform(string)
File "code/Sentiment_lstm.py", line 173, in input_transform
model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', binary = True, unicode_errors='ignore')
File "/anaconda3/lib/python3.6/site-packages/gensim/models/word2vec.py", line 1172, in load_word2vec_format
header = utils.to_unicode(fin.readline(), encoding=encoding)
File "/anaconda3/lib/python3.6/site-packages/gensim/utils.py", line 217, in any2unicode
return unicode(text, encoding, errors=errors)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

Tried "model=gensim.models.Word2Vec.load_word2vec_format('lstm_data/Word2vec_model.pkl', unicode_errors='ignore')", still same error.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions