Skip to content

Process mosei_senti_data.pkl to match the text id in mosei.hdf5 #39

@ZhuoZHI-UCL

Description

@ZhuoZHI-UCL

If you are using the mosei_senti_data.pkl and want to get the raw text by matching the id in mosei.hdf5, please consider to use the following script to process the data.

file1 = pickle.load(open('data/mosei_senti_data.pkl', 'rb'))

data = file1['test']['id']

# keep the first element and add the num.
modified_data = []
counters = {}
for element in tqdm(data, desc="Processing elements"):
    key = element[0]
    if key not in counters:
        counters[key] = 0
    modified_data.append(f"{key}[{counters[key]}]")
    counters[key] += 1


file1['test']['id'] = np.array(modified_data)


with open('data/mosei_new.pkl', 'wb') as f:
    pickle.dump(file1, f)

print('all done!')

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions