In this exercise, your task is to classify research papers in the citation network Cora. The Cora dataset consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.
The dataset is available at: https://graphsandnetworks.com/the-cora-dataset/, also available as in the torch_geometric package.
- Load the dataset
- Perform multiclass node classification using NetworkX
- Perform multiclass node classification using PyG
- Evaluate and compare the classification results
- To evaluate the result of a multiclass classification model, we can use the
classification_reportfunction insklearn. Check its documentation online. - Think about the reason why some method has lower performance than the others.