Skip to content

Latest commit

 

History

History
16 lines (9 loc) · 757 Bytes

File metadata and controls

16 lines (9 loc) · 757 Bytes

TaxoExt

Extend taxonomy by labeled documents. In this repo, we extend NSFC 3 level discipline taxonomy by NSFC project keywords.

Dependencies

  • Python 3

Run

For raw text, you can extract keywords using the same approach in HierRec. In this repo, we provide a processed file. Therefore, just:

We use PMI to compute the relativeness of a word and a discipline. And the softmax of PMI is used to represent the discipline distribution of a word.