-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Hi, Thanks for this great package.
right now I use KeyphraseCountVectorizer method to extract keywords based on different POS patterns.
Here is my code:
def kph_extr(docs:list, patt:str) -> list :
vectorizer = KeyphraseCountVectorizer(custom_pos_tagger=custom_pos,stop_words=stop_words, pos_pattern=patt)
vectorizer.fit(docs)
return list(vectorizer.get_feature_names_out())and here is my post patterns:
pos_patterns = ['<NOUN><NOUN><NOUN>', "<NOUN><NOUN>", "<NOUN><ADJ>", "<NOUN><ADJ><NOUN>", "<NOUN><NOUN><NOUN><NOUN>", "<NOUN><NOUN><NOUN><NOUN><NOUN>",
"<ADJ><NOUN><ADJ><NOUN>", "<NOUN><NOUN><ADJ>", "<NOUN><NOUN><NOUN><ADJ>"]
I wanted to know if is there a way to pass a list of pos patterns since I want to do this on a large data set and this takes a long time.
I think the POS protection took a long time and if I can do that once on each document, it reduces the runtime.
Thanks
phuclh and sgr1118
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request