-
Notifications
You must be signed in to change notification settings - Fork 3
Expand file tree
/
Copy pathREADME
More file actions
28 lines (15 loc) · 704 Bytes
/
README
File metadata and controls
28 lines (15 loc) · 704 Bytes
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
PyNLP is designed to perform simple NLP procedures (examples include tokenizing, removing stop words, frequency analysis,...)
Design Features:
- Light weight
- PyNLP is multi-core friendly. It will utilize all cores for its operations.
- For each module unittests are provided as part of the source code. Unit tests cover most of the functions and modules
Ideas for the future:
- Combine Tokenize and Preprocess into one package
Help:
- preprocess contains functions for tokenizing
- dataTools helps you import your data
Please note: This project is still in very early stages. Please report bugs to faridani@berkeley.edu
Siamak Faridani
UC Berkeley
July 2010
faridani@berkeley.edu