#UEF
- data
- http://data.computational-advertising.org/
- ftp://140.113.213.5/02_LAB/2_畢業資料/碩士/2016/王俊儫/data 這個是我從上面的連結抓下來放的,原本的下載很慢
- dependent libraries
- pymongo, scikit-learn
##Library
- lib
- classifier: CTR prediction models
- featureDistribution: Calculate conditional mean for each hashed feature
- featureHasher: A streaming feature hasher dealing with nested dict based on sklearn FeatureHasher
- featureSelector
- poissonInclusion.py
- significantFeatureSelector.py: select feature using featureDistribution
- fileDB: read data
- lr: LR
- lr_fh: LR_FH
- ftrlProximal: FTRL-Proximal
- sem: UEF_SEM
- ssem: UEF_SSEM
# standord: evaluate AUC and Loglikelihood for the model
##Plot Data
- plot
- e1: Stability of Index Set
- e2_e3: Select size k vs AUC, loglikelihood
- e4: Online Memory Usage
- e5: Index Set Size k vs Offline Memory Usage
- e6: Online Run Time
- e7: Index Set Size k vs Offline Run Time
- e8_e9: AUC, Loglikelihood
- e10: Distingusiablity of Each Feature
- e11: Index Set Size k vs Online Memory Usage
- e12: Offline Memory Usage
- e13: Index Set Size k vs Online Run Time
- e14: Offline Run Time
