scikit-learn-contrib
diff --git a/‎doc/combine.rst‎
Lines changed: 17 additions & 8 deletions b/‎doc/combine.rst‎
Lines changed: 17 additions & 8 deletions
diff --git a/‎doc/ensemble.rst‎
Lines changed: 3 additions & 3 deletions b/‎doc/ensemble.rst‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎doc/miscellaneous.rst‎
Lines changed: 4 additions & 0 deletions b/‎doc/miscellaneous.rst‎
Lines changed: 4 additions & 0 deletions
diff --git a/‎doc/over_sampling.rst‎
Lines changed: 33 additions & 16 deletions b/‎doc/over_sampling.rst‎
Lines changed: 33 additions & 16 deletions
diff --git a/‎doc/under_sampling.rst‎
Lines changed: 59 additions & 50 deletions b/‎doc/under_sampling.rst‎
Lines changed: 59 additions & 50 deletions
@@ -15,11 +15,11 @@ from over-sampling.
 
 In this regard, Tomek's link and edited nearest-neighbours are the two cleaning
 methods that have been added to the pipeline after applying SMOTE over-sampling
-to obtain a cleaner space. The two ready-to use classes imbalanced-learn implements
-for combining over- and undersampling methods are: (i) :class:`SMOTETomek`
-and (ii) :class:`SMOTEENN`.
+to obtain a cleaner space. The two ready-to use classes imbalanced-learn
+implements for combining over- and undersampling methods are: (i)
+:class:`SMOTETomek` [BPM2004]_ and (ii) :class:`SMOTEENN` [BBM2003]_.
 
-Those two classes can be used like any other sampler with parameters identical 
+Those two classes can be used like any other sampler with parameters identical
 to their former samplers::
 
   >>> from collections import Counter
@@ -50,7 +50,16 @@ noisy samples than :class:`SMOTETomek`.
    :scale: 60
    :align: center
 
-See :ref:`sphx_glr_auto_examples_combine_plot_smote_enn.py`,
-:ref:`sphx_glr_auto_examples_combine_plot_smote_tomek.py`,
-and
-:ref:`sphx_glr_auto_examples_combine_plot_comparison_combine.py`.
+.. topic:: Examples
+
+  * :ref:`sphx_glr_auto_examples_combine_plot_comparison_combine.py`
+
+.. topic:: References
+
+  .. [BPM2004] G. Batista, R. C. Prati, M. C. Monard. "A study of the behavior
+               of several methods for balancing machine learning training
+               data," ACM Sigkdd Explorations Newsletter 6 (1), 20-29, 2004.
+
+  .. [BBM2003] G. Batista, B. Bazzan, M. Monard, "Balancing Training Data for
+               Automated Annotation of Keywords: a Case Study," In WOB, 10-18,
+               2003.
@@ -50,7 +50,6 @@ takes the same parameters than the scikit-learn
 ``sampling_strategy`` and ``replacement`` to control the behaviour of the
 random under-sampler::
 
-
   >>> from imblearn.ensemble import BalancedBaggingClassifier
   >>> bbc = BalancedBaggingClassifier(base_estimator=DecisionTreeClassifier(),
   ...                                 sampling_strategy='auto',
@@ -115,8 +114,9 @@ ensemble as::
   >>> balanced_accuracy_score(y_test, y_pred)  # doctest: +ELLIPSIS
   0.62484778593026025
 
-See
-:ref:`sphx_glr_auto_examples_ensemble_plot_comparison_ensemble_classifier.py`.
+.. topic:: Examples
+
+  * :ref:`sphx_glr_auto_examples_ensemble_plot_comparison_ensemble_classifier.py`
 
 .. topic:: References
 
 
@@ -149,3 +149,7 @@ will be passed to ``fit_generator``::
   ...     X, y, sampler=RandomUnderSampler(), batch_size=10, random_state=42)
   >>> callback_history = model.fit_generator(generator=training_generator,
   ...                                        epochs=10, verbose=0)
+
+.. topic:: References
+
+  * :ref:`sphx_glr_auto_examples_applications_porto_seguro_keras_under_sampling.py`
@@ -9,6 +9,9 @@ Over-sampling
 A practical guide
 =================
 
+You can refer to
+:ref:`sphx_glr_auto_examples_over-sampling_plot_comparison_over_sampling.py`.
+
 .. _random_over_sampler:
 
 Naive random over-sampling
@@ -68,18 +71,15 @@ In addition, :class:`RandomOverSampler` allows to sample heterogeneous data
   >>> print(y_resampled)
   [0 0 1 1]
 
-See :ref:`sphx_glr_auto_examples_over-sampling_plot_random_over_sampling.py`
-for usage example.
-
 .. _smote_adasyn:
 
 From random over-sampling to SMOTE and ADASYN
 ---------------------------------------------
 
 Apart from the random sampling with replacement, there are two popular methods
-to over-sample minority classes: (i) the Synthetic Minority Oversampling Technique
-(SMOTE) and (ii) the Adaptive Synthetic (ADASYN) sampling method. These algorithms
-can be used in the same manner::
+to over-sample minority classes: (i) the Synthetic Minority Oversampling
+Technique (SMOTE) [CBHK2002]_ and (ii) the Adaptive Synthetic (ADASYN)
+[HBGL2008]_ sampling method. These algorithms can be used in the same manner::
 
   >>> from imblearn.over_sampling import SMOTE, ADASYN
   >>> X_resampled, y_resampled = SMOTE().fit_resample(X, y)
@@ -91,16 +91,25 @@ can be used in the same manner::
   [(0, 4673), (1, 4662), (2, 4674)]
   >>> clf_adasyn = LinearSVC().fit(X_resampled, y_resampled)
 
-The figure below illustrates the major difference of the different over-sampling
-methods.
+The figure below illustrates the major difference of the different
+over-sampling methods.
 
 .. image:: ./auto_examples/over-sampling/images/sphx_glr_plot_comparison_over_sampling_003.png
    :target: ./auto_examples/over-sampling/plot_comparison_over_sampling.html
    :scale: 60
    :align: center
 
-See :ref:`sphx_glr_auto_examples_over-sampling_plot_smote.py` and
-:ref:`sphx_glr_auto_examples_over-sampling_plot_adasyn.py` for usage example.
+.. topic:: References
+
+  .. [HBGL2008] He, Haibo, Yang Bai, Edwardo A. Garcia, and Shutao Li. "ADASYN:
+                Adaptive synthetic sampling approach for imbalanced learning,"
+                In IEEE International Joint Conference on Neural Networks (IEEE
+                World Congress on Computational Intelligence), pp. 1322-1328,
+                2008.
+
+  .. [CBHK2002] N. V. Chawla, K. W. Bowyer, L. O.Hall, W. P. Kegelmeyer,
+                "SMOTE: synthetic minority over-sampling technique," Journal of
+                artificial intelligence research, 321-357, 2002.
 
 Ill-posed examples
 ------------------
@@ -143,25 +152,33 @@ nearest neighbors class. Those variants are presented in the figure below.
    :align: center
 
 
-The :class:`BorderlineSMOTE` and :class:`SVMSMOTE` offer some variant of the SMOTE
-algorithm::
+The :class:`BorderlineSMOTE` [HWB2005]_ and :class:`SVMSMOTE` [NCK2009]_ offer
+some variant of the SMOTE algorithm::
 
   >>> from imblearn.over_sampling import BorderlineSMOTE
   >>> X_resampled, y_resampled = BorderlineSMOTE().fit_resample(X, y)
   >>> print(sorted(Counter(y_resampled).items()))
   [(0, 4674), (1, 4674), (2, 4674)]
 
-See :ref:`sphx_glr_auto_examples_over-sampling_plot_comparison_over_sampling.py`
-to see a comparison between the different over-sampling methods.
+.. topic:: References
+
+  .. [HWB2005] H. Han, W. Wen-Yuan, M. Bing-Huan, "Borderline-SMOTE: a new
+               over-sampling method in imbalanced data sets learning," Advances
+               in intelligent computing, 878-887, 2005.
+
+  .. [NCK2009] H. M. Nguyen, E. W. Cooper, K. Kamei, "Borderline over-sampling
+               for imbalanced data classification," International Journal of
+               Knowledge Engineering and Soft Data Paradigms, 3(1), pp.4-21,
+               2009.
 
 Mathematical formulation
 ========================
 
 Sample generation
 -----------------
 
-Both SMOTE and ADASYN use the same algorithm to generate new
-samples. Considering a sample :math:`x_i`, a new sample :math:`x_{new}` will be
+Both SMOTE and ADASYN use the same algorithm to generate new samples.
+Considering a sample :math:`x_i`, a new sample :math:`x_{new}` will be
 generated considering its k neareast-neighbors (corresponding to
 ``k_neighbors``). For instance, the 3 nearest-neighbors are included in the
 blue circle as illustrated in the figure below. Then, one of these
 
@@ -6,6 +6,9 @@ Under-sampling
 
 .. currentmodule:: imblearn.under_sampling
 
+You can refer to
+:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`.
+
 .. _cluster_centroids:
 
 Prototype generation
@@ -55,9 +58,6 @@ original one.
    generated are not specifically sparse. Therefore, even if the resulting
    matrix will be sparse, the algorithm will be inefficient in this regard.
 
-See :ref:`sphx_glr_auto_examples_under-sampling_plot_cluster_centroids.py` and
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`.
-
 Prototype selection
 ===================
 
@@ -116,13 +116,9 @@ In addition, :class:`RandomUnderSampler` allows to sample heterogeneous data
   >>> print(y_resampled)
   [0 1]
 
-See :ref:`sphx_glr_auto_examples_plot_sampling_strategy_usage.py`.,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`,
-and :ref:`sphx_glr_auto_examples_under-sampling_plot_random_under_sampler.py`.
-
-:class:`NearMiss` adds some heuristic rules to select
-samples. :class:`NearMiss` implements 3 different types of heuristic which can
-be selected with the parameter ``version``::
+:class:`NearMiss` adds some heuristic rules to select samples [MZ2003]_.
+:class:`NearMiss` implements 3 different types of heuristic which can be
+selected with the parameter ``version``::
 
   >>> from imblearn.under_sampling import NearMiss
   >>> nm1 = NearMiss(version=1)
@@ -137,10 +133,12 @@ from scikit-learn. The former parameter is used to compute the average distance
 to the neighbors while the latter is used for the pre-selection of the samples
 of interest.
 
-See
-:ref:`sphx_glr_auto_examples_applications_plot_multi_class_under_sampling.py`,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`,
-and :ref:`sphx_glr_auto_examples_under-sampling_plot_nearmiss.py`.
+.. topic:: References
+
+  .. [MZ2003] I. Mani, I. Zhang. "kNN approach to unbalanced data
+              distributions: a case study involving information extraction," In
+              Proceedings of workshop on learning from imbalanced datasets,
+              2003.
 
 Mathematical formulation
 ^^^^^^^^^^^^^^^^^^^^^^^^
@@ -194,9 +192,6 @@ affected by noise due to the first step sample selection.
    :scale: 60
    :align: center
 
-See
-:ref:`sphx_glr_auto_examples_under-sampling_plot_illustration_nearmiss.py`.
-
 Cleaning under-sampling techniques
 ----------------------------------
 
@@ -209,9 +204,9 @@ which will clean the dataset.
 Tomek's links
 ^^^^^^^^^^^^^
 
-:class:`TomekLinks` detects the so-called Tomek's links. A Tomek's link
-between two samples of different class :math:`x` and :math:`y` is defined such
-that there is no example :math:`z` such that:
+:class:`TomekLinks` detects the so-called Tomek's links [T2010]_. A Tomek's
+link between two samples of different class :math:`x` and :math:`y` is defined
+such that there is no example :math:`z` such that:
 
 .. math::
 
@@ -238,10 +233,10 @@ figure illustrates this behaviour.
    :scale: 60
    :align: center
 
-See
-:ref:`sphx_glr_auto_examples_under-sampling_plot_illustration_tomek_links.py`
-and
-:ref:`sphx_glr_auto_examples_under-sampling_plot_tomek_links.py`.
+.. topic:: References
+
+  .. [T2010] I. Tomek, "Two modifications of CNN," In Systems, Man, and
+             Cybernetics, IEEE Transactions on, vol. 6, pp 769-772, 2010.
 
 .. _edited_nearest_neighbors:
 
@@ -250,7 +245,7 @@ Edited data set using nearest neighbours
 
 :class:`EditedNearestNeighbours` applies a nearest-neighbors algorithm and
 "edit" the dataset by removing samples which do not agree "enough" with their
-neighboorhood. For each sample in the class to be under-sampled, the
+neighboorhood [W1972]_. For each sample in the class to be under-sampled, the
 nearest-neighbours are computed and if the selection criterion is not
 fulfilled, the sample is removed. Two selection criteria are currently
 available: (i) the majority (i.e., ``kind_sel='mode'``) or (ii) all (i.e.,
@@ -270,8 +265,8 @@ The parameter ``n_neighbors`` allows to give a classifier subclassed from
 the decision to keep a given sample or not.
 
 :class:`RepeatedEditedNearestNeighbours` extends
-:class:`EditedNearestNeighbours` by repeating the algorithm multiple times.
-Generally, repeating the algorithm will delete more data::
+:class:`EditedNearestNeighbours` by repeating the algorithm multiple times
+[T1976]_. Generally, repeating the algorithm will delete more data::
 
    >>> from imblearn.under_sampling import RepeatedEditedNearestNeighbours
    >>> renn = RepeatedEditedNearestNeighbours()
@@ -281,7 +276,7 @@ Generally, repeating the algorithm will delete more data::
 
 :class:`AllKNN` differs from the previous
 :class:`RepeatedEditedNearestNeighbours` since the number of neighbors of the
-internal nearest neighbors algorithm is increased at each iteration::
+internal nearest neighbors algorithm is increased at each iteration [T1976]_::
 
   >>> from imblearn.under_sampling import AllKNN
   >>> allknn = AllKNN()
@@ -297,19 +292,24 @@ impact by cleaning noisy samples next to the boundaries of the classes.
    :scale: 60
    :align: center
 
-See
-:ref:`sphx_glr_auto_examples_pipeline_plot_pipeline_classification.py`,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`,
-and :ref:`sphx_glr_auto_examples_under-sampling_plot_enn_renn_allknn.py`.
+.. topic:: References
+
+  .. [W1972] D. Wilson, Asymptotic" Properties of Nearest Neighbor Rules Using
+             Edited Data," In IEEE Transactions on Systems, Man, and
+             Cybernetrics, vol. 2 (3), pp. 408-421, 1972.
+
+  .. [T1976] I. Tomek, "An Experiment with the Edited Nearest-Neighbor
+             Rule," IEEE Transactions on Systems, Man, and Cybernetics, vol.
+             6(6), pp. 448-452, June 1976.
 
 .. _condensed_nearest_neighbors:
 
 Condensed nearest neighbors and derived algorithms
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 :class:`CondensedNearestNeighbour` uses a 1 nearest neighbor rule to
-iteratively decide if a sample should be removed or not. The algorithm is
-running as followed:
+iteratively decide if a sample should be removed or not [H1968]_. The algorithm
+is running as followed:
 
 1. Get all minority samples in a set :math:`C`.
 2. Add a sample from the targeted class (class to be under-sampled) in
@@ -331,10 +331,10 @@ However as illustrated in the figure below, :class:`CondensedNearestNeighbour`
 is sensitive to noise and will add noisy samples.
 
 In the contrary, :class:`OneSidedSelection` will use :class:`TomekLinks` to
-remove noisy samples.  In addition, the 1 nearest neighbor rule is applied to
-all samples and the one which are misclassified will be added to the set
-:math:`C`. No iteration on the set :math:`S` will take place. The class can be
-used as::
+remove noisy samples [KM1997]_. In addition, the 1 nearest neighbor rule is
+applied to all samples and the one which are misclassified will be added to the
+set :math:`C`. No iteration on the set :math:`S` will take place. The class can
+be used as::
 
   >>> from imblearn.under_sampling import OneSidedSelection
   >>> oss = OneSidedSelection(random_state=0)
@@ -346,9 +346,9 @@ Our implementation offer to set the number of seeds to put in the set :math:`C`
 originally by setting the parameter ``n_seeds_S``.
 
 :class:`NeighbourhoodCleaningRule` will focus on cleaning the data than
-condensing them. Therefore, it will used the union of samples to be rejected
-between the :class:`EditedNearestNeighbours` and the output a 3 nearest
-neighbors classifier. The class can be used as::
+condensing them [J2001]_. Therefore, it will used the union of samples to be
+rejected between the :class:`EditedNearestNeighbours` and the output a 3
+nearest neighbors classifier. The class can be used as::
 
   >>> from imblearn.under_sampling import NeighbourhoodCleaningRule
   >>> ncr = NeighbourhoodCleaningRule()
@@ -361,11 +361,18 @@ neighbors classifier. The class can be used as::
    :scale: 60
    :align: center
 
-See
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_condensed_nearest_neighbour.py`,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_one_sided_selection.py`, and
-:ref:`sphx_glr_auto_examples_under-sampling_plot_neighbourhood_cleaning_rule.py`.
+.. topic:: References
+
+  .. [H1968] P. Hart, "The condensed nearest neighbor rule,"
+             In Information Theory, IEEE Transactions on, vol. 14(3), pp.
+             515-516, 1968.
+
+  .. [KM1997] M. Kubat, S. Matwin, "Addressing the curse of imbalanced training
+              sets: one-sided selection," In ICML, vol. 97, pp. 179-186, 1997.
+
+  .. [J2001] J. Laurikkala, "Improving identification of difficult small
+             classes by balancing class distribution," Springer Berlin
+             Heidelberg, 2001.
 
 .. _instance_hardness_threshold:
 
@@ -374,7 +381,7 @@ Instance hardness threshold
 
 :class:`InstanceHardnessThreshold` is a specific algorithm in which a
 classifier is trained on the data and the samples with lower probabilities are
-removed. The class can be used as::
+removed [SMMG2014]_. The class can be used as::
 
   >>> from sklearn.linear_model import LogisticRegression
   >>> from imblearn.under_sampling import InstanceHardnessThreshold
@@ -403,6 +410,8 @@ The figure below gives another examples on some toy data.
    :scale: 60
    :align: center
 
-See
-:ref:`sphx_glr_auto_examples_under-sampling_plot_comparison_under_sampling.py`,
-:ref:`sphx_glr_auto_examples_under-sampling_plot_instance_hardness_threshold.py`.
+.. topic:: References
+
+  .. [SMMG2014] D. Smith, Michael R., Tony Martinez, and Christophe
+                Giraud-Carrier. "An instance level analysis of data
+                complexity." Machine learning 95.2 (2014): 225-256.