22Quick start guide
33=================
44
5- In the following we provide some pointers about which functions and classes
5+ In the following we provide some pointers about which functions and classes
66to use for different problems related to optimal transport (OT) and machine
77learning. We refer when we can to concrete examples in the documentation that
88are also available as notebooks on the POT Github.
99
1010This document is not a tutorial on numerical optimal transport. For this we strongly
11- recommend to read the very nice book [15 ]_ .
11+ recommend to read the very nice book [15 ]_ .
1212
1313
1414Optimal transport and Wasserstein distance
@@ -55,8 +55,8 @@ solver is quite efficient and uses sparsity of the solution.
5555 Examples of use for :any: `ot.emd ` are available in :
5656
5757 - :any: `auto_examples/plot_OT_2D_samples `
58- - :any: `auto_examples/plot_OT_1D `
59- - :any: `auto_examples/plot_OT_L1_vs_L2 `
58+ - :any: `auto_examples/plot_OT_1D `
59+ - :any: `auto_examples/plot_OT_L1_vs_L2 `
6060
6161
6262Computing Wasserstein distance
@@ -102,13 +102,13 @@ distance.
102102 An example of use for :any: `ot.emd2 ` is available in :
103103
104104 - :any: `auto_examples/plot_compute_emd `
105-
105+
106106
107107Special cases
108108^^^^^^^^^^^^^
109109
110110Note that the OT problem and the corresponding Wasserstein distance can in some
111- special cases be computed very efficiently.
111+ special cases be computed very efficiently.
112112
113113For instance when the samples are in 1D, then the OT problem can be solved in
114114:math: `O(n\log (n))` by using a simple sorting. In this case we provide the
@@ -117,13 +117,13 @@ matrix and value. Note that since the solution is very sparse the :code:`sparse`
117117parameter of :any: `ot.emd_1d ` allows for solving and returning the solution for
118118very large problems. Note that in order to compute directly the :math: `W_p`
119119Wasserstein distance in 1D we provide the function :any: `ot.wasserstein_1d ` that
120- takes :code: `p ` as a parameter.
120+ takes :code: `p ` as a parameter.
121121
122122Another special case for estimating OT and Monge mapping is between Gaussian
123123distributions. In this case there exists a close form solution given in Remark
1241242.29 in [15 ]_ and the Monge mapping is an affine function and can be
125125also computed from the covariances and means of the source and target
126- distributions. In the case when the finite sample dataset is supposed gaussian, we provide
126+ distributions. In the case when the finite sample dataset is supposed gaussian, we provide
127127:any: `ot.da.OT_mapping_linear ` that returns the parameters for the Monge
128128mapping.
129129
@@ -176,7 +176,7 @@ solution of the resulting optimization problem can be expressed as:
176176 where :math: `u` and :math: `v` are vectors and :math: `K=\exp (-M/\lambda )` where
177177the :math: `\exp ` is taken component-wise. In order to solve the optimization
178178problem, on can use an alternative projection algorithm called Sinkhorn-Knopp that can be very
179- efficient for large values if regularization.
179+ efficient for large values if regularization.
180180
181181The Sinkhorn-Knopp algorithm is implemented in :any: `ot.sinkhorn ` and
182182:any: `ot.sinkhorn2 ` that return respectively the OT matrix and the value of the
@@ -201,12 +201,12 @@ More details about the algorithms used are given in the following note.
201201 + :code: `method='sinkhorn' ` calls :any: `ot.bregman.sinkhorn_knopp ` the
202202 classic algorithm [2 ]_.
203203 + :code: `method='sinkhorn_stabilized' ` calls :any: `ot.bregman.sinkhorn_stabilized ` the
204- log stabilized version of the algorithm [9 ]_.
204+ log stabilized version of the algorithm [9 ]_.
205205 + :code: `method='sinkhorn_epsilon_scaling' ` calls
206206 :any: `ot.bregman.sinkhorn_epsilon_scaling ` the epsilon scaling version
207- of the algorithm [9 ]_.
207+ of the algorithm [9 ]_.
208208 + :code: `method='greenkhorn' ` calls :any: `ot.bregman.greenkhorn ` the
209- greedy sinkhorn verison of the algorithm [22 ]_.
209+ greedy sinkhorn verison of the algorithm [22 ]_.
210210
211211 In addition to all those variants of sinkhorn, we have another
212212 implementation solving the problem in the smooth dual or semi-dual in
@@ -236,7 +236,7 @@ of algorithms in [18]_ [19]_.
236236 Examples of use for :any: `ot.sinkhorn ` are available in :
237237
238238 - :any: `auto_examples/plot_OT_2D_samples `
239- - :any: `auto_examples/plot_OT_1D `
239+ - :any: `auto_examples/plot_OT_1D `
240240 - :any: `auto_examples/plot_OT_1D_smooth `
241241 - :any: `auto_examples/plot_stochastic `
242242
@@ -248,13 +248,13 @@ While entropic OT is the most common and favored in practice, there exist other
248248kind of regularization. We provide in POT two specific solvers for other
249249regularization terms, namely quadratic regularization and group lasso
250250regularization. But we also provide in :any: `ot.optim ` two generic solvers that allows solving any
251- smooth regularization in practice.
251+ smooth regularization in practice.
252252
253253Quadratic regularization
254254""""""""""""""""""""""""
255255
256256The first general regularization term we can solve is the quadratic
257- regularization of the form
257+ regularization of the form
258258
259259.. math ::
260260 \Omega (\gamma )=\sum _{i,j} \gamma _{i,j}^2
@@ -264,7 +264,7 @@ densifying the OT matrix but it keeps some sort of sparsity that is lost with
264264entropic regularization as soon as :math: `\lambda >0 ` [17 ]_. This problem can be
265265solved with POT using solvers from :any: `ot.smooth `, more specifically
266266functions :any: `ot.smooth.smooth_ot_dual ` or
267- :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='l2' ` to
267+ :any: `ot.smooth.smooth_ot_semi_dual ` with parameter :code: `reg_type='l2' ` to
268268choose the quadratic regularization.
269269
270270.. hint ::
@@ -300,7 +300,7 @@ gradient algorithm [7]_ in function
300300.. hint ::
301301 Examples of group Lasso regularization are available in :
302302
303- - :any: `auto_examples/plot_otda_classes `
303+ - :any: `auto_examples/plot_otda_classes `
304304 - :any: `auto_examples/plot_otda_d2 `
305305
306306
@@ -311,7 +311,7 @@ Finally we propose in POT generic solvers that can be used to solve any
311311regularization as long as you can provide a function computing the
312312regularization and a function computing its gradient (or sub-gradient).
313313
314- In order to solve
314+ In order to solve
315315
316316.. math ::
317317 \gamma ^* = arg\min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + \lambda\Omega (\gamma )
@@ -336,12 +336,12 @@ Another generic solver is proposed to solve the problem
336336 where :math: `\Omega _e` is the entropic regularization. In this case we use a
337337generalized conditional gradient [7 ]_ implemented in :any: `ot.optim.gcg ` that
338338does not linearize the entropic term but
339- relies on :any: `ot.sinkhorn ` for its iterations.
339+ relies on :any: `ot.sinkhorn ` for its iterations.
340340
341341.. hint ::
342342 An example of generic solvers are available in :
343343
344- - :any: `auto_examples/plot_optim_OTreg `
344+ - :any: `auto_examples/plot_optim_OTreg `
345345
346346
347347Wasserstein Barycenters
@@ -382,7 +382,7 @@ solver :any:`ot.lp.barycenter` that rely on generic LP solvers. By default the
382382function uses :any: `scipy.optimize.linprog `, but more efficient LP solvers from
383383cvxopt can be also used by changing parameter :code: `solver `. Note that this problem
384384requires to solve a very large linear program and can be very slow in
385- practice.
385+ practice.
386386
387387Similarly to the OT problem, OT barycenters can be computed in the regularized
388388case. When using entropic regularization is used, the problem can be solved with a
@@ -403,11 +403,11 @@ operators. We provide an implementation of this algorithm in function
403403 Examples of Wasserstein (:any: `ot.lp.barycenter `) and regularized Wasserstein
404404 barycenter (:any: `ot.bregman.barycenter `) computation are available in :
405405
406- - :any: `auto_examples/plot_barycenter_1D `
407- - :any: `auto_examples/plot_barycenter_lp_vs_entropic `
406+ - :any: `auto_examples/plot_barycenter_1D `
407+ - :any: `auto_examples/plot_barycenter_lp_vs_entropic `
408408
409409 An example of convolutional barycenter
410- (:any: `ot.bregman.convolutional_barycenter2d `) computation
410+ (:any: `ot.bregman.convolutional_barycenter2d `) computation
411411 for 2D images is available
412412 in :
413413
@@ -451,13 +451,13 @@ optimal mapping is still an open problem in the general case but has been proven
451451for smooth distributions by Brenier in his eponym `theorem
452452<https://who.rocq.inria.fr/Jean-David.Benamou/demiheure.pdf> `__. We provide in
453453:any: `ot.da ` several solvers for smooth Monge mapping estimation and domain
454- adaptation from discrete distributions.
454+ adaptation from discrete distributions.
455455
456456Monge Mapping estimation
457457^^^^^^^^^^^^^^^^^^^^^^^^
458458
459459We now discuss several approaches that are implemented in POT to estimate or
460- approximate a Monge mapping from finite distributions.
460+ approximate a Monge mapping from finite distributions.
461461
462462First note that when the source and target distributions are supposed to be Gaussian
463463distributions, there exists a close form solution for the mapping and its an
@@ -513,16 +513,16 @@ A list of the provided implementation is given in the following note.
513513
514514 Here is a list of the OT mapping classes inheriting from
515515 :any: `ot.da.BaseTransport `
516-
516+
517517 * :any: `ot.da.EMDTransport ` : Barycentric mapping with EMD transport
518518 * :any: `ot.da.SinkhornTransport ` : Barycentric mapping with Sinkhorn transport
519519 * :any: `ot.da.SinkhornL1l2Transport ` : Barycentric mapping with Sinkhorn +
520520 group Lasso regularization [5 ]_
521521 * :any: `ot.da.SinkhornLpl1Transport ` : Barycentric mapping with Sinkhorn +
522- non convex group Lasso regularization [5 ]_
522+ non convex group Lasso regularization [5 ]_
523523 * :any: `ot.da.LinearTransport ` : Linear mapping estimation between Gaussians
524524 [14 ]_
525- * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
525+ * :any: `ot.da.MappingTransport ` : Nonlinear mapping estimation [8 ]_
526526
527527.. hint ::
528528
@@ -550,7 +550,7 @@ consist in finding a linear projector optimizing the following criterion
550550.. math ::
551551 P = \text {arg}\min _P \frac {\sum _i OT_e(\mu _i\# P,\mu _i\# P)}{\sum _{i,j\neq i}
552552 OT_e(\mu _i\# P,\mu _j\# P)}
553-
553+
554554 where :math: `\#` is the push-forward operator, :math: `OT_e` is the entropic OT
555555loss and :math: `\mu _i` is the
556556distribution of samples from class :math: `i`. :math: `P` is also constrained to
@@ -575,10 +575,10 @@ respectively. Note that we also provide the Fisher discriminant estimator in
575575Unbalanced optimal transport
576576^^^^^^^^^^^^^^^^^^^^^^^^^^^^
577577
578- Unbalanced OT is a relaxation of the original OT problem where the violation of
578+ Unbalanced OT is a relaxation of the entropy regularized OT problem where the violation of
579579the constraint on the marginals is added to the objective of the optimization
580580problem:
581-
581+
582582.. math ::
583583 \min _\gamma \quad \sum _{i,j}\gamma _{i,j}M_{i,j} + reg\cdot\Omega (\gamma ) + \alpha KL(\gamma 1 , a) + \alpha KL(\gamma ^T 1 , b)
584584
@@ -589,9 +589,24 @@ where KL is the Kullback-Leibler divergence. This formulation allows for
589589computing approximate mapping between distributions that do not have the same
590590amount of mass. Interestingly the problem can be solved with a generalization of
591591the Bregman projections algorithm [10 ]_. We provide a solver for unbalanced OT
592- in :any: `ot.unbalanced ` and more specifically
593- in function :any: `ot.sinkhorn_unbalanced `. A solver for unbalanced OT barycenter
594- is available in :any: `ot.barycenter_unbalanced `.
592+ in :any: `ot.unbalanced `. Computing the optimal transport
593+ plan or the transport cost is similar to the balanced case. The Sinkhorn-Knopp
594+ algorithm is implemented in :any: `ot.sinkhorn_unbalanced ` and :any: `ot.sinkhorn_unbalanced2 `
595+ that return respectively the OT matrix and the value of the
596+ linear term. Note that the regularization parameter :math: `\alpha ` in the
597+ equation above is given to those functions with the parameter :code: `reg_m `.
598+
599+ Similarly, Unbalanced OT barycenters can be computed using :any: `ot.barycenter_unbalanced `.
600+
601+ .. note ::
602+ The main function to solve entropic regularized OT is :any: `ot.sinkhorn_unbalanced `.
603+ This function is a wrapper and the parameter :code: `method ` help you select
604+ the actual algorithm used to solve the problem:
605+
606+ + :code: `method='sinkhorn' ` calls :any: `ot.unbalanced.sinkhorn_knopp_unbalanced `
607+ the generalized Sinkhorn algorithm [10 ]_.
608+ + :code: `method='sinkhorn_stabilized' ` calls :any: `ot.unbalanced.sinkhorn_stabilized_unbalanced `
609+ the log stabilized version of the algorithm [10 ]_.
595610
596611
597612.. hint ::
@@ -636,7 +651,7 @@ barycenters that can be expressed as
636651
637652 where :math: `Ck` is the distance matrix between samples in distribution
638653:math: `k`. Note that interestingly the barycenter is defined as a symmetric
639- positive matrix. We provide a block coordinate optimization procedure in
654+ positive matrix. We provide a block coordinate optimization procedure in
640655:any: `ot.gromov.gromov_barycenters ` and
641656:any: `ot.gromov.entropic_gromov_barycenters ` for non-regularized and regularized
642657barycenters respectively.
@@ -654,19 +669,19 @@ The implementations of FGW and FGW barycenter is provided in functions
654669 Examples of computation of GW, regularized G and FGW are available in :
655670
656671 - :any: `auto_examples/plot_gromov `
657- - :any: `auto_examples/plot_fgw `
672+ - :any: `auto_examples/plot_fgw `
658673
659674 Examples of GW, regularized GW and FGW barycenters are available in :
660675
661676 - :any: `auto_examples/plot_gromov_barycenter `
662- - :any: `auto_examples/plot_barycenter_fgw `
677+ - :any: `auto_examples/plot_barycenter_fgw `
663678
664679
665680GPU acceleration
666681^^^^^^^^^^^^^^^^
667682
668683We provide several implementation of our OT solvers in :any: `ot.gpu `. Those
669- implementations use the :code: `cupy ` toolbox that obviously need to be installed.
684+ implementations use the :code: `cupy ` toolbox that obviously need to be installed.
670685
671686
672687.. note ::
7017161. **How to solve a discrete optimal transport problem ? **
702717
703718 The solver for discrete OT is the function :py:mod: `ot.emd ` that returns
704- the OT transport matrix. If you want to solve a regularized OT you can
719+ the OT transport matrix. If you want to solve a regularized OT you can
705720 use :py:mod: `ot.sinkhorn `.
706721
707722
714729 T= ot.emd(a,b,M) # exact linear program
715730 T_reg= ot.sinkhorn(a,b,M,reg) # entropic regularized OT
716731
717- More detailed examples can be seen on this example:
732+ More detailed examples can be seen on this example:
718733 :doc: `auto_examples/plot_OT_2D_samples `
719-
734+
720735
7217362. **pip install POT fails with error : ImportError: No module named Cython.Build **
722737
726741 installing POT.
727742
728743 Note that this problem do not occur when using conda-forge since the packages
729- there are pre-compiled.
744+ there are pre-compiled.
730745
731746 See `Issue #59 <https://github.com/rflamary/POT/issues/59 >`__ for more
732747 details.
751766 In order to limit import time and hard dependencies in POT. we do not import
752767 some sub-modules automatically with :code: `import ot `. In order to use the
753768 acceleration in :any: `ot.gpu ` you need first to import is with
754- :code: `import ot.gpu `.
769+ :code: `import ot.gpu `.
755770
756771 See `Issue #85 <https://github.com/rflamary/POT/issues/85 >`__ and :any: `ot.gpu `
757772 for more details.
@@ -763,7 +778,7 @@ References
763778.. [1 ] Bonneel, N., Van De Panne, M., Paris, S., & Heidrich, W. (2011,
764779 December). `Displacement nterpolation using Lagrangian mass transport
765780 <https://people.csail.mit.edu/sparis/publi/2011/sigasia/Bonneel_11_Displacement_Interpolation.pdf> `__.
766- In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
781+ In ACM Transactions on Graphics (TOG) (Vol. 30, No. 6, p. 158). ACM.
767782
768783 .. [2 ] Cuturi, M. (2013). `Sinkhorn distances: Lightspeed computation of
769784 optimal transport <https://arxiv.org/pdf/1306.0895.pdf> `__. In Advances
@@ -874,4 +889,4 @@ References
874889 .. [24 ] Vayer, T., Chapel, L., Flamary, R., Tavenard, R. and Courty, N.
875890 (2019). `Optimal Transport for structured data with application on
876891 graphs <http://proceedings.mlr.press/v97/titouan19a.html> `__ Proceedings
877- of the 36th International Conference on Machine Learning (ICML).
892+ of the 36th International Conference on Machine Learning (ICML).
0 commit comments