Skip to content

Commit c5e7bb0

Browse files
committed
Update post
1 parent e4b9978 commit c5e7bb0

1 file changed

Lines changed: 12 additions & 11 deletions

File tree

_posts/2025-10-27-subspaces.md

Lines changed: 12 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -23,9 +23,9 @@ $$
2323
\min_{\mathbf{x} \in \mathcal{X}} p(\mathbf{x})
2424
$$
2525

26-
where $p(\mathbf{x})$ is a polynomial and $\mathcal{X}$ is a set defined by polynomial equalities, for instance, $\mathcal{X} = \\{ \mathbf{x} \in \mathbb{R}^d \mid q_i(\mathbf{x}) = 0, \,i\in[n] \\}$, where we introduced the shorthand $[n]$ for $\\{1, \ldots, n\\}$. This problem may in general be hard to solve due to the non-convexity of the objective and the feasible set.
26+
where $p(\mathbf{x})$ is a polynomial and $\mathcal{X}$ is a set defined by polynomial equalities, for instance, $\mathcal{X} = \\{ \mathbf{x} \in \mathbb{R}^d \mid q_i(\mathbf{x}) = 0, \,i\in[n] \\}$, where we introduced the shorthand $[n]$ for $\\{1, \ldots, n\\}$. This problem may in general be hard to solve when the objective and/or feasible set are non-convex.
2727

28-
A powerful technique to tackle such problems is to solve a series of convex relaxations. To do so, we first rewrite the problem using "lifted" variables. We define a vector of monomials, $\phi(\mathbf{x})^\top$, which in machine learning would be called the feature vector. The objective can then be written as an inner product $p(\mathbf{x}) = \langle \mathbf{C}, \phi(\mathbf{x})\phi(\mathbf{x})^\top \rangle$ for some matrix $\mathbf{C}$, where $\langle, \rangle$ is trace inner product. Our problem becomes:
28+
A powerful technique to tackle such problems is to solve a series of increasingly tight convex relaxations. To do so, we first rewrite the problem using "lifted" variables. We define a vector of monomials, $\phi(\mathbf{x})$, akin to a feature vector in machine learning. We choose the feature vector such that objective can then be written as an inner product $p(\mathbf{x}) = \langle \mathbf{C}, \phi(\mathbf{x})\phi(\mathbf{x})^\top \rangle$ for some matrix $\mathbf{C}$, where $\langle, \rangle$ is trace inner product. Our problem becomes:
2929

3030
$$
3131
\min_{\mathbf{x} \in \mathcal{X}} \langle \mathbf{C}, \phi(\mathbf{x})\phi(\mathbf{x})^\top \rangle
@@ -66,15 +66,16 @@ $$
6666
\mathcal{V} = \text{span} \{ \phi(\mathbf{x})\phi(\mathbf{x})^\top \mid \mathbf{x} \in \mathcal{X} \}
6767
$$
6868

69-
Every moment matrix $\mathbf{M}(\mathbf{x})$ corresponding to a feasible point of our problem lies within this subspace $\mathcal{V}$. In other words, we can define a basis $\\{ \mathbf{B}_i \\}\_{i\in[n\_b]}$, so that every element $\mathbf{X}$ of $\mathcal{V}$ can be written as
69+
Every moment matrix $\mathbf{M}(\mathbf{x})$ corresponding to a feasible point of our problem lies within this subspace $\mathcal{V}$. In other words, we can define a basis $\\{ \mathbf{B}_i \\}\_{i\in[n\_b]}$, so that every element $\mathbf{M}$ in $\mathcal{V}$ can be written as
7070

7171
$$
72-
\mathbf{X} = \sum_i \alpha_i \mathbf{B}_i,
72+
\mathbf{M} = \sum_i \alpha_i \mathbf{B}_i,
7373
$$
7474

75-
for some choices $\alpha_i$. In particular, there exist some $\alpha$ that allow to characterize each element of the feasible set $\mathcal{X}$.
75+
for some chosen $\alpha_i$.
7676

77-
If we call $\mathcal{K}$ the space of all admissible moment matrices, i.e., matrices $\mathbf{X}$ for which there exists a positive measure $\mu$ such that $\mathbf{X}=\int \phi(\mathbf{x})\phi(\mathbf{x})^\top d\mu(\mathbf{x})$, that space corresponds to the closure of the convex hull of all $\\{\phi(\mathbf{x})\phi(\mathbf{x})$ for $\mathbf{x}\in\mathcal{X}\\}$ --- we call this set for convenience $\bar{\mathcal{X}}$ (see [this post](https://francisbach.com/sums-of-squares-for-dummies/) for more details, and below for the visualization of our toy example).
77+
We call $\mathcal{K}$ the space of all admissible (pseudo) moment matrices.
78+
<!--i.e., matrices $\mathbf{X}$ for which there exists a positive measure $\mu$ such that $\mathbf{X}=\int \phi(\mathbf{x})\phi(\mathbf{x})^\top d\mu(\mathbf{x})$,--> This space corresponds to the closure of the convex hull of all $\\{\phi(\mathbf{x})\phi(\mathbf{x})$ for $\mathbf{x}\in\mathcal{X}\\}$ --- we call this set for convenience $\bar{\mathcal{X}}$ (see [this post](https://francisbach.com/sums-of-squares-for-dummies/) for more details, and below for the visualization of our toy example).
7879

7980
{% include figure.liquid
8081
path="/assets/images/blog/2025-10-27/subspaces-export.svg"
@@ -108,7 +109,7 @@ $$
108109
The subspace $\mathcal{V}$ is the span of these two matrices, $\mathcal{V} = \text{span}(\mathbf{B}_1, \mathbf{B}_2)$. This is a 2-dimensional subspace within the ambient space of $4 \times 4$ symmetric matrices, which has dimension $\frac{4 \times 5}{2} = 10$. For a more compact notation, we use the half-vectorization operator and define $\mathbf{b}_i:=\mathrm{vech}(\mathbf{B}_i)\in\mathbb{R}^{10}$, where we scale off-diagonal elements by $\sqrt{2}$ to ensure $\langle \mathbf{A}, \mathbf{B}\rangle = \mathbf{a}^\top\mathbf{b}$.
109110
</div>
110111

111-
We will also need a basis for $\mathcal{V}^{\perp}$, the nullspace of the span of $\\{ \phi(\mathbf{x})\phi(\mathbf{x})^T, \mathbf{x} \in \mathcal{X} \\}$. Let's call the basis vectors $\\{\mathbf{U}_j\\}\_{j\in [n\_u]}$. By definition of the nullspace, for any $\mathbf{x} \in \mathcal{X}$, we must have:
112+
We will also need a basis for $\mathcal{V}^{\perp}$, the nullspace of the span of $\\{ \phi(\mathbf{x})\phi(\mathbf{x})^T, \mathbf{x} \in \mathcal{X} \\}$. Let's call the basis vectors $\\{\mathbf{U}_j\\}\_{j\in [n\_u]}$. By definition of the nullspace, for any $\mathbf{x} \in \mathcal{M}$, we must have:
112113

113114
$$
114115
\langle \mathbf{U}_j, \phi(\mathbf{x})\phi(\mathbf{x})^T \rangle = 0
@@ -152,13 +153,13 @@ $$
152153
Now we want to find a computationally tractable outer approximation of the set
153154

154155
$$
155-
\mathcal{K} = \{\mathbf{X} \;| \mathbf{M} = \int_\mathcal{X} \phi(\mathbf{x})\phi(\mathbf{x})^\top d\mu(x)\ \text{for some measure $\mu$.}\}
156+
\mathcal{K} = \{\mathbf{X} \;| \mathbf{X} = \int_\mathcal{X} \phi(\mathbf{x})\phi(\mathbf{x})^\top d\mu(x)\ \text{for some measure $\mu$.}\}
156157
$$
157158

158159
An intuitive choice is to add all characteristics of this set that are computationally easy to handle:
159160

160161
$$
161-
\mathcal{\widehat{K}} = \{\mathbf{M} \;| \begin{cases}
162+
\mathcal{\widehat{K}} = \{\mathbf{X} \;| \begin{cases}
162163
& \mathbf{X} = \sum_i \alpha_i \mathbf{B}_i & \text{(want to lie in same subspace)} \\
163164
& \mathbf{X} \succeq 0 & \text{(because it is an outer product of same vector and $\mu \geq 0$)} \\
164165
& \langle \mathbf{A}_0, \mathbf{X} \rangle = 1 & \text{(we assume normalization and that $\phi(\mathbf{x})_0=1$)}
@@ -375,9 +376,9 @@ This kernelization of the problem allows for different kernels to be applied, an
375376

376377
### Conclusion and Discussion
377378

378-
We've seen that by taking a subspace perspective, we can derive alternative but equivalent relaxations for polynomial optimization problems. Depending on the dimensions of the subspace $\mathcal{V}$ and its complement $\mathcal{V}^\perp$, one form might be more computationally efficient than the other. For instance, if the nullspace $\mathcal{V}^\perp$ has a very small dimension, the primal kernel form might have far fewer constraints than the image form.
379+
We've seen that by taking a subspace perspective, we can derive relaxations for polynomial optimization problems using only information from feasible samples. Depending on the dimensions of the subspace $\mathcal{V}$ and its complement $\mathcal{V}^\perp$, one form might be more computationally efficient than the other. For instance, if the nullspace $\mathcal{V}^\perp$ has a very small dimension, the primal kernel form might have far fewer constraints than the image form.
379380

380-
An obvious limitation of this approach is that it cannot easily deal with inequality constraints. However, at least the equality-constrained part of polynomial optimization problems can handled in a very elegant way through this subspace view.
381+
An obvious limitation of this approach is that it cannot easily deal with inequality constraints. However, at least the equality-constrained part of polynomial optimization problems can handled in a very elegant way through this subspace view. To some extent, inequalities can be dealt with my introducing slack variables, but this may does not easily scale to many inequalities.
381382

382383
### Appendix 1: Verification via the Duals {#appendix1}
383384

0 commit comments

Comments
 (0)