不知为何,突然想起测度论里的不可测度的维塔利集合

复制以我写的知乎文章

我在知乎上写的目前竟是一些有关于美国华人和ABC和犹太人的政治话题,自己快成了民族活动家了,其实对于学理工科的人而言,民族活动家比较贬义。民族活动家似的言论与活动,尤其在美国,其实是自然被有能力的人所藐视的,这原则很简单,它根本就是不“专业”的表现,甚至可以说是一种流氓耍诬赖的作为。在美国,中国人政治上都是特别老实的,从来不闹事儿,不抗议,就服从性的低调的埋头苦干。相反,我看到过一位根本不黑但有黑人血统的数学研究生,他的数学水平其实很差的,与其他人相比,可是他却公开的支持Black Lives Matter,然后学校媒体却非常支持他,以他宣传自己的diversity,公布的视频里还有他说I didn’t have to think about race。有意思的是他根本不黑,要他不说,其实都看不出来他是黑人。所预料,这些在学校没人敢说的,说了都怕给自己惹麻烦,其实好多人都为此感到不满,但不得不不了了之,最终政治赢者是谁就毫无疑问了。

我为什么开始写这些东西具体愿意也很难说,一个根本是我天性特别讨厌装傻扯淡的表现,而美国的政治在我眼中就是个大装傻戏(当然,未避免是好多人真的那么傻,或者自己太聪明了哈哈)。反正美国人关于中国和中国人的看法好多实在太傻,在那儿的中国人大多也懒得去“纠正“,大多也是不了了之,我也是,则好多负面的又不太符合客观事实的刻板印象逐渐形成。好多这些我也在我博客用英文写了,美国人爱看他们可以看,知识让一部人知道,并留个记录,也给中国人一点启发。

好,说起数学,我想稍微写写关于我对某非常具有纯数学性质又非常基础重要及美妙的一个观念和例子,那就是不可测度集合的体会。我记得曾经把他的英文维基百科页发给某北大物理毕业的人看了,他的反应就是这种鬼东西只有脱离现实的数学家才会在乎。其实测度这个观念大家是有的,即使不喜欢数学的人,长度宽度这些都是对人很直觉的。形式化一些,我们以实数集合模拟,一个实数集合的子集的度量,勒贝格度量(Lebesgue measure)定义为

\mu^*(E) = \inf\{\displaystyle\sum_{k=1}^{\infty} I_k : (I_k)_{n \in \mathbb{N}} \text{as open intervals}, \displaystyle\bigcup_{k=1}^{\infty} I_k \supset E\}

这个其实是勒贝格外测度,是否可勒贝格测度有条件,那就是所有实数子集A 符合

\displaystyle \mu ^{*}(A)=\mu ^{*}(A\cap E)+\mu ^{*}(A\cap E^{c})

若可勒贝格测度,度值为上的外测度值。

好,我们去找一个非可勒贝格测度的集合。为此,我们将引用三个观察。

  • 测度平移守恒,那就是 \mu(S) = \mu(x+S), \forall x \in \mathbb{R}
  • \{A_k\}_{k \in \mathbb{N}} 互相不交则 \mu(\bigcup_{k=1}^{\infty} A_k) = \sum_{k=1}^{\infty} \mu(A_k)
  • S \subset T\mu(S) \leq \mu(T)

这些直觉上显而易见,形式化证也不难。

以前两点,我们发觉任何可以表示为可数无限多个平移同等的不交集的测度比为 0\infty ,因为所有不交集测度一样所以总测度必然是无限个零或无限个有限数。
那,若我们找到那样的一个集合,并且通过第三点,把他的测度加以有限上限及下限,则得以矛盾,则无可测度。

我们先取商加法群 \mathbb{R} / \mathbb{Q} ,并以选择公理在其所有同价类选在 [0,1] 范围内的一个元素构建一个集合,称之 V 。我们在以每一个 \mathbb{Q} \cap [-1,1] 的元素将 V 平移,这些平移互相不交叉,其并集又包含 [0,1] (也就是我们选择不大不小 [-1,1] ),但 [-1,2] 之内,所以以第三点,他的测度又在 13 之间。所以他若可测度就矛盾了。

我第一次看到这个好像是大三看到的,是自己在英文维基百科页看到的,当时,想这个脑子都有点晕了,还是太不数学成熟,使劲想了但还为此感到迷惑,这个构建的要素脑子里还未看透。可是,后来数学有了大的进步,今年初,我又在没有查任何资料的情况下十分钟左右就从新想了起来,接着把它板上解释给了一位芝加哥大学经济系毕业的人,可是这次,那个人却难以理解。

前天晚上,又想起这个了,感到它非常奇妙,尽然在数学的脱离于物质世界的抽象化及形式化存在这样度不可测的集合。当然,有一点是他依赖有一定哲学争议的选择公理,这我现在也没资格谈。这次根本没有想就能够清晰在脑子里看到这个构建,自然就回顾到了与其恰恰相反的无法理解之的曾经,觉得那时候自己脑子还处于一种半沉睡的状态。这也是数学一种奇妙之处吧。有的定理无论如何证出来不得不有点复杂繁琐,但也有一些定理或观念虽然简单但是从数学思想上却是天才般的,革命性的,之所以那么久才能被人发现到,之所以当初理解困难而经过正确深思后却一目了然永不忘。

纯数学我觉得还是最需要智力的学科,主要是他那种抽象度啊,是很少一部分人脑子先天条件足以接受的,像这种东西他跟计算机科学那些算法就很不一样了,算法还是相对具体的,离我们日常生活不太远,我当初接触有一定难度的算法题没问题,但是某些抽象的数学观念总是吃不透,让我感到自己就是不够聪明,天分有限,可以后来,突然就数学觉醒了。这些客观可严谨证明的抽象数学真理终于在我脑海里实现了,而之前虽然一直存在,对我当时还有问题的头脑却是不存在的。主要还是这些定理的构建与证明都有一些简单的抽象数学观念为基础,这些却很难抓住,在没有掌握的时候,你再费脑也无用,但一旦看透了就觉得其实很简单。所以我很佩服数学天才,他们的结果更多是靠一种天才般的智力和想象力,而非仅仅刻苦,他们能够看到一个远远更高的境界,而这不是什么毛泽东思想或耶稣这种人为的信仰世界,而是一种绝对的科学真理。

Riesz-Thorin interpolation theorem

I had, a while ago, the great pleasure of going through the proof of the Riesz-Thorin interpolation theorem. I believe I understand the general strategy of the proof, though for sure, I glossed over some details. It is my hope that in writing this, I can fill in the holes for myself at the more microscopic level.

Let us begin with a statement of the theorem.

Riesz-Thorin Interpolation Theorem. Suppose that (X,\mathcal{M}, \mu) and (Y, \mathcal{N}, \nu) are measure spaces and p_0, p_1, q_0, q_1 \in [1, \infty]. If q_0 = q_1 = \infty, suppose also that \mu is semifinite. For 0 < t < 1, define p_t and q_t by

\frac{1}{p_t} = \frac{1-t}{p_0} + \frac{t}{p_1}, \qquad  \frac{1}{q_t} = \frac{1-t}{q_0} + \frac{t}{q_1}.

If T is a linear map from L^{p_0}(\mu) + L^{p_1}(\mu) into L^{q_0}(\nu) + L^{q_1}(\nu) such that \left\|Tf\right\|_{q_0} \leq M_0 \left\|f\right\|_{p_0} for f \in L^{p_0}(\mu) and \left\|Tf\right\|_{q_1} \leq M_1 \left\|f\right\|_{p_1} for f \in L^{p_1}(\mu), then \left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t} for f \in L^{p_t}(\mu), 0 < t < 1.

We begin by noticing that in the special case where p = p_0 = p_1,

\left\|Tf\right\|_{q_t} \leq \left\|Tf\right\|_{q_0}^{1-t} \left\|Tf\right\|_{q_1}^t \leq M_0^{1-t}M_1^t \left\|f\right\|_p,

wherein the first inequality is a consequence of Holder’s inequality. Thus we may assume that p_0 \neq p_1 and in particular that p_t < \infty.

Observe that the space of all simple functions on X that vanish outside sets of finite measure has in its completion L_p(\mu) for p < \infty and the analogous holds for Y. To show this, take any f \in L^p(\mu) and any sequence of simple f_n that converges to f almost everywhere, which must be such that f_n \in L^p(\mu), from which follows that they are non-zero on a finite measure. Denote the respective spaces of such simple functions with \Sigma_X and \Sigma_Y.

To show that \left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t} for all f \in \Sigma_X, we use the fact that

\left\|Tf\right\|_{q_t} = \sup \left\{\left|\int (Tf)g d\nu \right| : g \in \Sigma_Y, \left\|g\right\|_{q_t'} = 1\right\},

where q_t' is the conjugate exponent to q_t. We can rescale f such that \left\|f\right\|_{p_t} = 1.

From this it suffices to show that across all f \in \Sigma_X, g \in \Sigma_Y with \left\|f\right\|_{p_t} = 1 and \left\|g\right\|_{q_t'} = 1, |\int (Tf)g d\nu| \leq M_0^{1-t}M_1^t.

For this, we use the three lines lemma, the inequality of which has the same value on its RHS.

Three Lines Lemma. Let \phi be a bounded continuous function on the strip 0 \leq \mathrm{Re} z \leq 1 that is holomorphic on the interior of the strip. If |\phi(z)| \leq M_0 for \mathrm{Re} z = 0 and |\phi(z)| \leq M_1 for \mathrm{Re} z = 1, then |\phi(z)| \leq M_0^{1-t} M_1^t for \mathrm{Re} z = t, 0 < t < 1.

This is proven via application of the maximum modulus principle on \phi_{\epsilon}(z) = \phi(z)M_0^{z-1} M_1^{-z} \mathrm{exp}^{\epsilon z(z-1)} for \epsilon > 0. The \mathrm{exp}^{\epsilon z(z-1)} serves of function of |\phi_{\epsilon}(z)| \to 0 as |\mathrm{Im} z| \to \infty for any \epsilon > 0.

We observe that if we construct f_z such that f_t = f for some 0 < \mathrm{Re} t < 1. To do this, we can express for convenience f = \sum_1^m |c_j|e^{i\theta_j} \chi_{E_j} and g = \sum_1^n |d_k|e^{i\theta_k} \chi_{F_k} where the c_j‘s and d_k‘s are nonzero and the E_j‘s and F_k‘s are disjoint in X and Y and take each |c_j| to \alpha(z) / \alpha(t) power for such a fixed t for some \alpha with \alpha(t) > 0. We let t \in (0, 1) be the value corresponding to the interpolated p_t. With this, we have

f_z = \displaystyle\sum_1^m |c_j|^{\alpha(z)/\alpha(t)}e^{i\theta_j}\chi_{E_j}.

Needless to say, we can do similarly for g, with \beta(t) < 1,

g_z = \displaystyle\sum_1^n |d_k|^{(1-\beta(z))/(1-\beta(t))}e^{i\psi_k}\chi_{F_k}.

Together these turn the LHS of the inequality we desire to prove to a complex function that is

\phi(z) = \int (Tf_z)g_z d\nu.

To use the three lines lemma, we must satisfy

|\phi(is)| \leq \left\|Tf_{is}\right\|_{q_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f_{is}\right\|_{p_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f\right\|_{p_t}\left\|g\right\|_{q_t'} = M_0.

It is not hard to make it such that \left\|f_{is}\right\|_{p_0} = 1 = \left\|g_{is}\right\|_{q_0'}. A sufficient condition for that would be integrands associated with norms are equal to |f|^{p_t/p_0} and |g|^{q_t'/q_0'} respectively, which equates to \mathrm{Re} \alpha(is) = 1 / p_0 and \mathrm{Re} (1-\beta(is)) = 1 / q_0'. Similarly, we find that \mathrm{Re} \alpha(1+is) = 1 / p_1 and \mathrm{Re} (1-\beta(1+is)) = 1 / q_1'. From this, we can solve that

\alpha(z) = (1-z)p_0^{-1}, \qquad \beta(z) = (1-z)q_0^{-1} + zq_1^{-1}.

With these functions inducing a \phi(z) that satisfies the hypothesis of the three lines lemma, our interpolation theorem is shown for such simple functions, from which extend our result to all f \in L^{p_t}(\mu).

To extend this to all of L^p, it suffices that Tf_n \to Tf a.e. for some sequence of measurable simple functions f_n with |f_n| \leq |f| and f_n \to f pointwise. Why? With this, we can invoke Fatou’s lemma (and also that \left\|f_n\right\|_p \to \left\|f\right\|_p by dominated convergence theorem) to obtained the desired result, which is

\left\|Tf\right\|_q \leq \lim\inf \left\|Tf_n\right\|_q \leq \lim\inf M_0^{1-t} M_1^t\left\|Tf_n\right\|_p \leq M_0^{1-t} M_1^t \left\|f\right\|_p.

Recall that convergence in measure is a means to derive a subsequence that converges a.e. So it is enough to show that \displaystyle\lim_{n \to \infty} \mu(\left\|Tf_n - Tf\right\| > \epsilon) = 0 for all \epsilon > 0. This can be done by upper bounding with something that goes to zero. By Chebyshev’s inequality, we have

\mu(\left\|Tf_n - Tf\right\| > \epsilon) \leq \frac{\left\|Tf_n - Tf\right\|_p^p}{\epsilon^p}.

However, recall that in our hypotheses we have constant upper bounds on T in the p_0 and p_1 norms respectively assuming that f is in L^{p_0} and L^{p_1}, which we can make use of.  So apply Chebyshev on any one of q_0 (let’s use this) and q_1, upper bound its upper bound with M_0 or M_1 times \left\|f_n - f\right\|_{p_0}, which must go to zero by pointwise convergence.

Convergence in measure

Let f, f_n (n \in \mathbb{N}) : X \to \mathbb{R} be measurable functions on measure space (X, \Sigma, \mu). f_n converges to f globally in measure if for every \epsilon > 0,

\displaystyle\lim_{n \to \infty} \mu(\{x \in X : |f_n(x) - f(x)| \geq \epsilon\}) = 0.

To see that this means the existence of a subsequence with pointwise convergence almost everywhere, let n_k be such that for n > n_k, \mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \frac{1}{k}\}) < \frac{1}{k}, with n_k increasing. (We invoke the definition of limit here.) If we do not have pointwise convergence almost everywhere, there must be some \epsilon such that there are infinitely many n_k such that \mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \epsilon\}) \geq \epsilon. There is no such \epsilon for the subsequence \{n_k\} as \frac{1}{k} \to 0.

This naturally extends to every subsequence’s having a subsequence with pointwise convergence almost everywhere (limit of subsequence is same as limit of sequence, provided limit exists). To prove the converse, suppose by contradiction, that the set of x \in X, for which there are infinitely many n such that |f_n(x) - f(x)| \geq \epsilon for some \epsilon > 0 has positive measure. Then, there must be infinitely many n such that |f_n(x) - f(x)| \geq \epsilon is satisfied by a positive measure set. (If not, we would have a countable set in \mathbb{N} \times X for bad points, whereas there are uncountably many with infinitely bad points.) From this, we have a subsequence without a pointwise convergent subsequence.