Riesz-Thorin interpolation theorem

I had, a while ago, the great pleasure of going through the proof of the Riesz-Thorin interpolation theorem. I believe I understand the general strategy of the proof, though for sure, I glossed over some details. It is my hope that in writing this, I can fill in the holes for myself at the more microscopic level.

Let us begin with a statement of the theorem.

Riesz-Thorin Interpolation Theorem. Suppose that (X,\mathcal{M}, \mu) and (Y, \mathcal{N}, \nu) are measure spaces and p_0, p_1, q_0, q_1 \in [1, \infty]. If q_0 = q_1 = \infty, suppose also that \mu is semifinite. For 0 < t < 1, define p_t and q_t by

\frac{1}{p_t} = \frac{1-t}{p_0} + \frac{t}{p_1}, \qquad  \frac{1}{q_t} = \frac{1-t}{q_0} + \frac{t}{q_1}.

If T is a linear map from L^{p_0}(\mu) + L^{p_1}(\mu) into L^{q_0}(\nu) + L^{q_1}(\nu) such that \left\|Tf\right\|_{q_0} \leq M_0 \left\|f\right\|_{p_0} for f \in L^{p_0}(\mu) and \left\|Tf\right\|_{q_1} \leq M_1 \left\|f\right\|_{p_1} for f \in L^{p_1}(\mu), then \left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t} for f \in L^{p_t}(\mu), 0 < t < 1.

We begin by noticing that in the special case where p = p_0 = p_1,

\left\|Tf\right\|_{q_t} \leq \left\|Tf\right\|_{q_0}^{1-t} \left\|Tf\right\|_{q_1}^t \leq M_0^{1-t}M_1^t \left\|f\right\|_p,

wherein the first inequality is a consequence of Holder’s inequality. Thus we may assume that p_0 \neq p_1 and in particular that p_t < \infty.

Observe that the space of all simple functions on X that vanish outside sets of finite measure has in its completion L_p(\mu) for p < \infty and the analogous holds for Y. To show this, take any f \in L^p(\mu) and any sequence of simple f_n that converges to f almost everywhere, which must be such that f_n \in L^p(\mu), from which follows that they are non-zero on a finite measure. Denote the respective spaces of such simple functions with \Sigma_X and \Sigma_Y.

To show that \left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t} for all f \in \Sigma_X, we use the fact that

\left\|Tf\right\|_{q_t} = \sup \left\{\left|\int (Tf)g d\nu \right| : g \in \Sigma_Y, \left\|g\right\|_{q_t'} = 1\right\},

where q_t' is the conjugate exponent to q_t. We can rescale f such that \left\|f\right\|_{p_t} = 1.

From this it suffices to show that across all f \in \Sigma_X, g \in \Sigma_Y with \left\|f\right\|_{p_t} = 1 and \left\|g\right\|_{q_t'} = 1, |\int (Tf)g d\nu| \leq M_0^{1-t}M_1^t.

For this, we use the three lines lemma, the inequality of which has the same value on its RHS.

Three Lines Lemma. Let \phi be a bounded continuous function on the strip 0 \leq \mathrm{Re} z \leq 1 that is holomorphic on the interior of the strip. If |\phi(z)| \leq M_0 for \mathrm{Re} z = 0 and |\phi(z)| \leq M_1 for \mathrm{Re} z = 1, then |\phi(z)| \leq M_0^{1-t} M_1^t for \mathrm{Re} z = t, 0 < t < 1.

This is proven via application of the maximum modulus principle on \phi_{\epsilon}(z) = \phi(z)M_0^{z-1} M_1^{-z} \mathrm{exp}^{\epsilon z(z-1)} for \epsilon > 0. The \mathrm{exp}^{\epsilon z(z-1)} serves of function of |\phi_{\epsilon}(z)| \to 0 as |\mathrm{Im} z| \to \infty for any \epsilon > 0.

We observe that if we construct f_z such that f_t = f for some 0 < \mathrm{Re} t < 1. To do this, we can express for convenience f = \sum_1^m |c_j|e^{i\theta_j} \chi_{E_j} and g = \sum_1^n |d_k|e^{i\theta_k} \chi_{F_k} where the c_j‘s and d_k‘s are nonzero and the E_j‘s and F_k‘s are disjoint in X and Y and take each |c_j| to \alpha(z) / \alpha(t) power for such a fixed t for some \alpha with \alpha(t) > 0. We let t \in (0, 1) be the value corresponding to the interpolated p_t. With this, we have

f_z = \displaystyle\sum_1^m |c_j|^{\alpha(z)/\alpha(t)}e^{i\theta_j}\chi_{E_j}.

Needless to say, we can do similarly for g, with \beta(t) < 1,

g_z = \displaystyle\sum_1^n |d_k|^{(1-\beta(z))/(1-\beta(t))}e^{i\psi_k}\chi_{F_k}.

Together these turn the LHS of the inequality we desire to prove to a complex function that is

\phi(z) = \int (Tf_z)g_z d\nu.

To use the three lines lemma, we must satisfy

|\phi(is)| \leq \left\|Tf_{is}\right\|_{q_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f_{is}\right\|_{p_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f\right\|_{p_t}\left\|g\right\|_{q_t'} = M_0.

It is not hard to make it such that \left\|f_{is}\right\|_{p_0} = 1 = \left\|g_{is}\right\|_{q_0'}. A sufficient condition for that would be integrands associated with norms are equal to |f|^{p_t/p_0} and |g|^{q_t'/q_0'} respectively, which equates to \mathrm{Re} \alpha(is) = 1 / p_0 and \mathrm{Re} (1-\beta(is)) = 1 / q_0'. Similarly, we find that \mathrm{Re} \alpha(1+is) = 1 / p_1 and \mathrm{Re} (1-\beta(1+is)) = 1 / q_1'. From this, we can solve that

\alpha(z) = (1-z)p_0^{-1}, \qquad \beta(z) = (1-z)q_0^{-1} + zq_1^{-1}.

With these functions inducing a \phi(z) that satisfies the hypothesis of the three lines lemma, our interpolation theorem is shown for such simple functions, from which extend our result to all f \in L^{p_t}(\mu).

To extend this to all of L^p, it suffices that Tf_n \to Tf a.e. for some sequence of measurable simple functions f_n with |f_n| \leq |f| and f_n \to f pointwise. Why? With this, we can invoke Fatou’s lemma (and also that \left\|f_n\right\|_p \to \left\|f\right\|_p by dominated convergence theorem) to obtained the desired result, which is

\left\|Tf\right\|_q \leq \lim\inf \left\|Tf_n\right\|_q \leq \lim\inf M_0^{1-t} M_1^t\left\|Tf_n\right\|_p \leq M_0^{1-t} M_1^t \left\|f\right\|_p.

Recall that convergence in measure is a means to derive a subsequence that converges a.e. So it is enough to show that \displaystyle\lim_{n \to \infty} \mu(\left\|Tf_n - Tf\right\| > \epsilon) = 0 for all \epsilon > 0. This can be done by upper bounding with something that goes to zero. By Chebyshev’s inequality, we have

\mu(\left\|Tf_n - Tf\right\| > \epsilon) \leq \frac{\left\|Tf_n - Tf\right\|_p^p}{\epsilon^p}.

However, recall that in our hypotheses we have constant upper bounds on T in the p_0 and p_1 norms respectively assuming that f is in L^{p_0} and L^{p_1}, which we can make use of.  So apply Chebyshev on any one of q_0 (let’s use this) and q_1, upper bound its upper bound with M_0 or M_1 times \left\|f_n - f\right\|_{p_0}, which must go to zero by pointwise convergence.

Hilbert basis theorem

I remember learning this theorem early 2015, but I could not remember its proof at all. Today, I relearned it. It employed a beautiful induction argument to transfer the Noetherianness (in the form of finite generation) from R to R[x] via the leading coefficient.

Hilbert Basis TheoremIf R is a Noetherian ring, then so is R[x].

Proof: Take some ideal J in R. Notice that if we partition J by degree, we get from the leading coefficients appearing in each an ascending chain (that has to become constant eventually, say at k). Take finite sets A_n \subset J for m \leq n \leq k, where m is the smallest possible non-zero degree such that the I_ns for the leading coefficient ideals are generated. With this we can for any polynomial p construct a finite combination within A = \displaystyle\cup_{n=m}^k A_n that equates to p leading coefficient wise, and thereby subtraction reduces to a lower degree. Such naturally lends itself induction, with m as the base case. For m any lower degree polynomial is the zero polynomial. Now assume, as the inductive hypothesis that A acts as a finite generating set all polynomials with degree at most n. If n+1 \leq k, we can cancel out the leading coefficient using our generating set, and then use the inductive hypothesis. If n+1 > k, we can by our inductive hypothesis generate with A a degree n polynomial with same leading coefficient (and thereby a degree n+1 one multiplying by x) and from that apply our inductive hypothesis again, this time on our difference.



后在得知此来自Time Magazine的刚发布的一篇文章,想这是Time Magazine啊,美国很有威望的畅销杂志。在那儿,闻到中国竟然已在非洲东北岸的吉布提(Djibouti)建立了头一个海外的军事基地。但对我印象最深的文章最后段,为:

The China striding into that spotlight is not guaranteed to win the future. In this fragmenting world, no one government will have the international influence required to continue to set the political and economic rules that govern the global system. But if you had to bet on one country that is best positioned today to extend its influence with partners and rivals alike, you wouldn’t be wise to back the U.S. The smart money would probably be on China.




Math vs engineering

I am currently a full time software engineer. I don’t really like the work and I mostly find it draining though I guess I’m not bad at it, though I’m definitely not great. Much of it is process and understanding of requirements and the specific codebase (that includes the tools it uses), which is more often than not not fun at all though I find it more tolerable now. It pays well but is low status, as Michael O Church loves to say. The work is rather lowbrow by STEM standards. I was thinking that it loads not very highly on g (at least line of business engineering) but rather highly on conscientiousness and ability to grind. The people who excel are at it are those who can do that type of work for long hours and not feel tired, and often ones who have the genes to sleep 5 hours a day and still be fine. It’s not a very attractive or sexy ability, but it is a very useful and respectable one. One of my colleagues spent 4 years working on FPGAs just to design one chip and he said after that experience, he’s not gonna do anything related to chip design again. I know that chip design is much more technically involved, much higher barrier to entry, and is actually the hardest to replicate part of computing. Anybody can build a website but only a few places have the expertise and infrastructure to make a good CPU. The latter requires a sophisticated industrial process, the fabrication part, which involves much advanced applied physics, none of which I know. I’ve heard that because fabs are a physical constraint which run in cycle, it is imperative to meet deadlines, which means you need the types who can pull all-nighters, who can toil day in day out in the lab on very detail oriented work (that’s often grindy, not artsy or beautiful like math is) with little room for error. It also pays less than software engineering, for obvious economic reasons. On this note, I recall adults knowledgeable were telling me not to major in EE because there are few jobs in it now. Electronics is design once mass produce. So many of them have been outsourced.

Engineering is hard hard work. Not intellectually hard (though there is that aspect of it too in some of it), but grindily hard. Plumbing is inevitable, and you have to deal with some dirty complexity. You need a very high level of stamina and of some form of pain tolerance that I don’t regard myself as very high in, though I’ve improved substantially. It’s not a coincidence that engineering is what makes the big bucks, for individuals (somewhat) and for economies (or execs in them). Rich countries are the ones who can sell high end engineering products like cars and CPUs.

Mathematics, theoretical science, on the other hand, is much more about abstraction of the form that requires a higher level of consciousness. Math and theoretical physics are far more g-loaded than engineering is and attracts smarter people, a different breed of personality, those with a more intellectual upper class vibe that I see largely absent in software engineering. These are used in engineering, but in it, they are merely tools with the focus being on design and on practical application, with cost as a major consideration. It is like how in physics, there is much mathematics used, but because math is just a tool for it, physicists can be sloppy with their math. Pure theoretical science is much more deep and far less collective and organizationally complex, with a pronounced culture of reverence for individual genius and brilliance. There is also an emphasis on beauty and on some pure elevation of the human spirit in this type of pure thought.

I myself am by nature much more in the theoretical category though I am for now in the practical one, pressured into it by economic circumstances, which I am looking to leave. I will say though that I have derived some satisfaction and confidence from having some practical skills and from having done some things which others find directly useful, as well as having endured some pain so I know what that feels like. In the unlikely case that I actually make it as a mathematician, I can say that unlike most of my colleagues I didn’t spend my entire life in the ivory tower and actually suffered a bit in the real world. I can thereby be more down-to-earth, as opposed to the intellectual snob that I am. I will say though that I do genuinely respect those who are stimulated by engineering enough to do it 24-7 even in their spare time. I don’t think I will ever be able to experience that by my very makeup. However, I do at least suspect that I am capable of experiencing to some extent a higher world that most of those guys fail to, which should bring me some consolation.

Galois theory

I’ve been quite exhausted lately with work and other annoying life things. So sadly, I haven’t been able to think about math much, let alone write about it. However, this morning on the public transit I was able to indulge a bit by reviewing in my head some essentials behind Galois theory, in particular how its fundamental theorem is proved.

The first part of it states that there is the inclusion reversing relation between the fixed fields and subgroups of the full Galois group and moreover, the degree of the field extension is equal to the index of corresponding subgroup. This equivalence can be easily proved using the primitive element theorem, which I will state and prove.

Primitive element theorem: Let F be a field. F(\alpha)(\beta), the field from adjoining elements \alpha, \beta to F can be represented as F(\gamma) for some single element \gamma. This extends inductively to that any field extension can be represented by some adjoining some primitive element.

Proof: Let \gamma = \alpha + c\beta for some c \in F. We will show that there is such a c such that \beta is contained in F(\gamma). Let f, g be minimal polynomials for \alpha and \beta respectively. Let h(x) = f(\gamma - cx). The minimal polynomial of \beta in F(\gamma) must divide both h and g. Suppose it has degree at least 2. Then there is some \beta' \neq \beta which induces \alpha' = \gamma - c\beta' that is another root of f. With \gamma = \alpha + c\beta = \alpha' + c\beta', there is only a finite number of c such that \beta is not in F(\gamma). QED.

The degree of a field extension corresponds to the degree of the minimal polynomial of its primitive element. That primitive element can be in an automorphism mapped to any one of the roots of the minimal polynomial, thereby determining the same number of cosets.

The second major part of this fundamental theorem states that normality subgroup wise is equivalent to there being a normal extension field wise. To see this, remember that if a field extension is normal, a map that preserves multiplication and addition cannot take an element in the extended field outside it as that would imply that its minimal polynomial has a root outside the extended field, thereby violating normality. Any g in the full Galois group thus in a normal extension escapes not the extended field (which is fixed by the subgroup H we’re trying to prove is normal). Thus for all h \in H, g^{-1}hg also fixes the extended field, meaning it’s in H.


Convergence in measure

Let f, f_n (n \in \mathbb{N}) : X \to \mathbb{R} be measurable functions on measure space (X, \Sigma, \mu). f_n converges to f globally in measure if for every \epsilon > 0,

\displaystyle\lim_{n \to \infty} \mu(\{x \in X : |f_n(x) - f(x)| \geq \epsilon\}) = 0.

To see that this means the existence of a subsequence with pointwise convergence almost everywhere, let n_k be such that for n > n_k, \mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \frac{1}{k}\}) < \frac{1}{k}, with n_k increasing. (We invoke the definition of limit here.) If we do not have pointwise convergence almost everywhere, there must be some \epsilon such that there are infinitely many n_k such that \mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \epsilon\}) \geq \epsilon. There is no such \epsilon for the subsequence \{n_k\} as \frac{1}{k} \to 0.

This naturally extends to every subsequence’s having a subsequence with pointwise convergence almost everywhere (limit of subsequence is same as limit of sequence, provided limit exists). To prove the converse, suppose by contradiction, that the set of x \in X, for which there are infinitely many n such that |f_n(x) - f(x)| \geq \epsilon for some \epsilon > 0 has positive measure. Then, there must be infinitely many n such that |f_n(x) - f(x)| \geq \epsilon is satisfied by a positive measure set. (If not, we would have a countable set in \mathbb{N} \times X for bad points, whereas there are uncountably many with infinitely bad points.) From this, we have a subsequence without a pointwise convergent subsequence.


A observation on conjugate subgroups

Let H and H' be conjugate subgroups of G, that is, for some g \in G, g^{-1}Hg = H'. Equivalently, HgH' = gH', which means there is some element of G/H' such that under the action of H on G/H', its stabilizer subgroup is H, all of the group of the group action. Suppose H is a p-group with index with respect to G non-divisible by p. Then such a fully stabilized coset must exist by the following lemma.

If H is a p-group that acts on \Omega, then |\Omega| = |\Omega_0|\;(\mathrm{mod\;} p), where \Omega_0 is the subset of \Omega of elements fully stabilized by H.

Its proof rests on the use orbit stabilizer theorem to vanish out orbits that are multiples of p.

This is the natural origin of the second Sylow theorem.