## Innate mathematical ability

This morning I had the great pleasure of reading an article on LessWrong on innate ability by Jonah Sinick. Jonah has been one of my greatest influences and inspirations, having interacted with him substantially. He is unusual in one of the best ways possible. I would not be surprised if he goes on to do something extraordinary.

When I catch up with Jonah, I like to talk with him about math, mathematicians, and IQ, which happens to be what that article of his on LessWrong is about. 😉 That article resonates with me deeply because I myself had similar experiences as he did. It is hypothesized by me that I was also twice exceptional, albeit in different ways, with its effects compounded by my unusual background, all of which mediocrities within the American public school system are not good at dealing with in an effectual way.

This writing of Jonah has brought forth reflections in my own mind with regard to mathematical ability, development, and style. I’ll say that as a little kid under 6, I was very good at arithmetic and even engaged in it obsessively. However, by age 8, after two years of adjusting to life in America starting off not knowing a word of English, I had forgotten most of that. I was known to be good at math among the normal normal students; of course, that doesn’t mean much. In grade school, I was not terribly interested in math or anything academic; I was more interested in playing and watching sports, particularly basketball and baseball.

I didn’t have any mathematical enrichment outside of school other than this silly after school math olympiad program. Nonetheless, I managed to test into two year accelerated math once I reached junior high, not that it means anything. In junior high, we were doing this stupid “core math” with graphing calculators and “experiments.” I didn’t realize that I was actually a joke at math until I failed miserably at the state mathcounts contest, having not prepared for it, unlike all those other tiger mommed Asian kids, who to me seemed way beyond me at that time. It only occurred to me that I might have some real talent for math when I made the AIME in 10th grade, taking the AMCs for the first time, being one of four in my high school of about 2000 to do so. I thought it was fun solving some of those math contest problems, which were more g-loaded, with an emphasis on the pattern recognition side.

It was after that I started to read up on the history of mathematics and mathematicians. I taught myself some calculus and was fascinated by it, not that I understood it very well. But I could easily sense that this was much more significant than many of those contrived contest problems, and soon, I began to lose interest in the contest stuff. It was also after that that I learned about proving things, which the American public school math doesn’t teach. I finally realized what mathematics is really about.

Like Jonah, I had some difficulties with careless errors and mental organization. I don’t think my raw intellectual horsepower was very high back in high school, but fortunately, it has improved substantially since then that it is for the most part no longer the major impediment.

I took calculus officially in 11th grade, and it was a breeze for me. I could easily compute the areas and volumes and such but the entire time, I felt quite dissatisfied, because I could not actually understand that stuff at a rigorous, theoretical level as I poured through our textbook that went up to vector calculus during lecture, which was rather inane, expected if one considers the mismatch between cognitive threshold relative to the distribution of ability of the students. I knew from reading online the rich world of math far beyond what we were covering, most of which I was not intellectually mature enough to access at that time. However, I vividly remember during summer after 11th grade, while attending a math summer program, I was able to comfortably write out the delta epsilon definition of limit with understanding of why it was reasonably defined that way. Still, I would say I was still quite weak in terms of both my mathematical maturity and overall intellectual ability. There were too many things I wasn’t aware of, including the g factor, that I easily would have been had I been higher in verbal ability, which would have enabled me to read, absorb, and internalize information much more rapidly and broadly. In contrast, Jonah had discovered independently, or so he says, the lack of free will at the age of 7!

I made some incremental advances in my math knowledge from reading and thinking outside of school the next year. As for contest math, I almost made the USAMO. Though I had improved, I was still not terribly quick and careful with solving contest style problems and doing computations. I think close to graduation, I also solved some Putnam problems.

Only in undergrad did I learn real math more seriously, but even there, nothing too advanced. US undergrad is a joke, and I also was one, just to a lesser extent than most of my “peers.” Almost certainly, Jonah, based on he’s told me, had gained much deeper and broader knowledge at the same stage, from the reading works of giants like Euler and Riemann.

I’ve noticed how there are a lot of Chinese-(American) kids really into those high school math contests, and they now also dominate USAMO and Putnam (though careful, as in the latter, there you’ve got some of Chinese internationals drawn from the elite from China). I will say that at the lower levels, many of those kids have some pretty low taste and an inability to think outside the system that would enable them to discover the existence of real math, as opposed to this artificial math game that they enjoy playing or are pressured to doing so for college. Though those contests have a high pattern recognition component to them, there is not really much depth or substantial math knowledge. It is also my belief, with reference to Jonah’s article, that math contests are mostly M loaded while real math is more V loaded. So this behavior is consistent with the lopsidedness in favor of M and perhaps also short term working memory of Chinese students. It has also been Jonah’s belief that controlling for g, these contests select for low taste and value judgement, and I surely identify with that perspective. So maybe college admissions are somewhat fair to assess an Asian penalty?

Of the thesis of Jonah’s article, a representative figure is Terry Tao. There, Jonah also pointed out that Tao’s research in math is more concrete and problem solving oriented by pure math standards, in line with what appears to be the same lopsided (modulo the absolute level, as Terry is a far far outlier) cognitive profile of his based on testing at age 9 and 10. Again, people enjoy what they are best at, and though, Terry Tao is almost certainly at least +3 sigma at verbal, he is far more rare, at least +5 sigma, a real übermensch, in the (in some sense dual) pattern recognition component, which means he leans towards the areas of math more loaded on the latter. I have heard the saying that even other Fields medalists are intimidated by Terry Tao. The breadth and volume and technical power of his work is almost unrivaled and otherworldly. The media makes it seem like Terry is a league above even the other Fields medalists. However, Jonah seems to believe that the deepest and most leading of mathematicians are the ones who are more theory builders, who create through leaps of insight and synthesis new fields and directions that keep mathematicians busy for decades, and even centuries. That would be say Grothendieck or SS Chern, and an ability that is more loaded on verbal ability, crudely speaking. Again, I have felt the same. This might explain why the advantage of Chinese students is not anywhere near as pronounced in math research as in contests, and why some people say that generally speaking, the Chinese mathematicians are more problem solving and technical than theoretical, more analysis than algebra. Likewise, we can predict the opposite for Jews who are skewed in favor of verbal. A corollary of this would be that the Jews produce the deepest thinkers, adjusted somewhat for population, which is almost certainly the case, if you look at the giants of mathematics and theoretical physics.

I’ll conclude with the following remark. I used to revere somewhat those who placed very highly on those contests, until I realized that many of them are actually somewhat weak in terms of deep understanding and thinking at a more theoretical level. Yes, I have met MOSPers who got destroyed by real math and who are not very intellectually versatile, with glaring weaknesses; I was quite surprised initially that even I seemed to be smarter if not a lot than some of them. Once upon a time, I couldn’t understand those who appeared very strong at real math (and often also science and/or engineering and/or humanities) who struggled with more concrete math and/or contest-style problem solving, like Jonah, who has written on LessWrong of his difficulties with accuracy on the trivial math SAT. I’ve met this other guy, who I thought was an idiot for being unable to perform simple computations, who is leagues beyond me in the most abstract of math, who writes prolifically about partially V-loaded areas of math like model theory. Now, the more metacognitive me has awakened to the reality that I may never by deficit of my neurobiology be able to fathom and experience what they’re capable of. After all, there are plenty I am almost certain are and are essentially doomed to be very delusional by nature relative to me, and since I’m at the far tail but not quite so much, there are bound to be people who view me the same. I can only hope that I can become more like them through some combination of exposure and organic neurobiological growth, but I as a realist will not deem that very likely.

## More on Asian stereotypes

I just stumbled upon this wonderful essay by Gwydion Madawc Williams on why the Ming voyages led by Zheng He (郑和) led to nothing. The quote of it particularly memorable to me was this:

The separation of craft and education as represented by China’s illiterate shipwrights was indeed a genuine weakness in the Chinese system.  Christian Europe always remembered that St Peter had been a fisherman and St Paul a tent-maker, and it was quite acceptable for learned people to also be involved in manufacturing.  The weakness of Confucianism was not so much that it rated agriculture and craft above merchant trade, but that it insisted on the educated being a learned caste distanced from all of these matters.

Again, it’s the Asian stereotype of being a study hard grind lacking in practical, hands-on skills and “well-roundedness” and “social skills” and all that that admissions officers use to justify denying Asian applicants. I’ll say that from what I know, that is still very limited, Confucianism was very much like that. The quote that epitomized this was: 劳心者治人，劳力者治于人, which translates to roughly “the worker of the mind governs, the physical worker is governed.” The whole imperial examination system essentially created an upper class of bookworms for whom any form of hands on labor was beneath. To be a true 君子, gentlemen, you were supposed to study the classics and write poetry and engage in all that Confucian bull shit. I myself don’t have a very high opinion of Confucianism. It’s too conservative for me, with all the emphasis on ritual and filial piety. It discouraged any form of innovation outside the system, outside what was already there, which is partly why China could not make the giant leaps in science that the West did. I’ve read some of the Analects of Confucius and know some of the quotes, and I don’t think Confucius was a deep philosopher at all; there is little actual substance in what he said. On the other hand, Mo Tzu was a much further reaching, more scientific, and surprisingly modern thinker, and had China followed his path instead of banishing his school of thought into obscurity, the world would be completely different now, with China likely having made many more leaps of progress than it had actually done. I’ll say that the West was able to escape the shackles of Christianity, but China could not by itself escape those of Confucianism, until its dire situation, with reached its nadir in 1900, forced it too.

Apparently, the elite college admissions officers aren’t terribly good at filtering out the real Asian grinds either, as I know one who went to Princeton, who I found ridiculous. He said that all he did in college was study, and even though he majored in math, he hardly knew any. Like, he didn’t know what a topological space is. When I went ice skating with him and some others, he was near the edge the whole time, and he characterized my skating backwards (not well at all) as “scary.” I told him I’m not very athletic and wasn’t even any good, unlike the girl he was dating at that time, who could do spins among other fancy “figure skating” things she was trying out. I did show him the video taken of this 360 somersault I did off a 15 feet cliff in Hawaii, into the water, which was the first time I had done anything like that. He was like: “that’s so scary.” I honestly didn’t know what to say. To justify himself, he was like: “Chinese parents only want their kids to study.” I told him that in China, there are some very athletic people who attend special sports schools. On that, he was like: “but those aren’t normal people.” I also remember when we went camping once, everybody else got drunk, so I got to drive that kid’s BMW back. He had told us that his father does business in Beijing, which might explain why he drives that kind of car. He came to US at age 4. His Chinese is absolutely awful though, and he doesn’t realize it. He will of course say: “I already know enough. Some people can’t even speak it.” 怎么说那，不仅是个书呆子，而且是个书都读不好的书呆子，连这样的sb还都被Princeton录取了。I’ve talked with one of my very smart Asian friends about this, and he was like: “but he’s socially normal, unlike us.” And more recently: “Maybe they do accept Asian grinds, just not the ones with bad social skills.”

From what I’ve seen, there are plenty of super conformist Asian grinds like him, but there are also many who aren’t, who are actually smart and interesting, like myself (or at least I hope). I think what he said about Chinese parents is somewhat true actually; after all, I saw many growing up. They do see academics as a way to get ahead more so than others, largely because in China, to get out of your rural village and/or not be stuck with a working class job, you had to do sufficiently well on the gaokao to get into a good major at a good university. It’s funny that I’ve actually seen a ton of ignorant, narrow-minded, and risk-averse uncool tiger Chinese parents. And I have also seen some extremely impressive ones, not just academically. There is again quite a wide range and variety.

There is a phenomenon I’ve witnessed, which is that if a person is extremely strong at X and merely above average at Y, then that person will seem weak at Y, even compared to another person about as good at Y but less lopsided. It seems a natural human cognitive bias to think this way. This is in fact applied rather perversely to Asians in stereotyping. For example, Asian students are perceived as weak at language and humanities because they are generally stronger at STEM. We all know that in fact math IQ and verbal IQ (which we can use crudely as proxies for STEM ability and humanities ability respectively) are highly correlated, which makes it highly unlikely that a STEM star is actually legitimately weak at humanities. He might not be interested in reading novels and such but that’s rather different. There is also that humanities is more cultural exposure loaded with a much higher subjective element to it, with much less of a uniform metric. It actually seems to me based on personal experience that is by no means representative that in terms of precise use of language and the learning of foreign languages, mathematicians and theoretical physicists are at or near the top in terms of ability. On this, I will give an opposing perspective that I identify with somewhat, which is that even if you’re very strong at Y, having an X that you are significantly more talented at is a weakness for Y, because engaging in Y deprives the joy derived from engaging in the X, which often leads to loss of interest over time. Maybe this is why employers shy from hiring people who they deem “overqualified?” On this, I have thought of how possibly the lopsided cognitive profile in East Asians (with what is likely at least 2/3 SD differential between math/visuo-spatial and verbal, normalizing on white European scores) predisposed the thinking of the elite (assuming that lopsidedness is preserved at the far tail) as well as the development of that society at large in certain ways, some of which may have been not the most conducive for, say, the development of theoretical science. This is of course very speculative, and I would actually hypothesize that the far tail cognitive elite among East Asians is more balanced in terms of the math/visuo-spatial and verbal split, given the great extent to which the imperial examination system, which tested almost exclusively literary things, selected for V at the tail instead of for M.

On the aforementioned bias, I’ll give another illustrative example. I once said to this friend of mine, a math PhD student, not Asian, how there’s the impression that people who are weaker academically tend to be better at certain practical things, like starting restaurants and businesses. We sure all know there are plenty who weren’t good at school but were very shrewd and successful at business, at practical things. That guy responded with reference to Berkson’s paradox. He said something like: “That’s because you are unlikely to see those who are bad at both. They tend to be in prison or in the lower classes.” I could only agree.

I’ll conclude with another more dramatic example. I used to, when I knew nothing about the subject, think that people who were really at math were weirdos and socially awkward. For one, there was this kid in my high school who was way better than me at math at the time, who was incredibly autistic. Also, summer after 10th grade, I saw Beautiful Mind, which depicts the mathematician as mentally crazy. Now I would bet the incidence of schizophrenia among the mathematically gifted is lower than it is in the whole population. It just happens that certain combinations of extreme traits are vastly more noticeable or exposed by the media to the public (a mathematician or physicist may think of this as weighing those with such combinations with a delta function, or something along that direction at least). I wasn’t quite aware of that at that time though. Only later, after meeting more math people did I realize that math people are not actually that socially out of it in general, far from it, at least once they’re past a certain age, by which they will have had the chance to interact with more people like them and form their own peer group.

It is my hope that people can be more cognizant of these biases described in this blog post.

## Hahn-Banach theorem

I’m pleased to say that I find the derivation of the Hahn-Banach theorem pretty straightforward by now. Let me first state it, for the real case.

Hahn-Banach theorem: Let $V$ be a real vector space. Let $p: V \to \mathbb{R}$ be sublinear. If $f : U \to \mathbb{R}$ be a linear functional on the subspace $U \subset V$ with $f(x) \leq p(x)$ for $x \in U$, then there exists a linear extension of $f$ to all of $V$ (call it $g$) such that $f(x) \leq g(x)$ for $x \in V$ with $f(x) = g(x)$ for $x \in U$ and $g(x) \leq p(x)$ for all $x \in V$.

To show this, start by taking any $x_0 \in V \setminus U$. We wish to assign some $\alpha$ to $x_0$ that keeps $p$ as the dominating function in the vector space $U + \mathbb{R}x_0$. For this to happen, applying the linearity of $f$ and the domination constraint, we can derive

$\frac{f(y) - p(y - \lambda x_0)}{\lambda} \leq \alpha \leq \frac{p(y+\lambda x_0) - f(y)}{\lambda}, \quad y \in U, \lambda > 0$.

This reduces to

$\sup_{y \in U} p(y+x_0) - f(y) \leq \inf_{y \in U} f(y) - p(y-x_0)$.

Such can be proven via

$f(y_1) + f(y_2) = f(y_1 + y_2) \leq p(y_1 + y_2) \leq p(y_1 - x_0) +p(y_2 + x_0), \quad y_1, y_2 \in U$.

Now take the space of linear functionals defined on some specific subspace dominated by $p$. Denote an element of it as $(f, U)$. We introduce a partial order wherein $(f, U) \leq (f', U')$ iff $f(x) = f'(x)$ for $x \in U$ and $U \subset U'$. We can apply Zorn’s lemma on this, as we can take the union to derive an upper bound for any chain. Any maximal element is necessarily $(g, V)$ as if the domain is not the entire vector space, we can by above construct a larger element.

## Riesz-Thorin interpolation theorem

I had, a while ago, the great pleasure of going through the proof of the Riesz-Thorin interpolation theorem. I believe I understand the general strategy of the proof, though for sure, I glossed over some details. It is my hope that in writing this, I can fill in the holes for myself at the more microscopic level.

Let us begin with a statement of the theorem.

Riesz-Thorin Interpolation Theorem. Suppose that $(X,\mathcal{M}, \mu)$ and $(Y, \mathcal{N}, \nu)$ are measure spaces and $p_0, p_1, q_0, q_1 \in [1, \infty]$. If $q_0 = q_1 = \infty$, suppose also that $\mu$ is semifinite. For $0 < t < 1$, define $p_t$ and $q_t$ by

$\frac{1}{p_t} = \frac{1-t}{p_0} + \frac{t}{p_1}, \qquad \frac{1}{q_t} = \frac{1-t}{q_0} + \frac{t}{q_1}$.

If $T$ is a linear map from $L^{p_0}(\mu) + L^{p_1}(\mu)$ into $L^{q_0}(\nu) + L^{q_1}(\nu)$ such that $\left\|Tf\right\|_{q_0} \leq M_0 \left\|f\right\|_{p_0}$ for $f \in L^{p_0}(\mu)$ and $\left\|Tf\right\|_{q_1} \leq M_1 \left\|f\right\|_{p_1}$ for $f \in L^{p_1}(\mu)$, then $\left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t}$ for $f \in L^{p_t}(\mu)$, $0 < t < 1$.

We begin by noticing that in the special case where $p = p_0 = p_1$,

$\left\|Tf\right\|_{q_t} \leq \left\|Tf\right\|_{q_0}^{1-t} \left\|Tf\right\|_{q_1}^t \leq M_0^{1-t}M_1^t \left\|f\right\|_p$,

wherein the first inequality is a consequence of Holder’s inequality. Thus we may assume that $p_0 \neq p_1$ and in particular that $p_t < \infty$.

Observe that the space of all simple functions on $X$ that vanish outside sets of finite measure has in its completion $L_p(\mu)$ for $p < \infty$ and the analogous holds for $Y$. To show this, take any $f \in L^p(\mu)$ and any sequence of simple $f_n$ that converges to $f$ almost everywhere, which must be such that $f_n \in L^p(\mu)$, from which follows that they are non-zero on a finite measure. Denote the respective spaces of such simple functions with $\Sigma_X$ and $\Sigma_Y$.

To show that $\left\|Tf\right\|_{q_t} \leq M_0^{1-t}M_1^t \left\|f\right\|_{p_t}$ for all $f \in \Sigma_X$, we use the fact that

$\left\|Tf\right\|_{q_t} = \sup \left\{\left|\int (Tf)g d\nu \right| : g \in \Sigma_Y, \left\|g\right\|_{q_t'} = 1\right\}$,

where $q_t'$ is the conjugate exponent to $q_t$. We can rescale $f$ such that $\left\|f\right\|_{p_t} = 1$.

From this it suffices to show that across all $f \in \Sigma_X, g \in \Sigma_Y$ with $\left\|f\right\|_{p_t} = 1$ and $\left\|g\right\|_{q_t'} = 1$, $|\int (Tf)g d\nu| \leq M_0^{1-t}M_1^t$.

For this, we use the three lines lemma, the inequality of which has the same value on its RHS.

Three Lines Lemma. Let $\phi$ be a bounded continuous function on the strip $0 \leq \mathrm{Re} z \leq 1$ that is holomorphic on the interior of the strip. If $|\phi(z)| \leq M_0$ for $\mathrm{Re} z = 0$ and $|\phi(z)| \leq M_1$ for $\mathrm{Re} z = 1$, then $|\phi(z)| \leq M_0^{1-t} M_1^t$ for $\mathrm{Re} z = t$, $0 < t < 1$.

This is proven via application of the maximum modulus principle on $\phi_{\epsilon}(z) = \phi(z)M_0^{z-1} M_1^{-z} \mathrm{exp}^{\epsilon z(z-1)}$ for $\epsilon > 0$. The $\mathrm{exp}^{\epsilon z(z-1)}$ serves of function of $|\phi_{\epsilon}(z)| \to 0$ as $|\mathrm{Im} z| \to \infty$ for any $\epsilon > 0$.

We observe that if we construct $f_z$ such that $f_t = f$ for some $0 < \mathrm{Re} t < 1$. To do this, we can express for convenience $f = \sum_1^m |c_j|e^{i\theta_j} \chi_{E_j}$ and $g = \sum_1^n |d_k|e^{i\theta_k} \chi_{F_k}$ where the $c_j$‘s and $d_k$‘s are nonzero and the $E_j$‘s and $F_k$‘s are disjoint in $X$ and $Y$ and take each $|c_j|$ to $\alpha(z) / \alpha(t)$ power for such a fixed $t$ for some $\alpha$ with $\alpha(t) > 0$. We let $t \in (0, 1)$ be the value corresponding to the interpolated $p_t$. With this, we have

$f_z = \displaystyle\sum_1^m |c_j|^{\alpha(z)/\alpha(t)}e^{i\theta_j}\chi_{E_j}$.

Needless to say, we can do similarly for $g$, with $\beta(t) < 1$,

$g_z = \displaystyle\sum_1^n |d_k|^{(1-\beta(z))/(1-\beta(t))}e^{i\psi_k}\chi_{F_k}$.

Together these turn the LHS of the inequality we desire to prove to a complex function that is

$\phi(z) = \int (Tf_z)g_z d\nu$.

To use the three lines lemma, we must satisfy

$|\phi(is)| \leq \left\|Tf_{is}\right\|_{q_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f_{is}\right\|_{p_0}\left\|g_{is}\right\|_{q_0'} \leq M_0 \left\|f\right\|_{p_t}\left\|g\right\|_{q_t'} = M_0$.

It is not hard to make it such that $\left\|f_{is}\right\|_{p_0} = 1 = \left\|g_{is}\right\|_{q_0'}$. A sufficient condition for that would be integrands associated with norms are equal to $|f|^{p_t/p_0}$ and $|g|^{q_t'/q_0'}$ respectively, which equates to $\mathrm{Re} \alpha(is) = 1 / p_0$ and $\mathrm{Re} (1-\beta(is)) = 1 / q_0'$. Similarly, we find that $\mathrm{Re} \alpha(1+is) = 1 / p_1$ and $\mathrm{Re} (1-\beta(1+is)) = 1 / q_1'$. From this, we can solve that

$\alpha(z) = (1-z)p_0^{-1}, \qquad \beta(z) = (1-z)q_0^{-1} + zq_1^{-1}$.

With these functions inducing a $\phi(z)$ that satisfies the hypothesis of the three lines lemma, our interpolation theorem is shown for such simple functions, from which extend our result to all $f \in L^{p_t}(\mu)$.

To extend this to all of $L^p$, it suffices that $Tf_n \to Tf$ a.e. for some sequence of measurable simple functions $f_n$ with $|f_n| \leq |f|$ and $f_n \to f$ pointwise. Why? With this, we can invoke Fatou’s lemma (and also that $\left\|f_n\right\|_p \to \left\|f\right\|_p$ by dominated convergence theorem) to obtained the desired result, which is

$\left\|Tf\right\|_q \leq \lim\inf \left\|Tf_n\right\|_q \leq \lim\inf M_0^{1-t} M_1^t\left\|Tf_n\right\|_p \leq M_0^{1-t} M_1^t \left\|f\right\|_p$.

Recall that convergence in measure is a means to derive a subsequence that converges a.e. So it is enough to show that $\displaystyle\lim_{n \to \infty} \mu(\left\|Tf_n - Tf\right\| > \epsilon) = 0$ for all $\epsilon > 0$. This can be done by upper bounding with something that goes to zero. By Chebyshev’s inequality, we have

$\mu(\left\|Tf_n - Tf\right\| > \epsilon) \leq \frac{\left\|Tf_n - Tf\right\|_p^p}{\epsilon^p}$.

However, recall that in our hypotheses we have constant upper bounds on $T$ in the $p_0$ and $p_1$ norms respectively assuming that $f$ is in $L^{p_0}$ and $L^{p_1}$, which we can make use of.  So apply Chebyshev on any one of $q_0$ (let’s use this) and $q_1$, upper bound its upper bound with $M_0$ or $M_1$ times $\left\|f_n - f\right\|_{p_0}$, which must go to zero by pointwise convergence.

## Hilbert basis theorem

I remember learning this theorem early 2015, but I could not remember its proof at all. Today, I relearned it. It employed a beautiful induction argument to transfer the Noetherianness (in the form of finite generation) from $R$ to $R[x]$ via the leading coefficient.

Hilbert Basis TheoremIf $R$ is a Noetherian ring, then so is $R[x]$.

Proof: Take some ideal $J$ in $R$. Notice that if we partition $J$ by degree, we get from the leading coefficients appearing in each an ascending chain (that has to become constant eventually, say at $k$). Take finite sets $A_n \subset J$ for $m \leq n \leq k$, where $m$ is the smallest possible non-zero degree such that the $I_n$s for the leading coefficient ideals are generated. With this we can for any polynomial $p$ construct a finite combination within $A = \displaystyle\cup_{n=m}^k A_n$ that equates to $p$ leading coefficient wise, and thereby subtraction reduces to a lower degree. Such naturally lends itself induction, with $m$ as the base case. For $m$ any lower degree polynomial is the zero polynomial. Now assume, as the inductive hypothesis that $A$ acts as a finite generating set all polynomials with degree at most $n$. If $n+1 \leq k$, we can cancel out the leading coefficient using our generating set, and then use the inductive hypothesis. If $n+1 > k$, we can by our inductive hypothesis generate with $A$ a degree $n$ polynomial with same leading coefficient (and thereby a degree $n+1$ one multiplying by $x$) and from that apply our inductive hypothesis again, this time on our difference.

## 四海翻腾云水怒，五洲震荡风雷激

The China striding into that spotlight is not guaranteed to win the future. In this fragmenting world, no one government will have the international influence required to continue to set the political and economic rules that govern the global system. But if you had to bet on one country that is best positioned today to extend its influence with partners and rivals alike, you wouldn’t be wise to back the U.S. The smart money would probably be on China.

## Math vs engineering

I am currently a full time software engineer. I don’t really like the work and I mostly find it draining though I guess I’m not bad at it, though I’m definitely not great. Much of it is process and understanding of requirements and the specific codebase (that includes the tools it uses), which is more often than not not fun at all though I find it more tolerable now. It pays well but is low status, as Michael O Church loves to say. The work is rather lowbrow by STEM standards. I was thinking that it loads not very highly on g (at least line of business engineering) but rather highly on conscientiousness and ability to grind. The people who excel are at it are those who can do that type of work for long hours and not feel tired, and often ones who have the genes to sleep 5 hours a day and still be fine. It’s not a very attractive or sexy ability, but it is a very useful and respectable one. One of my colleagues spent 4 years working on FPGAs just to design one chip and he said after that experience, he’s not gonna do anything related to chip design again. I know that chip design is much more technically involved, much higher barrier to entry, and is actually the hardest to replicate part of computing. Anybody can build a website but only a few places have the expertise and infrastructure to make a good CPU. The latter requires a sophisticated industrial process, the fabrication part, which involves much advanced applied physics, none of which I know. I’ve heard that because fabs are a physical constraint which run in cycle, it is imperative to meet deadlines, which means you need the types who can pull all-nighters, who can toil day in day out in the lab on very detail oriented work (that’s often grindy, not artsy or beautiful like math is) with little room for error. It also pays less than software engineering, for obvious economic reasons. On this note, I recall adults knowledgeable were telling me not to major in EE because there are few jobs in it now. Electronics is design once mass produce. So many of them have been outsourced.

Engineering is hard hard work. Not intellectually hard (though there is that aspect of it too in some of it), but grindily hard. Plumbing is inevitable, and you have to deal with some dirty complexity. You need a very high level of stamina and of some form of pain tolerance that I don’t regard myself as very high in, though I’ve improved substantially. It’s not a coincidence that engineering is what makes the big bucks, for individuals (somewhat) and for economies (or execs in them). Rich countries are the ones who can sell high end engineering products like cars and CPUs.

Mathematics, theoretical science, on the other hand, is much more about abstraction of the form that requires a higher level of consciousness. Math and theoretical physics are far more g-loaded than engineering is and attracts smarter people, a different breed of personality, those with a more intellectual upper class vibe that I see largely absent in software engineering. These are used in engineering, but in it, they are merely tools with the focus being on design and on practical application, with cost as a major consideration. It is like how in physics, there is much mathematics used, but because math is just a tool for it, physicists can be sloppy with their math. Pure theoretical science is much more deep and far less collective and organizationally complex, with a pronounced culture of reverence for individual genius and brilliance. There is also an emphasis on beauty and on some pure elevation of the human spirit in this type of pure thought.

I myself am by nature much more in the theoretical category though I am for now in the practical one, pressured into it by economic circumstances, which I am looking to leave. I will say though that I have derived some satisfaction and confidence from having some practical skills and from having done some things which others find directly useful, as well as having endured some pain so I know what that feels like. In the unlikely case that I actually make it as a mathematician, I can say that unlike most of my colleagues I didn’t spend my entire life in the ivory tower and actually suffered a bit in the real world. I can thereby be more down-to-earth, as opposed to the intellectual snob that I am. I will say though that I do genuinely respect those who are stimulated by engineering enough to do it 24-7 even in their spare time. I don’t think I will ever be able to experience that by my very makeup. However, I do at least suspect that I am capable of experiencing to some extent a higher world that most of those guys fail to, which should bring me some consolation.

## Galois theory

I’ve been quite exhausted lately with work and other annoying life things. So sadly, I haven’t been able to think about math much, let alone write about it. However, this morning on the public transit I was able to indulge a bit by reviewing in my head some essentials behind Galois theory, in particular how its fundamental theorem is proved.

The first part of it states that there is the inclusion reversing relation between the fixed fields and subgroups of the full Galois group and moreover, the degree of the field extension is equal to the index of corresponding subgroup. This equivalence can be easily proved using the primitive element theorem, which I will state and prove.

Primitive element theorem: Let $F$ be a field. $F(\alpha)(\beta)$, the field from adjoining elements $\alpha, \beta$ to $F$ can be represented as $F(\gamma)$ for some single element $\gamma$. This extends inductively to that any field extension can be represented by some adjoining some primitive element.

Proof: Let $\gamma = \alpha + c\beta$ for some $c \in F$. We will show that there is such a $c$ such that $\beta$ is contained in $F(\gamma)$. Let $f, g$ be minimal polynomials for $\alpha$ and $\beta$ respectively. Let $h(x) = f(\gamma - cx)$. The minimal polynomial of $\beta$ in $F(\gamma)$ must divide both $h$ and $g$. Suppose it has degree at least $2$. Then there is some $\beta' \neq \beta$ which induces $\alpha' = \gamma - c\beta'$ that is another root of $f$. With $\gamma = \alpha + c\beta = \alpha' + c\beta'$, there is only a finite number of $c$ such that $\beta$ is not in $F(\gamma)$. QED.

The degree of a field extension corresponds to the degree of the minimal polynomial of its primitive element. That primitive element can be in an automorphism mapped to any one of the roots of the minimal polynomial, thereby determining the same number of cosets.

The second major part of this fundamental theorem states that normality subgroup wise is equivalent to there being a normal extension field wise. To see this, remember that if a field extension is normal, a map that preserves multiplication and addition cannot take an element in the extended field outside it as that would imply that its minimal polynomial has a root outside the extended field, thereby violating normality. Any $g$ in the full Galois group thus in a normal extension escapes not the extended field (which is fixed by the subgroup $H$ we’re trying to prove is normal). Thus for all $h \in H$, $g^{-1}hg$ also fixes the extended field, meaning it’s in $H$.

## Convergence in measure

Let $f, f_n (n \in \mathbb{N}) : X \to \mathbb{R}$ be measurable functions on measure space $(X, \Sigma, \mu)$. $f_n$ converges to $f$ globally in measure if for every $\epsilon > 0$,

$\displaystyle\lim_{n \to \infty} \mu(\{x \in X : |f_n(x) - f(x)| \geq \epsilon\}) = 0$.

To see that this means the existence of a subsequence with pointwise convergence almost everywhere, let $n_k$ be such that for $n > n_k$, $\mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \frac{1}{k}\}) < \frac{1}{k}$, with $n_k$ increasing. (We invoke the definition of limit here.) If we do not have pointwise convergence almost everywhere, there must be some $\epsilon$ such that there are infinitely many $n_k$ such that $\mu(\{x \in X : |f_{n_k}(x) - f(x)| \geq \epsilon\}) \geq \epsilon$. There is no such $\epsilon$ for the subsequence $\{n_k\}$ as $\frac{1}{k} \to 0$.

This naturally extends to every subsequence’s having a subsequence with pointwise convergence almost everywhere (limit of subsequence is same as limit of sequence, provided limit exists). To prove the converse, suppose by contradiction, that the set of $x \in X$, for which there are infinitely many $n$ such that $|f_n(x) - f(x)| \geq \epsilon$ for some $\epsilon > 0$ has positive measure. Then, there must be infinitely many $n$ such that $|f_n(x) - f(x)| \geq \epsilon$ is satisfied by a positive measure set. (If not, we would have a countable set in $\mathbb{N} \times X$ for bad points, whereas there are uncountably many with infinitely bad points.) From this, we have a subsequence without a pointwise convergent subsequence.

## A observation on conjugate subgroups

Let $H$ and $H'$ be conjugate subgroups of $G$, that is, for some $g \in G$, $g^{-1}Hg = H'$. Equivalently, $HgH' = gH'$, which means there is some element of $G/H'$ such that under the action of $H$ on $G/H'$, its stabilizer subgroup is $H$, all of the group of the group action. Suppose $H$ is a $p$-group with index with respect to $G$ non-divisible by $p$. Then such a fully stabilized coset must exist by the following lemma.

If $H$ is a $p$-group that acts on $\Omega$, then $|\Omega| = |\Omega_0|\;(\mathrm{mod\;} p)$, where $\Omega_0$ is the subset of $\Omega$ of elements fully stabilized by $H$.

Its proof rests on the use orbit stabilizer theorem to vanish out orbits that are multiples of $p$.

This is the natural origin of the second Sylow theorem.