Cayley-Hamilton theorem and Nakayama’s lemma

The Cayley-Hamilton theorem states that every square matrix over a commutative ring A satisfies its own characteristic equation. That is, with I_n the n \times n identity matrix, the characteristic polynomial of A

p(\lambda) = \det (\lambda I_n - A)

is such that p(A) = 0. I recalled that in a post a while ago, I mentioned that for any matrix A, A(\mathrm{adj}(A)) = (\det A) I_n, a fact that is not hard to visualize based on calculation of determinants via minors, which is in fact much of what brings the existence of this adjugate to reason in some sense. This can be used to prove the Cayley-Hamilton theorem.

So we have

(\lambda I_n - A)\mathrm{adj}(\lambda I_n - A) = p(\lambda)I_n,

where p is the characteristic polynomial of A. The adjugate in the above is a matrix of polynomials in t with coefficients that are matrices which are polynomials in A, which we can represent in the form \displaystyle\sum_{i=0}^{n-1}t^i B_i.

We have

\displaystyle {\begin{aligned}p(\lambda)I_{n} &= (\lambda I_n - A)\displaystyle\sum_{i=0}^{n-1}\lambda^i B_i \\ &= \displaystyle\sum_{i=0}^{n-1}\lambda^{i+1}B_{i}-\sum _{i=0}^{n-1}\lambda^{i}AB_{i} \\ &= \lambda^{n}B_{n-1}+\sum _{i=1}^{n-1}\lambda^{i}(B_{i-1}-AB_{i})-AB_{0}.\end{aligned}}

Equating coefficients gives us

B_{n-1} = I_n, \qquad B_{i-1} - AB_i = c_i I_n, 1 \leq i \leq n-1, \qquad -AB_0 = c_0I_0.

With this, we have

A^n + c_{n-1}A^{n-1} + \cdots + c_1A + c_0I_n = A^nB_{n-1} + \displaystyle\sum_{i=1}^{n-1} (A^iB_{i-1} - A^{i+1}B_i) - AB_0 = 0,

with the RHS telescoping and annihilating itself to 0.

There is generalized version of this for a module over a ring, which goes as follows.

Cayley-Hamilton theorem (for modules) Let A be a commutative ring with unity, M a finitely generated A-module, I an ideal of A, \phi an endomorphism of M with \phi M \subset IM.

Proof: It’s mostly the same. Let \{m_i\} \subset M be a generating set. Then for every i, \phi(m_i) \in IM, with \phi(m_i) = \displaystyle\sum_{j=1}^n a_{ij}m_j, with the a_{ij}s in I. This means by closure properties of ideals the polynomial coefficients in the above will stay in I.     ▢

From this follows easily a statement of Nakayama’s lemma, ubiquitous in commutative algebra.

Nakayama’s lemma  Let I be an ideal in R, and M a finitely-generated module over R. If IM = M, then there exists an r \in R with r \equiv 1 \pmod{I}, such that rM = 0.

Proof: With reference to the Cayley-Hamilton theorem, take \phi = I_M, the identity map on M, and define the polynomial p as above. Then

rI_M = p(I_M) = (1 + c_{n-1} + c_{n-2} + \cdots + c_0)I_M = 0

both annihilates the c_is, coefficients residing in I, so that r \equiv 1 \pmod{I} and gives the zero map on M in order for rM = 0.     ▢

Jordan normal form

Jordan normal form states that every square matrix is similar to a Jordan normal form matrix, one of the form

J = \begin{bmatrix}J_1 & \; & \; \\ \; & \ddots & \; \\\; & \; & J_p \\ \end{bmatrix}

where each of the J_i is square of the form

J_i = \begin{bmatrix}\lambda_i & 1 & \; & \; \\ \; & \lambda_i \; & \ddots & \; \\ \; & \; & \ddots & 1 \\ \; & \; & \; & \lambda_i \\ \end{bmatrix}.

This is constructed via generalized eigenvectors. One can observe that each block matrix corresponds to an invariant subspace, and generalized eigenvectors (of a matrix) are a set of chains, each of which is its own invariant subspace.

We let A be any linear transformation from V to V, where V is of course a vector space.

It is more common knowledge that Ker(A - \lambda I) is the mechanism used to solve for eigenvectors. Let us first observe that v \in Ker(A - \lambda I) is such that also Av \in Ker(A - \lambda I) since A commutes with A - \lambda I and that this extends to Ker(A - \lambda I)^t for any natural number t. This gives us a way to identify larger invariant subspaces.

Let W_i = Ker(A - \lambda I)^i. W_i \subset W_{i+1} is obvious, and there will be some and a smallest t at which W_t = W_{t+1}. Afterwards, the W_i must all be equal. If not, there will be in the intermediary on iterating A - \lambda I against a vector from which W_t = W_{t+1} is contradicted.

Next, we observe that Ker(A - \lambda I)^t \cap Im(A - \lambda I)^t = \emptyset. Suppose not. Then, we would have some w \in W_{2t} but not in W_t.

Rank nullity theorem says that the remainder after pulling out Ker(A - \lambda I)^t for some eigenvalue \lambda is Im(A - \lambda I)^t. We can run the same algorithm on that for another eigenvalue. So this is resolved by induction.

The result is that if A has distinct eigenvalues \lambda_1, \lambda_2, \ldots, \lambda_k, there are a_1, a_2, \ldots, a_k such that the domain of A is

(A - \lambda_1 I)^{a_1} \oplus (A - \lambda_2 I)^{a_2} \oplus \cdots \oplus (A - \lambda_k I)^{a_k}.

Now does Ker(A - \lambda I)^t correspond to a irreducible invariant subspace necessarily? No, as there is a difference between algebraic and geometric multiplicity.

Now we will show, as the second part of the proof, to be invoked on the components in the direct sum decomposition from the preceding first part of the proof that if T is nilpotent, meaning that T^n = 0 for some n, then there are v_1, \ldots, v_k and a_1, \ldots, a_k such that \{v_1, Tv_1, \ldots, T^{a_1-1}v_1, v_2, \ldots, v_k, Tv_k, \ldots, T^{a_k-1}v_k\} is a basis (linearly independent by definition) for the domain of T, with \sum a_k = \dim V and \max(a_1, \ldots, a_k) = n. (Note that here n is the smallest with T^n = 0.)

For any eigenvalue, there is an eigenvector space associated with it. Take its preimage with respect to A. Do this successively until nothing remains, which will be at the n-1th iteration. In particular, take u_1, \ldots, u_k to be the basis of the eigenvector space. For each one of these that has non-empty preimage, we take the element with the kernel projected out. This accumulates a set of vectors of the format specified. It has to be exhaustive with respect to the vector space under the nilpotence assumption, from which termination is also guaranteed. It remains to show that these are linearly independent. We can using our eigenvector space as our base case take an inductive hypothesis where the vectors accumulated prior to the kth iteration are linearly independent. Now we show that the vector set remains so after adding in the ones obtained from taking preimage. We note that first, the added ones are linearly independent themselves (if a nontrivial linear combinations gives zero, applying A would violate our inductive hypothesis). There is also that a nontrivial linear combination of the newly added ones cannot equal a linear combination of the rest. To show this, assume otherwise, and apply A just enough times (k times) for one side to disappear. The other side should be a linear combination with respect to our designated basis of the eigenvector space, which cannot disappear. This concludes our construction.

Essentially we have chains (of vectors) which terminate when an element is no longer found after our preimage operation. Applying to this T = A - \lambda I, we see that for some element u in our chain, Au = \lambda u + v, where v is the previous element of the chain, with 0 signifying that we are the at an eigenvector (non-generalized), at the front. Ones along the superdiagonal correspond to the 1 coefficient of v above.



Hilbert basis theorem

I remember learning this theorem early 2015, but I could not remember its proof at all. Today, I relearned it. It employed a beautiful induction argument to transfer the Noetherianness (in the form of finite generation) from R to R[x] via the leading coefficient.

Hilbert Basis TheoremIf R is a Noetherian ring, then so is R[x].

Proof: Take some ideal J in R. Notice that if we partition J by degree, we get from the leading coefficients appearing in each an ascending chain (that has to become constant eventually, say at k). Take finite sets A_n \subset J for m \leq n \leq k, where m is the smallest possible non-zero degree such that the I_ns for the leading coefficient ideals are generated. With this we can for any polynomial p construct a finite combination within A = \displaystyle\cup_{n=m}^k A_n that equates to p leading coefficient wise, and thereby subtraction reduces to a lower degree. Such naturally lends itself induction, with m as the base case. For m any lower degree polynomial is the zero polynomial. Now assume, as the inductive hypothesis that A acts as a finite generating set all polynomials with degree at most n. If n+1 \leq k, we can cancel out the leading coefficient using our generating set, and then use the inductive hypothesis. If n+1 > k, we can by our inductive hypothesis generate with A a degree n polynomial with same leading coefficient (and thereby a degree n+1 one multiplying by x) and from that apply our inductive hypothesis again, this time on our difference.

Galois theory

I’ve been quite exhausted lately with work and other annoying life things. So sadly, I haven’t been able to think about math much, let alone write about it. However, this morning on the public transit I was able to indulge a bit by reviewing in my head some essentials behind Galois theory, in particular how its fundamental theorem is proved.

The first part of it states that there is the inclusion reversing relation between the fixed fields and subgroups of the full Galois group and moreover, the degree of the field extension is equal to the index of corresponding subgroup. This equivalence can be easily proved using the primitive element theorem, which I will state and prove.

Primitive element theorem: Let F be a field. F(\alpha)(\beta), the field from adjoining elements \alpha, \beta to F can be represented as F(\gamma) for some single element \gamma. This extends inductively to that any field extension can be represented by some adjoining some primitive element.

Proof: Let \gamma = \alpha + c\beta for some c \in F. We will show that there is such a c such that \beta is contained in F(\gamma). Let f, g be minimal polynomials for \alpha and \beta respectively. Let h(x) = f(\gamma - cx). The minimal polynomial of \beta in F(\gamma) must divide both h and g. Suppose it has degree at least 2. Then there is some \beta' \neq \beta which induces \alpha' = \gamma - c\beta' that is another root of f. With \gamma = \alpha + c\beta = \alpha' + c\beta', there is only a finite number of c such that \beta is not in F(\gamma). QED.

The degree of a field extension corresponds to the degree of the minimal polynomial of its primitive element. That primitive element can be in an automorphism mapped to any one of the roots of the minimal polynomial, thereby determining the same number of cosets.

The second major part of this fundamental theorem states that normality subgroup wise is equivalent to there being a normal extension field wise. To see this, remember that if a field extension is normal, a map that preserves multiplication and addition cannot take an element in the extended field outside it as that would imply that its minimal polynomial has a root outside the extended field, thereby violating normality. Any g in the full Galois group thus in a normal extension escapes not the extended field (which is fixed by the subgroup H we’re trying to prove is normal). Thus for all h \in H, g^{-1}hg also fixes the extended field, meaning it’s in H.


A observation on conjugate subgroups

Let H and H' be conjugate subgroups of G, that is, for some g \in G, g^{-1}Hg = H'. Equivalently, HgH' = gH', which means there is some element of G/H' such that under the action of H on G/H', its stabilizer subgroup is H, all of the group of the group action. Suppose H is a p-group with index with respect to G non-divisible by p. Then such a fully stabilized coset must exist by the following lemma.

If H is a p-group that acts on \Omega, then |\Omega| = |\Omega_0|\;(\mathrm{mod\;} p), where \Omega_0 is the subset of \Omega of elements fully stabilized by H.

Its proof rests on the use orbit stabilizer theorem to vanish out orbits that are multiples of p.

This is the natural origin of the second Sylow theorem.

Math sunday

I had a chill day thinking about math today without any pressure whatsoever. First I figured out, calculating inductively, that the order of GL_n(\mathbb{F}_p) is (p^n - 1)(p^n - p)(p^n - p^2)\cdots (p^n - p^{n-1}). You calculate the number of k-tuples of column vectors linear independent and from there derive p^k as the number of vectors that cannot be appended if linear independence is to be preserved. A Sylow p-group of that is the group of upper triangular matrices with ones on the diagonal, which has the order p^{n(n-1)/2} that we want.

I also find the proof of the first Sylow theorem much easier to understand now, the inspiration of it. I had always remembered that the Sylow p-group we are looking for can be the stabilizer subgroup of some set of p^k elements of the group where p^k divides the order of the group. By the pigeonhole principle, there can be no more than p^k elements in it. The part to prove that kept boggling my mind was the reverse inequality via orbits. It turns out that that can be viewed in a way that makes its logic feel much more natural than it did before, when like many a proof not understood, seems to spring out of the blue.

We wish to show that the number of times, letting p^r be the largest pth power dividing n, that the order of some orbit is divided by p is no more than r-k. To do that it suffices to show that the sum of the orders of the orbits, \binom{n}{p^k} is divided by p no more than that many times. To show that is very mechanical. Write out as m\displaystyle\prod_{j = 1}^{p^k-1} \frac{p^k m - j}{p^k - j} and divide out each element of the product on both the numerator and denominator by p to the number of times j divides it. With this, the denominator of the product is not a multiple of p, which means the number of times p divides the sum of the orders of the orbits is the number of times it divides m, which is r-k.

Following this, Brian Bi told me about this problem, starred in Artin, which means it was considered by the author to be difficult, that he was stuck on. To my great surprise, I managed to solve it under half an hour. The problem is:

Let H be a proper subgroup of a finite group G. Prove that the conjugate subgroups of H don’t cover G.

For this, I remembered the relation |G| = |N(H)||Cl(H)|, where Cl(H) denotes the number of conjugate subgroups of H, which is a special case of the orbit-stabilizer theorem, as conjugation is a group action after all. With this, given that |N(H)| \geq |H| and that conjugate subgroups share the identity, the union of them has less than |G| elements.

I remember Jonah Sinick’s once saying that finite group theory is one of the most g-loaded parts of math. I’m not sure what his rationale is for that exactly. I’ll say that I have a taste for finite group theory though I can’t say I’m a freak at it, unlike Aschbacher, but I guess I’m not bad at it either. Sure, it requires some form of pattern recognition and abstraction visualization that is not so loaded on the prior knowledge front. Brian Bi keeps telling me about how hard finite group theory is, relative to the continuous version of group theory, the Lie groups, which I know next to nothing about at present.

Oleg Olegovich, who told me today that he had proved “some generalization of something to semi-simple groups,” but needs a bit more to earn the label of Permanent Head Damage, suggested upon my asking him what he considers as good mathematics that I look into Arnold’s classic on classical mechanics, which was first to come to mind on his response of “stuff that is geometric and springs out of classical mechanics.” I found a PDF of it online and browsed through it but did not feel it was that tasteful, perhaps because I’m been a bit immersed lately in the number theoretic and abstract algebraic side of math that intersects not with physics, though I had before an inclination towards more physicsy math. I thought of possibly learning PDEs and some physics as a byproduct of it, but I’m also worried about lack of focus. Maybe eventually I can do that casually without having to try too hard as I have done lately for number theory. At least, I have not the right combination of brainpower and interest sufficient for that in my current state of mind.

一说起偏微分方程,想到此行有不少杰出的浙江裔学者,最典型的可以说是谷超豪。想起,华盛顿大学一位做非交换代数几何的教授,浙江裔也,的儿子,曾经说起他们回国时谷超豪,复旦的,如他父亲一样,逝世了,又半开玩笑言:“据说谷超豪被选为院士,是因为他曾经当过地下党。”记得看到杨振宁对谷超豪有极高的评价,大大出于谷超豪在杨七十年代访问复旦的促动下解决了一系列有关于杨-米尔斯理论的数学问题。之外,还有林芳华,陈贵强,都是非常有名气的这套数学的教授,也都是浙江人。我们都知道浙江人是中国的犹太人,昨天Brian Bi还在说”there are four times more Zhejiangnese than Jews.” 可惜我不是浙江人,所以成为数学家可能希望不大了。:(



昨晚,我跟那位犹裔美国IMO金牌在脸书上讨论犹太人与中国人在最高智力层次相比的问题,想起有两位我所认识的以基督教传统长大的学习理论科学的美国人所我当时难以思议的东亚人智力上强于犹太人的观点。怎么说那,虽然在这前五十年,日本人和华人在理论科学上做出了的不少伟大的贡献,占有美国好研究大学不少教职,加上我这一代的华人在竞赛中出色的表现,可是还是感觉在科学里的绝顶,犹太人更多,以犹太人更具有一定的高瞻远瞩,可促以颠覆性的跨越,苏联那批犹裔数学大师为典型例子。同时,这个人,作为组合数学为学习及研究方向的高材生,又提醒我犹太人在理论计算机以及匈牙利式组合数学所有的牛耳。他说世界上最聪明的人是亚洲人,他的名字是Terry Tao,可是前一百犹太人综合强于前一百亚洲人的综合。对此,我问他:你了解任何Tao所做的工作吗,可肯定他是世界上最聪明的人?他回:我读过Green-Tao定理的证明。我没啥好说的,只言那还算比较前沿的东西,又跟他说我在对一些华罗庚撰的数论引导,虽引导,可以包含一些我现在认为相当深的数论,如Selberg所做的一些。Tao是个神,可是我也有朋友说:我有事想是否Tao未有过以自己不如von Neumann聪明而心里不安,加上数学那么难,连Tao都差点没有通过博士生资格考试。加上,von Neumann精通数门外语,具有即兴无迟钝翻译之能,以及过目不忘的记忆力,而我都看到过有些中国人在网上以将自己视为”primarily an Australian”的Tao对中国文化一无认同和他对中文一无所知表示反感。我在此博客上前所提到那位犹裔数学博士,念到深到Goro Shimura所做的工作,也觉得Tao有点overrated,觉得他的工作没有例如陈省身所做的深远及原创,说Tao至今还没有创造新的领域。关于犹亚之比,我想到的还有环境的因素,在这一点华人还是比较吃亏,由于经济原因,也由于名字及文化陌生原因,老一辈的华人还在为了自己及国家的生存挣扎,没有那么多经历投入科学研究。或许现在歧视对华人,即使在理论科学界,还是相当严重,虽理论科学少有集体性及宣传及政治因素,与比如生物或软件开发不同,可是人都是有偏见的,这包括评审委员会,如我听到的诺贝尔委员会对苏联科学家的工作的贬值。我这一代,华人在那些完全公平没有任何主观因素的竞赛里已经遥遥胜于犹太人,而那些是最好的对纯粹智力顶级的测试。我有时候想:中国人现在最缺的不是科学技术人才,而是反抗歧视,争取话语权的人才。在外国人眼中,中国人经常有性格被动的刻板印象,的确有这一点,但是好多也是不太客观的媒体所造成的。加上,中国人在美国也是少数,又有语言文化障碍,这又是一个视为寻常的Asian penalty.

数学上,我闻到了在\mathbb{F}_p域下的次数整除n的不可约首一多项式的积等于非常干净的x^{p^n} - x。此多项式很容易看到没有平方因式,用典型的此与此导数非共有因子去证。同时,取任意次数dd | n的不可约首一多项式\phi,则\mathbb{F}_p[x] / (\phi)是个p^d元素的域,则所有元素是x^{p^d} - x的根(x也是此域一元),从此可以得到任意多项式(这包括x)代到x^{p^d} - xx里都在模\phi等于零,也就是说他会是\phi的倍数。因d | nx^{p^d} - x | x^{p^n} - x,则\phi | x^{p^n} - x。不难证明\mathrm{gcd}(x^{p^n} - x, x^{p^d} - x) = x^{p^{\mathrm{gcd}(n, d)}} - x. 若d \nmid n,次数d的多项式若要整除x^{p^n} - x,必整除x^{p^{\mathrm{gcd}(n, d)}} - x,可以用归纳法证明此不可能,在\mathrm{gcd}(n, d)< d的情况下。从此,可以得到x^{p^n} - x没有因子次数非整除n。证闭。


Composition series

My friend after some time in industry is back in school, currently taking graduate algebra. I was today looking at one of his homework and in particular, I thought about and worked out one of the problems, which is to prove the uniqueness part of the Jordan-Hölder theorem. Formally, if G is a finite group and

1 = N_0 \trianglelefteq N_1 \trianglelefteq \cdots \trianglelefteq N_r = G and 1 = N_0' \trianglelefteq N_1' \trianglelefteq \cdots \trianglelefteq N_s' = G

are composition series of G, then r = s and there exists \sigma \in S_r and isomorphisms N_{i+1} / N_i \cong N_{\sigma(i)+1} / N_{\sigma(i)}.

Suppose WLOG that s \geq r and as a base case s = 2. Then clearly, s = r and if N_1 \neq N_1', N_1 \cap N_1' = 1. N_1 N_1' = G must hold as it is normal in G. Now, remember there is a theorem which states that if H, K are normal subgroups of G = HK with H \cap K = 1, then G \cong H \times K. (This follows from (hkh^{-1})k^{-1} = h(kh^{-1}k^{-1}), which shows the commutator to be the identity). Thus there are no other normal proper subgroups other than H and K.

For the inductive step, take H = N_{r-1} \cap N_{s-1}'. By the second isomorphism theorem, N_{r-1} / H \cong G / N_{s-1}'. Take any composition series for H to construct another for G via N_{r-1}. This shows on application of the inductive hypothesis that r = s. One can do the same for N_{s-1}'. With both our composition series linked to two intermediary ones that differ only between G and the common H with factors swapped in between those two, our induction proof completes.

Automorphisms of quaternion group

I learned this morning from Brian Bi that the automorphism group of the quaternion group is in fact S_4. Why? The quaternion group is generated by any two of i,j,k all of which have order 4. \pm i, \pm j, \pm k correspond to the six faces of a cube. Remember that the symmetries orientation preserving of cube form S_4 with the objects permuted the space diagonals. Now what do the space diagonals correspond to? Triplet bases (i,j,k), (-i,j,-k), (j,i,-k), (-j,i,k), which correspond to four different corners of the cube, no two of which are joined by a space diagonal. We send both our generators i,j to two of \pm i, \pm j, \pm k; there are 6\cdot 4 = 24 choices. There are by the same logic 24 triplets (x,y,z) of quaternions such that xy = z. We define an equivalence relation with (x,y,z) \sim (-x,-y,z) and (x,y,z) \sim (y,z,x) \sim (z,x,y) that is such that if two elements are in the same equivalence class, then results of the application of any automorphism on those two elements will be as well. Furthermore, no two classes are mapped to the same class. Combined, this shows that every automorphism is a bijection on the equivalence classes.


上周底,又跑湾区一趟,为了面试,也为了玩。这次受益匪浅,拿到了比较好的工作,同时也再次得到学习数学的启发。我的一位不善学,疯疯癫癫的,苏犹半朋友,却建议我跟一位正在MIT学数学的美国IMO金牌得主聊天。此人我已被介绍过,通过一位“知名的高才生中间人”,可与此人未说任意有含量的话,从而想这牛人太忙,与我这庸人无话可说。出乎意料,介绍者却说他与牛人沟通频繁,并且将我加入他们俩人的脸书群,后来又加了一位不相信智商,反徐道辉的,女生物博士生,犹裔也。面试之前的晚上,我住在旧金山的宾馆,附近有一家“现代犹太博物馆”,好奇去看了看,楼上关门,楼下没啥好看的。我们在群里所讨论的好多与种族,文化,智商,和学科相关,还记得我曾经对数学尖子开玩笑,美国IMO选手,非亚裔,必犹裔,而他似乎未感到我的玩笑口气,回说他那界有一两位非犹太白人。他还强调自己很美国人,不是那种在以色列待过的。高中时,他选了中文为他的外语,以童话为他爱听的一首中文歌。学术上,给的感觉是专注,单一的纯数学本科生,非常肯定他会走学术道路。他的具体数学兴趣及倾向为组合数学和理论计算机和解析数论。那天晚上,问他知道那几个二次互反律(law of quadratic reciprocity)的证明,他回答一个引用Zolotarev’s lemma的,并且发了个链接。当时,我只知道高斯和(Gauss sum)的那个,而细节已经记不清楚了。我花了一小时细读那个证明,领悟后感觉漂亮无比,引用的工具极其简单。从来没有想到可以将这数论皇后的定理视为,表达为置换的奇偶的积,毕竟Legendre symbol给的是1或-1,与sgn一样,好妙啊。回顾透明Legendre symbol给的基本是在给循环阿贝尔群(\mathbb{Z}/{p\mathbb{Z}})^{\times}的元素奇偶,二次剩余和偶置换在他们对应母群都是指数为2的子群。这又让我想到高斯对于正十七边形可作图的证明也是引用指数2的子群,其由Galois theory对应于度数为2的域扩张。过两天,我又学到了Gauss lemma,就是\left(\frac{a}{p}\right) = (-1)^n, n\{a, 2a, \ldots, \frac{p-1}{2}a\}大于p/2的元素的数量。证明思路很直接,将\{a, 2a, \ldots, \frac{p-1}{2}a\}的大于p/2的元素负掉,可和其他元素从新凑成\{1, 2, \ldots, \frac{p-1}{2}\}. Eisenstein对该定理的证明,我以前知道其存在但没看懂的,引用类似于Gauss lemma的引理,思路及证明策略抓住,这次却清晰了然。二次互反律的美妙我之前无法欣赏,记得对此定理有过一种稀奇古怪,难以思议之感,是没有抓住并且悟觉它的美妙的对称结构。想起在Eisenstein的证明中,画了一个pq的格子,将-1的次数示为格子的下左象限所有的点数,此为(p-1)(q-1)/4的来源。

数论是纯数学比较活的分支,与现在被我看为形式化繁琐枯燥的测度论相反。回想起,我学测度论习题做不出来开始大大怀疑自己数学的能力并且对数学稍失去了兴趣。而数论从某种角度而言有相反的作用。昨天,我从新过了Bertrand’s postulate的源于Erdos的证,对此同样有新的见解,至少感觉是这样。这次很明确它是以好紧限的有易与素数次数相连的\binom{2n}{n}在假设n2n范围无素数情况下,导致未成立的不等式。证明里的关键观察是2n/3n之间的素数不整除,此有大大减小上限的作用。