An unpacking of Hurwitz’s theorem in complex analysis

Let’s first state it.

Theorem (Hurwitz’s theorem). Suppose \{f_k(z)\} is a sequence of analytic functions on a domain D that converges normally on D to f(z), and suppose that f(z) has a zero of order N at z_0. Then for every small enough \rho > 0, there is k large such that f_k(z) has exactly N zeros in the disk \{|z - z_0| < \rho\}, counting multiplicity, and these zeros converge to z_0 as k \to \infty.

As a refresher, normal convergence on D is convergence uniformly on every closed disk contained by it. We know that the argument principle comes in handy for counting zeros within a domain. That means

The number of zeros in |z - z_0| < \rho, \rho arbitrarily small, goes to the number of zeros inside the same circle of f, provided that

\frac{1}{2\pi i}\int_{|z - z_0| = \rho} \frac{f'_k(z)}{f_k(z)}dz \longrightarrow \frac{1}{2\pi i}\int_{|z - z_0| = \rho} \frac{f'(z)}{f(z)}dz.

To show that boils down to a few technicalities. First of all, let \rho > 0 be sufficiently small that the closed disk \{|z - z_0| \leq \rho\} is contained in D, with f(z) \neq 0 inside it everywhere except for z_0. Since f_k(z) converges to f(z) uniformly inside that closed disk, f_k(z) is not zero on its boundary, the domain integrated over, for sufficiently large k. Further, since f_k \to f uniformly, so does f'_k / f_k \to f' / f, so we have condition such that convergence is preserved on application of integral to the elements of the sequence and to its convergent value. With \rho arbitrarily small, the zeros of f_k(z) must accumulate at z_0.

Principal values (of integrals)

I’ve been looking through my Gamelin’s Complex Analysis quite a bit lately. I’ve solved some exercises, which I’ve written up in private. I was just going over the section on principal values, which had a very neat calculation. I’ll give a sketch of that one here.

Take an integral \int_a^b f(x)dx such that on some x_0 \in (a,b) there is a singularity, such as \int_{-1}^1 \frac{1}{x}dx. The principal value of that is defined as

PV \int_a^b f(x)dx = \lim_{\epsilon \to 0}\left(\int_a^{x_0 - \epsilon} + \int_{x_0 + \epsilon}^b\right)f(x)dx.

The example the book presented was

PV\int_{-\infty}^{\infty} \frac{1}{x^3 - 1} = -\frac{\pi}{\sqrt{3}}.

Its calculation invokes both the residue theorem and the fractional residue theorem. Our integrand, complexly viewed, has a singularity at e^{2\pi i / 3}, with residue \frac{1}{3z^2}|_{z = e^{2\pi i / 3}} = \frac{e^{2\pi i / 3}}{3}, which one can arrive at with so called Rule 4 in the book, or more from first principles, l’Hopital’s rule. That is the residue to calculate if we had the half-disk in the half plane, arbitrarily large. However, with our pole at 1 we must indent it there. The integral along the arc obviously vanishes. The infinitesimal arc spawned by the indentation, the integral along which, can be calculated by the fractional residue theorem, with any -\pi, the minus accounting for the clockwise direction. This time the residue is at 1, with \frac{1}{3z^2}|_{z = 1} = \frac{1}{3}. So that integral, no matter how small \epsilon is, is -\frac{\pi}{3}i. 2\pi i times the first residue we calculated minus that, which is extra with respect to the integral, the principal value thereof, that we wish to calculate, yields -\frac{\pi}{\sqrt{3}} for the desired answer.

Let’s generalize. Complex analysis provides the machinery to compute integrals not to be integrated easily by real means, or something like that. Canonical is having the value on an arc go to naught as the arc becomes arbitrarily large, and equating the integral with a constant times the sum of the residues inside. We’ve done that here. Well, it turns out that if the integral has an integrand that explodes somewhere on the domain of integration, we can make a dent there, and minus out the integral along its corresponding arc.

A possible switch in focus from math to natural science

I find myself becoming more keen on natural reality over the last week or so, though my time has still been mostly concentrated on mathematics. It is possible that I am actually more suited to natural science than to mathematics, who knows. To estimate the expected extent of that finer, I’m going to go learn some natural science, like what else would I do.

I want to first talk about my experience with science in college, high school, middle school, and perhaps even earlier. In elementary school, in sixth grade, we had this science fair. My partner and I chose to do ours on wastewater treatment plants. There were some people who did solar power and even one who did population growth the previous year (social science is science I guess). I learned absolutely nothing from that; it’s like, how many kids at that age can actually learn science that isn’t bull shit?

In 7th grade, we had for our science course life science. It was mostly taking notes on various types of life, from fish to reptiles to plants. That’s when I first learned of the Linnaeus classification system. We didn’t do experiments really. Tests were mostly regurgitation of notes. There was nothing quantitative.

In 8th grade, it was earth science. The teacher was so dumb that in math class, this kid was like: how can you like Mrs.    ? She’s as dumb as a rock! To that, the math teacher, who later realized by me was a complete moron who didn’t even know what math was, was not terribly accepting, I’ll put it that way. We studied volcanos and earthquakes, watched documentaries on those types of things, and played around a bit with Bunsen burners and random equipment typical in chemistry laboratory, the names of which I know not. For names, I guess use this as a reference? I didn’t like that class and didn’t do well in it at all. My ADHD or what not was particularly severe in it.

In 9th grade, it was “physical science.” We did some problems in Newtonian mechanics, very simple ones, that’s when I first learned of Newtons, Joules, work, energy, those types of things. I actually found that pretty interesting. There was this project for making an elastic powered, or rubber band powered car to be more explicit. Really, there wasn’t much point in that other than as a way to pass time for kids. Having worked as a software engineer, I can guess that there are very systematic ways of designing and building that stuff. Of course, us kids just tinkered around in a way wherein we didn’t know at all what we were doing. I do remember there was a time when we were playing with this thing, called a crucible I think, that we were not supposed to touch, as doing so would smear black onto our hands, which I nonetheless still did, receiving, consequently, reprimand from the teacher.

In 10th grade, it was chemistry and then biology for the second half of the year. This was now at my grades 10-12 high school. The class was rumored to be impossibly hard, and the teacher was said to be a very demanding guy. There was, unlike the year previous, basically zero hands on. The chemistry part was very quantitative, I remember stoichiometry was a big part. There was nothing really hard about it; the students were simply too dumb to even perform very mechanical calculations. Kids would say: “it’s a lot of math.” At that time, I didn’t know the difference between math and science, and the other kids knew even less. Math is founded on the axiomatic system pioneered by the Greeks, about proving things in an a priori way, while science is about modeling natural phenomena and testing those models. Math in science is just a tool and not the focus. I recall we started off learning about uncertainties in measurement. There’s really nothing especially hard about that stuff, with a very systematic way of going about it, but the atmosphere and the way it was lectured about made it seem like such a grand thing to us. The second half the year as I said was biology. I wasn’t terribly engaged in that. I didn’t like the memorization involved. I liked math more. I made the AIME that year, taking the AMCs for the first time, and was one of four kids out of almost 2000 in our high school to do so, so that brought me to conclude that maybe I actually had some talent for math and science. I knew that physics was the hardest and brainiest of the sciences, with all that fancy math and Einstein, so I was rather keen to learn that. I checked out some physics books from the library I think, and the first thing I learned about was if I remember correctly centripetal acceleration, which confused me quite a lot at that time.

11th and 12th grade was physics with the same teacher. The class was rather dumbed down; it had to, especially on the math end, problem solving wise, since this is an American high school after all. There was quite a focus on phenomena, as opposed to formalism. I didn’t really like that much. I was more comfortable with formalism, with math being my relative forte at that time. We did some experiments, but I wasn’t good at them at all. I remember on the first day, when looking at some uniformly dense rod vertically situated, it occurred neither to me nor my partner to record its position at its center of mass. I didn’t really understand what I was doing throughout the whole time. The other kids, most of them, were worse. There were some who were confused about the difference between energy and power, the latter of which is the derivative of the former of course, after two years of it! I remember the whole time many kids would go: wow! physics! That kind of perspective, later understood by me, makes it almost impossible for one to really learn it. With just about everything, there is a right way of going about it. Discover it (mentally, with the aid of books, lectures, various resources) and you’ll do great. Be in awe of it, and you’ll never get it. The former is in line with the philosophy that you should focus solely on what is true, objectively, and not imagine anything that doesn’t aid in your convergence to the truth, and reminds me of the quote of Einstein that one should make something as simple as it can be but no simpler. Simplicity is gold in science and just about everything. Ability to recognize the redundant and superfluous and to generalize is the essence of intellectual ability, or to put it in more extreme terms, genius. The culture in American high school is the antithesis of that. Kids are always talking about how hard things, especially math loaded subjects, are, when they’re making it hard for themselves by imagining in their minds what is complete bogus from a scientific point of view. To digress, this holds as well as for subjects like history. Focus only and solely on the what are the facts and the truth they bare out. Don’t let political biases and personal wants and wishes interfere in any way. This is to my remembrance advice intellectual Bertrand Russell gave to posterity nearing his death. American history classes are particular awful at this. American teaching of history is very much founded on ignorance and American exceptionalism and a misportrayal of cultures or political systems it, or more like, its blood sucking elite, regards as evil for the simple reason that they are seen as threatening towards their interests. Math and science under the American public school system was pretty dismal. History (or social studies, as they call it) was perhaps more so, in a way more laughable and contemptible.

I hardly took science in college, being a math and CS major. I did take two quarters of physics and it was awful. Talking with some actual physics PhD students and physics PhDs gave me a more accurate idea of what physics really was, though I was still pretty clueless. It was evident to me at that time that physics, and probably also chemistry, was far more demanding in terms of cognitive ability as many of the CS majors, who could write code not badly, struggled with even very simple physics. Being in college, I had a closer look at the world of real science, of scientists, in America, which is very foreign. It dawned on me that science, as exciting as it sounds, is in America done mostly by underpaid ubermensch immigrant men, who are of a completely different breed both intellectually and culturally from most of the people I had encountered at that time. Yes, by then I had found my way to this essay by Greenspun. I’ll leave its interpretation up to the reader. 😉

You can probably guess that I think American science education is a complete joke, which is the truth. I felt like I only began to really learn things once I got out of the American school system, although for sure, the transition between high school and college in terms of content and depth and rapidity of learning was quite substantial. However, the transition from undergrad to out into the bigger world, where I could consider myself psychologically as more in the ranks of everyone, regardless of age or national origin, than in the ranks of clueless American undergrads at a mediocre program, was probably just as substantial in the same respect, albeit in a very different way.

Now let’s context switch to some actual science (that’s not pure math or artificial in any way).


A capacitor is made by taking some negative charge off a positive plate and transferring it to the negative plate. This obviously requires work. If the final voltage is \Delta V, then the average during the charging process is half of that. With the change in potential energy \Delta U_E as change times voltage (remember, voltage is potential energy per unit charge), we can write \Delta U_E = \frac{1}{2}Q\Delta V.

How to maintain charge separation? Insert an insulator (or dielectric) between the plates. Curiously, a dielectric always increases the capacitance (Q / \Delta V) of a capacitor. Its existence, via the charge on the plates, makes for a electrically polarized medium, which induces an electric field in the reverse direction that is in addition to the one induced by the capacitors alone. As you see, the negative charges in the dielectric lean towards the positive plate and same holds if you permute negative and positive. So if the plates, by themselves give rise to \mathbf{E}, the addition of the dielectric gives rise to some \mathbf{E_i} in the opposite direction. Call \kappa the coefficient of the reduction in the magnitude of the electric field with

E_{\mathrm{with\;dielectric}} = E_{\mathrm{without\;dielectric}} - E_i = \frac{E} {\kappa}.

We put that coefficient in the denominator so that

C_{\mathrm{with\;dielectric}} = \kappa C_{\mathrm{without\;dielectric}}.

To be more explicit, so that \kappa is proportional capacitance wise, which is reasonable since capacitance is what is more central to the current context. This kappa value is called dielectric constant, varying from material to material, under the constraint that it is always greater than 1.

Now one might ask if the capacitor is charging when the dielectric is inserted. If it isn’t, the voltage across will experience a sudden decrease, with the charge stored constant, and if it is, voltage will experience the same, but the charge on the plates will keep going up, as the voltage will too at a rate proportional to that of the increase of the charge, with the constant of proportionality the increased C. Needless to say, on taking derivative, a linear relation is preserved with the same coefficient of linearity.

The presence of a dielectric presents a potential problem, namely that if the voltage is too high, the electrons in the dielectric material can be ripped out of their atoms and propelled towards the positive plate. Obviously, this discharges the capacitor, as negative and positive meet to neutralize. It is said that this typically burns a hole through the dielectric. This phenomenon is called dielectric breakdown.

Programming types

Programming, the intense hacker side of it, attracts a certain breed of person. In short, I would put it as that it attracts those who are higher in autism than in g, though of course one needs to be reasonably high in both, especially the verbal side of g, as its activity is largely one of reading (of logs and documentation) and writing (of code (and its supporting documentation), the quality of which has good variable names as a major component). I do feel at times that programmers, even elite ones, are lacking in scientific taste. Many of them are mathematically null. They thrive on and even love the detailed minutiae involved in the work, such as encodings (like UTF, ASCII, that type of thing), the ins and outs of Unix, and arcane facts of various languages. I had to encounter in my work today parsing of CSV files, and it turned out that the CSV reader was not reading under the correct encoding. I ended up diffing my output with the output generated via a means more or less guaranteed to work to aid such’s diagnosis. I’m not bad at this type of thing any longer, having trained myself or more like grown to be able to patiently resolve such problems in a systematic, foolproof fashion.

Does that mean I enjoy this type of thing? No, not at all, though I find it tolerable, more or less. Too autistic for me. It does not have the depth that mathematics has. It has not the beauty of poetry or of music. It has not the wittiness of words or the expressiveness of (human) language. Nor does it have the significance on the world that politics has. There are more meaningful to be doing than programming, though needless to say there is much demand for it as the world now runs on computer programs, which are written mostly by politically incompetent and often socially awkward who answer to morons with MBAs.

I’ve come to notice that programmers tend to be very narrow. They only know programming. There are of course exceptions. Mathematicians and to a greater extent physicists are more broad, and more deep. It makes them very boring to talk with. The people who are more well rounded who are in programming are often, from my observation, in it for the easy money, which is of course paltry relative to what the parasites of our society suck in, but nonetheless a very good sum by the standards of ordinary folk.

There is of course another world of programming, that of the incompetents, who often know only Java and barely know any computer science even. They’re far from the functional programmers who I work with. This industry is so in need of grunt labor that those people manage to find their way into six figure salaries. Yes, this includes places like Google and Facebook. There are Google engineers who don’t know what the difference between stack memory and heap memory is and who think C++ pointers are scary, who make 200k a year or almost. I won’t talk more about them. Waste of breath.

A result of Cantorian pathologies

We are asked to find a function on [0,1] such that f(0) = f(1) = 1 that has positive derivative almost everywhere on that domain. It occurred to me to use the Cantor set, which is obtained by partitioning remaining intervals into thirds and removing the interior of the second one. So first (1/3, 2/3) then (1/3^2, 2/3^2) and (7/3^2, 8/3^2), and so on. Each time we remove two-thirds of the remaining and summing the geometric series yields a measure of 1 for all that is removed. Another Cantorian construct arrived at from the Cantor set is the Cantor function, or Cantor staircase, called such as it resembles a staircase. That is, it turns out exactly what we need. It is a function with derivative zero almost everywhere, with non zero derivative points as jump points at the Cantor set. It is that discontinuity that facilitates the going from 0 to 1 along an interval with zero derivative almost everywhere. A transformation of that with a function with positive derivative is a step to deliver us what we want. That would be x minus the Cantor function. This has derivative 1 almost everywhere. However, we need to keep its range inside [0,1], which its outputs at [0,1/2] seems to not satisfy entirely. We are done though if we prove the other half to be non-negative, because then we can stretch the other half horizontally by a factor of two. It is needless to say that between 1/2 and 2/3 such is the case. Past 2/3, we have a downshift of 2 in base three to a 1 in base two at the first place, meaning a decrease of at least 1/6 when summing the net change at all digits which decrease in value. Digits increase in value on a change from 2 to 1 in every place past the first, or on a 1 digit in the original in its change to base two, which can occur only once, with all the following of that changed to zero, an increase of 1/2^n - 1/3^n, where n is the index of the place. In the former case, the largest possible increase is 1/2^2 + 1/2^3 + \cdots = 1/2 minus 2/3^2 + 2/3^3 + \cdots = 1/3, which is 1/6. In the latter case, 1/2^n - 1/3^n is exceeded by 1/2^{n-1} - 1/(2 \cdot 3^{n-1}), the total decrease from digit 2 to digit 1 from the nth place on. Thus, the minimum total increase from the increases exceeds the maximum total decrease from the decreases, which completes our proof of non-negativity for x \geq 1/2.

Galois group of x^10+x^5+1

This was a problem from an old qualifying exam, that I solved today, with a few pointers. First of all, is it reducible? It actually is. Note that x^{15} - 1 = (x^5-1)(x^{10}+x^5+1) = (x^3-1)(x^{12} + x^9 + x^6 + x^3 + 1). 1 + x + x^2, as a prime element of \mathbb{Q}[x] that divides not x^5-1 must divide the polynomial, the Galois group of which we are looking for. The other factor of it corresponds to the multiplicative group of \mathbb{F}_{15}, which has 8 elements. Seeing that it has 3 elements of order 2 and 4 elements of order 4 and is abelian, it must be C_2 \times C_4. Thus, the answer is C_2 \times C_2 \times C_4.

On the adjugate

I learned that the adjugate is the transpose of the matrix with the minors with the appropriate sign, that as we all know, alternates along rows and columns, corresponding to each element of the matrix on which the adjugate is taken. The matrix, multiplied with its adjugate, in fact, yields the determinant of that matrix, times the identity of course, to matrix it. Note that the diagonal elements of its result is exactly what one gets from applying the minors algorithm for calculating the determinant along each row. The other terms vanish. There are n(n-1) of them, where n is the number of rows (and columns) of the (square) matrix. They are, for each column of the adjugate and each column of it not equal to the current column, the sum of each entry in the column times the minor (with sign applied) determined by the removal of the other selected column (constant throughout the sum) and the row of the current entry. In the permutation expansion of this summation, each element has a (unique) sister element, with the sisterhood relation symmetric, determined by taking the entry of the adjugate matrix in the same column as the non minor element to which the permutation belongs and retrieving in the permutation expansion of the element times minor for that element the permutation, the product representing which contains the exact same terms of the matrix. Note that shift in position of the swapped element in the minor matrix is one less than that in the adjugate matrix. Thus, the signs of the permutations cancel. From this, we arrive at that the entire sum of entry times corresponding minor across the column is zero.

A corollary of this is that \mathrm{adj}(\mathbf{AB}) = \mathrm{adj}(\mathbf{B})\mathrm{adj}(\mathbf{A}).

More math

Last night, I learned, once more, the definition of absolute continuity. Formally, a function f : X \to Y‘s being absolutely continuous is its for any \epsilon > 0, having a \delta > 0 such that for any finite number of pairs of points (x_k, y_k) with \sum |x_k - y_k| < \delta implies \sum |f(x_k) - f(y_k)| < \epsilon. It is stronger than uniform continuity, a special case of it. I saw that it implied almost everywhere differentiability and is intimately related to the Radon-Nikodym derivative. A canonical example of a function not absolute continuous but uniformly continuous, to my learning last night afterwards, is the Cantor function, this wacky function still to be understood by myself.

I have no textbook on this or on anything measure theoretic, and though I could learn it from reading online, I thought I might as well buy a hard copy of Rudin that I can scribble over to assist my learning of this core material, as I do with the math textbooks I own. Then, it occurred to me to consult my math PhD student friend Oleg Olegovich on this, which I did through Skype this morning.

He explained very articulately absolute continuity as a statement on bounded variation. It’s like you take any set of measure less than \delta and the total variation of that function on that set is no more than \epsilon. It is a guarantee of a stronger degree of tightness of the function than uniform continuity, which is violated by functions such as x^2 on reals, the continuity requirements of which increases indefinitely as one goes to infinity and is thereby not uniformly continuous.

Our conversation then drifted to some lighter topics, lasting in aggregate almost 2 hours. We talked jokingly about IQ and cultures and politics and national and ethnic stereotypes. In the end, he told me that введите общение meant “input message”, in the imperative, and gave me a helping hand with the plural genitive conjugation, specifically for “советские коммунистические песни”. Earlier this week, he asked me how to go about learning Chinese, for which I gave no good answer. I did, on this occasion, tell him that with all the assistance he’s provided me with my Russian learning, I could do reciprocally for Chinese, and then the two of us would become like Москва-Пекин, the lullaby of which I sang to him for laughs.

Back to math, he gave me the problem of proving that for any group G, a subgroup H of index p, the smallest prime divisor of |G|, is normal. The proof is quite tricky. Note that the action of G on G / H induces a map \rho : G \to S_p, the kernel of which we call N. The image’s order, as a subgroup of S_p must divide p!, and as an isomorphism of a quotient group of G must divide n. Here is where the smallest prime divisor hypothesis is used. The greatest common divisor of n and p! cannot not p or not 1. It can’t be 1 because not everything in G is a self map on H. N \leq H as everything in N must take H to itself, which only holds for elements of H. By that, [G:N] \geq [G:H] = p which means N = H. The desired result thus follows from NgH = gH for all g \in G.

Later on, I looked at some random linear algebra problems, such as proving that an invertible matrix A is normal iff A^*A^{-1} is unitary, and that the spectrum of A^* is the complex conjugate of the spectrum of A, which can be shown via examination of A^* - \lambda I. Following that, I stumbled across some text involving minors of matrices, which reminded me of the definition of determinant, the most formal one of which is \sum_{\sigma \in S_n}\mathrm{sgn}(\sigma)\prod_{i=1}^{n}a_{i,\sigma_{i}}. In school though we learn its computation via minors with alternating signs as one goes along. Well, why not relate the two formulas.

In this computation, we are partitioning based on the element that 1 or any specific element of [n] = \{1, 2, \ldots, n\}, with a corresponding row in the matrix, maps to. How is the sign determined for each? Why does it alternate. Well, with the mapping for 1 already determined in each case, it remains to determine the mapping for the remainder, 2 through n. There are (n-1)! of them, from \{2, 3, \ldots, n\} to [n] \setminus \sigma_1. If we were to treat 1 through i-1 as shifted up by one so as to make it a self map on \{2, 3, \ldots, n\} then each entry in the sum of the determinant of the minor would have its sign as the sign of the number of two cycles between consecutive elements (which generate the symmetric group). Following that, we’d need to shift back down \{2, 3, \ldots, i\}, the presentation of which, in generator decomposition, would be (i\ i+1)(i-1\ i) \ldots (1\ 2), which has sign equal to the sign of i, which is one minus the column we’re at, thereby explaining why we alternate, starting with positive.



这让我回想起我上小学中学,好几次有美国同学争论is Taiwan part of China,一般最终得到结论都是不是。不用说,他们所想的,无论如何,都不会改变事实,所以这种争论是毫无意义的,尤其在他们和当时的我对与历史客观具体事实的无知的情况下。基本在那儿,小学中学的历史课都是垃圾,尤其是在美国,因为老师水平一般不会太高,经常还会很差,比如在美国,好多历史老师会自以为是地将自己的主观偏见施加在学生上。孩子们都想得很简单,什么东西都用好与坏衡量,我也是。现在长大了,就知道好坏正邪非客观存在,但赢者输者是有的,无庸隐讳,败到台湾的蒋介石国民党就是极度的输者。







How not woven the fabric of the universe
Spliced with craft
Comes together as one
Wide and broad with unparalleled mystery
Nature loves geometry
Fiber bundles describe four forces
Long unsolved problems
Euclid Gauss Riemann Cartan Chern




如何解释之?我想是大科学家,按照在Steve Hsu的博客上在关于g的讨论中用的语言,V都很高与中国老的科举的那一套结合所产生的自然结果。