## A special case of Vandermonde’s identity

Waking up this morning, I was somehow reminded of this combinatorial identity that appeared on an exam in a “math problem solving” class I took, which I didn’t actually solve during the test because back then I was an idiot. It was

$\displaystyle\binom{2n}{n} = \displaystyle\sum_{k=0}^n \binom{n}{k}^2$.

Basically, it’s observing that $\binom{n}{k}^2 = \binom{n}{k}\binom{n}{n-k}$ and then seeing that we have an instance of Vandermonde’s identity. The square is basically a form of obfuscation.

This stuff feels so obvious to me now yet it wasn’t back then. To make this entirely self-contained, I will prove Vandermonde’s identity as well for this specific case.

Let’s say we have $n$ men and $n$ women. How many ways are there to select $n$ people among those $2n$?

Easy. $\displaystyle\binom{2n}{n}$.

Or we can for every $k$ from $0$ to $n$, take the number of ways to pick $k$ men ($\binom{n}{k}$) and then the number of ways to pick the $k$ women to exclude ($\binom{n}{k}$) so that there are total of $n$ people picked. This value is the one given by the right hand side of the identity.

There really is very little to this once you see how it works. But I guess the thought process behind this is unnatural for (most) humans to develop. It might take a while before one can visualize it accurately and clearly. Again, it would have been somewhat mind-boggling to the high school me.

Certainly this loads much more on math IQ than on verbal IQ, and testing has indicated I’m somewhat stronger at the former.

Math seems to have a strong either you can get it or you can’t component, being more exclusive. It seems a much higher percentage of people can do software engineering than this kind of math. I’ve seem plenty of good, reliable, hardworking software engineers who simply do not have the brain to grok this type of math. Those people tend to go to state schools whereas the ones who are good at computers and good at math are more likely to go to, say, MIT, but they may well eventually end up doing more or less the same type of software work. Plenty of MIT students who did well in math contests don’t go into quant finance but end up at big software/internet company or Dropbox and even though they seem much sexier (in terms of how they appear on paper overall, their being much smarter), at work, what they do is not much different. Experience has told me that software engineering is much more about being able to grind computer stuff (not minding it too much or even enjoying it, and in that regard, I’m not very high) than about IQ.

## RSA公开密钥加密算法

### RSA算法

$a^{\phi(n)} \equiv 1 \mod n$

$\phi$为totient函数，$\phi(n) = |\{1 \leq k \leq n : \gcd(k, n) = 1\}|$

## 不知为何，突然想起测度论里的不可测度的维塔利集合

$\mu^*(E) = \inf\{\displaystyle\sum_{k=1}^{\infty} I_k : (I_k)_{n \in \mathbb{N}} \text{as open intervals}, \displaystyle\bigcup_{k=1}^{\infty} I_k \supset E\}$

$\displaystyle \mu ^{*}(A)=\mu ^{*}(A\cap E)+\mu ^{*}(A\cap E^{c})$

• 测度平移守恒，那就是 $\mu(S) = \mu(x+S), \forall x \in \mathbb{R}$
• $\{A_k\}_{k \in \mathbb{N}}$ 互相不交则 $\mu(\bigcup_{k=1}^{\infty} A_k) = \sum_{k=1}^{\infty} \mu(A_k)$
• $S \subset T$$\mu(S) \leq \mu(T)$

## Big Picard theorem

I’ve been asked to prove the Big Picard theorem, assuming the fundamental normality test. Assuming the latter, it is a very short proof, and I could half-ass with that. I don’t like writing up stuff that I don’t actually understand for the sake of doing so. There’s little point, and if I’m going to actually write up a proof of it, I’ll do so for real, which means that I go over the fundamental normality test in its entirety.

First some preliminaries.

Theorem 2.28 (Riemann mapping theorem). Let $\Omega \subset \mathbb{C}$ be simply-connected and $\Omega \neq \mathbb{C}$. Then there exists a conformal homeomorphism $f : \Omega \to \mathbb{D}$ onto the unit disk $\mathbb{D}$.

Theorem 2.30. Suppose n is bounded, simply-connected, and regular. Then any conformal homeomorphism as in Theorem 2.28 extends to a homeomorphism $\bar{\Omega} \to \bar{\mathbb{D}}$.

Schwarz reflection principle. Suppose that $f$ is an analytic function which is defined in the upper half-disk $\{|z|^2 < 1, \text{Im } z > 0\}$. Further suppose that $f$ extends to a continuous function on the real axis, and takes on real values on the real axis. Then $f$ can be extended to an analytic function on the whole disk by the formula

$f(\bar{z}) = \overline{f(z)}$

and the values for $z$ reflected across the real axis are the reflections of $f(z)$ across the real axis.

We begin by presenting the standard “geometric” procedure by which the covering map $\pi : \mathbb{D} \to \mathbb{C} \setminus \{p_1, p_2\}$ may be obtained. Here $p_1, p_2$ are distinct points. This then leads naturally to the “little” and “big” Picard theorems, which are fundamental results of classical function theory.

The construction takes place in the Poincaré disk. In the above figure, we have a circle $C_2$ reflected about $C_1$. The configuration is such that $C_2$ intersects $C_1$ perpendicularly. We are reflecting $C_2$ across $C_1$. The intersection points must be fixed, and the reflection must preserve the orthogonality. Moreover, reflection preserves the geodesic nature, and under the hyperbolic metric, geodesics are generalized circles. From this, we can deduce that $C_2$ goes to itself, with its two arcs relative to $C_1$ interchanged.

To construct the map, we start with a triangle $\Delta_0$ inside the unit circle consisting of circular arcs that intersect the unit circle at right angles. Reflect $\Delta_0$ across each of its sides and one gets three more triangles with circular arcs intersecting the unit circle at right angles.

The above figure shows how the unit disk is partitioned by triangles as a result of iterating these reflections indefinitely. To obtain the sought after covering map, we start from the Riemann mapping theorem which gives us a conformal isomorphism $f : \Delta_0 \to \mathbb{H}$, the upper half-plane. This map extends as a homeomorphism to the boundary by Theorem 2.30. Thus, the three circular arcs of $\Delta_0$ get mapped to the intervals $[-\infty, 0], [0,1], [1,\infty]$, respectively. By the Schwarz reflection principle, the map $f$ extends analytically to the region obtained by reflecting $\Delta_0$ across each of its sides as just explained above. There is that $0, 1, \infty$ are on the boundary of the unit disk and thus omitted. There is also that complex conjugation as specified in the Schwarz reflection principle reflects the upper half plane to the lower half plane. This way, we obtain a conformal map onto $\mathbb{C} \setminus \{0, 1\}$ defined on the entire unit disk that is a local isomorphism and a covering map.

Theorem 4.18. Every entire function which omits two values is constant.

Proof. Indeed, if $f$ is such a function, we may assume that it takes its values in $\mathbb{C} \setminus \{0, 1\}$. But then we can lift $f$ to the universal cover of $\mathbb{C} \setminus \{0, 1\}$ to obtain an entire function $F$ into $\mathbb{D}$. By Liouville’s theorem, $F$ is constant.     ▢

Theorem 4.19 (Fundamental normality test). Any family of functions $\mathcal{F}$ in $\mathcal{H}(\Omega)$ which omits the same two distinct values in $\mathbb{C}$ is a normal family.

Theorem 4.20. If $f$ has an isolated essential singularity at $z_0$, then in any small neighborhood of $z_0$ the function $f$ attains every complex value infinitely often, with one possible exception.

Proof. Suppose without loss of generality that $z_0 = 0$ and define $f_n(z) = f(2^{-n}z)$ for an integer $n \geq 1$. We take $n$ so large that $f_n$ is analytic on $0 < |z| < 2$ by making the neighborhood about $z_0$ sufficiently small. Suppose by contradiction every neighborhood of $f$ omits the some two points. Then every function in this family, defined via $f$, omits the same two points. Thus, by the fundamental normality test, some subsequence of the family $f_{n_k}(z) \to F(z)$ uniformly on $1/2 \leq |z| \leq 1$ where either $F$ is analytic or $F = \infty$ by Weierstrass’s theorem (see here). By the maximum principle, in the former case, $f$ is bounded near $z = 0$, which means it’s removable. In the latter case, convergence to $\infty$ implies that $z = 0$ is a pole, contradicting that $f$ has an essential singularity there.     ▢

References

• Schlag, W., A Course in Complex Analysis and Riemann Surfaces, American Mathematical Society, 2014, pp. 70-72,81,160-164.

## On grad school, science, academia, and also a problem on Riemann surfaces

I like mathematics a ton and I am not bad at it. In fact, I am probably better than many math graduate students at math, though surely, they will have more knowledge than I do in some respects, or maybe even not that, because frankly, the American undergrad math major curriculum is often rather pathetic, well maybe largely because the students kind of suck. In some sense, you have to be pretty clueless to be majoring in just pure math if you’re not a real outlier at it, enough to have a chance at a serious academic career. Of course, math professors won’t say this. So we have now an excess of people who really shouldn’t be in science (because they much lack the technical power or an at least reasonable scientific taste/discernment, or more often both) adding noise to the job market. On this, Katz in his infamous Don’t Become a Scientist piece writes:

If you are in a position of leadership in science then you should try to persuade the funding agencies to train fewer Ph.D.s. The glut of scientists is entirely the consequence of funding policies (almost all graduate education is paid for by federal grants). The funding agencies are bemoaning the scarcity of young people interested in science when they themselves caused this scarcity by destroying science as a career. They could reverse this situation by matching the number trained to the demand, but they refuse to do so, or even to discuss the problem seriously (for many years the NSF propagated a dishonest prediction of a coming shortage of scientists, and most funding agencies still act as if this were true). The result is that the best young people, who should go into science, sensibly refuse to do so, and the graduate schools are filled with weak American students and with foreigners lured by the American student visa.

I’ll transition now to a problem that I’ve been asked to solve. Its statement is the following:

Let $f$ be holomorphic on a simply-connected Riemann surface $M$, and assume that $f$ never vanishes. Then there exists $F$ holomorphic on $M$ such that $f = e^F$. Show that harmonic functions on $M$ have conjugate harmonic functions.

Every $p_0 \in M$ corresponds to an open connected neighborhood $U = \{p : \lVert F(p) - F(p_0) \rVert < F(p_0)\}$. Let $\{U_{\alpha}\}$ be the system consisting of these neighborhoods, $(\log F)_{\alpha}$ a continuous branch of the logarithm of $F$ in $U_{\alpha}$. From this arises a family $F_{\alpha} = \{(\log F)_{\alpha} + 2n\pi i, n \in \mathbb{Z}\}$.

In Schlag, there is the following lemma.

Lemma 5.5. Suppose $M$ is a simply-connected Riemann surface and

$\{D_{\alpha} \subset M : \alpha \in A\}$

is a collection of domains (connected, open). Assume further that these sets form an open cover $M = \bigcup_{\alpha \in A} D_{\alpha}$ such that for each $\alpha \in A$ there is a family $F_{\alpha}$ of analytic functions $f : D_{\alpha} \to N$, where $N$ is some other Riemann surface, with the following properties: if $f \in F_{\alpha}$ and $p \in D_{\alpha} \cap D_{\beta}$, then there is some $g \in F_{\beta}$ so that $f = g$ near $p$. Then given $\gamma \in A$ and some $f \in F_{\gamma}$ there exists an analytic function $\psi_{\gamma} : M \to N$ so that $\psi_{\gamma} = f$ on $D_{\gamma}$.

Using the families of analytic function $F_{\alpha}$ as given above, it is clear that near $p \in D_{\alpha} \cap D_{\beta}$, $(\log F)_{\alpha} + 2n_{\alpha}\pi i = (\log F)_{\beta} + 2n_{\beta}\pi i$ when $n_{\alpha} = n_{\beta}$, which means the hypothesis of Lemma 5.5 is satisfied by the above families.

I’ll present the proof of the above lemma here, to consolidate my own understanding, and also out of its essentiality in the construction of a global holomorphic function matching some function in each family. It does so in generality of course, whereas in the problem we are trying to solve it is on a specific case.

Proof. Let

$\mathcal{U} = \{(p, f) | p \in D_{\alpha}, f \in F_{\alpha}, \alpha \in A\} / \sim$

where $(p, f) \sim (q, g)$ iff $p = q$ and $f = g$ in a neighborhood of $p$. Let $[p, f]$ denote the equivalence class of $(p, f)$. As usual, $\pi([p, f]) = p$. For each $f \in F_{\alpha}$, let

$D'_{\alpha, f} = \{[p, f] | p \in D_{\alpha}\}$.

Clearly, $\pi : D_{\alpha, f}' \to D_{\alpha}$ is bijective. We define a topology on $\mathcal{U}$ as follows: $\Omega \subset D_{\alpha, f}'$ is open iff $\pi(\Omega) \subset D_{\alpha}$ is open for each $\alpha, f \in F_{\alpha}$. This does indeed define open sets in $\mathcal{U}$: since $\pi(D'_{\alpha, f} \cap D'_{\beta, g})$ is the union of connected components of $D_{\alpha} \cap D_{\beta}$ by the uniqueness theorem (if it is not empty), it is open in $M$ as needed. With this topology, $\mathcal{U}$ is a Hausdorff space since $M$ is Hausdorff (we use this if the base points differ) and because of the uniqueness theorem (which we use if the base points coincide). Note that by construction, we have made the fibers indexed by the functions in $F_{\alpha}$ discrete in the topology of $\mathcal{U}$.

The main point is now to realize that if $\widetilde{M}$ is a connected component of $\mathcal{U}$, then $\pi : \widetilde{M} \to M$ is onto and in fact is a covering map. Let us check that it is onto. First, we claim that $\pi(\widetilde{M}) \subset M$ is open. Thus, let $[p, f] \in \widetilde{M}$ and pick $D_{\alpha}$ with $p \in D_{\alpha}$ and pick $D_{\alpha}$ with $p \in D_{\alpha}$ and $f \in F_{\alpha}$. Clearly, $D'_{\alpha, f} \cap \widetilde{M} \neq \emptyset$ and since $D_{\alpha}$, and thus also $D'_{\alpha, f}$, is open and connected, the connected component $\widetilde{M}$ has to contiain $D'_{\alpha, f}$ entirely. Therefore, $D_{\alpha} \subset \pi(\widetilde{M})$ as claimed.

Next, we need to check that $M \setminus \pi(\widetilde{M})$ is open. Let $p \in M \setminus \pi(\widetilde{M})$ and pick $D_{\beta}$ so that $p \in D_{\beta}$. If $D_{\beta} \cap \pi(\widetilde{M}) = \emptyset$, then we are done. Otherwise, let $q \in D_{\beta} \cap \pi(\widetilde{M})$ and pick $D_{\alpha}$ containing $q$ and some $f \in F_{\alpha}$ with $D'_{\alpha, f} \subset \widetilde{M}$ (using the same “nonempty intersection implies containment” argument as above). But now we can find $g \in F_{\beta}$ with the property that $f = g$ on a component of $D_{\alpha} \cap D_{\beta}$. As before, this implies that $\widetilde{M}$ would have to contain $D'_{\beta, g}$ which is a contradiction.

To see that $\pi : \widetilde{M} \to M$ is a covering map, one verifies that

$\pi^{-1}(D_{\alpha}) = \bigcup_{f \in F_{\alpha}} D'_{\alpha, f}$.

The sets on the right-hand side are disjoint and in fact they are connected components of $\pi^{-1}(D_{\alpha})$.

Since $M$ is simply-connected, $\widetilde{M}$ is homeomorphic to $M$ (proof given in the appendix). We thus infer the existence of a globally defined analytic function which agrees with some $f \in F_{\alpha}$ on each $D_{\alpha}$. By picking the connected component that contains any given $D_{\alpha, f}'$ one can fix the “sheet” locally on a given $D_{\alpha}$.     ▢

By this, we can construct an analytic $F$ such that for all $\alpha$,

$f_{|U_{\alpha}} = (\log F)_{\alpha} + n_{\alpha} \cdot 2\pi i, \qquad n_{\alpha} \in \mathbb{Z}$.

from which follows $e^F = f$.

For the existence of harmonic conjugates, we do similarly. Take a connected open cover of $M$, $\{U_{\alpha}\}$ where each $U_{\alpha}$ is conformally equivalent to the unit disc, and $v_{\alpha}$ is a harmonic conjugate of $u$ in $U_{\alpha}$ (which exists uniquely up to constant on the unit disc. Let $F_{\alpha} = \{v_{\alpha} + c, \quad c \in \mathbb{R}\}$. Then by the same lemma, there exists $v$ such that for all $\alpha$,

$v_{|U_{\alpha}} = v_{\alpha} + c_{\alpha}, \quad \text{some } c_{\alpha} \in \mathbb{R}$

that is harmonic and conjugate to $u$ since it is the harmonic conjugate to $u$ on every element of the cover, again with choise of $c_{\alpha}$s to ensure that on intersection of cover elements there is a match.

## Elliptic functions

I am writing this as a way to go through in detail the section on elliptic functions in Schlag’s book.

Proposition 4.14.  Let $\Lambda = \{m\omega_1 + n\omega_2 | m,n \in \mathbb{Z}\}$ and set $\Lambda^* = \Lambda \setminus \{0\}$. For any integer $n \geq 3$, the series

$f(z) = \displaystyle\sum_{w \in \Lambda} (z+w)^{-n} \qquad (4.16)$

defines a function $f \in \mathcal{M}(M)$ with $deg(f) = n$. Furthermore, the Weierstrass function

$\wp(z) = \frac{1}{z^2} + \displaystyle\sum_{w \in \Lambda^*} [(z+w)^{-2} - w^{-2}] ,\qquad (4.17)$

is an even elliptic function of degree two with $\Lambda$ as its group of periods. The poles of $\wp$ are precisely the points in $\Lambda$ and they are all of order $2$.

Proof.  It suffices to prove that $f(z) = \displaystyle\sum_{w \in \Lambda} (z+w)^{-n}$ converges absolutely and uniformly on every compact set $K \subset \mathbb{C} \setminus \Lambda$. Periodicity allows us to restrict to the closure of any fundamental region. There exists $C > 0$ such that for all $x,y \in \mathbb{R}$,

$C^{-1}(|x|+|y|) \leq |x\omega_1 + y\omega_2|$.

Hence, when $z \in \{x\omega_1 + y\omega_2 | 0 \leq x, y \leq 1\}$, then

$|z + (k_1\omega_1 + k_2\omega_2)| \geq C^{-1}(|k_1| + |k_2|) - |z| \geq (2C)^{-1}(|k_1| + |k_2|)$

provided $|k_1| + |k_2|$ is sufficiently large. In

$\displaystyle\sum_{|k_1|+|k_2|>0} |k_1 + k_2|^{-n}$,

there are $O(n)$ occurrences of $|k_1| + |k_2| = n$, which means the above converges when $n > 2$, and this, with the above bound, means $f \in \mathcal{H}(\mathbb{C} \setminus \Lambda)$. Periodicity implies $f \in \mathcal{M}(M)$. Moreover, the degree of (4.16) is determined by nothing that inside a fundamental region the series has a unique pole of order $n$.

For the second part, we note that when $|w| > 2|z|$,

$\left|(z+w)^{-2} - w^{-2}\right| \leq \frac{|z||z+2s|}{|w|^2|z+w|^2} \leq \frac{C|z|}{|w|^3}$,

which means the series defining $\wp$, which is clearly even, converges absolutely and uniformly on compact subsets of $\mathbb{C} \setminus \Lambda$. For the periodicity of $\wp$, note that $\rho'$ is periodic relative the same lattice $\Lambda$. Thus, for every $w \in \Lambda$,

$\wp(z+w) - \wp(z) = C(w) \quad \forall z \in \mathbb{C}$

with some constant $C(w)$, which has to be zero by

$\wp(\omega_1/2) - \wp(-\omega_1/2) = 0$.

Another way to go about it to define $\sigma$ such that

$\zeta(z) = \frac{d \log \sigma(z)}{dz} = \frac{1}{z} + \displaystyle\sum_{w \in \Lambda^*} \left[\frac{1}{z-\omega} + \frac{1}{\omega} + \frac{z}{\omega^2}\right]$,

so that $\wp = -\zeta'$, from which by periodicity, we have

$\zeta(z+\omega) - \zeta(z) = C(\omega)$.

From this we can solve that

$\sigma(z+\omega_j) = -\sigma(z)e^{\eta_j(z+\omega_j/2)}, \qquad (4.20)$

where the $\nu_j$s are constants for $j = 1,2$.

Lemma 4.15.  With $\wp$ as before, one has

$(\wp'(z))^2 = 4(\wp(z) - e_1)(\wp(z) - e_2)(\wp(z) - e_3) \qquad (4.21)$

where $e_1 = \wp(\omega_1/2), e_2 = \rho(\omega_2/2)$, and $e_3 = \rho((\omega_1+\omega_2)/2)$ are pairwise distinct. Furthermore, one has $e_1 + e_2 + e_3 = 0$ so that (4.21) can be written in the form

$(\wp'(z))^2 = 4(\wp(z))^3 - g_2\wp(z) - g_3 \qquad (4.22)$

with constants $g_2 = -4(e_1e_2 + e_1e_3 + e_2e_3)$ and $g_3 = 4e_1e_2e_3$.

View the torus as

$S = \{x\omega_1 + y\omega_2 | -1/2 \leq x,y \leq 1/2\}$.

$\wp'(z)$ is odd and has a pole of order $3$ at $z = 0$ but no other poles in $S$, which means $\wp'(z)$ has degree $3$.

Oddness with periodicity applied at $\omega_1/2$ and $\omega_2/2$ yields that

$\frac{1}{2}\omega_1, \quad \frac{1}{2}\omega_2, \quad \frac{1}{2}(\omega_1+\omega_2)$

are the three zeros of $\wp'$, each simple, and thus also the unique points where $\wp$ has valency $2$ apart from $z = 0$. The $e_j$ are distinct, because if not $\wp$ would assume such a value four times, impossible when the degree is $2$.

Denoting the RHS of (4.21) by $F(z)$, we have that

$\frac{(\wp'(z))^2}{F(z)} \in \mathcal{H}(M)$

with the zeros cancelled out, and thus equal to a constant.

At $z = 0$, the highest pole of $(\wp'(z))^2$ is one of order $3\cdot 2 = 6$ with coefficient $(-2)^2 = 4$. In $F(z)$, we have essentially a cubic in $\wp(z)$ with leading coefficient $4$, and $\wp(z)$ has pole of order $2$ with coefficient $1$. In taking the limit towards zero, we only need to consider the $4\wp(z)^3$ term, which has the highest order pole, which is also of order $6$ with coefficient $4$. That means our constant function is $1$.

The final statement follows by observing from the Laurent series around zero, which is, from the geometric series of $\left(\frac{1/w}{1+(z/w)}\right)^2 - \frac{1}{w^2}$

$\wp(z) = \frac{1}{z^2} - \displaystyle\sum_{k = 1}^{\infty} (k+1)(-1)^{k}z^k\displaystyle\sum_{w \in \Lambda^*} \frac{1}{w^{3+k}}$.

Because $\wp$ is even, the odd coefficients must vanish. So we have

$\wp(z) = \frac{1}{z^2} - \displaystyle\sum_{k=1}^{\infty} (2k+1)z^{2k} \displaystyle\sum_{w \in \Lambda^*} \frac{1}{w^{2k+2}}$.

For now, let

$G_k = \displaystyle\sum_{w \in \Lambda^*} \frac{1}{w^k}$.

\begin{aligned}\wp(z) & = & \frac{1}{z^2} + 3G_4z^2 + 5G_6z^4 + \cdots, \\ \wp'(z) & = & \frac{-2}{z^3} + 6G_4z + 20G_6z^3 + \cdots, \\ (\wp(z))^3 & = & \frac{1}{z^6} + 9\frac{G_4}{z^2} + \cdots, \\ (\wp'(z))^2 & = & \frac{4}{z^6} - \frac{24G_4}{z^2} + \cdots. \end{aligned}

What we want is to find the $g_2$ such that $(\wp'(z))^2 - 4(\wp(z))^3 + g_2\wp(z)$ becomes analytic and thus constant, and to do that we must vanish out all the poles at $0$. The $z^{-6}$ coefficients tells us to multiply $(\wp(z))^3$ by $4$. After that, we have from the $z^{-2}$ coefficient that $-24G_4 - 9\cdot 4 G_4 + g_2 = 0$, which means

$g_2 = 60G_4 = -4(e_1e_2 + e_1e_3 + e_2e_3)$.

Proposition 4.16.  Every $f \in \mathcal{M}(M)$ is a rational function of $\wp$ and $\wp'$. If $f$ is even, then it is a rational function of $\wp$ alone.

Proof.  Suppose that $f$ is non-constant and even. Then for all but finitely many values of $w \in \mathbb{C}_{\infty}$, the equation $f(z) - w = 0$ has only simple zeros (since there are only finitely many zeros of $f'$). Pick two such $w \in \mathbb{C}$ and denote them by $c,d$. Moreover, we can ensure that the zeros of $f - c$ and $f - d$ are distinct from the branch points of $\wp$. Thus, since $f$ is even and with $2n = deg(f)$, one has:

\begin{aligned}\{z \in M : f(z) - c = 0\} & = \{a_j, -a_j\}_{j=1}^n, \\ \{z \in M : f(z) - d = 0\} & = \{b_j, -b_j\}_{j=1}^n. \end{aligned}

The elliptic functions

$g(z) = \frac{f(z) - c}{f(z) - d}$

and

$h(z) = \displaystyle\prod_{j=1}^n \frac{\wp(z) - \wp(a_j)}{\wp(z) - \wp(b_j)}$

have the same zeros and poles which are all simple. It follows that $g = \alpha h$ for some $\alpha \neq 0$. Solving this relation for $f$ yields the desired conclusion.

If $f$ is odd, then $f/\wp'$ is even so $f = \wp'R(\wp)$ where $R$ is rational. Finally, if $f$ is any elliptic function, then

$f(z) = \frac{1}{2}(f(z) + f(-z)) + \frac{1}{2}(f(z) - f(-z))$

is a decomposition into even/odd elliptic functions whence

$f(z) = R_1(\wp) + \wp'R_2(\wp)$

with rational $R_1, R_2$ as claimed.     ▢

We conclude with the following question: given disjoint finite sets of distinct points $\{z_j\}$ and $\{\zeta_k\}$ in $M$ as well as positive integers $n_j$ for $z_j$ and $\nu_k$ for $\zeta_k$, respectively, is there an elliptic function with precisely these zeros and poles and of the given orders? In the case of $\mathbb{C}_\infty$ yes iff $\sum_{j} n_j = \sum_{k} \nu_k$ since the degree must be constant throughout.

For the tori, we first observe that by the residue theorem one has

$\frac{1}{2\pi i}\oint_{\partial P} z\frac{f'(z)}{f(z)}dz = \sum_j n_jz_j - \sum_k \nu_k \zeta_k. \qquad (4.25)$

where $\partial P$ is the boundary of a fundamental region $P$ such that no zero or pole lies on the boundary. Second, comparing parallel sides of the fundamental region and using the periodicity shows that the left-hand side is (4.25) is of the form $n_1\omega_1 + n_2\omega_2$ with $n_1, n_2 \in \mathbb{Z}$ and thus equals $0$ modulo $\Lambda$. (This follows from that $\int_{\gamma} \frac{f'(z)}{f(z)}dz$ is the difference of logarithms of the same value, which regardless of branch must be an integer multiple of $2\pi i$.)

Now consider the edges in $\partial P$ given by $\gamma_1(t) = \{t\omega_1 | 0 \leq t \leq 1\}$ and $\gamma_1(t) = \{\omega_2 + t\omega_1 | 0 \leq t \leq 1\}$ respectively. By $\omega_2$-periodicity of $\frac{f'(z)}{f(z)}$ we infer that

$\int_{\gamma_1} z\frac{f'(z)}{f(z)}dz + \int_{\gamma_2} z \frac{f'(z)}{f(z)}dz = -\omega_2\int_{\gamma_1}d \log f(z)$.

The branch of logarithm here is irrelevant, since the arbitrary constant is differentiated away. By periodicity applied to the difference in this integral,

$\omega_2 \frac{1}{2\pi i} \int_{\gamma_1} d\log f(z) \in \omega_2 \mathbb{Z}$.

The other edge pair gives an element of $\omega_1 \mathbb{Z}$, whence (4.24).

Theorem 4.17.  Suppose (4.23) and (4.24) hold. Then there exists an elliptic function which has precisely these zeros and poles with the given orders. This function is unique up to a nonzero complex multiplicative constant.

Proof.  Listing the points $z_j$ and $\zeta_k$ expanded out with their respective multiplicities, we obtain sequences $z_j'$ and $\zeta_k'$ of the same length, say $n$. Shifting the $z_j'$s and $\zeta_k'$s by lattice elements if needed, one has

$\sum_{j=1}^n z_j' = \sum_{k=1}^n \zeta_k'$.

Take

$f(z) = \displaystyle\prod_{j=1}^n \frac{\sigma(z - z_j')}{\sigma(z - \zeta_j')}$

using the $\sigma$ in (4.20). Then

\begin{aligned} \frac{f(z+\omega_i)}{f(z)} & = & \displaystyle\prod_{j=1}^n \frac{\sigma(z-z_j' + \omega_i/2)}{\sigma(z - z_j')}\cdot \frac{\sigma(z - \zeta_j')}{\sigma(z - \zeta_j' + \omega_i/2)} \\ & = & \displaystyle\prod_{j=1}^n e^{\eta_i\left[(z - z_j' + \omega_i/2) - (z - \zeta_j' + \omega_i/2)\right]} \\ & = & e^{\eta_i\cdot 0} \\ & = & 1, \end{aligned}

which shows periodicity.    ▢

Finally, we observe how we can solve (4.22) by integrating

$\frac{d\wp(z)}{\sqrt{4(\wp(z))^3 - g_2\wp(z) - g_3}} = dz$

where we choose some branch of the root, which yields

$z - z_0 = \int_{\wp(z_0)}^{\wp(z)} \frac{d\zeta}{\sqrt{4\zeta^3 - g_2\zeta - g_3}}$.

In other words, the Weierstrass function $\wp$ is the inverse of an elliptic integral. The integration path in (4.30) needs to be chosen to avoid the zeros and poles of $\wp'$, and the branch of the root is determined by $\wp'$.

Analogously, $\int_{w_0}^w \frac{d\zeta}{\sqrt{1 - \zeta^2}} = z - z_0$ is satisfied by $w = \sin z$ with similar restriction on the path and the choice of branch. This case though is a periodic function with one period, whereas in (4.30), there are two periods.

References

• Schlag, W., A Course in Complex Analysis and Riemann Surfaces, American Mathematical Society, 2014, pp. 153-157.

## Vector fields, flows, and the Lie derivative

Let $M$ be a smooth real manifold. A smooth vector field $V$ on $M$ can be considered as a function from $C^{\infty}(M)$ to $C^{\infty}(M)$. Every function $f : M \to \mathbb{R}$ at every point $p \in M$ is by a vector field (which implicitly associates a tangent vector at every point) taken to some real value, which one can think of as the directional derivative of $f$ along the tangent vector. Moreover, this varies smoothly with $p$.

Along any vector field, if we start at any point, we can trace a path along the vector field. Imagine a vector field in water based on the velocity that does not change with time. Take a point particle at a point at any time and we can deterministically predict its path both forward in time and backward in time. We call this an integral curve and it is easy to see that integral curves are equivalence classes.

On a manifold $M$, at a point with chart $(U, \varphi)$, under vector field $V$, we would have

$\frac{\mathrm{d}x^{\mu}(t)}{\mathrm{d}t} = V^{\mu}(x(t)), \qquad (1)$

where $x^{\mu}(t)$ is the $\mu$th component of $\varphi(x(t))$ and $V = V^{\mu}\partial / \partial x^{\mu}$. This is an ODE which is guaranteed to have a unique solution at least locally, and we assume for now that the parameter $t$ can be maximally extended.

If we attach the initial condition that at $t = 0$, the integral curve is at $x_0$, and denote the coordinate by $\sigma^{\mu}(t, x_0)$(1) becomes

$\frac{\mathrm{d}\sigma^{\mu}(t, x_0)}{\mathrm{d}t} = V^{\mu}(\sigma(t, x_0))$,

Here, $\sigma : \mathbb{R} \times M \to M$ is called a flow generated by $V$, which necessarily satisfies

$\sigma(t, \sigma(s, x_0)) = \sigma(t+s, x_0)$

for any $s, t \in \mathbb{R}$.

Within this is the structure of a one-parameter family where

(i) $\sigma_{s+t} = \sigma_s \circ \sigma_t$ or $\sigma_{s+t}(x_0) = \sigma_s(\sigma_t(x_0))$.
(ii) $\sigma_0$ is the identity map.
(iii) $\sigma_{-t} = (\sigma_t)^{-1}$.

We now ask the question how a smooth vector field $W$ changes along a smooth vector field $V$. If our manifold were simply $\mathbb{R}^n$ (with a single identity chart, globally) we would at any point $p$ some direction along $V$ and on an infinitesimal change along that, $W$ would change as well. In this case, it is easy to represent tangent vectors with indexed coordinates. Naively, we could take the displacement in $W$, divide by the amount of displacement along $V$ and take the limit. However, we have not defined addition of tangent vectors on different tangent spaces. To do so, we would need some meaningful correspondence between values on different tangent spaces. Why can we not simply do vector addition? Recall that tangent space elements are defined in terms of how they act on smooth functions from $M$ to $\mathbb{R}$ instead of directly. It is only because they are linear in themselves with respect to any given such function that we can using vectors to represent them.

We resolve this in a more general fashion by defining the induced map on tangent spaces $T_pM$ and $T_{f(p)}N$ for smooth $f : M \to N$ between manifolds. Recall that an element of a tangent space is a map $D : C^{\infty}(M) \to \mathbb{R}$ (that also satisfies the Leibniz property: $D(fg) = Df \cdot g + f \cdot Dg$). If $g \in C^{\infty}(N)$, then $g \circ f \in C^{\infty}(M)$. We define the induced map

$\Phi_{f, p} : T_p M \to T_{f(p)} N$

in the following manner. If $D \in T_p(M)$, then $\Phi_{f, p}(D) = D'$, where $D'[g] = D[g \circ f]$.

We notice how we can apply this on $\sigma_t : M \to M$ in our construction of the Lie derivative $\mathcal{L}_V W$ of a vector field $W$ with respect to vector field $V$. Since the flow is along $V$,

$\sigma_{-t}^{\mu}(p) = x^{\mu}(p) - tV^{\mu}(p) + O(t^2). \qquad (2)$

We define as the induced map of $\sigma_t(p)$

$\Phi_{\sigma_{-t}, \sigma_t(p)} : T_{\sigma_t(p)} M \to T_p M$.

If $\Phi_{\sigma_{-t}, \sigma_t(p)}(W) = W'$, then by definition,

$W'[f](p) = W[f \circ \sigma_{-t}](\sigma_t(p))$.

That means

$\mathcal{L}_V W[f](p) = \left(\displaystyle\lim_{t \to 0}\frac{W'(p) - W(p)}{t}\right)[f] = \displaystyle\lim_{t \to 0}\frac{W'[f](p) - W[f](p)}{t}. \qquad (3)$

Using that by the chain rule,

$\frac{\partial}{\partial x^r}(f \circ \sigma_{-t})(\sigma_t(p)) = \frac{\partial \sigma_{-t}^{\rho}}{\partial x^r}(\sigma_t(p)) \frac{\partial f}{\partial x^{\mu}}(p)$,

we arrive at

\begin{aligned} W'[f](p) & = W^{\nu}(\sigma_t(p)) \frac{\partial}{\partial x^{\nu}}[f \circ \sigma_{-t}](\sigma_t(p)) \\ & = W^{\nu}(\sigma_t(p)) \frac{\partial \sigma_{-t}^{\mu}}{\partial x^{\nu}}(\sigma_t(p))\frac{\partial f}{\partial x^{\mu}}(p). \qquad (4) \end{aligned}

Using the power series of $\sigma_t(p)$ at $p$, we get

$W^{\nu}(\sigma_t(p)) = W^{\nu}(p) + tV^{\rho}(p) \frac{\partial W^{\nu}}{\partial x^{\rho}} + O(t^2). \qquad (5)$

Moreover, by (2),

$\frac{\partial}{\partial x^{\nu}} \sigma_{-t}^{\mu}(\sigma_t(p)) = \delta_{\nu}^{\mu} - t \frac{\partial V^{\mu}}{\partial x^{\mu}}(p) + O(t^2). \qquad (6)$

Substituting (5) and (6) into (4) yields

\begin{aligned} W'[f](p) & = \left(W^{\nu}(p) + tV^{\rho}(p) \frac{\partial W^{\nu}}{\partial x^{\rho}} + O(t^2)\right)\left(\delta_{\nu}^{\mu} - t \frac{\partial V^{\mu}}{\partial x^{\mu}}(p) + O(t^2)\right)\frac{\partial f}{\partial x^\mu} \\ & = \left(W^{\mu}(p) + t\left(V^{\rho}(p) \frac{\partial W^{\nu}}{\partial x^{\rho}}(p) - W^{\nu}(p) \frac{\partial V^{\mu}}{\partial x^{\nu}}(p)\right) + O(t^2)\right)\frac{\partial f}{\partial x^\mu} \\ & = \left(W^{\mu}(p) + t\left(V^{\nu}(p) \frac{\partial W^{\mu}}{\partial x^{\nu}}(p) - W^{\nu}(p) \frac{\partial V^{\mu}}{\partial x^{\nu}}(p)\right) + O(t^2)\right)\frac{\partial f}{\partial x^\mu}. \qquad (7) \end{aligned}

There is a constant term, a first order term, and an $O(t^2)$. In (3), the constant term is subtracted out, and the $O(t^2)$ contributes nothing to the limit. This means that the Lie derivative is equal to the first order term, with

$(\mathcal{L}_V W)^{\mu}(p) = V^{\nu}(p) \frac{\partial W^{\mu}}{\partial x^{\nu}}(p) - W^{\nu}(p) \frac{\partial V^{\mu}}{\partial x^{\nu}}(p). \qquad (8)$

Notice how in (4), there is $\frac{\partial f}{\partial x^{\mu}}$ that we have omitted in (8). This is because we are using $\partial/\partial x^\mu$ as the basis of the tangent vector that is applied onto $f \in C^{\infty}(M)$.

We have in (8) what is the $\mu$th component of the Lie bracket of $[V,W]$ where

$[V,W]^{\mu} = V^{\nu} \frac{\partial W^{\mu}}{\partial x^{\nu}} - W^{\nu} \frac{\partial V^{\mu}}{\partial x^{\nu}}. \qquad (9)$

## Sheaves of holomorphic functions

I can sense vaguely that the sheaf is a central definition in the (superficially) horrendously abstract language of modern mathematics. There really does seem to be quite a distance, between crudely speaking, pre-1950 math and post-1950 math in the mainstream in terms of the level of abstraction typically employed. It is my hope that I will eventually accustom myself to the latter instead of viewing it as a very much alien language. It is difficult though, and  there are in fact definitions which take quite me a while to grasp (by this, I mean be able to visualize it so clearly that feel like I won’t ever forget it), which is expected given how long it has taken historically to condense to certain definitions golden in hindsight. In the hope of a step forward in my goal to understand sheaves, I’ll write up the associated definitions in this post.

Definition 1 (Presheaf). Let $(X, \mathcal{T})$ be a topological space. A presheaf of vector spaces on $X$ is a family $\mathcal{F} = \{\mathcal{F}\}_{U \in \mathcal{T}}$ of vector spaces and a collection of associated linear maps, called restriction maps,

$\rho = \{\rho_V^U : \mathcal{F}(U) \to \mathcal{F}(V) | V,U \in \mathcal{T} \text{ and } V \subset U\}$

such that

$\rho_U^U = \text{id}_{\mathcal{F}(U)} \text{ for all } U \in \mathcal{T}$

$\rho_W^V \circ \rho_V^U = \rho_W^U \text{ for all } U,V,W \in \mathcal{T} \text{ such that } W \subseteq V \subseteq U$.

Given $U,V \in \mathcal{T}$ such that $V \subseteq U$ and $f \in \mathcal{F}(U)$ one often writes $f|_V$ rather than $\rho_V^U(f)$.

Definition 2 (Sheaf). Let $\mathcal{F}$ be a presheaf on a topological space $X$. We call $\mathcal{F}$ a sheaf on $X$ if for all open sets $U \subseteq X$ and collections of open sets $\{U_i \subseteq U\}_{i \in I}$ such that $\cup_{i \in I} U_i = U$, $\mathcal{F}(U)$ satisfies the following properties:

1. For $f, g \in F(U)$ such that $f|_{U_i} = g|_{U_i}$ for all $i \in I$, it is given that $f = g$.    (2.1)
2. For all collections $\{f_i \in F(U_i)\}_{i \in I}$ such that $f_i |_{U_i \cap U_j} = f_j |_{U_i \cap U_j}$ for all $i, j \in I$ there exists $f \in F(U)$ such that $f |_{U_i} = f_i$ for all $i \in I$.    (2.2)

In more concrete terms, it is not difficult to see that (2.1) is a statement of power series about a point with radius of convergence covering $U$, and that (2.2) is a statement of analytic continuation.

Definition 3 (Sheaf of holomorphic functions $\mathcal{O}$). Let $X$ be a Riemann surface. The presheaf $\mathcal{O}$ of holomorphic functions on $X$ is made up of complex vector spaces of holomorphic functions. For all open sets $U \subseteq X$, $\mathcal{O}(U)$ is the vector space of holomorphic functions on $U$. The restrictions are the usual restrictions of functions.

Proposition 4  If $X$ is a Riemann surface, then $\mathcal{O}$ is a sheaf on $X$.

Proof. As $\mathcal{O}$ is a presheaf, it suffices to show properties (2.1) and (2.2)(2.1) follows directly from the definition of restriction of a function. If they agree on every set in the cover of $U$, they agree on all of $U$.

For (2.2) take some collection $\{f_i \in \mathcal{O}(U_i)\}_{i \in I}$ such that $f_i |_{U_i \cap U_j} = f_j |_{U_i \cap U_j}$ for all $i, j \in I$. For $x \in U$, $f(x) = f_i(x)$ where $i \in I$ such that $x \in U$. When $\in U_i \cap U_j$$f_i |_{U_i \cap U_j} = f_j |_{U_i \cap U_j}$ by definition of the $f_i$. Therefore, $f$ is well-defined. Given any $x \in U$, there exists some neighborhood $U_i \in \mathcal{U}$ where $f_i$ is holomorphic. From this follows that $f$ is holomorphic, which means $f \in \mathcal{O}(U)$.     ▢

Definition 5 (Direct limit of algebraic objects). Let $\langle I, \leq \rangle$ be a directed set. Let $\{A_i : i \in I\}$ be a family of objects indexed by $I$ and $f_{ij}: A_j \rightarrow A_j$ be a homomorphism for all $i \leq j$ with the following properties:

1. $f_{ii}$ is the identity of $A_i$, and
2. $f_{ik} = f_{jk} \circ f_{ij}$ for all $i \leq j \leq k$.

Then the pair $\langle A_i, f_{ij} \rangle$ is called a direct system over $I$.

The direct limit of the direct system $\langle A_i, f_{ij} \rangle$ is denoted by $\varinjlim A_i$ and is defined as follows. Its underlying set is the disjoint union of the $A_i$s modulo a certain equivalence relation $\sim$:

$\varinjlim A_i = \bigsqcup_i A_i \bigg / \sim$.

Here, if $x_i \in A_i$ and $x_j \in A_j$, then $x_i \sim x_j$ iff there is some $k \in I$ with $i \leq k, j \leq k$ such that $f_{ik}(x_i) = f_{jk}(x_j)$.

More concretely, using the sheaf of holomorphic functions on a Riemann surface, we see that here, the indices correspond to open sets with $i \leq j$ meaning $U \supset V$, and $f_{ij} : A_i \to A_j$ is the restriction $\rho_V^U : \mathcal{F}(U) \to \mathcal{F}(V)$. Two holomorphic functions defined on $U$ and $V$, represented by $x_i$ and $x_j$ are considered equivalent iff they are equal restricted to some $W \subset V \cap U$.

Fix a point $x \in X$ and requires that the open sets in consideration are the neighborhoods of it. The direct limit in this case is called the stalk of $F$ at $x$, denoted $F_x$. For each neighborhood $U$ of $x$, the canonical morphism $F(U) \to F_x$ associates to a section $s$ of $F$ over $U$ an element $s_x$ of the stalk $F_x$ called the germ of $s$ at $x$.

Dually, there is the inverse limit, which in our concrete context is the more abstract language for an analytic continuation.

Definition 6 (Inverse limit of algebraic objects). Let $\langle I, \leq \rangle$ be a directed set. Let $\{A_i : i \in I\}$ be a family of objects indexed by $I$ and $f_{ij}: A_j \rightarrow A_j$ be a homomorphism for all $i \leq j$ with the following properties:

1. $f_{ii}$ is the identity of $A_i$, and
2. $f_{ik} = f_{jk} \circ f_{ij}$ for all $i \leq j \leq k$.

Then the pair $((A_i)_{i \in I}, (f_{ij})_{i \leq j \in I})$ is an inverse system of groups and morphisms over $I$, and the morphism $f_{ij}$ are called the transition morphisms of the system.

We define the inverse limit of the inverse system $((A_i)_{i \in I}, (f_{ij})_{i \leq j \in I})$ as a particular subgroup of the direct product of the $A_i$s:

$A = \displaystyle\varprojlim_{i \in I} A_i = \left\{\left.\vec{a} \in \prod_{i \in I} A_i\; \right|\;a_i = f_{ij}(a_j) \text{ for all } i \leq j \text{ in } I\right\}$.

What we have essentially are families of holomorphic functions over open sets, and we glue them together via a direct product indexed by open sets under the restriction there must be agreement in values at places where the open sets coincide. This gives us the space of holomorphic functions over the union of the open sets, which is of course a subgroup of the direct product under both addition and multiplication. We have here again the common theme of patching up local pieces to create a global structure.

## Construction of Riemann surfaces as quotients

There is a theorem in Chapter 4 Section 5 of Schlag’s complex analysis text. I went through it a month ago, but only half understood it, and it is my hope that passing through it again, this time with writeup, will finally shed light, after having studied in detail some typical examples of such Riemann surfaces, especially tori, the conformal equivalence classes of which can be represented by the fundamental region of the modular group, which arise from quotienting out by lattices on the complex plane, as well as Fuchsian groups.

In the text, the theorem is stated as follows.

Theorem 4.12.  Let $\Omega \subset \mathbb{C}_{\infty}$ and $G < \mathrm{Aut}(\mathbb{C}_{\infty})$ with the property that

• $g(\Omega) \subset \Omega$ for all $g \in G$,
• for all $g \in G, g \neq \mathrm{id}$, all fixed points of $g$ in $\mathbb{C}_{\infty}$ lie outside of $\Omega$,
• for all $K \subset \Omega$ compact, the cardinality of $\{g \in G | g(K) \cap K \neq \phi\}$ is finite.

Under these assumptions, the natural projection $\pi : \Omega \to \Omega / G$ is a covering map which turns $\Omega/G$ canonically onto a Riemann surface.

The properties essentially say that the we have a Fuchsian group $G$ acting on $\Omega \subset \mathbb{C}_{\infty}$ without fixed points, excepting the identity. To show that quotient space is a Riemann surface, we need to construct charts. For this, notice that without fixed points, there is for all $z \in \Omega$, a small pre-compact open neighborhood of $z$ denoted by $K_z \subset \Omega$, so that

$g(\overline{K_z} \cap \overline{K_z}) = \emptyset \qquad \forall g \in G, g \neq \mathrm{id}$.

So, in $K_z$ no two elements are twice represented, which mean the projection $\pi : K_z \to K_z$ is the identity, and therefore we can use the $K_z$s as charts. The $g$s as Mobius transformations are open maps which take the $K_z$s to open sets. In other words, $\pi^{-1}(K_z) = \bigcup_{g \in G} g^{-1}(K_z)$ with pairwise disjoint open sets $g^{-1}(K_z)$. From this, the $K_z$s are open sets in the quotient topology. In this scheme, the $g$s are the transition maps.

Finally, we verify that this topology is Hausdorff. Suppose $\pi(z_1) \neq \pi(z_2)$ and define for all $n \geq 1$,

$A_n = \left\{z \in \Omega | |z-z_1| < \frac{r}{n}\right\} \subset \Omega$

$B_n = \left\{z \in \Omega | |z-z_2| < \frac{r}{n}\right\} \subset \Omega$

where $r > 0$ is sufficiently small. Define $K = \overline{A_1} \cup \overline{B_1}$ and suppose that $\pi(A_n) \cap \pi(B_n) \neq \emptyset$ for all $n \geq 1$. Then for some $a_n \in A_n$ and $g_n \in G$ we have

$g_n(a_n) \in B_n \qquad \forall n \geq 1$.

Since $g_n(K) \cap K$ has finite cardinality, there are only finitely many possibilities for $g_n$ and one of them therefore occurs infinitely often. Pass to the limit $n \to \infty$ and we have $g(z_1) = z_2$ or $\pi(z_1) = \pi(z_2)$, a contradiction.

## Variants of the Schwarz lemma

Take some self map on the unit disk $\mathbb{D}$, $f$. If $f(0) = 0$, $g(z) = f(z) / z$ has a removable singularity at $0$. On $|z| = r$, $|g(z)| \leq 1 / r$, and with the maximum principle on $r \to 1$, we derive $|f(z)| \leq |z|$ everywhere. In particular, if $|f(z)| = |z|$ anywhere, constancy by the maximum principle tells us that $f(z) = \lambda z$, where $|\lambda| = 1$. $g$ with the removable singularity removed has $g(0) = f'(0)$, so again, by the maximum principle, $|f'(0)| = 1$ means $g$ is a constant of modulus $1$. Moreover, if $f$ is not an automorphism, we cannot have $|f(z)| = |z|$ anywhere, so in that case, $|f'(0)| < 1$.