Arzela-Ascoli theorem and delta epsilons

I always like to think of understanding of the delta epsilon definition of limit as somewhat of an ideal dividing line on the cognitive hierarchy, between actually smart and pseudo smart. I still remember vividly struggling to grok that back in high school when I first saw it junior year, though summer after, it made sense, as for why it was reasonable to define it that way. That such was only established in the 19th century goes to show how unnatural such abstract precise definitions are for the human brain (more reason to use cognitive genomics to enhance it 😉 ). At that time, I would not have imagined easily that this limit definition could be generalized further, discarding the deltas and epsilons, which presumes and restricts to real numbers, as it already felt abstract enough. Don’t even get me started on topological spaces, nets, filters, and ultrafilters; my understanding of them is still cursory at best, but someday I will fully internalize them.

Fortunately, I have improved significantly since then, both in terms of experience and in terms of my biological intelligence, that last night, I was able to reconstruct in my head the proof of the Arzela-Ascoli theorem, which also had been once upon a time mind-boggling. Remarkably, that proof, viewed appropriately, comes naturally out of just a few simple, critical observations.

The statement of the theorem is as follows.

Arzela-Ascoli theorem Let F be a family of functions from I = [a,b] to \mathbb{R} that are uniformly bounded and equicontinuous. Then there is a sequence of f_n of elements in F that converges uniformly in I.

The rationals are dense in the reals and make an excellent starting point. Uniform boundedness enables us employ Bolzano-Weierstrass to construct a sequence of functions convergent at any rational in I. With a countable number of such rationals, we can iteratively filter this sequence to construct one that converges at every rational in I. Essentially, we have an enumeration of the rationals in I and a chain of subsequences based on that, where in the nth subsequence is convergent at the first n rationals in the enumeration, and Bolzano-Weierstrass is applied onto the results of the application of the functions of that subsequence on the n+1th rational to yield another subsequence. Take, diagonally, the nth function in the nth subsequence, to derive our desired uniformly convergent sequence, which we call \{f_i\}.

To show uniform convergence, it suffices to show uniform Cauchyness, namely that for any \epsilon > 0, there is an N such that n,m > N implies |f_m(x) - f_n(x)| < \epsilon for all x \in I. By compactness, open neighborhoods of all rationals of I, as an open cover, has a finite subcover. Each element of the subcover comes from some rational of I and across that finite subset of I we can for any \epsilon > 0 take the max of all the Ns for convergence. This means that so long as our neighborhoods are sufficiently small, we can for any point x \in I have some x' that is the point of focus of the neighborhood of our finite subcover containing x and thereby connect f_m(x) to f_m(x') by equicontinuity and use our maximum N (over finite elements) to connect that to f_n(x') and use equicontinuity again to connect that to f_n(x). Thus, triangle inequality over three of \epsilon/3 suffices.

More explicitly, equicontinuity-wise, we have for every x some open neighborhood of U_x such that s,t \in U_x implies that |f(s) - f(t)| < \epsilon.