# Implicit function theorem and its multivariate generalization

The implicit function theorem for a single output variable can be stated as follows:

Single equation implicit function theorem. Let $F(\mathbf{x}, y)$ be a function of class $C^1$ on some neighborhood of a point $(\mathbf{a}, b) \in \mathbb{R}^{n+1}$. Suppose that $F(\mathbf{a}, b) = 0$ and $\partial_y F(\mathbf{a}, b) \neq 0$. Then there exist positive numbers $r_0, r_1$ such that the following conclusions are valid.

a. For each $\mathbf{x}$ in the ball $|\mathbf{x} - \mathbf{a}| < r_0$ there is a unique $y$ such that $|y - b| < r_1$ and $F(\mathbf{x}, y) = 0$. We denote this $y$ by $f(\mathbf{x})$; in particular, $f(\mathbf{a}) = b$.

b. The function $f$ thus defined for $|\mathbf{x} - \mathbf{a}| < r_0$ is of class $C^1$, and its partial derivatives are given by $\partial_j f(\mathbf{x}) = -\frac{\partial_j F(\mathbf{x}, f(\mathbf{x}))}{\partial_y F(\mathbf{x}, f(\mathbf{x}))}$.

Proof. For part (a), assume without loss of generality positive $\partial_y F(\mathbf{a}, b)$. By continuity of that partial derivative, we have that in some neighborhood of $(\mathbf{a}, b)$ it is positive and thus for some $r_1 > 0, r_0 > 0$ there exists $f$ such that $|\mathbf{x} - \mathbf{a}| < r_0$ implies that there exists a unique $y$ (by intermediate value theorem along with positivity of $\partial_y F$) such that $|y - b| < r_1$ with $F(\mathbf{x}, y) = 0$, which defines some function $y = f(\mathbf{x})$.To show that $f$ has partial derivatives, we must first show that it is continuous. To do so, we can let $r_1$ be our $\epsilon$ and use the same process to arrive at our $\delta$, which corresponds to $r_0$.

For part (b), to show that its partial derivatives exist and are equal to what we desire, we perturb $\mathbf{x}$ with an $\mathbf{h}$ that we let WLOG be $\mathbf{h} = (h, 0, \ldots, 0)$.

Then with $k = f(\mathbf{x}+\mathbf{h}) - f(\mathbf{x})$, we have $F(\mathbf{x} + \mathbf{h}, y+k) = F(\mathbf{x}, y) = 0$. From the mean value theorem, we can arrive at $0 = h\partial_1F(\mathbf{x}+t\mathbf{h}, y + tk) + k\partial_y F(\mathbf{x}+t\mathbf{h}, y+tk)$

for some $t \in (0,1)$. Rearranging and taking $h \to 0$ gives us $\partial_j f(\mathbf{x}) = -\frac{\partial_j F(\mathbf{x}, y)}{\partial_y F(\mathbf{x}, y)}$.

The following can be generalized to multiple variables, with $k$ implicit functions and $k$ constraints.     ▢

Implicit function theorem for systems of equations. Let $\mathbf{F}(\mathbf{x}, \mathbf{y})$ be an $\mathbb{R}^k$ valued functions of class $C^1$ on some neighborhood of a point $\mathbf{F}(\mathbf{a}, \mathbf{b}) \in \mathbb{R}^{n+k}$ and let $B_{ij} = (\partial F_i / \partial y_j)(\mathbf{a}, \mathbf{b})$. Suppose that $\mathbf{F}(\mathbf{x}, \mathbf{y}) = \mathbf{0}$ and $\det B \neq 0$. Then there exist positive numbers $r_0, r_1$ such that the following conclusions are valid.

a. For each $\mathbf{x}$ in the ball $|\mathbf{x} - \mathbf{a}| < r_0$ there is a unique $\mathbf{y}$ such that $|\mathbf{y} - \mathbf{b}| < r_1$ and $\mathbf{F}(\mathbf{x}, \mathbf{y}) = 0$. We denote this $\mathbf{y}$ by $\mathbf{f}(\mathbf{x})$; in particular, $\mathbf{f}(\mathbf{a}) = \mathbf{b}$.

b. The function $\mathbf{f}$ thus defined for $|\mathbf{x} - \mathbf{a}| < r_0$ is of class $C^1$, and its partial derivatives $\partial_j \mathbf{f}$ can be computed by differentiating the equations $\mathbf{F}(\mathbf{x}, \mathbf{f}(\mathbf{x})) = \mathbf{0}$ with respect to $x_j$ and solving the resulting linear system of equations for $\partial_j f_1, \ldots, \partial_j f_k$.

Proof: For this we will be using Cramer’s rule, which is that one can solve a linear system $Ax = y$ (provided of course that $A$ is non-singular) by taking matrix obtained from substituting the $k$th column of $A$ with $y$ and letting $x_k$ be the determinant of that matrix divided by the determinant of $A$.

From this, we are somewhat hinted that induction is in order. If $B$ is invertible, then one of its $k-1 \times k-1$ submatrices is invertible. Assume WLOG that such applies to the one determined by $B^{kk}$. With this in mind, we can via our inductive hypothesis have $F_1(\mathbf{x}, \mathbf{y}) = F_2(\mathbf{x}, \mathbf{y}) = \cdots = F_{k-1}(\mathbf{x}, \mathbf{y}) = 0$

determine $y_j = g_j(\mathbf{x}, y_k)$ for $j = 1,2,\ldots,k-1$. Here we are making $y_k$ an independent variable and we can totally do that because we are inducting on the number of outputs (and also constraints). Substituting this into the $F_k$ constraint, this reduces to the single variable case, with $G(\mathbf{x}, y_k) = F_k(\mathbf{x}, \mathbf{g}(\mathbf{x}, y_k), y_k) = 0$.

It suffices now to show via our $\det B \neq 0$ hypothesis that $\frac{\partial G}{\partial y_k} \neq 0$. Routine application of the chain rule gives $\frac{\partial G}{\partial y_k} = \displaystyle\sum_{j=1}^{k-1} \frac{\partial F_k}{\partial y_j} \frac{\partial g_j}{\partial y_k} + \frac{\partial F_k}{\partial y_k} = \displaystyle\sum_{j=1}^{k-1} B^{kj} \frac{\partial g_j}{\partial y_k} + B^{kk}. \ \ \ \ (1)$

The $\frac{\partial g_j}{\partial y_k}$s are the solution to the following linear system: $\begin{pmatrix} \frac{\partial F_1}{\partial y_1} & \dots & \frac{\partial F_1}{\partial y_{k-1}} \\ \; & \ddots \; \\ \frac{\partial F_{k-1}}{\partial y_1} & \dots & \frac{\partial F_{k-1}}{\partial y_{k-1}} \end{pmatrix} \begin{pmatrix} \frac{\partial g_1}{\partial y_k} \\ \vdots \\ \frac{\partial g_{k-1}}{\partial y_k} \end{pmatrix} = \begin{pmatrix} \frac{-\partial F_1}{\partial y_k} \\ \vdots \\ \frac{-\partial F_{k-1}}{\partial y_k} \end{pmatrix}$.

Let $M^{ij}$ denote the $k-1 \times k-1$ submatrix induced by $B_{ij}$. We see then that in the replacement for Cramer’s rule, we arrive at what is $M^{kj}$ but with the last column swapped to the left $k-j-1$ times such that it lands in the $j$th column and also with a negative sign, which means $\frac{\partial g_j}{\partial y_k}(\mathbf{a}, b_k) = (-1)^{k-j} \frac{\det M^{jk}}{\det M^{kk}}$.

Now, we substitute this into $(1)$ to get \begin{aligned}\frac{\partial G}{\partial y_k}(\mathbf{a}, b_k) &= \displaystyle_{j=1}^{k-1} (-1)^{k-j}B_{kj}\frac{\det M^{kj}}{\det M^{kk}} + B_kk \\ &= \frac{\sum_{j=1}^k (-1)^{j+k} B_{kj}\det M^{kj}}{\det M^{kk}} \\ &= \frac{\det B}{\det M^{kk}} \\ &\neq 0. \end{aligned}

Finally, we apply the implicit function theorem for one variable for the $y_k$ that remains.     ▢

References

• Gerald B. Folland, Advanced Calculus, Prentice Hall, Upper Saddle River, NJ, 2002, pp. 114–116, 420–422.