## Newton's method for finding the zeros of a function

Let's imagine you have a function $y=f(x)$ and you want to find the zeros. That means find the value for $x$ such that $f(x)=0$. Newton's method is a very good way to do this providing that the function $f(x)$ is "well behaved" (more on what that means below). But first we should ask: why would we want to do this?

To answer that, imagine you have some kind of complex function of the variable $x$ that you want to solve, for example: $$7x^3 - 11\ln x + 12x = 0\nonumber$$ That function is certainly not solvable analytically. So you can solve it numerically by guessing a value for $x$, seeing what the left side comes to, and adjusting so that it converges to zero. Newton knew calculus, so what he did was to put this problem into another form and solve it that way, using calculus. So let's rewrite the function as $y=f(x)$: $$y = 7x^3 - 11\ln x + 12x\nonumber$$ and plot it for $0.05\le x\le 1.05$ as in the plot below.

$x_{min}$    $x_{max}$
$y_{min}$    $y_{max}$
Precision:
You can see that the function crosses the $x$ ($y=0$) at around $0.5$. Voila, you have solved the equation $7x^3 - 11\ln x + 12x = 0$ for $x\sim 0.5$. Of course the value of $0.5$ is just a guess by eye, in fact it looks like the line crosses at a poing slightly greater than $0.5$, so you could then blow up the plot, say plot it between $0.4$ and $0.6$, and keep iterating the blowup until you are ok with the uncertainty. Just change the values in the text windows and hit return and it will update. Note that as you get closer to the right value you will have to increase the precision.

Or, you could use Newton's method, using calculus ideas, and do it algorithmically. Here's how. The plot below shows the function $(x-a)^2 + (y-b)^2 = r^2$ where the circle is centered at $(a, b)$ and has radius r$$, and all are constants. The center of the circle is not shown (it's off the page), and the angle \theta goes between 160^\circ and 90^\circ. We want to find where it crosses the x axis, which means finding the zeros of y = b +\sqrt{r^2 - (x-a)^2} (remember, b is negative so we want the positive square root). So how do we find the value for the point x_0 that gives us y=0, where the blue curve crosses the x axis? The first thing you do is pick a value for x that you think might be close to x_0. Call this point x_1, and let's say it's the blue open circle in the drawing below. Since we know what y=f(x) is, then we know the point on the curve at x_1 is y(x_1). We also know the slope m of the straight line tangent to the curve at (x_1,y(x_1)), which we get from differentiating, so$$\begin{align} m(x_1,y(x_1)) &= \frac{dy}{dx}\big|_{x_1}\nonumber\\ &= -\frac{x-a}{\sqrt{(x-a)^2 - r^2}}\big|_{x_1}\nonumber\\ &=-\frac{x_1-a}{\sqrt{(x_1-a)^2 - r^2}}\nonumber \end{align}$$The straight line hits the x axis at a point (x_2,0), so by the definition of the slope (rise over the run), we have$$m = \frac{y(x_1)}{x_1-x_2}\nonumber$$Solving for x_2 gives$$x_2 = x_1 - \frac{y(x_1)}{m} = x_1 - \frac{y(x_1)}{y'(x_1)}\nonumber$$This is the heart of Newton's method for approximating zeros of function. Another way to see it, imagine we expanded y=f(x) around the point x_1 using a Taylor expansion, and just kept the first 2 terms:$$y(x)\big|_{x_1} \sim f(x_1) + f'(x_1)(x-x_1)\nonumber$$Now, if we evaluate that function at x_2, and propose that f(x_2) is a zero of the function, then we would have$$0 = f(x_1) + f'(x_1)(x_1-x_2)\nonumber$$Solving for x_2 gives$$x_2 = x_1 - \frac{f(x_1)}{f'(x_1)}\label{en}$$Click the blue circlec at x_1. It will find the point f(x_1), calculate the slope there using the derivative, draw it as a straight line to the x axis (y=0), and will show you the new point, x_2. Then, click on x_2, and it will find the next point x_3, and so on. You will see that each time you iterate, it gets closer and closer to the real zero, which will be at the analytic value given below. You can see how fast this converges.  Analytic solution: Iteration 1 Estimated zero is at Distance from analytic is #### Convergence So how fast does Newton's method actually converge, and can we quantify that? First we have to understand we mean by convergence is this: let's say that the point x=a is a zero of the function, and we want to find the value of a. Each successive x_n will get, hopefully, closer and closer to a. Let's define \epsilon_n\equiv a-x_n. So by convergence we want to know how \epsilon_{n+1} is related to \epsilon_n. To see that, let's keep the next term in f''(x) in the Taylor expansion and expand around the root a by an amount x_n, where x_n is the nth try in the iterative method here:$$y(a)\big|_{x_n} \sim f(x_n) + f'(x_n)(a-x_n) + \half f''(x_n)(a-x_n)^2\label{en2}$$Dividing by f'(x_n) and rearranging we get$$\frac{f(x_n)}{f'(x_n)} + (a-x_n) = -\frac{f''(x_n)}{2f'(x_n)}(a-x_n)^2\nonumber$$Now we can take equation \ref{en}, let x_1 be x_n and x_2 be x_{n+1} and write it as$$\frac{f(x_n)}{f'(x_n)} = x_n - x_{n+1} \nonumber$$and substitute it into equation \ref{en2}. This gives us$$(x_n-x_{n+1})-(a-x_n) = -\frac{f''(x_n)}{2f'(x_n)}(a-x_n)^2\nonumber$$Collecting terms (notice that x_n drops out of the left hand side) and using the defintion for \epsilon_n gives:$$\epsilon_{n+1} = -\frac{f''(x_n)}{2f'(x_n)}\epsilon_n^2\label{econ}$$This shows that each successive \epsilon_{n+1} converges quadratically with how cloes the previous \epsilon_n got. Note that if your function is not analytic, you can still use Newton's method for finding the zeros by simply calculating the derivative f'(x_n) numerically. #### Conditions The conditions for Newton's approximation to work are • The first "guess", x_1 should be "close enough". This will apply to functions that have more than 1 place where the derivative is zero. It also means that if your guess is too far from the final answer, depending on the function you won't converge. • If the first guess is at a place where the derivative is already zero, then the method doesn't work since at that point you will be dividing by zero in the estimate formula above. • The derivatives have to exist and be continuous for convergence. If the 1st derivative at the root happens to be zero, then it might still converge but it won't be quadratically • 2nd derivatives have to exist, and just like 1st derivatives at the root, if the 2nd derivative there happens to be zero, it also won't converge quadratically #### Minimization Newton's method can be used for finding minimizations of a function, that is, find x_n so that f(x_n) is a minimum. This is easy to do, because we know from calculus that the minimum (or maximum, and we call the minimum or maximum an extrenum) of a function f(x) occurs where the derivative is zero. So to build on Newton's method for finding the zeros to find the minimum, we can go back to the Taylor expansion in equation \ref{eqn}, and pick a value for x_n (starting at n=1 of course) that we think might be close to the extremum x_m, and expand f(x) around x_n:$$f(x) \sim f(x_n) + f'(x_n)(x-x_n) + \half f''(x_n)(x-x_n)^2\nonumber$$and take the derivative (and remember, f'(x_n) is a constant and not a function of x)$$f'(x) = f'(x_n) + f''(x_n)(x-x_n)\nonumber$$If we evaluate this function at the extremum x=x_m where the derivative is zero, we get$$x_m = x_n - \frac{f'(x_n)}{f''(x_n)}\nonumber$$And just as we did above, the left hand side is usually not x_m (unless the initial guess for x_1 was damn close to being correct) but will then be used as the next guess, x_{n+1}, which gives us the iterative formula for finding the minimum (or maximum) of a function as$$x_{n+1} = x_n - \frac{f'(x_n)}{f''(x_n)}\label{eqmin} In the figure below, the small solid black dot shows where the true minimum is. You can click on the open circle and successive interations. The blue line is a polynomial $f(x)$ up to $x^5$, the light dashed yellow line is the slope $f'(x)$. It also converges pretty fast. But you have to be "close to" the true answer. You can drag the initial $x_1$ along the $x$ axis and see for yourself what "close to" means - it means that the slope of the 1st derivative (the 2nd derivative) has to point in the direction of the true minimum. And of course you can't start where the second derivative is 0 (where the slope of the 1st derivative is 0) because then the slope won't intersect the $x$ axis at all.

 x y Function minimum Iteration 1 Distance from $x_{min}$