Halley's root finding code analysis

Related to Generating fractals with Halley’s Method. The Halley’s Method note has an overview of the math.

The full code is at Halley’s which-root code.

An attempt to look at what’s going on with the math: Generating Fractals by Finding Roots With Halley’s Method.

Generate fractals by checking for convergence on known roots

Starting with the (poorly named) halleys_roots_fractal function.

Create a grid of complex numbers

def halleys_roots_fractal(
    f,
    df,
    ddf,
    roots,
    width=800,
    height=800,
    xmin=-2.0,
    xmax=2.0,
    ymin=-2.0,
    ymax=2.0,
    max_iter=50,
):
    # create a grid of complex numbers
    x = np.linspace(xmin, xmax, width)
    y = np.linspace(ymin, ymax, height)
    X, Y = np.meshgrid(x, y)
    Z = X + 1j * Y
    # ...

A simplified demonstration of np.meshgrid:

In [28]: x = np.array([0, 1, 2, 3, 4])

In [29]: y = np.array([0, 1, 2, 3, 4])

In [30]: X, Y = np.meshgrid(x, y)

In [31]: X
Out[31]:
array([[0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4],
       [0, 1, 2, 3, 4]])

In [32]: Y
Out[32]:
array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4]])

Starting with the 1D arrays, x and y, meshgrid creates 2 2D arrays where every possible combination of the x and y coordinates exists. meshgrid returns a tuple: (X, Y). When it’s called with X, Y = np.meshgrid(x, y), the tuple is being unpacked into the X and Y variables.

In the fractal function code, the ranges defined by x and y are used to generate a grid (2 2D arrays). The Z array is a 2D matrix of complex numbers (in the rectangular form) where the real part is set from X and the imaginary part is scaled by Y.

Create matrices to store the output

The function is going to run Halley’s Method against every point in the Z matrix and record what root the point converges on, and how many iterations it takes to converge.

    # ...
    # record the index of the root that a point converges on
    # the index used is the index from the `roots` function argument
    root_map = np.zeros((height, width), dtype=int)
    # record how many iterations it takes for a point to converge
    iter_map = np.zeros((height, width), dtype=int)
    # ...

Iterate through the grid of complex numbers

    # ...
    for i in range(height):
        for j in range(width):
            z = Z[i, j]  # the complex number
      # ...

Call Halley’s Method on a point

            # ...
            for iteration in range(max_iter):
                fz = f(z)
                dfz = df(z)
                ddfz = ddf(z)

                if abs(fz) < 1e-6:  # considered to be converged if magnitude < 1e-6
                    distances = [abs(z - root) for root in roots]
                    root_idx = np.argmin(distances)
                    root_map[i, j] = root_idx
                    iter_map[i, j] = iteration
                    break

                denominator = 2 * dfz**2 - fz * ddfz
                # avoid divide by zero errors; hasn't been called yet when testing
                if abs(denominator) < 1e-15:
                    print("Denominator approaching zero; breaking from loop")
                    break
                z = z - (2 * fz * dfz) / denominator
                # ...

fz = f(z): f is the function that roots are being found for. E.g.:

# f(z) = z^3 - 1
def f(z):
    return z**3 - 1

df is the function that calculates the first order derivative. E.g.:

def df(z):
    return 3 * z**2

ddf is the function that calculates the second order derivative. E.g.:

def ddf(z):
    return 6 * z

Refer to Halley’s Method for details about how Halley’s Method works (note that I don’t provide a proof of it there). What’s significant for this implementation is:

                if abs(fz) < 1e-6:  # considered to be converged if magnitude < 1e-6
                    distances = [abs(z - root) for root in roots]
                    root_idx = np.argmin(distances)
                    root_map[i, j] = root_idx
                    iter_map[i, j] = iteration
                    break

When abs is called on a complex number, it returns the magnitude of the number — the $r$ part (radius) of the exponential form of the number. When the radius is less than 1e-6, the function considers it to be converged.

It then finds the root (from the roots array argument) that the value of z that the fz function was called with that’s closest to:

distances = [abs(z - root) for root in roots]
root_idx = np.argmin(distances)

The root index for the current point is saved to root_map[i, j]. The number of iterations of Halley’s Method that it took to converge to the root is saved to iter_map[i, j].

That’s basically it. The rest of the function just handles the case of a point that’s failed to converge, and returns the values that are needed to render the root_map and iter_map as images:

            # ...
            else:  # for...else is legitimate Python; runs if a break statement isn't hit in the for loop
                # didn't converge
                root_map[i, j] = -1
                iter_map[i, j] = max_iter

    return root_map, iter_map, x, y

Supplying the function arguments

def halleys_roots_fractal(
    f,
    df,
    ddf,
    roots,
    width=800,
    height=800,
    xmin=-2.0,
    xmax=2.0,
    ymin=-2.0,
    ymax=2.0,
    max_iter=50,
):

f, df, and ddf are functions. E.g.:

# f(z) = z^3 - 1
# the function to find the roots for
def f(z):
    return z**3 - 1

# the first order derivative
# e.g.: if f(x) = x^5; f'(x) = 5x^(5-1)
def df(z):
    return 3 * z**2

# the second order derivative
# e.g.: if f(x) = 3x^2; f'(x) = 3*2x^(2-1)
def ddf(z):
    return 6 * z

Finding the roots of function f

The need to find the roots is the annoying part of this implementation. Annoying in the sense that it makes it difficult to just quickly test a bunch of functions. The roots for f(z) = z^3 - 1 are:

roots = [1, np.exp(2j * np.pi / 3), np.exp(4j * np.pi / 3)]

Go to the Roots of f(x) = x^3 -1 section for more details, or have a look at these links:

Setting the size and coordinates of the complex plane

The xmin, xmax, ymin, and ymax arguments set the size and position of the complex plane that Halley’s Method is run against. E.g., xmin=-1000, xmax=1000, ymin=-1000, ymax=1000 will generate a plane with the real (x) axis in the range (-1000, 1000) and the imaginary (y) axis in the range (-1000, 1000). The center of the plane will be the point (0, 0).

The height and width arguments set the actual number of points that will exist in the range set by the xmin/xmax, ymin/ymax arguments.

Calling the function and rendering images with PyPlot

# f(z) = z^3 - 1
def f(z):
    return z**3 - 1


def df(z):
    return 3 * z**2


def ddf(z):
    return 6 * z


roots = [1, np.exp(2j * np.pi / 3), np.exp(4j * np.pi / 3)]


root_map, iter_map, xs, ys = halleys_roots_fractal(
    af,
    df,
    ddf,
    alt_roots,
    xmin=-1000,
    xmax=1000,
    ymin=-1000,
    ymax=1000,
)

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 6))

# map which roots are selected for the complex points
im1 = ax1.imshow(
    root_map, extent=(xs[0], xs[-1], ys[0], ys[-1]), cmap="viridis", origin="lower"
)
ax1.set_title("Basins of attraction")
ax1.set_xlabel("Real")
ax1.set_ylabel("Imaginary")
cbar1 = plt.colorbar(im1)  # could use cbar1.set_label("some label") to set a label

# map how many iterations it took each point to converge on its root
im2 = ax2.imshow(
    iter_map, extent=(xs[0], xs[-1], ys[0], ys[-1]), cmap="viridis", origin="lower"
)
ax2.set_title("Convergence speed")
ax2.set_xlabel("Real")
ax2.set_ylabel("Imaginary")
cbar2 = plt.colorbar(im2)

plt.tight_layout()
plt.show()

The full source code is at Halley’s Which Root Code

Roots of f(x) = x^3 - 1

Starting with the function f(x) = x^3 - 1. The roots will be x^3 = 1

First guess: There’s a real root of 1: 1 * 1 * 1 = 1, that is also the complex number 1 + 0i.

Systematic approach:

Express 1 in the exponential form $r\cdot e^{i\theta}$

  • r (magnitude): 1
  • \theta (angle from the positive real axis): $0\pi$

So 1 can be written in the exponential form as $1e^{(i\cdot0\pi)}$, or more simply as $e^{(i\cdot0)}$

But the trick is that $0\pi$ is the same as $2\pi$, because $e^{(i\theta)}$ has a period of $2\pi$. So:

  • $1 = e^{(i\cdot0)}$
  • $1 = e^{(i\cdot2\pi)}$
  • $1 = e^{(i\cdot4\pi)}$

In general, $1 = e^{(i\cdot\pi k)}$ for any integer $k$.

What actually gets solved for is:

$$ x^3 = e^{(i\cdot2\pi k)} $$$$ x = e^{(i\cdot2\pi k/3)} $$

By convention, use $k = 0, 1, 2,\dots,n-1$ to generate $n$ distinct roots. For the cubic roots, $k=3$.

In the code below, the 1 root comes from k=0; e^0 = 1; the np.exp(2j * np.pi/3 comes from k=1 (the 2 in 2j comes from 1*2); in the last root, the 4 in 4j comes from k=2; 2*2=4:

roots = [1, np.exp(2j * np.pi / 3), np.exp(4j * np.pi / 3)]

Roots of f(x) = x^3 - 0.1

See Solving polynomial equations.

roots = [0.1**(1.0/3), 0.1**(1.0/3) * np.exp(2j * np.pi / 3), 0.1**(1.0/3) * np.exp(2j * np.pi / 3)]
Tags: