# Chapter 2

## Steepest Descent

Although adjustment of each coefficient individually works well when using orthogonal functions,  there could be some difficulty with numerical precision in cases when the error measure is dominated by errors in coefficients other than the one that is being adjusted. Additionally, the contribution to the error of as yet unadjusted coefficients can make convergence quite slow.

When not employing orthogonal functions for approximation and when simply optimizing coefficients to minimize cost, maximize efficiency, and the like, convergence can be impossible when adjusting one coefficient at a time.

 Imagine a mountain valley in the Canadian Rockies that is surrounded by mountains.  You are well up the mountain slope, the ground is very rough and snow covered.  Snow is falling heavily and you can only just see your feet.  You want to reach your canoe on the shallow lake in the low point of the valley as soon as possible.  What do you do? You could put one foot forward and if it felt lower in that direction, take further steps.  If it felt higher then you might turn around and step in the opposite direction. You could repeat the foregoing until neither forward nor reverse steps felt lower.  Then you might make a 90-degree turn and repeat the process.  Hopefully you will have reached your objective when  neither forward nor backward steps lower your position.  Are you at the water's edge or in the water?  If you are neither, then your 90-degree turn could have been less than precise. You might consider repeating the entire process, albeit you could die from exposure before reaching your objective. (You might have been spiraling around the valley following a very small downward slope.) As an alternative, consider moving one foot forward and then to the back and noting which of these lowered your position the most and by how much. Then turn 90 degrees and repeat the foregoing.  Now take a step in a direction that lies between the two best directions and favours, proportionately, the direction that lowered your position the most, the direction of steepest decent. It will be intuitive that the method of steepest decent offers the best chance of reaching the lake before dying of exposure. But there are other problems. Too large an initial step could take you into the next valley or across the valley to the opposite slope.  Steps that are too small can fail to guide you out of small depressions, called local minima.

Additionally, when adjusting only one coefficient at a time, and in situations where the effects of coefficient adjustments are not fully independent, it can be impossible to reach the objective.  In such cases adjusting all coefficients at once using steepest decent will often succeed.

The efficacy of steepest decent coefficient adjustment will now be compared with the one at a time process that was used in the previous treatment of approximating a semicircle with sine functions.

First determine the amount that the squared error changes for a small change in each of the 17 coefficients. These amounts indicate the sensitivity of the error to adjustment of the coefficient.  Then adjust all coefficients in proportion to the negative of their sensitivities, negative because we wish to reduce error rather than increase it.  Continue these adjustments until the reduction in the error that results is less than a predetermined amount, in our case 0.00000001.

Recall that the error that resulted when using the one at a time adjustments was 0.008464.  Also, some coefficients required 81 steps of adjustment.  The fewest steps required for a coefficient were 54. Employing steepest decent, with the aid of a macro (shown later), only 8 steps were required. The results for the initial and first two of the eight steps are shown in the graphs that follow. The final error was again 0.008464.

The sensitivities for the17 coefficients, for each step, as found by the macro are seen next.

There is no separate need to adjust the step sizes since they are maintained in proportion to the continually reducing sensitivities.

Next view the increments made to the coefficients in each step and their final values.

The final values consist of the initial values plus the increments calculated in each step.

The Macro follows.

The spreadsheet Excel is a registered trademark of Microsoft Corporation. Excel 2000 used herein is Copyright by Microsoft Corporation. Excel contains a tool called Solver that is used for iterative optimization.  Solver and the foregoing macro found exactly the same coefficients for the sinusoidal approximation to the semicircle.

Other uses for iterative optimization include solving sets of linear equations, locating roots of polynomials, solving sets of differential equations and more.

Use Solver to solve a few linear equations of your own choosing. (Hint! Assign initial values to the unknown coefficients. Next, let Solver adjust the assigned values until the error between the results given by the equations and the known values becomes small.)

It is said that one's understanding of mathematical methods is greatly advanced by doing.  Try to make some improvements to the given macro.  It is possible.

### Next

The spreadsheet is used to find a close approximation to e, the base of Naperian logarithms.  An application of e is then explored.

 Top Previous Topic Next Topic Topics