casey b math 365 blog: February 2008

Thursday, February 28, 2008

Feb 29 2008

MAIN POINTS
The reading first defines "inconsistent" systems of equations, which have no solution. These are probably extremely common in scientific computation because real world data (and data represented inside computers) as coefficients rarely would give an intersection with more than two lines, because a miniscule error in the coefficient could throw it all off. The reading writes a system in vector form, then reminds us that we can calculate the shortest distance from a point to a plane by using the concept of orthogonality, which would give a dot product of 0. This is used to create an equation for finding the least squares for any system.
4.1.2 Generalizes this equation for the specific purpose of finding data to fit a curve.
The solution follows the same steps. Fortunately, the text gives a MATLAB command to calculate least squares.

The second half of the reading takes an example we've touched upon in class — periodic temperature. This leads to a large number of data points being punched into the formula. Then it covers exponential growth, going on to Moore's law and concentration of checmicals in the bloodstream.... a hearty dose of examples.

CHALLENGES
I see that there is equation 4.6. I don't understand what the operation A^T is doing— it's turning A on its side, but what are the implications of this? Also, I don't get how they solve for the final least-squares solution with Gaussian elimination (on the top of page 197). Because of this, I still don't really get the process of turning this large system into a least square number, especially when dealing with curves that aren't lines. I would like to see an example calculation before we try this in MATLAB!

REFLECTIONS
The fact that models such as sines and exponential functions can work inside this matrix model for best fit is interesting. I've used simple equations for regression lines in high school biology and chemistry, and generated them automatically in excel for college chemistry, and never understood the reasoning behind them. This could be very useful in robotics— estimating things like movement and vision in which a limited number of discrete data points is available. The text also points out how it is useful for compression, and I'm curious about what kinds of compression use these ideas.

Tuesday, February 19, 2008

Feb 20 2008

MAIN POINTS
The text defines special arithmetic operations that act within the limits of floating point numbers under the IEEE standard. Tables 1.2 and 1.3 show the difference between using the operations on numbers that are far apart and numbers that are very close. When the two are close, as in Table 1.3, the relative error is much higher. Example 5 stresses how important the formulation of the problem is, because this can affect the number of significant digits. Just formulating the problem as an addition rather that a subtraction can have huge effects on the number of significant digits. The final example uses nesting as a way to minimize error.

CHALLENGES
I'm assuming it's a typo when they say on p.25 that x^3=x^3*x. I feel like it would be challenging to keep track of errors as they propagate through lengthy calculations, if we weren't given an exact value to compare to. It would also be challenging to keep track of all of this while implementing it in software.

REFLECTION
The book truncates numbers to a certain number of base-10 digits. However, isn't this inconsistent with how the numbers are represented within the computer? They would have to be truncated to a certain number of base-2 digits. This probably affects the measure of relative error very slightly.

It's interesting that rounding produces a slightly greater error than truncation in Example 6.

Sunday, February 17, 2008

Feb 18 2008

MAIN POINTS
The text emphasizes that our ideal notion of numbers cannot occur inside computers, because they must be represented within a finite number of bits. It then fleshes out the IEEE standard for representing floating point numbers in computers. This is made up of a certain number of mantissa and exponent (characteristic) bits, and a sign bit. This length varies on which standardized length. The mantissa is normalized to maximize precision (bits are shifted as far left as possible) and the exponent is calculated by subtracting from a large number. Underflow and overflow are also discussed, and the fact that two zeroes are possible. Rounding and chopping (truncating) are discussed. Rounding has the disadvantage that many digits might have to be changed in other fields. Relative and absolute errors are defined again, the same way they were in earlier readings. The notion of significant digits is defined.

CHALLENGES
I've never seen significant digits defined this way— I only used them in chem, bio and physics. I've recently covered the IEEE floating point form in COMP240, but I could see how this brief introduction could confuse people.

REFLECTION
It would be interesting to see how we would propagate errors, as truncated numbers are used in further calculations. Also, what happens in computers when we want to minimize errors, and the IEEE form isn't good enough? Could the user define a special data type using many more bits? Or create a special array?

Tuesday, February 12, 2008

Feb 13 2008

MAIN POINTS
The first section defines the rate of convergence, particularly focusing on linear and quadratic convergence. This is defined by a limit as a fraction of successive terms goes to infinity. A table demonstrates the dramatic difference between linear convergence and quadratic convergence. Linear convergence doesn't seem to even be much of an improvement upon the bisection method, for instance. I'm assuming that the lambda term is a goal of accuracy decided upon beforehand.
The second section defines zero multiplicity as.. sort of the degree of the polynomial around the zero. A zero of multiplicity one is 'simple.' Thm 2.10 says there is a simple zero at p only if f(p)=0 and f'(p) != 0. This is extended to zeroes of any multiplicity by making further and further claims about deeper derivatives. Examples are given that show how to deal with zeroes of various multiplicity.

CHALLENGES
In the first section, the relationship between the lambda term and the alpha term confuses me— are they determined by the equation in question, or by the person doing the calculation. In the second part, I don't really understand the significance of zeroes of various multiplicities— does this directly relate to how the method converges, and if so, how?

REFLECTION
The quantification of convergence sheds more light on when Newton's method is inefficient, which was not obvious before. This seems to be furthered by the investigation into zeros of various multiplicity which affect the type of convergence— leading to a clear classification of the behavior of the method.

Sunday, February 10, 2008

MAIN POINTS
The first section of the reading derives the Newton method by creating a Taylor polynomial where f(p) = 0. If the assumption is made that |p-p0| is very small, then only the initial taylor terms are needed— the polynomial is made to equal to 0, and then solved for p. (Because p=x, this is essentially solving for x). After explaining the process as one of tangents (which I didn't quite get), the text describes the algorithm for using the Newton method, including a discussion of Tolerance. It also mentions that the Newton method fails when f'(x)=0. Ex.1 shows an instance of this, because -sin(0)=0.
The second part of the reading describes the Secant method, which iterates through different secants until the answer is within the tolerance. This is described as slightly slower than the Newton method.

CHALLENGES
I've used Newton's method before, so it wasn't quite a challenge. However, the diagram on p. 64 was a little unclear— about how Newton's method actually works graphically. The secant method /was/ confusing to me, because I don't think the actual description of how it works was included in the reading assignment. This is another thing where fitting the illustration to the formula would be helpful for understanding the concept itself (rather than just the equation).

REFLECTION
I learned the Newton method in high school, where we learned how to use the graphing calculator to make it go quickly. This was my first exposure to numerical methods! This seems like one of the optimal methods scientifically, if you are given explicit formulas, however it is not guaranteed to produce results. I'm wondering how a computer might be able to tell if it is producing results. Plus, in order to carry it out, the computer would have to be able to estimate derivatives— and what if it used the Newton method to do that— and ended up in an infinite cycle of using derivatives to estimate derivatives...

Tuesday, February 5, 2008

Feb 6 2008

MAIN POINTS
Table 2.1 makes two weaknesses of the bisection method apparent— it's not guaranteed that every recursion will produce a better answer, and it can take many recursions to produce an acceptably accurate answer (convergence is slow). Theorem 2.1 gives a formula for calculating approximation error. This is calculated by noting that the window shrinks by powers of two. The text also notes that the actual error can be much smaller than the estimated error. Example 2 shows how to start with the desired accuracy, and use logarithms to calculate the maximum number of iterations needer to attain that accuracy.

CHALLENGES
The very end of the reading confused me. I do not completely understand why they change the equation like this to factor in computer rounding errors. I also do not understand the use of the signum function— and how they prevent over/underflow errors. I understood the theorem, but it would be nice to work through an example.

REFLECTION
There's very little material for reflection here. I suppose that the ending part about considering computer rounding errors is important, particularly scientifically. It would be something that is easy to forget— possibly with bad consequences.

Sunday, February 3, 2008

Feb 4 2008

MAIN POINTS
The first couple of pages are again a restating of calculus theorems. The first is Rolle's Theorem, and the second is the Intermediate Value Theorem. The intermediate value theorem is sort of an obvious result but apparently the proof is difficult (but not shown here). I was less clear on Rolle's theorem. The IVT leads off by saying an efficient way to find the intermediate solution would be given in chapter 2, which is where the next assignment is. In the second part, there is an example with a differential equation (which I didn't really get), and then the text discusses the Bisection Method. This seems to be a divide-and-conquer method of root-finding. This takes the form of an algorithm. The section also discusses divergence, and trying to find optimal bounds.

CHALLENGES
I was not clear on how they constructed the statement of Rolle's theorem, and how this theorem is useful. I would also like to go over the concept of a differential equation, because, though I've seen them before, I've forgotten how the work. As for the bisection method, it would be good to see an example where it converges and an example where it diverges.

REFLECTIONS
It's interesting that the Babylonians used methods like this without computers (and it would blow my mind to use a base-60 number system). I just realized that I used this bisection method in my computer science class in high school to calculate square roots. It was an improvement on the previous method we had used, which was to take fixed steps until we crossed over the value, and then decrease the step value and change directions.

casey b math 365 blog