casey b math 365 blog: 2008

Thursday, April 24, 2008

April 25 2007

MAIN POINTS

Sarah defined chaos as behavior that is aperiodic and unpredictable over the long-term, yet deterministic (i.e, based on differential equations), and very sensitive on initial conditions. In 3-Space, for instance, a starting point of (0.99, 1, 1) will have very different long-term behavior than (1, 1, 1), with the difference between the two models growing exponentially as t increases. We saw this in our last lab. The Lorentz equations were derived to model weather systems, so this is evidence for how only close forecasts of the weather are accurate, because small errors in measurements of the initial condition quickly snowball out of control. Sarah used this sensitivity in composing music. She mapped a piano piece to a pitch sequence, and assigned that pitch sequence onto an existing strange attractor (I'm assuming it was the attractor starting at (1,1,1) with h=0.01, for instance). Then she chose a very similar starting point, and created the piece by approximating the differential equation at each discrete step t using the Runge-Kutta numerical method. The new piece started similarly to the old piece, but quickly deviated. However, it maintained some of the same tonal qualities. Due to the limitations of the model, rhythm stayed the same, in discrete 8th notes or whatever they were. It sounded great and the talk was packed with music people.

CHALLENGES
I didn't quite see how she was able to map the pitch sequences onto the Lorentz attractor, and then shift the notes based on the new attractor... She must have carefully chosen the mapping to make the new piece have similar key and chords, because I would think that the key would start to jump all over the place. I'd like to know how many pieces she created before she came across the one she settled on.

REFLECTIONS
The project raises some questions— when a composer creates variations on a theme, how are these variations different? What is the creative process there? Can chaos be used to model this creative process? This project sidestepped probabilistic methods entirely and used deterministic behavior built into the Lorentz equations instead.

Thursday, April 10, 2008

April 11 2008

MAIN POINTS
The text defines truncation error, a measure between the approximation and the exact solution of the differential equation. This error is shown to have behavior O(h) for Euler's method, where h is the value at which the function is computed.
The text then extends Euler's method. Euler's method is based on the n=1 Taylor polynomial, and the text extends this method to any value of n. This creates the "Taylor method of order n," which is just a substitution of the nth Taylor polynomial (sans error term) into Euler's method. This involves calculating n-1 derivatives, which the book is nice enough to carry out for 2nd and 4th order Taylor methods. The error of the 4th order method is impressively low.

CHALLENGES
How is this truncation error different from errors we've studied before? How can it measure the difference between the exact solution and the approximation if we don't know the exact solution? What is the significance of the final theorem— why does error seem to grow larger as the order of the method increases by (O(h^n))?

REFLECTION
Do you ever use this method when you're solving differential equations with a computer, or do you use the Runge-Kutta method from the next chapter? I'd like to see some real uses of differential equations, because I've only seen somewhat trivial ones. Are there any demos?

If you're doing anything like this with your lab over the summer, I'd like to check it out some time— I'll be staying here doing an internship (in scientific comp!) at the U.

Tuesday, April 8, 2008

April 9 200000008

MAIN POINTS
A differential equation describes the change of some variable with respect to another. Because they are often difficult or impossible to solve exactly, they are a good candidate for numerical methods. This chapter aims to find approximations at certain points, rather than the entire solution. The text defines the "Lipschitz condition," whos significance I don't immediately see. The further theorems arising that deal with convex and Lipshitz conditions confuse me.
Euler's Method is said to be seldom used in practice. It gives an approximation to a well-posd initial value problem. The method is derived through Taylor's theorem and is a fairly simple algorithm that uses recursion. The error is shown to grow as we get away from the initial value (typical of things that use a Taylor-style of thinking). A lot of this reading was lost on me because the reading defines theorem's without first showing why they are important, and sometimes without sufficiently defining terms.

CHALLENGES
The concept of "convex" works for me, but I have a hard time imagining what the "set" looks like. Pages 251-253 really confuse me. I still don't have a very good grasp of how a differential equation works. I will definitely need to go over a lot of these things in class, and I predict other people in the class will feel the same. Euler's method seems fairly simple, though.

REFLECTION
I've never done anything with differential equations. I'm not even really sure what we'd be approximating by using Euler's method— could we be approximating the location of a pendulum, for instance? In the Euler's method we see again the use of Taylor polynomials to derive a theorem. Most of what we've used in the class has been either polynomial interpolation or using Taylor... are there any other schools of thought in approximating functions like this?

Thursday, April 3, 2008

April 4 2008

MAIN POINTS
Simply using one instance of Newton-Cotes over a large domain is impractical and full of error. Dividing the domain up into sections and doing a Newton-Cotes integration of those pieces is much more accurate. The approximation is thus a sum of however many polynomials there are from splitting up the domain. The error term is the sum of each error term. The composite midpoint rule is described by Thm 4.6 as having a pretty error term. The midpoint rule is a common rule in calculus classes. This rule is run on a sample equation, and they practice on some Maple code.

CHALLENGES
Why do they show the Composite Midpoint Rule after the Composite Simpson's rule? I'd think the Midpoint rule would be the simpler one... everything else seems to follow directly from the Newton-Cotes formulas (error terms etc).

REFLECTION
I'd like to try this is discrete data points with no explicit formula. There is then a limit to the number of divisions you can make, and there is no error bound because you can't be sure of the derivative at any given point.
This seems similar to what Excel might use to draw continuous graphs out of discrete points— but the graph in Fig 4.7 has abrupt edges in the graph (instantaneous derivative change), whereas Excel produces continuous graphs.

Sunday, March 30, 2008

March 31 2008

MAIN POINTS
The text describes how we can estimate the definite integral of a function through "numerical quadrature," which is basically performed by selecting distinct points within the desired interval, evaluating them at the function, finding the Lagrange Interpolating Polynomial of these points, and then integrating the resulting polynomial. The error term for this integral looks nearly impossible to estimate, because it is inside both a product notation, and an integral.
The text describes the trapezoidal rule next, which is taught in high school, and just creates neighboring trapezoids out of function values. Simpson's rule finds the second-degree Lagrange polynomial between two function values. Its error term includes a fourth derivative, so knowing the fourth-derivative behavior of the function can give an estimate of how exact the answer will be.
Finally, the text makes a generalization of this class of Newton-Cotes' methods. I don't really understand how this generalization works, and how the n=... values of these methods are built up.

CHALLENGES
As written above, how are Newton-Cotes formulas built up? What's the difference between open and closed? Is it basically the Lagrange interpolating polynomial of n points being integrated? Is the basic integration model with the rectangles inside the function the model for n=0?

REFLECTION
This is more interesting than the normal approximation we learned in HS calc. As n increases, the amount of calculation seems to increase. The error term may or may not drop, it seems to depend on h and the upper derivative behavior of f(x).

Sunday, March 9, 2008

March 10 2008

MAIN POINTS
The reading begins with Thm 3.3, which defines the error function in much the same way that it was defined for Taylor polynomials. Xi(x) is within the bounds (a,b). The proof for this takes up a page and uses Rolle's Theorem. Ex2 I don't quite understand. Ex3 I understand more— they're comparing the error when using higher and higher degree polynomials, and the steps theyve explained already. The text also explains that one often has to calculate successive degree polynomials to find the one that reaches the correct accuracy.

CHALLENGES
I don't understand what they're doing in Example 2... so they're taking a table of values of e^x, and interpolating between them... but using a Lagrange polynomial of degree 1— does this mean they select two points from the function and make a polynomial out of them? What is the Bessel function?

REFLECTION
Lagrange strike me as being useful in the computational domain because computer data deals with discrete data points (in sound, pictures, video) and any time we want to try and think of these as continuous data, we have to make a transformation. The error terms here seem sort of difficult to calculate exactly, so that could be a problem in more open-ended applications of Lagrange polynomials. However, in A/V applications, it's probbaly possible to map out the domain of the problem really well to predict what sort of errors could be encountered.

Thursday, March 6, 2008

March 7 2008

MAIN POINTS
We discussed in class how Taylor polynomials aren't suitable for interpolation. Thus, polynomials are used. Finding a function passing through (x0,y0) and (x1,y1) is like solving for a function such that f(x0)=y0 and f(x1)=y1. The text defines L0 and L1 and then defines p(x) off of these two functions. They are formulated in a way that makes P(x0)=y0 and P(x1) = y1. This is generalized to any number of data points on p.105. Thm 3.2 says that this approximating function can exist with a degree of most n, and defines the function based off of the form given on p.105. Example 1 shows this method making much better predictions than Taylor polynomials did, and then shows the interp MAPLE command.

CHALLENGES
I understand why the numerator in the L{n,k} function is the way it is (to define the zeroes), but why is there also this very long denominator?

Also, I was expecting to see gigantic matrices like we saw in class... how is what they're doing here different?

REFLECTIONS
Lagrange sure has a lot of things named after him. This method seems more convenient than Taylor polynomials in a lot of ways— but I'm curious about situations in which this might be suboptimal.

Thursday, February 28, 2008

Feb 29 2008

MAIN POINTS
The reading first defines "inconsistent" systems of equations, which have no solution. These are probably extremely common in scientific computation because real world data (and data represented inside computers) as coefficients rarely would give an intersection with more than two lines, because a miniscule error in the coefficient could throw it all off. The reading writes a system in vector form, then reminds us that we can calculate the shortest distance from a point to a plane by using the concept of orthogonality, which would give a dot product of 0. This is used to create an equation for finding the least squares for any system.
4.1.2 Generalizes this equation for the specific purpose of finding data to fit a curve.
The solution follows the same steps. Fortunately, the text gives a MATLAB command to calculate least squares.

The second half of the reading takes an example we've touched upon in class — periodic temperature. This leads to a large number of data points being punched into the formula. Then it covers exponential growth, going on to Moore's law and concentration of checmicals in the bloodstream.... a hearty dose of examples.

CHALLENGES
I see that there is equation 4.6. I don't understand what the operation A^T is doing— it's turning A on its side, but what are the implications of this? Also, I don't get how they solve for the final least-squares solution with Gaussian elimination (on the top of page 197). Because of this, I still don't really get the process of turning this large system into a least square number, especially when dealing with curves that aren't lines. I would like to see an example calculation before we try this in MATLAB!

REFLECTIONS
The fact that models such as sines and exponential functions can work inside this matrix model for best fit is interesting. I've used simple equations for regression lines in high school biology and chemistry, and generated them automatically in excel for college chemistry, and never understood the reasoning behind them. This could be very useful in robotics— estimating things like movement and vision in which a limited number of discrete data points is available. The text also points out how it is useful for compression, and I'm curious about what kinds of compression use these ideas.

Tuesday, February 19, 2008

Feb 20 2008

MAIN POINTS
The text defines special arithmetic operations that act within the limits of floating point numbers under the IEEE standard. Tables 1.2 and 1.3 show the difference between using the operations on numbers that are far apart and numbers that are very close. When the two are close, as in Table 1.3, the relative error is much higher. Example 5 stresses how important the formulation of the problem is, because this can affect the number of significant digits. Just formulating the problem as an addition rather that a subtraction can have huge effects on the number of significant digits. The final example uses nesting as a way to minimize error.

CHALLENGES
I'm assuming it's a typo when they say on p.25 that x^3=x^3*x. I feel like it would be challenging to keep track of errors as they propagate through lengthy calculations, if we weren't given an exact value to compare to. It would also be challenging to keep track of all of this while implementing it in software.

REFLECTION
The book truncates numbers to a certain number of base-10 digits. However, isn't this inconsistent with how the numbers are represented within the computer? They would have to be truncated to a certain number of base-2 digits. This probably affects the measure of relative error very slightly.

It's interesting that rounding produces a slightly greater error than truncation in Example 6.

Sunday, February 17, 2008

Feb 18 2008

MAIN POINTS
The text emphasizes that our ideal notion of numbers cannot occur inside computers, because they must be represented within a finite number of bits. It then fleshes out the IEEE standard for representing floating point numbers in computers. This is made up of a certain number of mantissa and exponent (characteristic) bits, and a sign bit. This length varies on which standardized length. The mantissa is normalized to maximize precision (bits are shifted as far left as possible) and the exponent is calculated by subtracting from a large number. Underflow and overflow are also discussed, and the fact that two zeroes are possible. Rounding and chopping (truncating) are discussed. Rounding has the disadvantage that many digits might have to be changed in other fields. Relative and absolute errors are defined again, the same way they were in earlier readings. The notion of significant digits is defined.

CHALLENGES
I've never seen significant digits defined this way— I only used them in chem, bio and physics. I've recently covered the IEEE floating point form in COMP240, but I could see how this brief introduction could confuse people.

REFLECTION
It would be interesting to see how we would propagate errors, as truncated numbers are used in further calculations. Also, what happens in computers when we want to minimize errors, and the IEEE form isn't good enough? Could the user define a special data type using many more bits? Or create a special array?

Tuesday, February 12, 2008

Feb 13 2008

MAIN POINTS
The first section defines the rate of convergence, particularly focusing on linear and quadratic convergence. This is defined by a limit as a fraction of successive terms goes to infinity. A table demonstrates the dramatic difference between linear convergence and quadratic convergence. Linear convergence doesn't seem to even be much of an improvement upon the bisection method, for instance. I'm assuming that the lambda term is a goal of accuracy decided upon beforehand.
The second section defines zero multiplicity as.. sort of the degree of the polynomial around the zero. A zero of multiplicity one is 'simple.' Thm 2.10 says there is a simple zero at p only if f(p)=0 and f'(p) != 0. This is extended to zeroes of any multiplicity by making further and further claims about deeper derivatives. Examples are given that show how to deal with zeroes of various multiplicity.

CHALLENGES
In the first section, the relationship between the lambda term and the alpha term confuses me— are they determined by the equation in question, or by the person doing the calculation. In the second part, I don't really understand the significance of zeroes of various multiplicities— does this directly relate to how the method converges, and if so, how?

REFLECTION
The quantification of convergence sheds more light on when Newton's method is inefficient, which was not obvious before. This seems to be furthered by the investigation into zeros of various multiplicity which affect the type of convergence— leading to a clear classification of the behavior of the method.

Sunday, February 10, 2008

MAIN POINTS
The first section of the reading derives the Newton method by creating a Taylor polynomial where f(p) = 0. If the assumption is made that |p-p0| is very small, then only the initial taylor terms are needed— the polynomial is made to equal to 0, and then solved for p. (Because p=x, this is essentially solving for x). After explaining the process as one of tangents (which I didn't quite get), the text describes the algorithm for using the Newton method, including a discussion of Tolerance. It also mentions that the Newton method fails when f'(x)=0. Ex.1 shows an instance of this, because -sin(0)=0.
The second part of the reading describes the Secant method, which iterates through different secants until the answer is within the tolerance. This is described as slightly slower than the Newton method.

CHALLENGES
I've used Newton's method before, so it wasn't quite a challenge. However, the diagram on p. 64 was a little unclear— about how Newton's method actually works graphically. The secant method /was/ confusing to me, because I don't think the actual description of how it works was included in the reading assignment. This is another thing where fitting the illustration to the formula would be helpful for understanding the concept itself (rather than just the equation).

REFLECTION
I learned the Newton method in high school, where we learned how to use the graphing calculator to make it go quickly. This was my first exposure to numerical methods! This seems like one of the optimal methods scientifically, if you are given explicit formulas, however it is not guaranteed to produce results. I'm wondering how a computer might be able to tell if it is producing results. Plus, in order to carry it out, the computer would have to be able to estimate derivatives— and what if it used the Newton method to do that— and ended up in an infinite cycle of using derivatives to estimate derivatives...

Tuesday, February 5, 2008

Feb 6 2008

MAIN POINTS
Table 2.1 makes two weaknesses of the bisection method apparent— it's not guaranteed that every recursion will produce a better answer, and it can take many recursions to produce an acceptably accurate answer (convergence is slow). Theorem 2.1 gives a formula for calculating approximation error. This is calculated by noting that the window shrinks by powers of two. The text also notes that the actual error can be much smaller than the estimated error. Example 2 shows how to start with the desired accuracy, and use logarithms to calculate the maximum number of iterations needer to attain that accuracy.

CHALLENGES
The very end of the reading confused me. I do not completely understand why they change the equation like this to factor in computer rounding errors. I also do not understand the use of the signum function— and how they prevent over/underflow errors. I understood the theorem, but it would be nice to work through an example.

REFLECTION
There's very little material for reflection here. I suppose that the ending part about considering computer rounding errors is important, particularly scientifically. It would be something that is easy to forget— possibly with bad consequences.

Sunday, February 3, 2008

Feb 4 2008

MAIN POINTS
The first couple of pages are again a restating of calculus theorems. The first is Rolle's Theorem, and the second is the Intermediate Value Theorem. The intermediate value theorem is sort of an obvious result but apparently the proof is difficult (but not shown here). I was less clear on Rolle's theorem. The IVT leads off by saying an efficient way to find the intermediate solution would be given in chapter 2, which is where the next assignment is. In the second part, there is an example with a differential equation (which I didn't really get), and then the text discusses the Bisection Method. This seems to be a divide-and-conquer method of root-finding. This takes the form of an algorithm. The section also discusses divergence, and trying to find optimal bounds.

CHALLENGES
I was not clear on how they constructed the statement of Rolle's theorem, and how this theorem is useful. I would also like to go over the concept of a differential equation, because, though I've seen them before, I've forgotten how the work. As for the bisection method, it would be good to see an example where it converges and an example where it diverges.

REFLECTIONS
It's interesting that the Babylonians used methods like this without computers (and it would blow my mind to use a base-60 number system). I just realized that I used this bisection method in my computer science class in high school to calculate square roots. It was an improvement on the previous method we had used, which was to take fixed steps until we crossed over the value, and then decrease the step value and change directions.

Tuesday, January 29, 2008

Wed 1/30 Taylor Series

Main Points
This chapter is simply a list of calculus theorems with examples of these theorems being used. The 4 pages given state Taylor's theorem, and then show how to solve for low-order Taylor polynomials for f(x)=cosx, approximating cos(0.01), and then approximating a simple definite integral of cosx. Finding the polynomials is a simple matter of subsituting into the formula. Approximating cox(0.01) is another matter of substitution, and the truncation error is calculated by using the knowledge that Xi(x) lies in the interval [-1,1] (trivially), and then in the interval [0,0.1] (using Taylor's theorem). The integral approximation is performed by simply integrating the polynomial produced by simplifying the third-order Taylor polynomial for f(x)=cosx. The error is then calculated by integrating the Taylor polynomial remainder term within the same bounds.

Challenges
I was confused about why the example first uses [-1,1] as the error bound for Xi(x), even though we know by Taylor's theorem that Xi(x) is between 0 and 0.01. I did not find this reading challenging, but I feel like the logic tying the Taylor theorem to the actual approximation of the function could be difficult, because the Taylor's theorem does not explicitly state that it is for approximation with known error bounds. Also, the text did not really point out how error grows as x increases in distance from x0.

Reflections
I thought it was really elegant how the Taylor polynomial can be integrated to approximate the integral of the actual function, and even cooler how you can extend this to a calculation of the error bound. I see this theorem as playing an important part in numerical approximation of functions and integrals, although I am uncertain how we would deal with finding all of the derivatives using a computer.

Sunday, January 27, 2008

Post One

Name: Cas ey B

Year: Junior

Major: Comp Sci

Minor: Math

Math Classes Taken: Discrete Math, Multivariable Calculus, Theory of Computation, Topics in Graph Theory

Weakest Part: Number theory, Probability & Statistics

Strongest Part: Computer Science, Discrete + Graph Theory, Calculus (but not diff eq)
Why I'm taking the class: I think it will be helpful after college, and may use it towards a math major.

What I want out of it: becoming comfortable with implementing math in programs, becoming comfortable with linear alg, diff eq, and statistics.

Interests: experimental electronic music, graph theory, networks, artificial intelligence, local music scene

Worst teacher: Unstructured assignments, class was just a reiteration of the reading, but was even easier. Spent too much time on simple topics. Didn't ask for feedback from the class, didn't ask the class questions.

Best teacher: well-structured class with weekly assignments, always willing to help, gave very challenging but not overly long problem sets. Enthusiastic.

casey b math 365 blog