and it turned out it was a huge deal.

So, if you choose your alpha too big,

it causes the iteration's when you're calculating that error value to get really big,

and the algorithm looks at the error,

and goes "I got to go a negative.",

and it goes way negative,

and then it sees this way negative,

and I going to go away positive again.

So, it goes with more positive.

Then, next iteration, it goes even more negative,

and it ran away, and eventually,

Python stores real numbers lighting as double precision floating point,

and overflow double-precision floating-point,

scratching my head, what's going on here.

I have a graph of that coming up here,

so we won't dwell on this,

but I decode this myself.

So, I picked some values for theta outer range,

and I just learn this through trial and error.

I've decided that five times 10 to the 20th and my theta's where that big.

I was headed in the wrong direction,

and I would stop the simulation from running,

and I would stop the program.

I also wanted to make sure that as we were iterating,

as we were doing our gradient descent and coming down,

you're looking for how much of the thetas change from one iteration to another.

So, as long as the threshold was over the value of two, I kept going.

When the change from one theta update to the next day to update became less than two,

then I said, "Okay, I converged."

Okay, and you can turn these numbers.

Now, look at what I had a set of alpha two in order to keep it from running away.

It was very, very small number.

So, you can look at that code and that's what it does.

So, this is what I saw happening.

So, the theta is set to zero, then it went negative,

then it went positive, and then in the next iteration,

they went more negative, and more positive, and more negative,

and more positive, and then more negative,

and more positive, and then in that case,

it kept going back and forth,

and eventually overflow double-precision floating-point,

I got an error message from Python.

What's going on here? So, it took me a little while to figure that out

and was playing with these hyperparameters in order to get it to work.

We wanted to look like is this,

so theta started out here,

set it equal to zero,

and zero iteration's go,

and see the theta values come up,

and start to asymptotically approach the value,

and then this is where that threshold value of two that I use.

So, as long as from one update to the next,

it was changing more than two.

I kept going and once the delta became less than two,

then I stopped it, and I said that was enough iterations,

and I'm going to use those theta values,

and then give it data,

then give the algorithm a data it's never seen before,

and saw the results of the parts.

Problems that can occur,