Articles, Blog

14 – Let and Efficiency

December 24, 2019


. In this segment, I wanna demonstrate why
it’s important to avoid doing recursive computations repeatedly and I wanna show
you that let expressions are the natural way to avoid that problem. So, to do this,
I’m gonna use a small example here. So let me show you what this code is doing. it’s,
it’s a bad example. It’s not gonna work well, so I’ve called it bad max. And the
idea is to take a list of integers, and return the maximum number, in the list.
Now, that doesn’t make any sense if the list is empty. So, in that case, I really
ought to do something like raise an exception, or in the next segment, I’ll
show you another option for how to deal with that. But since that’s not my point
here if the list is empty, I’ll just return to zero. Now this is terrible
style, but it’s not my point and, its not what’s bad about this function that I’m
emphasizing here, okay? So ignore that case. Otherwise, if the element, if the
list has just one element, so the tail is null, then return that one element, the
head of the one element list. That’s the maximum of one element list. Otherwise, if
the head is greater than the max of the rest of the list, return the head.
Otherwise, turn, return the max of the tail of the list. So this actually
ignoring the empty list case does work correctly, okay, and yet it’s still a
horrible algorithm. And to show you that I’m gonna use a couple of helper functions
I wrote, count up and count down. So here they are in the file. We don’t need to go
through them they just basically include all the numbers from the to inclusive
either from smaller to larger, or from larger to smaller. So just know that when
I come here and test out bad max here. So, here I am in my REPL, and let’s get
everything loaded up here. I also already have it fixed, so you’re gonna see a good
max here as well, that also has type int list arrow int. So we’re still working on
bad max. So suppose I do bad max countdown. let’s say from 30 to one. Okay?
So it turns out the maximum of that is 30, and sure enough, if I said 30 to two, it
would still be 30. And of course, this is a perfectly good function, I could make a
30,000 element list, and ask the max of it. Computers are fast these days. I get
30,000, no trouble. Okay? Now, what if I tried counting up instead? So, you know,
if I wanna count up from one to ten, that’s fine. The max is ten. One to
twenty, it’s fine. I did test this out ahead of time. Watch this. 25, you see a
little delay there, just a little bit of one. How about 28? A little bit more of
one. if you tried 30,000 you would be here for a very, very long time. In fact even
at 30, it seems to have quite a delay. In fact, I don’t even want to wait for it to
finish. Although, we’ll finish here in a few seconds. So what’s going on here?
Let’s look back at our code. It’s that count up-case that isn’t going well. So
lets think about what happens when the bigger number’s are near the end of the
list. So we come in here, we have like a 30 element list. It’s not empty, so we
don’t do list then zero. It’s not almost empty, so we don’t do this. So what we end
up doing, is computing what’s the maximum of a 29 element list. Well that’s no
problem. That’ll eventually come back and say 29. and then, we’ll say oh, it’s,
sorry, it’s 30 because we have two through 30. So then when we come back and we have
that 30, we’ll end up computing bad max again. So, naively, you might think oh, so
the count up version is going to twice as slow as the count down version. Well, no
that’s not the case. Because in that recursive call, where we have a 29 element
list, we’re going to end up calling bad max twice, on 28 element lists. But if we
call it one twice on a 29 element list and each of those calls it twice on the
twenty-eighth element list and each of those call it twice on the 27 element list
you can see this is actually doubling at each level so its not twice as many calls
its actually exponentially more calls and that is a huge difference. So let me try
to make this point a little better with some slides. here is our code here And
don’t worry a bout the calls to null and head and tail. It turns out those all do
just a little bit of work. They take the beginning of the list, and they either
just say whether it’s empty or not, or return the first element or just return a
reference to the rest of the list. So those are all fast. It’s those recursive
calls that bad max, we have to be careful about. And here’s what it looks like. In
the countdown case, bad max of 50 49 48 calls bad max of 49 48 which calls bad max
of 48 and so on. And if we had a 50 element list, we would end up doing about
50 calls. But in the count upcase. We end up, way if we start with our list making
two calls. Right, once for in the, between the if and the then, and once in the else.
And each of those recalls to the next two calls and each of those recalls the
next two calls. So it looks like one, two, four, eight. If you ever double the number
one 50 times, you’d get an enormous number, which is why even for 30 we saw a
noticeable pause before we finished. I also want to point out that counting up
here is not just the only case that has this problem. If you just have the biggest
number, more then 50 elements from the beginning of your list, it’s gonna take
forever to compute. So if you had a 3,000 element list, yes count down is very fast.
But even if those numbers were randomly shuffled, chances are the maximum element
is not going to be close enough to the beginning of the list for it to compute
efficiently. Okay so it looks like ML has a problem, right? It looks like recursion
is terrible, we should have used a four loop or something. Of course that’s not
how I feel, what we just need to do is avoid doing these repeated computations.
And we know how to avoid doing repeated computation. What we can do is do that
computation and store the result in the variable. And this is the last use I’m
showing in this section of the course of led expressions, and why they’re so
important. So all I want to do is avoid computing bad max twice by remembering its
answer. In a variable. So that’s what good max does. So I really shouldn’t call it
good max because it’s still doing the empty list case completely wrong. But
ignoring that, for nonempty lists it does the right thing. So if the tail of the
list is empty, then just return the head of the list. In the else case, compute the
maximum of the rest of the list and remember it. Here I’ve just stored a
variable tail ans, just to remember what it is. And now, let’s use that. They say
if the head of the list is greater than tail ants, then return head. Otherwise
don’t call good max again, just return tail ends and this will work absolutely
fine. So if I just try good max, oh look my answer for 30 did finish. Good max I
get 30 right away and I can even count up to 3,000, it will work just fine.
Countdown still works correctly as well. It didn’t actually need this, because in
the countdown case, well, it’s taking a minute here I think just because I’ve
created such a large list. We’ll look into that later and I’ll fix the code up for
you. Let’s go back to the slides. Oh, did I oh I know what’s going on here. I did
figure it out. By the way, control C control C to break an infinite loop. When
you call countdown, you’re supposed to put the larger number first. if you do it the
other way, countdown itself is going into an infinite loop because of how I wrote
it. Good backs never got called. There we go all fixed. Okay back to the slides
always fun to mess up on the fly. Alright. So, why is that happening? Why is bad max
taking so much longer? Well, you can just work out the arithmetic. Suppose that the
amount of work bad max does, ignoring the recursion, is say, takes one ten millionth
of a second. I have no idea if that’s anywhere close to right. But one ten
millionth of a second seems pretty fast. Right? Well, then in the case where we
made 50 recursive calls, it’s going to finish in 50 one millionth, ten millionths
of a second. Right. Or half a millionth of a second, faster than humans can tell. But
if we do it in the count up order. We have to multiply tha t one in ten million, by
two to the 50th. And if you work out the arithmatic, that answer for a 50-element
list, is going to take about three and a half years, and for a 55-element list it
would take over a century. So I always like to emphasize that it’s not about
computers getting faster, if you don’t write you code in an efficient way. And in
a language that encourages recursive functions, you need to avoid doing
unnecessary recursion that leads to this exponential blowup where you go one, two,
four, eight, sixteen. Okay. Once we’ve fixed our code like you see here, then
everything works great, because we only ever call good max once on the whole list,
then the tail of the list, then the tail of that list, and so on, thanks to our let
expression. So in both the count up case and the count down case and in fact any
other case, any order of elements in any list, we will just have 50 calls so if we
guess that it takes one ten millionth of a second to do one of these calls, it’s
gonna take 50 ten millionths of a second or one millionth of a second to do all of
them. Okay, so, that is how to use lead expressions combined with recursion to
avoid pathological cases where you’re extremely inefficient, and that concludes
our discussion of lead expressions for

You Might Also Like

1 Comment

  • Reply enditend February 4, 2015 at 1:39 pm

    I still can't understand how Let works , the recusive function first return the very end of the list right? in the "in" expression you compared hd xz with that end of the list , what was it the hd xz? confused..

  • Leave a Reply