Discrete Structures for Computing
Notes 5
------------------------------------------------------------------------
Chapter 3: The Fundamentals: Algorithms, the Integers, and Matrices
------------------------------------------------------------------------
(( For now, skip the number theory and matrix material. ))
An algorithm is a finite sequence of well-defined steps that solve
a problem. Algorithms are used to solve problems such as
* finding the largest among a set of numbers
* multiplying two integers
etc.
In computer science and computer engineering, we are particularly
concerned with how much computing resources are needed to solve a
problem: i.e., how long an algorithm takes and how much memory it
uses. This is usually called the COMPUTATIONAL COMPLEXITY.
We will study the notational tools used to measure the computational
complexity, the big-O and big-Theta notations.
------------------------------------------------------------------------
3.1 Algorithms
------------------------------------------------------------------------
When given a problem to be solved computationally, we first have
to construct a model that translates the problem into a mathematical
context. After we've done that, we can often use existing results to
help us figure out how to solve the problem. It is often the case
that two seemingly unrelated real world problems actually have the
same basic underlying structure. A trivial example would be:
sorting a set of social security numbers and sorting a set of
temperature measurements -- they are both examples of sorting a set
of elements from a totally ordered set.
Once we have a mathematical model of the problem, we need to come up
with an ALGORITHM: finite set of precise instructions that solve
the problem.
In this course we will describe algorithms using PSEUDOCODE,
which is intermediate between English prose and some programming language.
The advantages of pseudocode are
* not tied to any particular programming language, so more general
* avoids some of the tedious details that could clutter the description
of the algorithm and thus the intuitive understanding of it
WARNING: Be careful, though, when using pseudocode to make sure that
every step of the pseudocode is well-defined (i.e., can be translated
easily into a real programming language).
Algorithm to FIND MAXIMUM element of sequence a_1, a_2, ..., a_n:
max := a_1
for i := 2 to n
if max < a_i then max := a_i
return max
Desirable properties of this pseudocode description of a maximum-finding
algorithm:
* has well-defined input (a_1,...,a_2) and output (the return statement)
* each step is well-defined and can clearly be translated into an
actual programming language
* termination: eventually returns an answer
* correctness: informally we can see that max is always updated to be
the largest among all the input elements that have been examined
so far
* generality: works for any sequence of elements that can be compared
with < (not just a particular list of numbers)
Another important problem is SEARCHING:
input: a list of elements, a_1 to a_n, and a particular element x.
(for now, assume all the elements are distinct)
output: if x appears in the list, then return the index in the list
where x appears, otherwise return 0
Linear search algorithm:
-----------------------
i := 1
while (i <= n and x != a_i)
i := i+1
endwhile
if i <= n then return i else return 0
Drawback of linear search is that it can be relatively slow: if x is
not in the list, or is toward the end of the list, we have to check
all (or most) of the elements in the list.
Informally speaking, we can see that the running time is proportional
to the number of elements in the list.
Sometimes the list has an additional property, which is helpful: the
elements are listed in increasing order. In this case we can use
binary search, which can be faster:
Binary search example:
----------------------
search for 19 in the list
1 2 3 5 6 7 8 10 12 13 15 16 18 19 20 22
First, compare 19 with the element roughly in the middle of the list:
1 2 3 5 6 7 8 [10] 12 13 15 16 18 19 20 22
Since 19 > 10 We can "discard" the first half of the list.
Now repeat the process by looking at the element roughly in the middle
of the remaining list:
12 13 15 [16] 18 19 20 22
Since 19 > 16, we can "discard" the first half of this sublist.
Now repeat the process by looking at the element roughly in the
middle of the remaining list:
18 [19] 20 22
Since 19 = 19, we have found 19 in the list, which is at index 14.
So return 14.
Binary search algorithm:
-----------------------
inputs are integer x and array A of integers (indexed from 1 to n)
i := 1 // left endpoint of search interval
j := n // right endpoint of search interval
while i < j
m := floor((i+j)/2) // approx. middle of search interval
if x > A[m] then
i := m+1 // increase left endpoint of search interval
else
j := m // decrease right endpoint of search interval
endwhile
if x = A[i] then return i else return 0
------
Informally speaking, we can see that the running time is proportional
to the LOGARITHM of the number of elements in the list, which is a
huge improvement over linear search.
Yet another basic algorithm: SORTING elements of a list.
We've already seen how it can be very useful to have a sorted list:
reduces search time exponentially.
Many different sorting algorithms have been developed, with different
pros and cons and requirements for the input and output.
INSERTION SORT:
Idea is to increase the interval of the array that is in sorted order:
[ ... ] [ .........]
sorted not yet sorted
* take the next element x from the not yet sorted part
* find the correct location for x in the sorted part
* shift down to make room for x in the sorted part
sorted part unsorted part
[ ... ] [ ... ] [x .........]
^
x goes here
Insertion sort example:
----------------------
<< do example on 3, 2, 4, 1, 5 >>
Insertion sort algorithm:
------------------------
input: array A[1..n] of real numbers, n > 1
for j := 2 to n do // j is the index of the start of the unsorted part
m := A[j] // A[j] holds the next number to insert in the sorted part
i := 1
while m > A[i] // use linear search to find where m should go
i := i+1
endwhile
// now i holds the index where m should go
for k := j downto i+1 // shift to make room to insert m
A[k] := A[k-1]
endfor
A[i] := m
endfor
return A
---------------
<< can improve on this version by just doing the downward shift
and checking to see when to stop >>
<< skipping material on greedy algorithms and the halting problem >>
Some homework problems for Sec 3.1:
-----------------------------------
#3: Devise an algorithm that finds the sum of all the integers in a list.
Suppose input is A[1..n].
sum := 0
for i := 1 to n do
sum := sum + A[i]
endfor
return sum
#5: Devise an algorithm that takes as input a list of n integers
in nondecreasing order and outputs the list of all values that
occur more than once.
Idea: loop through list. It'll be something like this:
1,2,2,5,6,6,6,8,8,8,8,10,11,20,20
We want to output
2,6,8,20
Suppose input is A[1..n].
for i := 2 to n do
if A[i] = A[i-1] then // repeat
if (A[i] = 2) or // first occurrence of this repeated value
(A[i] != A[i-2])
output A[i]
endif
endif
endfor
#9: algorithm to determine if a string is a palindrome.
Idea is to compare first and last character for equality,
then 2nd and 2nd-to-last, then 3rd and 3rd-to-last.
Notice issue whether string length is odd or even: if odd, then
middle character doesn't have (or need) a match.
Suppose input is S[1..n].
for i := 1 to floor(n/2) do
if S[i] != S[n-i+1]
return false
endif
endfor
return true
What is going on with the floor(n/2)? If n is even, then floor(n/2)
equals n/2. E.g., if n = 8, then we check S[1] against S[8],
S[2] against S[7], S[3] against S[6], and S[4] against S[5].
If n is odd, then floor(n/2) is n/2 - 1/2.
E.g., if n = 9, then we check S[1] against S[9],
S[2] against S[8], S[3] against S[7], and S[4] against S[6]; no
need to check S[5].
------------------------------------------------------------------------
3.2 Growth of Functions
------------------------------------------------------------------------
We would like to be able to get a better handle on estimating how
long it takes for algorithms to run.
Some issues:
* How to describe the running time?
As function of the input size.
With the examples above, use the array length.
* Different inputs, even of the same length, might take different
amounts of time.
- best case
- average case (how do we define this?)
- worst case (done most commonly)
* Running time depends on
- programming language
- compiler
- hardware
- level of multiprogramming
Abstract away from these details using ASYMPTOTIC ANALYSIS.
What is asymptotic analysis?
Informally, measure time as a function of input size and
* ignore multiplicative constants
* ignore lower order terms.
Focus on the behavior as the input size (n) goes to infinity.
Motivation/Justification: We are interested in the behavior of the
algorithm in the limit, as the input grows larger and larger.
As the input increases, the effect of lower order terms and even
multiplicative constanst become negligible.
Also, it gives a common way to compare different algorithms,
independent of how they might be implemented and executed.
We use BIG-O NOTATION as a way to do the asymptotic analysis.
This notation is a mathematically rigorous way to "ignore multiplicative
constants and lower order terms".
DEFINITION: f(x) IS BIG-O OF g(x), written f(x) = O(g(x)), if
there are positive constants C and k such that
f(x) <= C*g(x) for all x > k
<< simplifying def'n by dropping the absolute value signs; let's just
deal with positive functions. C NEEDS TO BE POSITIVE. >>
Example: f(x) = x^2 + 2x + 1 = O(x^2).
Why? Informally, we look at the highest term in the polynomial and
drop the rest. How can we support this formally? We need to find
C and k such that x^2 + 2x + 1 <= C*x^2 for all x > k. There are
an infinite number of C's and k's that will work - we just have to
find one pair.
Intuition is that x^2 grows faster than 2x + 1, so eventually we should
be able to use some extra x^2's to cover the 2x + 1.
I.e., x^2 + 2x + 1 <= x^2 + some additional x^2 terms.
Let's try an extra three x^2 terms, i.e., C = 4.
x^2 + 2x + 1 <= 4*x^2.
Note that when x = 1, the LHS = 4 and the RHS = 4.
The difference between the LHS and the RHS keeps increasing,
so we can set k = 1. I.e.,
x^2 + 2x + 1 <= 4x^2 for all x > 1.
To verify: <*>
Since x > 1, x - 1 > 0 and 3x + 1 > 0.
Thus (3x+1)(x-1) > 0
Thus 3x^2 + x - 3x - 1 = 3x^2 - 2x - 1 > 0
Thus -3x^2 + 2x + 1 < 0
Thus x^2 + 2x + 1 < 4x^2.
<< draw Fig 1 on p. 181 >>
Notice that C = 3 and k = 2 also work, i.e.,
x^2 + 2x + 1 <= 3x^2 for all x > 2.
Why? Since x > 2, we have that x^2 + 2x + 1 < x^2 + x*x + x*x = 3*x^2.
-----
Show that 7x^2 = O(x^2).
Set C = 7. Then for all x, 7x^2 <= 7*x^2, so we can set x = 1.
----
Show that n^2 is NOT O(n). This is a little trickier.
What is wrong with this argument?
* If we set C = 3 and k = 2, we have n^2 <= 3*n for all n > 2.
* But this is not true because when n = 4, we would have 4*4 <= 3*4,
which is false.
We just found one pair of C and k that did not work. We have not
yet proved that *no* pair of C and k can work.
Here is how to do that: Suppose in contradiction there is a C and k
that work. I.e., n^2 <= C*n for all n > k.
Dividing both sides by n gives n <= C for all n > k.
But C is a fixed constant and eventually there is a value of n that is
larger than C. So n^2 is not O(n).
THEOREM: Let f(x) = a_n*x^n + ... + a_1*x + a_0. Then f(x) is O(x^n).
PROOF:
f(x) <= |a_n|x^n + |a_{n-1}|x^{n-1} + ... + |a_1|*x + |a_0|
by rules of absolute value (since some of the coefficients might
be negative)
= x^n(|a_n| + |a_{n-1}|/x + ... + |a_1|/x^{n-1} + |a_0|/x^n)
pulling x^n out in front
<= x^n(|a_n| +|a_{n-1}| + ... + |a_1| + |a_0|)
as long as we have x > 1
So we can use k = 1 and C = |a_n| +|a_{n-1}| + ... + |a_1| + |a_0|.
QED
Example: Let's estimate the sum of the first n positive integers:
1 + 2 + 3 + ... + n <= n + n + n + ... + n = n^2 = O(n^2)
Example: Let's estimate the factorial funtion n!:
n! = 1*2*3*...*(n-1)*n <= n*n*n*...*n*n = n^n = O(n^n)
Example: Let's estimate the logarithm of the factorial function log(n!):
log(n!) <= log(n^n) = n log n = O(n log n).
IMPORTANT FUNCTIONS AND THEIR RATES OF GROWTH:
1, log n, n, n log n, n^2, 2^n, n!
<< draw Fig 3 on p. 187 >>
Combining functions and their big-o estimations:
THEOREM: If f1(x) = O(g1(x)) and f2(x) = O(g2(x)), then
(f1+f2)(x) = O(max(g1(x),g2(x)).
PROOF:
* f1(x) = O(g1(x)) means there exist C1 and k1 s.t.
f1(x) <= C1*g1(x) for all x > k1
* f2(x) = O(g2(x)) means there exist C2 and k2 s.t.
f2(x) <= C2*g2(x) for all x > k2.
* So f1(x) + f2(x) <= C1*g1(x) + C2*g2(x) for all x > max(k1,k2)
* Notice that g1(x) <= max(g1(x),g2(x))
and g2(x) <= max(g1(x),g2(x))
* So f1(x) + f2(x) <= (C1 + C2)*max(g1(x),g2(x)) for all x > max(k1,k2).
* I.e., set C = C1 + C2 and k = max(k1,k2).
QED
COROLLARY: If f1(x) and f2(x) are both O(g(x)), then (f1+f2)(x) is O(g(x)).
THEOREM: If f1(x) = O(g1(x)) and f2(x) = O(g2(x)), then
(f1*f2)(x) = O(g1(x)*g2(x)).
PROOF:
* f1(x) = O(g1(x)) means there exist C1 and k1 s.t.
f1(x) <= C1*g1(x) for all x > k1
* f2(x) = O(g2(x)) means there exist C2 and k2 s.t.
f2(x) <= C2*g2(x) for all x > k2
* Then (f1*f2)(x) = f1(x)*f2(x)
<= C1*g1(x)*C2*g2(x) for all x > max(k1,k2)
= (C1*C2))*g1(x)*g2(x)
* So (f1*f2)(x) = O(g1(x)*g2(x)) if we let C = C1*C2 and k = max(k1,k2).
QED
Example: What is the big-o estimate for
f(n) = 3nlog(n!) + (n^2 + 3)log n?
Handwavy way (drop lower-order terms and multiplicative constants):
Earlier example showed that log(n!) is O(n log n).
Thus f(n) = O(3n*n*log n + (n^2 + 3)log n)
= O(n^2 log n).
Rigorous math to back up the handwavy way:
* f(n) = f1(n)*f2(n) + f3(n)*f4(n), where
f1(n) = 3n
f2(n) = log(n!)
f3(n) = n^2 + 3
f4(n) = log n
* f1(n) = 3n is O(n) (simple argument, C = 3 and k = 1)
* f2(n) = log (n!) is O(n log n) (previous example)
* f3(n) = n^2 + 3 is O(n^2)
(simple argument: n^2 + 3 <= n^2 + n^2 for n > 2,
so use C = 2 and k = 2)
* Then use "product theorem" above to get that
f(n) is O(n^2 log n + n^2 log n)
* Then use "sum theorem" above to get that
f(n) is O(n^2 log n).
----
Example: What is the big-o estimate for
f(x) = (x+1)log(x^2+1)+3x^2?
Handwavy way: f(x) is O(x*log x^2 + x^2) = O(x^2).
Rigorous math to back it up;
* f(x) = f1(x)*f2(x) + f3(x) where
f1(x) = x+1
f2(x) = log(x^2+1)
f3(x) = 3x^2
* f1(x) is O(x) (simple arg: x + 1 < x + x for x > 1, so use C = 2 and k = 1)
* f2(x) is O(log x):
x^2 + 1 < x^2 + x^2 for x > 1,
so log(x^2 + 1) < log(2*x^2) = log 2 + 2*log x < 3*log x for x > 2.
So use C = 3 and k = 2.
* f3(x) is O(x^2) (use C = 3, k = 1).
* Thus f(x) is O(x*log x + x^2), which by the sum and product theorems
is O(x^2), since x*log x < x*x.
----
IMPORTANT POINT: big-o gives us UPPER BOUNDS on functions.
I.e., the value of function f is *at most* the value of function g,
for large enough inputs (and ignoring multiplicative constants and
lower order terms).
But it is possible that we are over-estimating the behavior of f
by comparing it to g. For instance, n is O(n^100).
What if we want to know a LOWER BOUND on a function: I.e., we would
like to know that it is *at least* so big? We use a similar notation,
called BIG-OMEGA.
DEFINITION: f(x) is Omega(g(x)) if there exist positive constants C
and k such that f(x) >= C*g(x) for all x > k.
Example: f(x)= 8x^3 + 5x^2 + 7 is Omega(x^3).
Why? 8x^3 + 5x^2 + 7 >= x^3 for all x > 0, so we can use C = 1 and k = 0.
FACT: f(x) is Omega(g(x)) if and only if g(x) is O(f(x)).
So we can pin down the behavior of a function f as lying between
an upper bound, big-o, and a lower bound, big-omega.
For instance, suppose f(x) = x^5. Then f(x) = O(x^7) and f(x) = Omega(x^3).
We can tighten the net even more, though, and show that f(x) = O(x^5)
and f(x) = Omega(x^5). Notice that the upper bound and lower bound
functions are the same! In some sense we have now captured the
(asymptotic) behavior of the function. We say we have a "tight bound" on f.
DEFINITION: If f(x) is O(g(x)) and f(x) is Omega(g(x)), then
we say that f(x) is Theta(g(x)).
Example: Earlier we showed that the sum of the first n positive integers
is O(n^2). Can we show that it is also Omega(n^2)? If so, we will have
a tight bound.
This argument uses a common trick: note that about half the numbers
in the sum are at least n/2. E.g., the sum 1 + 2 + 3 + ... + 7 + 8
has at least four summands that are >= 4 (4, 5, 6, 7, and 8).
So the sum must be at least (n/2)*(n/2) = n^2/4, which in our handwavy
way is Omega(n^2).
To proceed more carefully:
Let m = ceil(n/2).
1 + 2 + 3 + ... + (n-1) + n >= m + (m+1) + (m+2) + ... + (n-1) + n
>= m + m + m + ... + m + m
= (n - (m-1))*m since we dropped the first m-1 terms
>= m*m since m = ceil(n/2)
>= (n/2)(n/2) since m = ceil(n/2) >= n/2
= (n^2)/4
Now notice that n^2/4 is Omega(n^2), by letting C = 1/4 and k = 1.
THEOREM: If f(x) = a_n*x^n + a_{n-1}*x^{n-1} + ... + a_1*x + a_0,
then f(x) is Theta(x^n).
IMPORTANT NOTATION:
O(1) means constant. I.e., there exists some C such that f(x) <= C
for all sufficiently large x.
Some homework problems for Sec 3.2:
-----------------------------------
#1: Which of these functions is O(x)?
a) f(x) = 10. Yes: 10 <= 1*x for x > 10, so let C = 1 and k = 10.
b) f(x) = 3x + 7. Yes: 3x + 7 <= 3x + x = 4x for x > 7,
so let C = 4 and k = 7.
c) f(x) = x^2 + x + 1. No: We can show f(x) is Omega(x^2).
d) f(x) = 5 log x. Yes: 5 log x < 5 x for x > 1, so let C = 5 and k = 1.
e) f(x) = floor(x). Yes: floor(x) <= x for x > 1, so let C = 1 and k = 1.
f) f(x) = ceil(x/2). Yes: ceil(x/2) <= x for x > 2, so let C = 1 and k = 2.
#19: Give as good a big-o estimate as possbile for each function:
a) f(n) = (n^2 + 8)(n + 1) = n^3 + 8n + n^2 + 8 = O(n^3) by polynomial thm.
b) f(n) = (n log n + n^2)(n^3 + 2) = O(n^2 * n^3) = O(n^5)
c) f(n) = (n! + 2^n)(n^3 + log(n^2 + 1))
* n! is O(n^n), which is asymptotically larger than 2^n, so
the first factor is O(n^n).
* log(n^2 + 1) <= log(n^2 + n^2) = log(2*n^2) = log 2 + 2 log n
= 1 + 2 log n = O(log n), so, since n^3 is asymptotically
larger than log n, the second factor is O(n^3).
* Thus f(n) is O(n^n * n^3) = O(n^{n+3}).
#23: For each function below, determine whether it is Omega(x^2):
a) f(x) = 17x + 11. No. We can show that it is O(x).
b) f(x) = x^2 + 1000. Yes: x^2 + 1000 > x^2, so let C = 1 and k = 1.
c) f(x) = x log x. No. x log x < x^2 for all x > 1.
d) f(x) = x^4/2. Yes: x^4/2 > x^2 for all x > 2, so let C = 1 and k = 2.
e) f(x) = 2^x. Yes: 2^x > x^2 for all x > 2, so let C = 1 and k = 2.
f) f(x) = floor(x)*ceil(x). Yes: floor(x)*ceil(x) > (x-1)*x = x^2 - x
and x^2 - x > x^2 - x^2/2 = x^2/2 for all x > 2, so let C = 1/2 and k = 2.
------------------------------------------------------------------------
3.3 Complexity of Algorithms
------------------------------------------------------------------------
Now we tie together the last two topics, (1) algorithms and (2)
asymptotic analysis using big-oh notation, in order to analyze
how efficient algorithms are, as input sizes get large.
The efficiency of an algorithm is called its COMPUTATIONAL
COMPLEXITY. Focus is on
* time complexity (how long the program takes to run) and
* space complexity (how much memory space the program uses).
Actually for this class, just focus on time.
We will measure time complexity as the number of "basic operations"
that are executed. Recall that we don't want to measure wall clock
time because of impact of compiler, hardware, multiprogramming level, etc.
We will make the simplifying assumption that all operations take
the same amount of time. This is reasonable since the time of
different operations will only vary by a constant factor (e.g,.
one might be 4 times as slow as another), but we are not worrying
about constant factors.
Let's revisit the maximum finding algorithm:
1. max := A[1]
2. for i := 2 to n
3. if max < A[i] then max := A[i]
4. return max
* line 1: 1 op (assignment)
* line 2: for loop condition: i is incremented and tested n times;
last time is when i = n+1 and we fall out of the for loop.
So each time we have 3 ops (one increment, one assignment to i,
and one comparison against n)
* line 3: there are n-1 iterations of the body of the for loop,
each iteration of the body of the for loop involves at most
2 ops (one comparison and one assignment)
* line 4: 1 op (return)
Total number of operations is at most 1 + n*3 + (n-1)*2 + 1 = 5n = O(n).
NOTE: Textbook only counts number of comparisons, so slightly different
in the details.
-------------------------
Let's revisit the linear search algorithm:
1. i := 1
2. while (i <= n and x != A[i])
3. i := i+1
4. endwhile
5. if i <= n then return i else return 0
* line 2: 1 op
* line 2: while loop condition: testing requires 2 comparisons
and an "and"; maximum number of tests is n+1 (when x
is not in the list A)
* line 3: body of while loop is executed at most n times (when
x is not in the list, or x is the last element in the list);
each iteration is 2 ops (addition and assignment)
* line 4: not executable
* line 5: 2 ops (one comparison and one return)
Total number of ops is at most 1 + 2(n+1) + 2*n + 2 = 4n + 5 = O(n).
-------------------------
Let's revisit the binary search algorithm:
1. i := 1 // left endpoint of search interval
2. j := n // right endpoint of search interval
3. while i < j
4. m := floor((i+j)/2) // approx. middle of search interval
5. if x > A[m] then
6. i := m+1 // increase left endpoint of search interval
7. else
8. j := m // decrease right endpoint of search interval
9. endwhile
10. if x = A[i] then return i else return 0
* line 1: 1 op
* line 2: 2 ops
* line 3: each test of the while loop condition takes 1 op;
how many times is the while loop executed? See below.
* line 4: approx 4 ops per iteration (add, divide, floor, assign)
* line 5: 1 op per iteration
* line 6: 2 ops per iteration
* line 7: not executable
* line 8: 1 op per iteration
* line 9: not executable
* line 10: 2 ops
So total number of ops is
1 + 2 + 1(x+1) + (4 + 1 + 2 + 1)x + 2 = 9x + 6.
Calculate x, number of iterations of while loop:
Suppose n is a power of 2, say 2^k.
So k = log_2 n.
At each iteration, either i is increased or j is decreased.
This continues until j - i < 0.
At the end of each iteration, j - i has been cut in half:
After iteration 1, j - i = 2^k/2 = 2^{k-1}
After iteration 2, j - i = 2^{k-1}/2 = 2^{k-2}
After iteration 3, j - i = 2^{k-2}/2 = 2^{k-3}
etc.
How long can we go until j - i < 0?
After k iterations, j - i = 2^0, so after one more, j < i.
So x = k+1.
Thus the number of operations is 9(log n + 1) + 6 = O(log n).
*** This is much faster then O(n) from linear search! ***
-------------------------
Let's revisit the insertion sort algorithm.
Version given in book works like this:
for each element A[j] in the list A, starting with A[2] and going to A[n]
do linear search starting at A[1] to find where the current element goes
in A[1..j]
let i be the place where it goes
shift the elements in A[i..j] one place to the right to make room
The drawback of this algorithm is that we do some unnecessary work.
We can combine the searching and shifting. The new algorithm reduces
the running time in some situations and is also easier to analyze.
So here is the improved algorithm:
1. for j := 2 to n do
2. m := A[j]
3. i := j - 1
4. while i > 0 and A[i] > m do
5. A[i+1] := A[i] // shift
6. i := i - 1
7. endwhile
8. A[i+1] := m
* line 1: for loop: test is executed n times, each test is 2 ops;
number of iterations of body is n-1
* line 2: 1 op per iteration of for loop
* line 3: 1 op per iteration of for loop
* line 4: while loop: Each test of condition involves 3 ops.
See below for number of iterations.
* line 5: 2 ops per iteration of while (one addition, one assignment)
* line 6: 2 ops per iteration of while (one subtraction, one assignment)
* line 7: not executable
* line 8: 2 ops per iteration of for loop
Notice that the while loop can be executed a different number of
times in different iterations of the outer for loop. Let
t_2 be the number of iterations of the while loop when j = 2 (index
of for loop),
t_3 be the number of iterations of the while loop when j = 3,
etc.
Now to calculate the t_j's: Worst case is when we have to do the maximum
amount of searching and shifting in each iteration of the for loop.
This occurs when the input array A is in reverse sorted order:
Ex: 6 5 4 3 2 1
When j = 2, shift one array element, A[1]. So t_2 = 1.
When j = 3, shift two array elements, A[2] and A[1]. So t_3 = 2.
When j = 4, shift three array elements, A[3], A[2], and A[1]. So t_4 = 3.
...
When j = n, shift n-1 array elements, A[n-1] through A[1]. So t_n = n-1.
So in general, t_j = j - 1.
Total number of ops is:
n*2 + (n-1)(1 + 1 + 2) + Sum_{j=2}^n ((t_j+1)*3 + t_j*(2+2))
= 9n - 7 + 7*Sum_{j=2}^n (j-1)
= (7/2)*n^2 + (11/2)*n - 7
= O(n^2).
---------------
Terminology and some examples:
CONSTANT COMPLEXITY : O(1) (or Theta(1))
running time is independent of input size.
Ex: return the first element of a list
LOGARITHMIC : O(log n)
Ex: binary search
LINEAR : O(n)
Ex: linear search
QUADRATIC : O(n^2)
Ex: insertion sort
POLYNOMIAL : O(n^b) for some constant b
Ex: all of the above examples
EXPONENTIAL : O(b^n) for some constant b
Ex: generate all sets in the power set of some input set
(remember there are 2^n sets in the power set)
------------
To get a feeling for how fast these functions grow with respect
to each other, look at Table 2 on page 198:
* Assume each bit operation can be done in 10^{-9} sec
(i.e., 10^9 = 1,000,000,000, a billion, bit ops per second)
* Consider problem sizes n = 10, 100, 1000, etc.
* See how long it would take an algorithm whose running time
is f(n) bit operations, for
f(n) = log n, n, n log n, n^2, 2^n, and n!
<< copy table >>
Some example exercises:
#7: Algorithm to evaluate a polynomial
a_n*x^n + a_{n-1}*x^{n-1} + ... + a_1*x + a_0 when x = c:
1. power := 1
2. y := a_0
3. for i := 1 to n do
4. power := power*c
5. y := y + a_i * power
6. endfor
7. return y
For example, if the polynomial is 3x^2 + x + 1 and x = 2,
the algorithm does this:
n = 2, a_0 = 1, a_2 = 1, a_2 = 3
power := 1
y := 1
for loop iteration with i = 1:
power := 1*2 = 2
y := 1 + 1*2 = 3
for loop iteration with i = 2:
power := 2*2 = 4
y := 3 + 3*4 = 15
How many multiplications and divisions are done to evaluate a polynomial
of degree n at x = c? Don't count addionts used to increment the loop
variable.
For loop body is executed n times. In each iteration two multiplications
and one addition are done. So we have 2n multiplications and n additions.
------
#27: How do the worst-case running times for linear search and binary search
change when the input size doubles (increases from n to 2n)?
linear search: Have to search through an additional n list elements,
so running time is O(n) + O(n) = O(n).
binary search: Have to do one more iteration of the loop, so
running time is O(log n) + O(1) = O(log n).
*