CSCI 2170 Lab B
Analysis of Algorithms

Objectives:: To introduce the concept of efficiency of an algorithm
To study run-time efficiency of an algorithm
To introduce Big-O notation
To determine the Big-O of an algorithm

For those of you without Javascript capabilities these links will provide access to the additional files needed to complete this lab. They are set to open a new browser window if your browser supports this feature. Answer Sheet

A. Efficiency of Algorithms
B. Run-time Efficiency
C. A Formal Definition of Big-O Analysis
D. Determine the Big-O of an Algorithm
E. Best-case, Average-case, and Worst-case Run-time

A. Efficiency of Algorithms

One goal of software engineering is to design efficient algorithms. At it's heart, the concept of efficiency is NOT about expending very little effort or using very few resources. Instead, efficiency is about ensuring that all actions/resources associated with an algrithm are usefully employed. That is, an algorithm is efficient if it can solve a problem without wasting effort or resources.

Consider the following very inefficient search algorithm to locate an element s in an array a[0..n-1].

found = false
k = 0
while not found and k < n - 1
{
       k = k + 1
       for (i = 0; i < k; i++)
           if (a[i] == s) found = true
}

NOTE: For each of the following exercises, indicate answers on the answer sheet.

Exercise 1 : In what way is the above algorithm inefficient.

There are many measures of efficiency, but two common measures are space (amount of memory used) and time. If we are interested in space efficiency, we can compare the space used by each of several algorithms that solves the same problem. If one algorithm requires less space than another, then in some sense, it is better. For example, consider the following two algorithms for reversing the order of the elements in an array a[0..n-1].



//Reverse 1:
for (i = 0; i < (n/2); i++)   // go through half the elements
{
     save = a[i];         //swap 1st with nth, 2nd with n-1st
     a[i] = a[n - i - 1]  //{etc...}
     a[n - i - 1] = save
}

//Reverse 2:
//Copy the 1st array to a 2nd array but in reverse order.
j = n - 1;        // j runs backward through b[0..n-1]
for (i = 0; i < n; i++)
{
     b[j] = a[i]
     j = j - 1
}

// Copy b[0..n-1] into a[0..n-1].
for (i = 0; i < n; i++)
    a[i] = b[i]

Exercise 2 : Examine Reverse1 and Reverse 2. Which algorithm is more efficient in its use of space? Why?

As software developers we must be aware of such space inefficiencies and avoid them. However, in this lab, we will focus only on time efficiency.

B. Run-time Efficiency

In this lab we will not measure the space efficiency of an algorithm; instead we will measure the efficiency of an algorithm by estimating its run-time. Run-time means the time it takes the CPU to execute an implementation of the algorithm. Run-time is different than the wall-clock-time required to run your program since there will be start up time, time-sharing time, I/O time, and so on. The C++ compiler translates a C++ instruction into a group of machine language instructions (probably about 10 machine instructions for each C++ instruction). Each computer has a "speed" often measured in GHz (gigahertz), which shows the machine's speed in billions of machine cycles per second. High-powered graphics workstations may run faster than 3 GHz. PCs often run between 1 and 3 GHz. Thus wall-clock-time would depend on what computer was being used. It can also depend on which compiler is used (one compiler may produce a more efficient set of machine instructions). It may depend on the time of day and so on. Thus wall-clock-time is not the best measure to use for "time" efficiency.

There are many ways to measure "run-time" besides "CPU time of execution." We could, for instance, count the number of machine instructions that are executed when the algorithm runs and expect that another algorithm that requires more machine instructions would be less efficient. This approach is more appealing as it is machine (speed) independent. That is, the efficiency of the algorithm is not affected by the speed of the machine on which you execute it. But who wants to count machine instructions?

Relatively speaking we could just as well count the C++ instructions or the pseudo-code steps that are executed as the algorithm runs. The number of C++ instructions should give us a good measure of the number of machine instructions. This is how we will measure the run-time efficiency of an algorithm. Usually the "number of steps" depends on the number n of inputs to the algorithm. For instance, if you are searching an array, it will usually take less steps if the size n of the array is smaller.

C. A Formal Definition of Big-O Analysis

A notation we use to give an approximation to the run-time efficiency of an algorithm is called Big-O notation (The O is for order of magnitude of operations or space at run-time. Some books use Big-Oh.) First we try to understand the notation. We say a function f(n) (f(n) is the number of steps executed for an algorithm with n inputs) is Big-O of g(n) and write

f(n) = O(g(n)) (which is read f of n is big O of g of n or f of n has order g of n)

if there exist a positive real value C and a nonnegative integer m such that

| f(n) | ≤ C * | g(n) | for all n > m.

This means that the number of steps for executing an algorithm for an input of size n is less than or equal to some positive constant C times g(n) for all but a few values of n. For example, 10x + 5 = O(x²). It is also true that 10x + 5 = O(x) (hint: let C be 20, then 10x + 5 ≤ 20x for all values of x ≥ 1/2). On the other hand, x³ ≠ O(x²). Think about why.

The Big-O of a function is a relative measure of how fast the function grows with respect to n. To apply Big-O to an algorithm we let f(n) = the number of steps (or instructions) that are executed when the algorithm runs on n inputs. There are certain Big-Oh values that occur frequently in the analysis of algorithms. Here is a partial list in increasing "size" and the approximate value of |f(n)| for a few values of n. Notice the change in orders of magnitude.

n	O(1)	O(log₂ n)	O(n)	O(nlog₂ n)	O(n²)	O(2ⁿ)	O(n!)
10	C	3	10	30	10²	10³	4*10⁶
100	C	7	100	700	10⁴	10³⁰
1000	C	10	1000	10⁴	10⁶	10³⁰⁰
1,000,000	C	20	1,000,000	2*10⁶	10¹²	10^300,000

The last column could not be completed because the values were too big to calculate!

O(1)   means the number of steps to execute the algorithm is some constant and is not dependent on n the size of the input. This is called constant time.
O(log₂n) means the number of steps is bounded by log₂ n. Such an algorithm is very efficient. This is called logarithmic time.
O(n)   means the time is directly proportional to the size of the input. This is called linear time.
O(n²)   is quadratic time.
O(n³)   is cubic time.
O(2ⁿ)   is exponential time.

If an algorithm is O(n^m), for a constant m, it is called a polynomial time algorithm. A polynomial time algorithm is generally considered an efficient algorithm (for small values of m). Algorithms with exponential time are generally too inefficient or costly to run on a computer.

D. Determine the Big-O of an Algorithm

Consider the following algorithm to calculate the sum of the n elements in an integer array a[0..n-1].

sum = 0
for (i = 0; i < n; i++)
sum += a[i]
print sum

Consider the question: How many steps are executed when this algorithm runs?

The answer is: 2*n + 3

Why? Line 1 and line 4 execute one time each. Line 3 executes n times since it lies inside the for loop. Line 2 causes the counter i to be changed and tested n + 1 times (one extra for the final test when i = n). Thus the total is:

Line 1: 1
Line 4: 1
Line 3: n
Line 2: n +1
Total: 2n +3

The polynomial 2*n + 3 will be dominated by the 1st term as n (the number of elements in the array) becomes very large. In determining the Big-O category we ignore constants such as 2 and 3. Therefore the algorithm above is order n which is written as follows:

f(n) = O(n)

In other words the run-time of this algorithm increases at roughly the rate of the size of the inputs (array size) n.

Now consider an algorithm for finding the largest element in a square two-dimensional array a[0..n-1][0..n-1]

max = a[0][0]
for (row = 0; row < n; row++)
for (col = 0; col < n; col++)
if (a[row][col] > max) max = a[row][col].
print max

Line 1 and 5 execute one time each. For each execution of line 2 or each time row changes, line 3 executes n+1 times (or col changes n+1 times) and line 4 executes n times. Since the block of statements containing lines 3 and 4 executes n times (row = 0 to n-1) we see the number of steps executed in this algorithm is

Line 1: 1
Line 5: 1
Line 2: n+1
Line 3: n*(n+1)
Line 4: n*(n )
Total: 2n² + 2n + 3

The n² term will dominate the polynomial therefore the Big-O of the algorithm is f(n) = O(n²).

Exercise 3 : Consider the following algorithm which determines the sum of the first n positive integers.
     1. sum = 0
     2. for (int j = n; j > 0; j--)
     3.      sum = sum + j
     4. print sum
Analyze this algorithm and determine the number of steps executed in each line of the algorithm. Determine the Big-O of the algorithm.

Exercise 4 : Determine a O(1) algorithm to calculate the sum of the first n positive integers. If you can't figure it out ask the lab assistant. Hint: This sum is called the Gauss sum. There is a one line equation in math to compute this sum.

As a third example, consider the algorithm below that calculates the sum of the powers of 2 that are less than n.

sum = 1
powerTwo = 2
while powerTwo < n
sum = sum + powerTwo
powerTwo = 2 * powerTwo
print sum

Exercise 5 : Let K be the number of times the while loop executes in the above algorithm. The number of steps executed is 4 + 3 * K. To see that this is true, determine how many times each of the lines (1 - 6) is executed and find the total.

Now we must determine K. From the algorithm, we see that K is the largest integer such that 2^K < n. In other words, K should be such that 2^K is approximately equal to n. Suppose for example that n = 2^K , then K = log₂n so we may say K is approximately log₂n. Thus the number of steps executed by this algorithm is approximately (that's all Big-O's are anyway)

f(n) = 4 + 3 * K = 4 + 3 * log₂n = O(log₂n)

Exercise 6 : If n = 1000, approximately how many steps will be executed in the last algorithm?

In general if the number of steps needed to carry out an algorithm can be expressed as follows:

f(n) = a_k n^k + a_k-1 n^k-1 + ... + a₁ n¹ + a₀

then f(n) is a polynomial in n with degree k and f(n) = O(n^k). To obtain the order of a polynomial function we use the term with the highest degree and disregard any constants and terms with lower degrees.

To determine the order of an algorithm look at the loops. If the algorithm contains one loop of the form for (i = 0; i < n; i++) then the loop will cause the statements in the loop to be executed n times. Thus the number of steps is approximately n * the number of instructions in the loop. Therefore if the loop causes the most steps to be executed in the algorithm, then the algorithm is O(n). Nested loops such as

for (i = 0; i < n; i++)
for (j = 0; j < n; j++)

are O(n x n) or O(n²).

E. Best-case, Average-case, and Worst-case Run-time

Some algorithms don't execute a fixed number of steps even for a fixed value of n. For example, a search algorithm may stop after one step if it finds the value it is searching for on the first comparison. On the other hand, it may search through all elements of the array and still not find the value being sought. In such cases we usually calculate a best-case (the least number of steps executed for an input of size n), average-case (the average number of steps executed for any input of size n), and worst-case (the most number of steps executed for an input of size n) run-time for the algorithm.

For a sequential search of an array a[0..n] with n elements:

best-case: element searched for appears in 1^st position
                    1 comparison and the item is found so O(1)
average-case: element searched for near middle of array
                          n/2 comparisons are required so O(n)
worst-case: element searched for in last array element
                     n iterations of search loop so O(n)

Exercise 7 : We have implemented three search functions in C++: binarySearch, sequentialSearch, and awfulSearch.
In this exercise, you will experiment with these three algorithms (contained in the source file inlabB.cc) for different values of n. Copy the source file to your account. Look at the code then execute it to see what it does. Remove the statements that print the array (save trees!) and add code to each function so it will calculate (and print) the number, f(n), of times the comparison statement
if (sArray[??] == Item) ....
is executed. This value will essentially give the BIG-O for the function. Why? Do a worst-case analysis (that is, search for a value that is not in the array). Create a table containing the number of times the comparison statement is execute for n = 100, 200, 300, 400, and 500 for all three searches. Use the table to guess the Big-O of each algorithm.

----- End of Lab B - Analysis of Algorithms -----
Complete the Exercises on the Answer Sheet.
Turn in the Answer Sheet and the printouts required by the exercises.

Return to the top of this lab
Return to the link page for all 2170 labs

CSCI 2170 Lab B Analysis of Algorithms