Analysis of Algorithms

ANALYSIS OF ALGORITHMS

Sorts

a) Insertion Sort

We can represent the list on n integers (or reals) as

A0, A1, A2,···, An-1

In Insertion Sort, we first sort the first 2 elements, A0, A1 (we will assume that A0 is the smallest element of the list, i.e., A0 = -¥), we then compare the third element A2 with A1, and, if A2 is larger o equal to A1, then we stop; otherwise, we swap A2 with A1, and we compare A2 with A0. Since A0 is the smallest possible element, we know that the algorithm will stop.

In general, in the ith iteration the first i elements will be sorted, and Ai will be located (inserted) in the correct position, by comparing it with the elements of the sorted sub-list (from right to left), and every time Ai is smaller than an element of this sub-list, let's say Ar, we swap Ai with Ar , and continue to compare it with the remaining elements. If Ai is larger or equal to an element Ar, we then stop, since the sub list was already sorted and we know that Ai is also larger or equal to Ar, Ar-1, …., A0.

ith iteration:

A0, A1, A2,···, Ai-1 Ai

Example:

Consider the following list of integers

-¥ , 4, 1, 3, 2, 5, 6

First iteration:

|- ¥ | 4

4 is compared with -¥, and we stop because 4 is larger o equal to -¥.

Second Iteration:

| -¥, 4 | 1

1 is compared with 4, and since it is less, we swap it with 4.

-¥, 1, 4,

we then compare 1 with ¥, and we stop.

Third Iteration:

| -¥, 1, 4| 3

We compare 3 with 4, and since is less, we swap them

- ¥, 1, 3, 4

Next we compare 3 with 1, and we stop.

Fourth Iteration:

| -¥, 1, 3, 4| 2

We compare 2 with 4, and swap them. Next we compare 2 with 3, and swap them. Finally, we compare 2 with 1, and we stop:

-¥, 1, 2, 3, 4

Fifth Iteration:

| -¥, 1, 2, 3, 4| 5

5 is compared with 4 and we stop

-¥, 1, 2, 3, 4, 5

Sixth Iteration:

|-¥, 1, 2, 3, 4, 5| 6

We compare 6 with 5 and stop.

-¥, 1, 2, 3, 4, 5, 6

A3: void Insertion (int A[ ], int n) // in reality the elements to be sorted are indexed from

// index 1 to index n

{

int i,j, temp;

A[0]=-32768; //smallest possible integer using 2 bytes integer representation

for (i=1; i<=n, i++) {

j=i;

(1) while ( A[j] < A[j-1]) { // swap

temp=A[j];

A[j]=A[j-1];

A[j-1]=temp;

j--;

}

Worst-Case scenario.

The basic operation in this case is the comparison of two elements in the array, thus in algorithm A3, operation (1) is the only operation that counts, in other words, is the number of times we execute the while statement.

The worst-case scenario is when the list is in decreasing order, thus in this case when we consider the ith iteration, Ai is smaller that the first ith of the sorted sub-list, and then Ai will inserted in the position right after A0, thus to insert Ai , we make i comparisons (or equivalently we execute operation (1) i times).

Since there are n iterations in the algorithm to be executed, we have

Average case scenario.

We will assume that all the elements of the list to be sorted are distinct.

For the ith iteration, since all the elements are distinct, we can see this problem as the problem of inserting Ai, in the first ith urns.

A0 || U1 ||A1 || U2 ||A2 ···, Ai-1 || Ui || Ai

For example if Ai is larger than Ai-1, we will make one comparison and Ai

will be inserted in urn Ui. Similarly if Ai is less Ai-1, but larger than Ai-2 , Ai will be inserted in urn Ui-1, and the algorithm will make 2 comparisons, etc., etc.

Let p(Uj) and t(Uj) represent the probability and the number of comparisons for Ai to fall

in urn Uj.

Since Ai has the same probability of falling in any of the urns:

p(U1) = p(U2) =p(U3) = p(U4)….. p(Ui)= 1/i, and

t(Ui)=1, t(Ui-1)=2, …..t(U2)=i-1, t(U1)=i, thus the average case of inserting Ai is

thus in the average to insert the ith element will take i/2 comparisons.

But we are inserting n elements, so the average case is

So the "order" of the average case is the same as the worst-case, thus no big deal here!.

b) Divide and Conquer

Merge Sort

Suppose that we want to merge (sort) an array containing two sorted halves (i.e., A[low:mid], and A[mid+1,high].

|Alow, ···, Amid|, |Amid+1,···, Ahigh |

We want to merge the two halves into an (auxiliary)array B[low:high]. To obtain a sorted list, we can used two index-pointers, l, that points to A[low], and h, that points to A[mid+1] (i.e., l=low, and h=mid+1). At each iteration, we compare A[l] with A[h], and we move the smallest of the two, to the next available position of the B array. Either l or h will be increased by 1, depending upon which of A[l] or A[h] was the smallest (if A[l]=A[h], the we move either one to B, increasing the corresponding pointer).

Example:

A[ ] = 1 5 6 7 8 11 2 3 5 7 9 11 B[ ]=

^ ^

l h

A[ ] = 1 5 6 7 8 11 2 3 5 7 9 11 B[ ]=1

^ ^

l h

A[ ] = 1 5 6 7 8 11 2 3 5 7 9 11 B[ ]= 1 2

^ ^

l h

A[ ] = 1 5 6 7 8 11 2 3 5 7 9 11 B[ ]= 1 2 3

^ ^

l h

A[ ] = 1 5 6 7 8 11 2 3 5 7 9 11 B[ ]= 1 2 3 5

^ ^

l h

and so on.

The upper limit for the l pointer is mid, while the upper limit for the h pointer is high.

If one of the pointer reaches its limit, the remainder elements of the other half, will be pushed into the B array.

If we consider the comparison as the basic operation (comparison of A[l] with A[h]), we realize that every time we make one comparison, exactly one element is pushed into the auxiliary array B. Thus the maximum number of comparison is high - low, or equivalently, the total number of elements in the array A, minus 1.

Thus, the worst-case of Merge, given a list of n elements is n-1 (we will assume n to simplify).

A4: void Merge (int A[ ], int low, int mid, int high) // we assume that B is a global // variable

{

int l=low, i=low, h=mid+1, k; // i is the index corresponding to array B

while ((l <=mid) & (h <=high)) { // exactly one of the pointers will reach its limit

if (A[l] <=A[h]) {

B[i]=A[l]; // push A[l] to B

l++; //increment l

}

else { // we must A[h] to B

B[i]=A[h];

h++;

}

i++;

} //end while

// now one of the pointer has reached its limit so we push the remaining

// elements into B.

if (l > mid) { // we pushed the remaining elements starting at A[h]

for (k=h: k <=high; k++) {

B[i]=A[k];

i++;

} // end for

else for (k=l; k <= mid; k++) // we push remaining elements starting at A[l]

B[i]=A[k];

i++;

} // end else

// Next we move B[low:high] to A[low:high]

for (k=low; k <=high; k++) {

A[k]=B[k];

} // enf for

} // end algorithm

We are now ready to sort using Mergesort

A12: void Mergesort (int low, int high)

{

if (low < high) { // if the sub-list has more than one element

int mid=(low + high)/2;

(1) Mergesort (low, mid); // we call Mergesort for the first half

(2) Mergesort (mid+1, high); we call Mergesort for the second half

// at this point the two halves are sorted

(3) Merge (A, low, mid, high);

}

} // end algorithm

Example: suppose that A[] = 31,28,17,65,35,42,89,25,45,52

31 / 28 / 17 / 65 / 35 / 42 / 89 / 25 / 45 / 52
17 / 25 / 28 / 31 / 35 / 42 / 45 / 52 / 65 / 89
31 / 28 / 17 / 65 / 35
17 / 28 / 31 / 35 / 65
42 / 89 / 25 / 45 / 52
25 / 42 / 45 / 52 / 89
45 / 52
45 / 52
65 / 35
35 / 65
31 / 28 / 17
17 / 28 / 31
42 / 89 / 25
25 / 42 / 89
31 / 28
28 / 31
35
35
65
65
42 / 89
42 / 89
52
52
45
45
17
17
25
25
42
42
89
89
31
31
28
28

Worst-case scenario

By looking at A4, we realize that the worst-case of processing n elements is equivalent to processing the two halves (lines (1) and (2)), plus to merge the two already sorted halves (of complexity n).

Thus

1) W(n)=W(n/2)+W(n/2)+n,

and when n=1

W(1)=0, since if n=1, A4 returns automatically.

Problem: Solve recurrence 1) in terms of n (Homework 3.2 of Lecture 2)

Heap Sort

This sort uses a data structure called Max-heap (or Priority queue), which is a binary tree with certain properties. The natural way to represent a heap is by using an array as the main data-structure.

Let l represent the level, and d the depth

Properties of a Max-heap binary tree:

1) There are 2l-1 nodes at level l, 1 £ l £ d-1 (i.e., all the levels are full except for perhaps the last level). At level d-1, the leaves are to the right of all internal nodes. The rightmost internal node in level d-1, if it has 1 child, the child must be a left child, and the remaining internal nodes at this level, must have two children (left and right).

2) The key (label) of any node is greater or equal to the keys its children (if any).

Example 1 - Max-heap

In a Max-heap the largest element is on the root.

Max-Heap has two main functions: Insert( ) and Delete( ).

To Insert we “insert” an element at the last available position of the array, say i. This element will stay on this position if its key is less or equal to the parent-node’s key. If the element is in position i (of the array), the index of the parent-node is i/2, thus we compare the key of this element with the key of his parent-node. If the key of the element is less than the key of its parent, then we swap them, and now the element is in position i/2. Thus the element will possibly move-up until it reaches position 1 of the array, corresponding to the root of the binary tree. If then we insert the element in the last position, the maximum number of possible operations (swaps) to place this element in the right location is log2 i.

To Delete we “delete” the element at the root (the one with the largest key in the array and place it in of an auxiliary array). We then delete the last element of the array (decrement an element at the last available position of the array), say i, and decrement the size of the array by one while placing this element at the root (position 1 of the array), in this way we keep the heap structure. If the element is at position i (of the array), the indexes of its children are 2i and 2i + 1, respectively. This element will stay at this position if its left and right children have keys with lesser values; otherwise we swap this element with the child with the largest key. Since we originally placed this element at the root, the maximum number of possible operations (swaps) to place this element at the right location is 2.log2 i.

To sort a list of n numbers A[ ], we insert each element of this array into the Max-heap using the Insert function of Max-heap. We then delete the elements of the Max-heap, one by one into an auxiliary array B[ ], making sure that the deleted elements are placed in descending order.

template <class T>

class MinHeap {

public:

MinHeap(int MSize) {

MaxSize = MSize;

heap = new T[MaxSize + 1];

Size = 0;

}

~MinHeap() { delete[] heap; }

MinHeap<T& Insert(T& x);

MinHeap<T& Delete(T& x);

int Size;

private:

int MaxSize;

T *heap;

};

template <class T>

MinHeap<T& MinHeap<T>::Insert(T& x)

{

if (Size == MaxSize)

throw NoMem();

else

{

int i = ++Size;

while (i != 1 & x < heap[i / 2])

{

heap[i] = heap[i / 2];

i /= 2;

}

heap[i] = x;

return *this;

}

template <class T>

MinHeap<T& MinHeap<T>::Delete(T& x)

{

if (Size == 0) throw OutOfBounds();

x = heap[1]; //root has the smallest key

T y = heap[Size--]; //last element

int vacant = 1;

int child = 2; //make child = left child

while (child <= Size)

{

if (child < Size & heap[child] > heap[child + 1]) ++child;

//right child < left child

if (y <= heap[child]) break;

heap[vacant] = heap[child]; //move smaller child

vacant = child; //new vacant

child = child * 2; // new child of vacant

}

heap[vacant] = y;

return *this;

}