Consider the following three examples. What do they all have in common?
Heat milk, marshmallows and chocolate in 3-quart saucepan over low heat, stir constantly, until chocolate and marshmallows are melted and blended. Refrigerate about 20 minutes, stirring occasionally until mixture mounds slightly when dropped from a spoon.
Beat whipping cream in chilled small bowl with electric mixer on high speed until soft peaks form. Fold chocolate mixture into whipped cream. Pour into pie shell. Refrigerate uncovered about 8 hours or until set. Garnish with milk chocolate curls and whipped cream.
From the Quik Mart, you should follow Saddle road for four miles until you reach a stoplight. Then make a left-hand turn at the stop light. Now you will be on Hollow Street. Continue driving on Hollow Street for one mile. You should drive past four blocks until you reach the post office. Once you are at the post office, turn right onto Jackson road. Then stay on Jackson for about 10 miles. Eventually you will pass the Happy Meadow farm on your right. Just after Happy Meadow, you should turn left onto Brickland drive. My house is the first house on your left.
(1) Place the oil pan underneath the oil plug of your car.
(2) Unscrew the oil plug.
(3) Drain oil.
(4) Replace the oil plug.
(5) Remove the oil cap from the engine.
(6) Pour in 4 quarts of oil.
(7) Replace the oil cap.
Each of these examples is an algorithm, a set of instructions for solving a problem. Once we have created an algorithm, we no longer need to think about the principles on which the algorithm is based. For example, once you have the directions to John’s house, you do not need to look at a map to decide where to make the next turn. The intelligence needed to find the correct route is contained in the algorithm. All you have to do is following the directions. This means that algorithms are a way of capturing intelligence and sharing it with others. Once you have encoded the necessary intelligence to solve a problem in an algorithm, many people can use your algorithm without needing to become experts in a particular field.
(4-1) Algorithms are especially important to computers because computers are really general purpose machines for solving problems. But in order for a computer to be useful, we must give it a problem to solve and a technique for solving the problem. Through the use of algorithms, we can make computers “intelligent” by programming them with various algorithms to solve problems. Because of their speed and accuracy, computers are well-suited for solving tedious problems such as searching for a name in a large telephone directory or adding a long column of numbers. However, the usefulness of computers as problem solving machines is limited because the solutions to some problems cannot be stated in an algorithm.
Much of the study of computer science is dedicated to discovering efficient algorithms and representing them so that they can be understood by computers. During our study of algorithms, we will discuss what defines an algorithm, how to represent algorithms, and what makes algorithms efficient.
In the introduction, we gave an informal definition of an algorithm as “a set of instructions for solving a problem” and we illustrated this definition with a recipe, directions to a friend’s house, and instructions for changing the oil in a car engine. You also created your own algorithm for putting letters and numbers in order. While these simple algorithms are fine for us, they are much too ambiguous for a computer. In order for an algorithm to be applicable to a computer, it must have certain characteristics. We will specify these characteristics in our formal definition of an algorithm.
An algorithm is a well-ordered collection of unambiguous and effectively computable operations that when executed produces a result and halts in a finite amount of time .
With this definition, we can identify five important characteristics of algorithms.
(1) Algorithms are well-ordered.
(2) Algorithms have unambiguous operations.
(3) Algorithms have effectively computable operations.
(4) Algorithms produce a result.
(5) Algorithms halt in a finite amount of time.
These characteristics need a little more explanation, so we will look at each one in detail.
(1) Algorithms are well-ordered.
Since an algorithm is a collection of operations or instructions, we must know the correct order in which to execute the instructions. If the order is unclear, we may perform the wrong instruction or we may be uncertain which instruction should be performed next. This characteristic is especially important for computers. A computer can only execute an algorithm if it knows the exact order of steps to perform.
(2) Algorithms have unambiguous operations.
Each operation in an algorithm must be sufficiently clear so that it does not need to be simplified. Given a list of numbers, you can easily order them from largest to smallest with the simple instruction “Sort these numbers.” A computer, however, needs more detail to sort numbers. It must be told to search for the smallest number, how to find the smallest number, how to compare numbers together, etc. The operation “Sort these numbers” is ambiguous to a computer because the computer has no basic operations for sorting. Basic operations used for writing algorithms are known as primitive operations or primitives. When an algorithm is written in computer primitives, then the algorithm is unambiguous and the computer can execute it.
(3) Algorithms have effectively computable operations.
Each operation in an algorithm must be doable, that is, the operation must be something that is possible to do. Suppose you were given an algorithm for planting a garden where the first step instructed you to remove all large stones from the soil, this instruction may not be doable if there is a four ton rock buried just below ground level. For computers, many mathematical operations such as division by zero or finding the square root of a negative number are also impossible. These operations are not effectively computable so they cannot be used in writing algorithms.
(4) Algorithms produce a result.
In our simple definition of an algorithm, we stated that an algorithm is a set of instructions for solving a problem. Unless an algorithm produces some results, we can never be certain whether our solution is correct. Have you ever given a command to a computer and discovered that nothing changed? What was your response? You probably thought that the computer was malfunctioning because your command did not produce any type of result. Without some visible change, you have no way of determining the effect of your command. The same is true with algorithms. Only algorithms which produce results can be verified as either right or wrong.
(5) Algorithms halt in a finite amount of time.
Algorithms should be composed of a finite number of operations and they should complete their execution in a finite amount of time. Suppose we wanted to write an algorithm to print all the integers greater than 1. Our steps might look something like this:
Print the number 2.
Print the number 3.
Print the number 4.
…
While our algorithm seems to be pretty clear, we have two problems. First, the algorithm must have an infinite number of steps because there are an infinite number of integers greater than one. Second, the algorithm will run forever trying to count to infinity . These problems violate our definition that an algorithm must halt in a finite amount of time. Every algorithm must reach some operation that tells it to stop.
When writing algorithms, we have several choices of how we will specify the operations in our algorithm. One option is to write the algorithm using plain English. An example of this approach is the directions to John’s house given in the introduction lesson. Although plain English may seem like a good way to write an algorithm, it has some problems that make it a poor choice. First, plain English is too wordy. When we write in plain English, we must include many words that contribute to correct grammar or style but do nothing to help communicate the algorithm. Second, plain English is too ambiguous. Often an English sentence can be interpreted in many different ways. Remember that our definition of an algorithm requires that each operation be unambiguous.
Another option for writing algorithms is using programming languages. These languages are collections of primitives (basic operations) that a computer understands. While programming languages avoid the problems of being wordy and ambiguous, they have some other disadvantages that make them undesirable for writing algorithms. Consider the following lines of code from the programming language C++ in Fig.4-1.
Fig.4-1 Writing algorithms using programming languages
This algorithm sums the numbers from 1 to 10 and displays the answer on the computer screen. However, without some special knowledge of the C++ programming language, it would be difficult for you to know what this algorithm does. Using a programming language to specify algorithms means learning special syntax and symbols that are not part of standard english. For example, in the code above, it is not very obvious what the symbol “++” or the symbol“<<” does. When we write algorithms, we would rather not worry about the details of a particular programming language.
What we would really like to do is combine the familiarity of plain English with the structure and order of programming languages. A good compromise is structured English. This approach uses English to write operations, but groups operations by indenting and numbering lines. An example of this approach is the directions for changing motor oil in the introduction lesson. Each operation in the algorithm is written on a separate line so they are easily distinguished from each other. We can easily see the advantage of this organization by comparing the structured English algorithm with the plain English algorithm comparing the structured English algorithm with the plain English algorithm in Fig.4-2.
Fig.4-2 Comparison of structured English algorithm with the plain English algorithm
The sorting algorithms share two basic operations in common. These operations are the comparison operation and the swap operation. We will look at each one in more detail before we examine our sorting algorithms.
1) The Comparison Operation
The comparison operation is simply a way of determining which item in a list should come first. If we are sorting a list of numbers from smallest to largest, the comparison operation tells us to place the number with the least value first. If we are sorting a list of letters alphabetically, the comparison operation tells us to place “a” before “b”, “b” before “c”, and so on. We will see that a sorting algorithm must usually perform many comparisons in order to correctly sort a list.
2) The Swap Operation
The swap operation is one way we move items as we are sorting. By swapping small items with large ones, we can place all the items in the correct order. When we use computers for sorting, the swap operation can be a little tricky because of the way computers copy data from one memory location to another. In Fig.4-3, we illustrate the process of swap operation.
Let’s take one more look at the swap algorithm we wrote:
(1) Copy the contents of cell A to temp.
(2) Copy the contents of cell B to cell A.
(3) Copy the contents of temp to cell B.
Notice that this operation requires three copies. It is important to remember that the swap operation is really a combination of copy operations. In our next lesson, we will learn a sorting algorithm called the Simple Sort that uses just the copy operation rather than the swap operation.
Fig.4-3 the process of swap operation
One basic algorithm for sorting is the Simple Sort. We can illustrate this algorithm using a group of unsorted playing cards. The Simple Sort works by selecting the smallest card in the unsorted hand and moving this card to a second hand. Once all the cards have been removed from the unsorted hand, the second hand contains the cards in sorted order.
We can use the same idea as in our Simple Card Sort to write a Simple Sort that can be used by a computer. Let’s see what this algorithm looks like and how it can be used to sort numbers in a computer. The algorithm for the Simple Sort is given in the box below.
1. Get a list of unsorted numbers
2. Repeat steps 3 through 6 until the unsorted list is empty
3. Compare the unsorted numbers
4.Select the smallest unsorted number
5. Move this number to the sorted list
6. Store a maximum value in the place of the smallest number
7. Stop
The steps below illustrate how the Simple Sort algorithm works on a computer.
(1) First, we give the computer a list of unsorted numbers. These numbers are stored in a group of contiguous memory cells called an array. Each memory cell of the array holds a single number.
(2) As the computer sorts these numbers, it will repeatedly compare them to find the smallest number. This is similar to the comparisons we made when sorting our hand of cards. Each time we compared two cards and kept the smaller of the two. Then we compared this card to the remaining cards until we found a smaller one or checked all the cards. The computer uses the same process only with numbers rather than cards. Once the smallest number is found, the computer will copy this number to a new array of memory cells and replace the old number with a special number called MAX. MAX is the largest number a single memory cell can hold. None of the remaining numbers can be larger than MAX, so this number is a good choice for marking memory cells that have already been sorted.
(3) Next, the computer begins searching for the smallest number in the unsorted list. Although it is easy for us to scan the numbers and select the 2 as smallest, the computer must compare all the memory cells in the unsorted array to be certain which number is smallest. This means the computer must perform six comparisons: (7<8), (7>5), (5>2), (2<4), (2<6), and finally (2<3). Once the comparisons are done, the computer copies 2 to the sorted array and replaces the original 2 with MAX.
(4) Now the computer begins searching for the smallest number again. Six more comparisons are required to determine that 3 is smallest: (7<8), (7>5), (5<MAX), (5>4), (4<6), and finally (4>3). Now we can see the importance of replacing 2 with MAX in our previous step. If we had not made this change, then 2 would have been selected as the smallest number again. After copying 3 to the sorted array, the computer also replaces the original with MAX.
(5) With six more comparisons, the computer selects 4 as the smallest number, copies it to the sorted array, and replaces the original with MAX.
(6) The numbers 5, 6, 7, and 8 are also selected in turn by six comparisons, a copy, and a replacement of the original. Once all the memory cells in the unsorted array have been considered, the sorted array contains our original numbers in sorted order.
Now let’s look at how the Insertion Sort algorithm would work inside a computer. Below is our modified algorithm for sorting a list of numbers.
1. Get a list of unsorted numbers
2. Set a marker for the sorted section after the first number in the list
3. Repeat steps 4 through 6 until the unsorted section is empty
4. Select the first unsorted number
5. Swap this number to the left until it arrives at the correct sorted position
6. Advance the marker to the right one position
7. Stop
This time the steps of our modified algorithm are almost identical to the steps in our card sorting algorithm. We have simply substituted numbers for playing cards and a list of numbers for a hand of cards. The steps below illustrate how the Insertion Sort algorithm works on a computer.
(1) First, we give the computer a list of unsorted numbers and store them in an array of memory cells.
(2) To begin the sort, the computer divides the sorted and unsorted sections of the list by placing a marker after the first number. To sort the numbers, it will repeatedly compare the first unsorted number with the numbers in the sorted section. If the unsorted number is smaller than its sorted neighbor, the computer will swap them.
(3) The first number in the unsorted section is 8, so the computer compares it with the number to the left. Since 8 is greater than 7, these numbers do not need to swapped and the computer simply advances the marker one position. Notice that only one comparison was needed to sort the 8.
(4) Now the first number in the unsorted section is 5. 5 is less than 8, so the computer swaps these numbers. 5 is also less than 7, so the computer swaps these numbers as well.
(5) Now 5 is in the correct order, so the computer advances the marker one position. This time two comparisons and two swaps were needed to sort the number.
(6) Now the first number in the unsorted section is 2. 2 is less than 8, 7, and 5, so after three comparisons and three swaps, 2 arrives at the correct sorted position, and the computer advances the sort marker.
(7) Now the first number in the unsorted section is 4. 4 is less than 8, 7, and 5 but it is not less than 2. This time the computer performs four comparisons and three swaps to put the 4 in the correct order. Only three swaps were needed since the 2 and the 4 did not need to be switched. After these comparisons and swaps, the computer advances the sort marker.
(8) Now 6 is the first number in the unsorted section. After three comparisons and two swaps, the computer places the 6 in the correct position between 5 and 7. Notice that the computer did not need to compare the 6 with the 2 or the 4 since it already knows these numbers are less than 5. Once the computer finds a number in the sorted section less than 6, it knows it has found the correct position for 6 and it can advance the sort marker.
(9) The final unsorted number is 3. To find the correct position for 3, the computer must compare it with every number in the unsorted section. However, only five swaps are required since the first number (2) is less than 3. After moving 3 to the correct position and advancing the sort marker, the Insertion Sort is complete since the unsorted section is empty.
Now that you have a general idea of how the Selection Sort works, let’s see how the computer would perform this sort with numbers. Below is our modified algorithm for sorting a list of numbers.
1. Get a list of unsorted numbers
2. Set a marker for the unsorted section at the front of the list
3. Repeat steps 4 - 6 until one number remains in the unsorted section
4. Compare all unsorted numbers in order to select the smallest one
5. Swap this number with the first number in the unsorted section
6. Advance the marker to the right one position
7. Stop
(1) First, we give the computer a list of unsorted numbers and store them in an array of memory cells.
(2) To begin the sort, the computer divides the sorted and unsorted sections of the list by placing a marker before the first number. To sort the numbers, the computer will repeatedly search the unsorted section for the smallest number, swap this number with the first number in the unsorted section, and update the sort marker.
(3) To find the smallest number in the unsorted section, the computer must make six comparisons: (7<8), (7>5), (5>2), (2<4), (2<6), and (2>3). After these comparisons, the computer knows that 2 is the smallest number, so it swaps this number with 7, the first number in the unsorted section, and advances the sort marker.
(4) Now five more comparisons are needed to find the smallest number in the unsorted section: (8>5), (5<7), (5>4), (4<6), and (4>3). After these comparisons, the computer swaps 3, the smallest number in the unsorted section, with 8, the first number in the unsorted section, and advances the sort marker.
(5) This time four comparisons are needed to determine that 4 is the smallest number in the unsorted section: (5<7), (5>4), (4<6), and (4<8). After these comparisons, the computer swaps 4 with 5 and then advances the sort marker.
(6) After three more comparisons, the computer identifies 5 as the smallest unsorted number: (7>5), (5<6), and (5<8). Then the computer swaps 5 with 7 and advances the sort marker.
(7) This time only two comparisons are needed to determine that 6 is the smallest number: (7>6) and (6<8). After these two comparisons, the computer swaps 6 with 7 and then advances the sort marker.
(8) Now we only need a single comparison to find the right position for 7: (7<8). Since 7 is the smallest number and it is also the first number in the unsorted section, the computer does not need to swap this number. It only needs to advance the sort marker. Now there is only one number in the unsorted section, so the list of numbers is sorted and the Selection Sort algorithm is complete.
For any given problem, it is quite possible that there is more than one algorithm that represents a correct solution. A good example of this is the problem of sorting. Dozens of different algorithms have been written to solve this problem. Given such a wide range of solutions, how can we determine which algorithm is the best one to use? To do this, we must analyze our algorithms in such a way that we can gauge the efficiency of the algorithm. Once we have calculated the efficiency of an algorithm, we can compare our measurements and select the best solution.
Let’s analyze the three sorting algorithms in this section and determine which one was the most efficient solution for sorting the list of seven numbers in our examples. To do this we need a way to measure the efficiency of our algorithms. We can actually measure the efficiency in two different ways: space efficiency and time efficiency . An algorithm that is space-efficient uses the least amount of computer memory to solve the problem of sorting. An algorithm that is time-efficient uses the least amount of time to solve the problem of sorting. Since most of the sorting operations are comparisons, copies, and swaps, we can count these operations and use our results as a measure of time efficiency. Tab.4-1 below summarizes our measures of time and space efficiency in sorting.
Tab.4-1 Types of efficiency measures
Let’s begin our analysis by determining which sort was the most space-efficient. We discussed in our previous lesson that space efficiency can be measured by calculating the number of memory cells a particular sort requires. For simplicity, we will assume that a memory cell can hold one number. This means that a sort with five numbers would require at least five memory cells just to store the numbers in the computer. The total number of memory cells needed for sorting will depend on how many additional cells the algorithm requires to order the numbers.
Of the three sorts, the Insertion Sort and the Selection Sort are most space-efficient. These sorts order the items within the list using the swap operation rather than copying items to a new list. Since the Simple Sort copies every item to a new list, it requires twice as much computer memory. Remember that the swap operation requires a temporary memory cell.
Now let’s determine which sort was the most time-efficient. To do this, we will count the number of operations each sort performed. Most of the operations that are done by the computer during sorting fall into two groups: copying numbers or comparing numbers. The algorithm which requires the least copying and comparing is the one that will execute the fastest.
For the Insertion Sort and the Selection Sort, it will be easier to count the number of swaps that are done rather than the number of copies. Remember that the swap operation requires three copies. We can find the total number of copies that the algorithms perform by counting the number of swaps and multiplying by three. The Simple Sort does not use the swap operation, so you can count the number of copies directly.
Now that you have completed your calculations, let’s summarize the results in Tab.4-2 and Tab.4-3. We already know that the Insertion Sort and the Selection Sort were the most space-efficient, but we have yet to determine which sort is the most time-efficient. We will see that this answer is a little more difficult to determine.
Tab.4-2 Space efficiency
Tab.4-3 Time efficiency
Notice that the Simple Sort required the least amount of copies. We would expect this to be true since it does not swap numbers while sorting. Instead the numbers are copied to a new list in the computer. This is a common tradeoff between time and space. Although the Simple Sort loses space efficiency by using two lists, it gains time efficiency because fewer copies are required. Of course this does not mean that it is always best to use the Simple Sort to gain more speed. If we are trying to sort a list of 5 million names the Simple Sort would use too much space in the computer’s memory. It would be much better to swap items within the list rather than create two lists.
For number of comparisons, the Selection Sort and Insertion Sort were nearly the same. The Simple Sort, however, required twice as many comparisons. We can see the reason for this difference by thinking about how the algorithms work. Each algorithm repeatedly searches for the smallest number and then places this number in the correct position. For the Insertion Sort and the Selection Sort, each iteration of this process reduces the unsorted section by one number. During the next search, the computer does not need to make as many comparisons to find the smallest number. The Simple Sort, however, replaces sorted numbers with a marker called MAX. Each time the computer searches for the smallest number, it must compare all seven memory cells. This approach is much less efficient.
Given the particular set of seven numbers we sorted, the Selection Sort was the most time-efficient. However, it is important to understand that this may not be true for every set of seven numbers. Consider the following example.
If we use the Insertion Sort on these numbers only 8 comparisons and 1 swap would be needed to sort them. However, if we use the Selection Sort, 21 comparisons and 1 swap would be needed. In this case, the Insertion sort is more efficient.