Toolkits in First Year Computer Science:
A Pedagogical Imperative
Richard Rasala
College of Computer Science
Northeastern University
Boston MA 02115
Abstract
Traditional first year computer science courses teach the principles of computing using the basic features of some chosen programming language such as C, C++, Java, Ada, Scheme, Eiffel, etc. Abstraction and encapsulation focus on entities such as algorithms, functions, data structures, classes, objects, and closures that can be built directly on top of the raw language. If a facility such as windows and graphics is not directly available in the language then it is not used. This means that student exercises tend to look inward at computer science issues rather than outward to the exciting applications that show the breadth and power of computing.
The fundamental thesis of this article is that teaching students in the framework of powerful toolkits is essential to maintain student interest and is pedagogically important precisely because toolkits are a rich source of examples that illustrate the principles of computation. We hope to convince computer science faculty that the use of toolkits is imperative in a modern first year curriculum.
We will first discuss in general why toolkits are important. We will spice this discussion with some simple illustrations and with references to the use of toolkits by faculty at other institutions.
We will then describe the toolkits we have developed at Northeastern University and explain both what they do and why they are pedagogically valuable. We will see how toolkits enable students to do more interesting and effective work and how principles of design and algorithms can be demonstrated by the key components of the toolkits.
We will conclude with some general remarks and explain why the arguments made against toolkits do not have sufficient weight to change our conclusions. We will also give the web site address where our toolkits are available.
1 Why Toolkits?
There are three general arguments that support the view that the use of toolkits is imperative in first year computer science. The first argument is necessity, that is, without toolkits there are certain important computing activities that students would simply not be able to do. The second argument is economy, that is, without toolkits students are forced to repeat the same basic programming tasks over and over. The third argument is pedagogy, that is, without toolkits most examples are toy examples that do not show how the concepts play out and interact on a large scale.
1.1 Why Toolkits are Necessary
We believe that students must be able to generate graphical output. Many years ago, when the computer systems we used could generate only text, we discovered that students were unable to examine textual output carefully enough to recognize subtle bugs. Often students would hand in programs with errors and simply claim that they hadn’t noticed. When we were able to support graphics, we saw that bugs became magnified when presented graphically and that students could no longer ignore their existence.
Graphics are also important because many students are attracted to the discipline by the beautiful images that can be drawn on the computer screen. They do not want to study a 1970’s computer science that is merely text based.
In most languages, graphics is not built-in and so the only practical way to provide graphics is via toolkits. The graphics tools must be simple yet powerful so they are easy to use from the first or second week of the course but can support the variety of activities that will occur throughout the year.
It is especially important that ugly system specific details be hidden so that these details do not distract from the computer science principles and give the impression that computer science = technical mess. In fact, the use of toolkits should eventually give the opposite message, namely, that computer science is all about designing layers of abstractions and structures so that the technical mess is isolated and controlled and so that most of the time we can think in terms of the problem to be solved and not in terms of the machine.
In order to provide graphics, it is of course necessary to open and manage graphics windows. This demands other toolkits which hide even lower level system issues. In the opposite direction, graphics leads to animation which leads to issues of scaling images, of timing (delays and pauses), and of user control by the mouse rather than the keyboard. Running experiments leads to the desire to plot data. All of these activities occur often enough that toolkit support is justified.
1.2 Why Toolkits Promote Economy
A certain amount of repetition is educationally valuable but if students are forced to do the same basic steps throughout the year eventually it gets boring and takes away time from the new work that you as faculty want them to focus on. We believe in the use of toolkits to make the basic steps in programming more concise and in the use of frameworks so that students can work on the part of the program where the new concepts are central.
A fundamental principle of toolkit design is that one action in the mind should if possible be implemented with one line of code. So, for example, the entire effort behind building a graphics window, initializing its data structures, and making it visible is encapsulated in a single constructor call in our system toolkit:
GraphicsWindow G;
We also have commands that permit the console window and multiple graphics windows to be tiled precisely and easily.
In terms of the graphics calls that draw within a window, we also believe in the utmost brevity. For a simple widget, we insist on one call to tell the widget its parameters and one call to tell it to draw itself. For example, a LineTo widget (which draws a line from the current position to a new location) is used as follows:
LineToWidget L;
L.Set(250, 175).Draw();
Using the trick of returning the object by reference, we can combine the two actions (set and draw) into a single line of code that supports conceptual unity.
The area of text input-output is a domain that begs for abstraction and encapsulation. If you glance at the sample code in many first year textbooks, you will often see that 50% of the code is taken up with low level input-output. This code teaches nothing about the computer science topic at hand and is a very poor model for industrial practice. No serious software company uses the low level input-output library calls directly because such calls provide no error checking. For example, a faculty colleague who once worked at AT&T (the birthplace of the C programming language) recalled that programmers were forbidden to use scanf. If the inventors of a concept don’t use it then why do we teach it to students?
We believe that the correct approach to input-output is to consider how the input-output will integrate into the control flow of the program, ask what steps can be encapsulated as a single unit, and decide what work must be done to ensure robustness and error recovery. Then all of these elements may be encapsulated into tools that can be called easily and with confidence.
In general, if a sequence of actions is done over and over then it should be abstracted and encapsulated. Later in this paper, we will make this principle concrete when we discuss in detail several of the toolkits we have built to enhance the first year curriculum.
1.3 Why Toolkits Support Pedagogy
In typical textbooks, the best examples of quality code are usually less than a page in length. Such examples focus on the essence of an algorithm, data structure, or object. When larger examples appear in textbooks, the code is more problematic since the main program is often ad hoc and the input-output is both verbose and fragile.
A fundamental advantage of using quality toolkits is that you have a much larger body of well written code from which to draw examples. Moreover, if students have used the toolkits over a series of exercises, they will already understand the toolkit interfaces and will be eager to learn how the tools themselves are built.
For example, if you are introducing the concept of a class, you can utilize simple classes representing a point, a rectangle, or a color. Later, if you are discussing dynamic array allocation, you can present a template array class that safely handles construction, destruction, assignment, index checking, and automatic growth. Even later, when you are discussing inheritance, you can use the hierarchy of graphics widget classes as a motivating example. In general, using toolkits as examples permits you to present serious functions and classes that do real work and allows you to explore the deeper issues in abstraction, layering, and information hiding.
Of course, to use toolkits as pedagogical examples, they must be written using the highest quality design standards. Every name, every function, every object, every layer, and every comment must be thought through carefully not only for its direct impact on the toolkit performance but also for its pedagogical impact on the students who will read the code.
It is through the examples that you as faculty present to your students that your own philosophy is manifested. If you believe in objects then you will use them. If you believe in small functions that dispatch work to other functions then your code base will exemplify that design style. If you believe in algorithms that are both fast and robust then your code will be as efficient as it can be without sacrificing error checking.
Finally the use of toolkits supports a fundamental principle of software engineering, namely: Do not accept the raw machine or the raw language as it is. Instead, use software to build the “machine” you would like to work with.
1.4 Notable Toolkits at Other Universities
Our work on toolkits and laboratories has drawn inspiration from efforts at a number of other universities. We would like to mention these before discussing our own approach.
The group of Eric Roberts, Nick Parlante, and others at Stanford [9,14,22,23,24,25] has produced a number of beautiful exercises that are rich in the use of graphics and other tools. It is also noteworthy that Roberts developed a special compiler for C that produces much better error messages for students.
Lynn Stein of MIT [26] has explored multi-threaded programs based on the thread toolkits of Java. She believes that this approach more accurately represents how modern programs are developed than the linear single-threaded model.
Owen Astrachan of Duke [1,2,3] has developed a number of basic tools for C++ which have been used not only at the college level but also in the high school AP courses.
James Cohoon and Jack Davidson of Virginia have created a multi-platform toolkit package called EzWindows to support the teaching of C++ using their textbook [7].
Cay Horstmann of San Jose State [9] and Ursula Wolz of The College of New Jersey [27,28] have independently worked on toolkits to provide simple input-output in Java.
John Stasko of Georgia Tech [6] and Thomas Naps of Lawrence [11,12,13] have developed toolkits for the creation of algorithm animations.
2 The Core Tools
The Core Tools at Northeastern University encompass the following areas: input-output, window creation, graphics, transformations, plots, automatic animation, file handling, and selected data structures illustrating dynamic allocation, traversal, and the use of function objects. Currently the tools are written in C++ for PC’s and the Macintosh and those tools relevant for Java are under active development. In this section, we will explore some of the tools with a focus on design issues and pedagogical implications.
2.1 The Input-Output Tools
The IOTools package focuses on robust keyboard input although it also provides some tools for formatted output. The design of the package is such that the techniques can be easily adapted to input for user defined classes and for input from dialog boxes rather than from the console data stream. Since this package requires no graphics, it works in C++ on PC’s running Windows or Linux and on Macs. It has also recently been successfully ported to Java.
The original motivation for the design of IOTools was to provide robust error checked input so that students could make typographical errors and recover gracefully. We also wanted to compress the statements required to do IO so that the IO clutter typical of most textbook programs could be reduced as much as possible. Later, as we understood our own abstractions better, we realized that input could be integrated with control flow so that developing clean user interfaces could be made quite easy.
Let’s give a simple example of how IOTools is used. The sample problem is the traditional one of summing a number of values entered by the user.
int x;
int sum = 0;
cout < "Enter terms to sum:" < endl;
while (ReadingInt("Term:", x))
sum += x;
cout < "Sum = " < sum < endl;
The function ReadingInt prints the prompt string and then reads a line of input to obtain a valid number for x. If the input is OK then ReadingInt returns true so the loop body is entered and x is added to sum. If there is an input error then an explanatory message is printed which indicates the exact point of the mistake and the user is prompted to input again. If the user wishes to end the loop, she simply hits return on an empty line and ReadingInt calmly returns false to terminate.