72

Chapter 27:

Behavioral Research and Data Collection via the Internet

Michael H. Birnbaum

California State University, Fullerton

Ulf-Dietrich Reips

Universität Zürich

Date: 3/01/04

Contact regarding this paper should be sent to Michael H. Birnbaum:

Prof. Michael H. Birnbaum,

Department of Psychology, CSUF H-830M,

P.O. Box 6846,

Fullerton, CA 92834-6846

Email address:

Phone: 714-278-2102

Fax: 714-278-7134

Address for Reips:

Ulf-Dietrich Reips

Department of Psychology

Universität Zürich

Rämistrasse 62

CH-8001 Zürich

Switzerland

* This work was supported by National Science Foundation Grants SBR-9410572, SES 99-86436, and BCS-0129453 to the first author.


In the last decade it has become possible to collect data from participants who are tested via the WWW rather than in the lab. Although this mode of research has some inherent limitations due to lack of control and observation of conditions, it also has a number of advantages over lab research. Many of the potential advantages have been well-described in a number of publications (Birnbaum, 1999; 2000a; 200b; 2001a; 2001b; 2004a; 2004b; Krantz & Dalal, 2000; Reips, 1997; 1999; 2000; 2001a; 2001b; Reips & Bosnjak, 2001; Schmidt, 1997). Some of the chief advantages are that (1) one can test large numbers of participants very quickly; (2) one can recruit large heterogeneous samples and people with rare characteristics; (3) the method is more cost-effective in time, space, and labor in comparison with lab research.

This chapter will provide a handbook-style introduction to the major features of the new approach and illustrate the most important techniques in this area of research.

Overview of Web-Based Research

The process of on-line research can be described as follows: Web pages contain surveys and experiments are placed in Web pages available to participants via the Internet. These Web pages are hosted (stored) on any server connected to the WWW. People anywhere in the world can access the study and submit their data, which are processed and stored in a file on a secure server. The server that “hosts” or delivers the study to the participant and the server that codes and saves the data are often the same computer, but they can be different.

The Web researcher carefully plans the study following guidelines and avoiding pitfalls (Birnbaum, 2001a;Reips, 2002a, 2002b) and creates Web pages and other files containing text, pictures, graphics, sounds, or other media for the study. He or she will upload these files to the host server, and program the data server to accept, code, and save the data. The researcher tests the system for delivering the experiment and for collecting, coding, and saving the data. The Web researcher must ensure that the process is working properly, recruit people to participate in the study, and finally retrieve and analyze the data. Although this process may sound difficult, once a researcher has mastered the prerequisite skills, it can be far more efficient than traditional lab methods.

Basic Information on the Web and Web Pages

The basic unit of content on the World Wide Web (WWW) is the Web Page. Web Pages are files containing special codes that format and display text, graphics, pictures, sounds, computer programs, and other media. The most widely used system of Web coding is Hyper Text Markup Language (HTML). Files that contain HTML are simple text files that contain “tags” that instruct the browser how to display the information and how to link to other files on the Web.

The most basic method of requesting Web files is called HyperText Transfer Protocol (http), and the programs that request documents and display them are called “(Web) browsers,” which includes programs such as Safari, Opera, Netscape Navigator, Mozilla, and Internet Explorer, among others (for an overview, see http://browsers.evolt.org/). The Uniform Resource Locator (URL) address is the system for describing where in the WWW a document is located. A person types a URL into the address field of the browser and presses a button. The browser program now acts as a “client,” requesting the file from the Web “server,” that stores and “serves” files on request. The information is sent along the Web, perhaps half-way around the world, from the server to the client, which displays the formatted information in the browser’s “window.”

Psychological Research On the Web

To get an overview of the kinds of psychological studies that are currently in progress on the Web, visit the following links:

http://genpsylab-wexlist.unizh.ch/

http://psych.hanover.edu/research/exponnet.html

http://www.psychologie.unizh.ch/sowi/Ulf/Lab/WebExpPsyLab.html

http://psych.fullerton.edu/mbirnbaum/decisions/thanks.htm

http://psych.fullerton.edu/mbirnbaum/archive.htm

The number of studies conducted via the WWW appears to have grown exponentially since 1995, when psychologists began to take advantage of the new standard for HTML that allowed for convenient data collection (Musch & Reips, 2000). Internet-based research has become a new topic in psychology. The basics of authoring research studies will be described in detail in the next sections.

1. Constructing Web Studies for the Internet

There are many computer programs that allow one to create Web pages without knowing HTML. These programs include Adobe GoLive, Macromedia Contribute, Macromedia Dreamweaver, and Microsoft FrontPage (not recommended), among others. In addition, programs intended for other purposes, such as Microsoft Word, Powerpoint, and Excel, allow one to save their documents as Web pages. Although these programs can be useful on occasion, those doing Web research really need to understand and be able to compose basic HTML. While learning HTML, it is best to avoid these authoring programs. If you already know how to use these programs, you can study HTML by using them only in source code mode, which displays the HTML, rather than the “what you see is what you get” Web page.

How to Make a Basic Web Page in HTML

An HTML document is simply a text file containing special codes or “tags” that control the appearance of text, insert media such as pictures, graphics, or sound, and define hyperlinks to other materials on the Web. One can edit and save HTML files with text editors such as NotePad for Windows, TextEdit for Mac, or BBEdit.

When learning how to compose HTML, it is best to use a text editor and not a text processor, like Microsoft Word. To create a minimal Web page, one can type the “bare bones” example in Table 1 into NotePad, for example, and save the file as MyPage.htm, where MyPage would be any filename you choose, and the extension, .htm, indicates to the browser that it is an HTML file. Do not use a space in a filename to be served on the Web. Although such files may work fine on your own computer, they will not function properly on the Web. The HTML “tags” are shown in Table 1 in capital letters and in bold type to distinguish them from the text that is actually displayed and material that can be altered by the Web author. HTML is not case-sensitive (upper or lower case letters make no difference to HTML) and HTML files contain unformatted text. (Be aware that names of referenced files are case-sensitive: The file MyPage.htm can not be linked as Mypage.htm or mypage.htm. In addition, programming languages such as JavaScript, Java, and others are also case-sensitive.)

The text editor, NotePad, adds an extension of “.txt” unless the user specifies to save as “Files of All types” in the Save dialog box. In addition, the Open dialog box displays only files ending in .txt unless one chooses Files of All Types. Students who do not remember these two facts can become quite frustrated when working in NotePad.

There are two other ways that beginning students get stuck at this stage. An HTML file can also have the extension of “.html” instead of “.htm”. This bit of freedom traps some students, who name half of their files with extensions of .htm and half with “.html,” forget which is which, and spend hours trying to find their files or understand why their links don’t work. One student used an extension of “.htm1” instead of .html. It took hours to finally discover that the student had substituted a “1” (“one”) for the “l” (letter, L). An important lesson from human factors is to learn from experience and adopt procedures that prevent oneself and one’s students from becoming confused. One solution is to insist that students consistently use only the three-letter extension. A second solution is to use applications for constructing Web experiments that automatically set correct and consistent extensions, for example WEXTOR (Reips & Neuhaus, 2002).

Another sticking point is that when modifying a file in the text editor and then viewing the result in the browser, the student must save the file and then reload (or “refresh”) the file. When they forget to save and reload the file, students think that the changes in HTML they just made had no effect. It helps to remind students that when this happens, they should go back and save the file and reload it in the browser. It also helps to require that all students all work from files saved in the same location. That helps prevent confusion that students have when they create files in different folders having the same file names. They modify one and load the other, and then wonder why the change does not show up.

Insert Table 1 about here.

The examples in this chapter (including Tables 1-5, together with additional material), are linked from the following URL:

http://psych.fullerton.edu/mbirnbaum/handbook/

From this URL, you can load the examples from this chapter in your browser to see how they function. As with any file on the Web, you can examine the HTML by selecting to view the “source” from the “View” Menu of the browser. You can also save the source files and modify them to explore how HTML works.

Most HTML “tags” or commands have an opening and a closing “tag,” contained inside angle brackets to set them apart from the text to be displayed. For example, a page of HTML begins with <html> and ends with </html>. The basic Web page has two parts, the “head”, which contains the title, among other things, and the “body”, which contains the text and other material to be displayed inside the browser’s window. The tags for the “head” are <head> and </head> and the “body” is the material placed between <body> and </body>. The text placed between <title> and </title> is the page’s title, which is displayed at the top of the browser’s window. Figure 1 shows how this basic Web page appears in a browser window.

Insert Figure 1 about here.

1.1 Using Hyperlinks

Table 2 lists a simple file that illustrates the use of hyperlinks. The basic command for a text hyperlink is <A HREF=URL> Click here to link</A>. The URL can be an “absolute” address, as in the following link to Google:

<A HREF=”http://www.google.com”>Click here for Google</A>

This link is called “absolute” because it would be the same from any file on the Web.

The URL can also be a relative link, relative to the directory, or folder, in which the current file resides. To link to another file in the same folder (directory), one simply specifies that HREF=”filename.htm”. To link into a file in a folder within the current folder (a “child” directory of the current directory, one specifies

<A HREF=”foldername/filename.htm”>Click here</A>

To link to a file in the “parent” directory, one uses the following

<A HREF=../filename.htm>Click here to link</A>

Note that the name of the parent directory is not needed, because there is only one parent directory to any file. To link to a file in a grandparent folder, one uses HREF=../../filename.htm, and so on.

In the example of Table 2, there are two equivalent ways to link to Birnbaum’s home page. One uses an absolute link and the other a relative link. One reason to use relative links within your Web site is that should you move your Web site to a new server (after moving from one university to another, for example), all of the relative links will continue to work. Otherwise, you would have to change all of those absolute links to their new addresses whenever you move your site.

The “anchor” tag is used to define a spot within a document that can be linked from within the same file or from another file. In the example, within the main page of links, bare_links.htm, there is an anchor tag, <A NAME=”end”>. The anchor tag is an exception to the usual rule that HTML tags have an opening and closing tag; it is a single, stand-alone tag that defines a spot in a file. Linking to this spot from within that same file would use the relative link, <A HREF=”#end>click here</A>. However, to link to that spot from another file, one uses the link as follows, <A HREF=bare_links.htm#end>click here</A>.

Two additional HTML “tags” appear in this example. The line return or “break” tag, <BR>, like the anchor tag, is a stand-alone tag. The heading tag creates different sized headings, depending on the number following the H, where H1 is the largest size and H7 is the smallest.

Insert Table 2 about here.

1.2 Inserting Images in HTML

Table 3 illustrates how to insert images within a Web page. The two most important types of image files are the Joint Photographers Expert Group (JPEG) format, a compression format that works best for photographs, and the Graphics Interchange Format (GIF), which works best for graphics such as line drawings and charts. Consider the image tag in the example,