Critical Thinking Assessment Test (CAT) Scoring Session

SPC EPI District Office, Room 102

April 20, 2012

8:30 am – 4:00 pm

Attendees:

Janice Thiel,
Director, Quality Enhancement Plan / Maryann Todd,
Faculty, Natural Sciences
Ashley Hendrickson,
Baccalaureate Assessment & Accreditation Coordinator / Anne Marie Keyes,
Faculty, Social Sciences
Maggie Tymms,
Associate Assessment Director / Joe Smith,
Public Safety Institute Coordinator
Tina March,
Faculty, Communications / Rosanne Beck,
Public Safety Institute Program Development
Diane Reese,
Faculty, Communications / Yvonne Williams,
Public Safety Institute Curriculum Designer
Abagail Mills,
Faculty, Communications / Wendy Shellhorn,
Faculty, College of Health Sciences
Daiva Kennedy,
Faculty, Mathematics / Teresa Gaskill,
Faculty, Natural Sciences
Sharon Williams,
Faculty, Communications / Keith Kelly,
Faculty, Computer and Information Technology

Overview

“The CAT instrument is a unique tool designed to assess and promote the improvement of critical thinking and real-world problem solving skills. The instrument is the product of extensive development, testing, and refinement with a broad range of institutions, faculty, and students across the country. The National Science Foundation has provided support for many of these activities. The CAT Instrument is designed to assess a broad range of skills that faculty across the country feel are important components of critical thinking and real world problem solving. The test was designed to be interesting and engaging for students. All of the questions are derived from real world situations. Most of the questions require short answer essay responses and a detailed scoring guide helps insure good scoring reliability.” (TennesseeTechUniversity, Critical Thinking Assessment Test Overview).

In collaboration with TennesseeTechnologicalUniversity and with support from the National Science Foundation, St. Petersburg College (SPC) received a grant to administer the Critical Thinking Assessment Test (CAT) instrument to a representative sample of approximately one hundred students enrolled in the College during 2008. SPC conducted a second administration in 2009,which resulted in administering the CAT instrument to sixty-six students enrolled in randomly selected classes.

Beginning in 2010, SPC standardized the process by identifying one general education discipline in which to conduct future CAT administrations. In spring 2010, three College Algebra sections, and three Elementary Statistics sections were randomly selected, and the CAT instrument was administered to a representative sample of students enrolled in the six courses. The same process was followed in spring 2011 and 2012, which resulted in the administration of the CAT assessment in six randomly selected courses. The total number of CAT assessments that were scored in spring 2012 are listed in Table 1.

Table 1

Distribution of Students by Course

Course / Discipline / Scored CAT Assessments
MAC 1105 (1961) / College Algebra / 16
MAC 1105 (493) / College Algebra / 8
MAC 1105 (4577) / College Algebra / 14
STA 2023 (867) / Elementary Statistics / 15
STA 2023 (1060) / Elementary Statistics / 17
STA 2023 (874) / Elementary Statistics / 8
ATE 2653L (0101)* / Veterinary Technology / 22

*Section number was created and assigned, as two sections of the course administered the CAT and are, for all intents and purposes, being considered “one course”

Spring 2012 Administration

The CAT Scoring Session was held on April 20, 2012 at the District Office located on the EPI Campus of St. Petersburg College. A total of one hundred CAT assessments were scored. Scored assessments included seventy-eight of the one hundred and sixteen eligible assessments administered in math courses, and twenty-two of the thirty-two eligible assessments administered in the Veterinary Technology course. The remaining forty-eight assessments may be scored during a future workshop as time allows. Copies of the CAT Scoring Session agenda and non-disclosure consent form are located in Appendix A and B, respectively.

SPC Faculty members were invited to participate in the CAT scoring session during various meetings held between January and April.The majority of participants (10) arrived by 8:30 a.m., with several scorers arriving between 8:40 a.m. and 8:55 a.m. At 8:40 a.m., the Lead Facilitator, Ashley Hendrickson, welcomed everyone, introduced the facilitators, and asked for individual introductions. A brief history was given about SPC’s use of the CAT and its importance in measuring the institution’s Quality Enhancement Plan. The agenda for the day was reviewed and the workshop began by providing an overview of the CAT.

The overview was given via the first five minutes of the promotional NSF/CAT video (taken from the CAT Resources CDthat is provided at CAT train-the-trainer workshops), while scorers finished the continental breakfastthat was provided. The Non-Disclosure Form was then passed around to scorers for their signatures and the training module was shown to the group, starting with the illustration of how and when to pass tests to the next scorer. The facilitator then reviewed additional directions.

Following the CAT Overview and some group discussion, the CAT scoring session began. During the CAT scoring for each question item, the procedure listed below was followed, beginning with test item number one.

  1. The CAT Training Module, presented on a projection screen, provided the criterion and scoring rubric for a specific test item.
  2. Next, a sample test item was presented on the screen, and various responses were discussed and scored based on the scoring rubric given for the specific item, by the presenter on the training module.
  3. After completing the module for the question, scorers read aloud one or two examples of the test item to be scored from the tests provided, and various responses were discussed and scored based on the scoring rubric.
  4. Lastly, scorers reviewed the responses provided for the specific item on each assessment, and scored them based on the scoring rubric.
  5. Scorers who encountered a response which did not clearly follow the rubric discussed the response with the group for clarification.
  6. Each scorer then passed the scored assessment to the person on their right, and the same test item was scored by a second scorer.
  7. In the event that two scores differed, the assessment was provided to a third scorer and a third score was recorded.
  8. When all scoring for the specific test item on all assessments was completed, the assessments were collected, reviewed to ensure that all items had been scored accurately, and were redistributed randomly.[1]
  9. Finally, steps 1 through 7 were repeated for each test item until all assessments were completely scored.

A fifteen minute break was offered in the morning, and a buffet lunch was provided during a one hour break close to noon. Lunch was followed by a brief discussion with a subset of scorers regarding the morning scoring sessions.The day came to a close at approximately 4:45 p.m., 45 minutes past the scheduled time.

The 100 graded assessments were returned to Tennessee Tech University, together with the required scoring materials.

Results

The results of the one hundred scored assessments show a mean score of 14.2 with a possible range of scores from 0 to 38. This administration resulted in a maximum score of 29, and a standard deviation of 4.7. There were 44 males and 56 females, varying in age from 16 to 50. At the time of the administration, students had earned between 0 and 180 credits, and were enrolled in one of the seven different course sections (see Table 1, pg. 2). The assessments were aggregated by gender, age, number of credits, course, and grade point average (GPA).

The mean score of female students was slightly higher than for male students, as seen in Table 2. This finding is consistent with results from previous years’ CAT administrations, although the sample size, minimum score and standard deviation for male and female students were closer in 2012 than in the last three years.

Table 2

CAT Score by Gender

CAT Score 2012 by Gender
Gender / Total / Mean / Standard Deviation / Minimum / Maximum
Male / 44 / 13.5 / 4.9 / 6.0 / 29.0
Female / 56 / 14.7 / 4.4 / 7.0 / 29.0

Students were divided into three categories by age, and the results were calculated. The categories were selected based on standard college student age categories. They included ‘16 to 25’, ‘26 to 44’, and ‘Over 44’. Differences were noted in the mean score of students based on age, as seen in Table 3. Consistent with previous years’ results, the mean for the ’26-44’ age group exceeded the overall mean of 14.2, and the average scores for the youngest and oldest age groups were slightly lower.

Table 3

CAT Score by Age

CAT Score 2012 by Age
Age
Range / Total / Mean / Standard Deviation / Minimum / Maximum
16 to 25 / 80 / 13.9 / 4.4 / 6.0 / 29.0
26 to 44 / 19 / 15.5 / 5.6 / 7.0 / 29.0
Over 44 / 1 / 14.0 / n/a / 14.0 / 14.0

The scores from the CAT were also compared by GPA. Students (22) who had achieved a GPA of less than 2.5 had the lowest mean score of 11.7. As might be expected, students (25) with a GPA between 3.5 and 4.0 had the highest mean score of 15.9, surpassing the overall mean of 14.2. One student had no GPA available (no credits completed as of spring 2012), making the total number of students ninety-nine (99).

Table 4

CAT Scores by GPA

CAT Score 2012 by GPA
GPA / Total / Mean / Standard Deviation / Minimum / Maximum
Less than 2.50 / 22 / 11.7 / 4.5 / 6.0 / 21.0
Between 2.50 and 3.00 / 25 / 14.0 / 5.0 / 8.0 / 29.0
Between 3.01 and 3.50 / 27 / 14.8 / 3.6 / 7.0 / 21.3
Between 3.51 and 4.00 / 25 / 15.9 / 5.0 / 6.0 / 29.0

Figure 1. Illustration of increase in scores and GPA

Students were also divided into categories based on number of credits earned. While an attempt was made to have groups close in size, categories were decided based on the typical number of credits for each class standing(freshman, sophomore, junior, senior). All seven sections that were assessed in Spring 2012 were lower division courses (courses part of a general education or associate’s degree curriculum that typically require approximately 60 hours), however33 students had earned 61 or more credit hoursand 29 students reported their class standing as either a junior or senior.

Groups included 0-30 credits earned, which made up 40% of the group; 31-60 credits earned,comprised 27% of the group; 61-120 credits earned, 22% of the group; and 121 or more credits, 11% of the group. Differences in scores were noted as credit hours earned increase, so do the mean and minimum scores, as seen in Table 5. A difference in mean scores is observed betweenlower division (0-60) and upper division (61-120) levels.

Table 5

CAT Score by Credits

CAT Score 2012 by Credits Earned
Credits / Total / Mean / Standard Deviation / Minimum / Maximum
0-30 / 40 / 13.0 / 4.2 / 6.0 / 23.0
31-60 / 27 / 13.5 / 5.0 / 6.0 / 29.0
61-120 / 22 / 16.2 / 5.3 / 7.0 / 29.0
121 or more / 11 / 16.0 / 2.4 / 12.0 / 20.0

Student scores were also aggregated by the course section they were enrolled in for the administration of the CAT. Students enrolled in the ATE 2653L course attained a higher mean score than students enrolled in the other two course types. The mean score for this group was 16.1 with a minimum of 9, and a maximum of 29, as shown in Table 6.

Table 6

CAT Scores by Course Section

CAT Score 2012 by Course Section
Course / Total / Mean / Standard Deviation / Minimum / Maximum
ATE 2653L: Animal Nursing & Medicine Lab / 22 / 16.1 / 4.3 / 9.0 / 29.0
MAC 1105: College Algebra
(Section493) / 8 / 14.6 / 5.6 / 6.0 / 23.0

Table 6

CAT Scores by Course Section, continued

Course / Total / Mean / Standard Deviation / Minimum / Maximum
MAC 1105: College Algebra
(Section 1961) / 16 / 13.8 / 6.0 / 6.0 / 29.0
MAC 1105: College Algebra
(Section 4577) / 14 / 13.4 / 4.7 / 7.0 / 23.0
College Algebra Total / 38 / 13.8 / 5.3 / 6.0 / 29.0
STA 2023: Elem Statistics
(Section 867) / 15 / 12.6 / 3.4 / 7.0 / 18.0
STA 2023: Elem Statistics (Section 874) / 8 / 14.0 / 2.9 / 12.0 / 21.0
STA 2023: Elem Statistics (Section 1060) / 17 / 13.9 / 4.8 / 6.0 / 23.0
Elem Statistics Total / 40 / 13.4 / 3.9 / 6.0 / 23.0

As illustrated in Table 6, the Veterinary Technology course had the highest mean score of the three course types. This finding was explored further by looking at the number of credit hours earned in each course type. All twenty-two students enrolled in the Vet Tech course, ATE 2653L,had earned enough credits to be considered a typical college senior (90-120), with nine students having earned 121 or more credits, as shown in Table 7, below. This finding is in line with the increase in mean scores from 31-60credit hours to 61-120 hours, as shown in Table 5, above.

Table 7

Credit Hours earned by Course Type

Number of Earned Credit Hours by Course Type
Course / Total / ATE 2653L: Veterinary Technology / MAC 1105: College Algebra / STA 2023: Elem Statistics
0-30 / 40 / 0 / 25 / 15
31-60 / 27 / 0 / 10 / 17
61-120 / 22 / 13 / 3 / 6
121 or more / 11 / 9 / 0 / 2
Total / 100 / 22 / 38 / 40

National Comparisons

As of 2011, Tennessee Tech has made available national data for comparisons to other universities administering the assessment. There are approximately 3,900 students in the Lower Division National Norms from 30 institutions from private, large public Research 1 institutions, small regional schools, and others. The Lower Division norms are compared to all SPC students, each course type, and each section. Table 8 illustrates SPC results by question, compared to the national norms.

Table 8

Comparison of All SPC Students and Lower Division National Norms

SPC students achieved mean scores higher than the national average means on all fifteen question items. Ten of the fifteen mean scores are significantly higher than the national average (p< .05), although SPC’s overall mean score (14.2) is only slightly higher than the national average (13.7).

Comparisons by Attainable Points

Outlined below in Tables 9 and 10 are the itemswhereSPC scored both the highest andlowest, based on the “Average % of Attainable Points.” The “Average % of Attainable Points” reports the mean score in terms of percentiles, based on the total number of points that were attainable. For instance, Question 10 has a 4-point maximum; the mean score of 3.15 is 79% of the 4 points.

Table 9

Four Highest-Scoring Question Items

Question item / Points Available / Average % of Attainable Points / SPC Mean Score / National Mean Score
Q12. Use basic mathematical skills to help solve a real-world problem. (n=95) / 1 / 88% / 0.88 / 0.75
Q10. Separate relevant from irrelevant information when solving a real-world problem. (n=97) / 4 / 79% / 3.15 / 3.01
Q1. Summarize the pattern of results in a graph without making inappropriate inferences. (n=100) / 1 / 62% / 0.62 / 0.58
Q5. Evaluate whether spurious information strongly supports a hypothesis. (n=100) / 1 / 62% / 0.62 / 0.52

Table 10

Four Lowest-Scoring Question Items

Question item / Points Available / Average % of Attainable Points / SPC Mean Score / National Mean Score
Q2. Evaluate how strongly correlational-type data supports a hypothesis. (n=100) / 3 / 31% / 0.94 / 0.69
Q3. Provide alternative explanations for a pattern of results that has many possible causes. (n=100) / 3 / 31% / 0.93 / 0.67
Q13. Identify suitable solutions for a real-world problem using relevant information. (n=97) / 3 / 33% / 1.00 / 0.75
Q14. Identify and explain the best solution for a real-world problem using relevant information. (n=100) / 5 / 37% / 1.87 / 1.65

Note:DataIn Tables 9 and 10are displayed using two decimal places due to the proximity of the values

Conclusion

The second major requirement in meeting the accreditation requirement standards of the Southern Accreditation of Colleges and Schools (SACS) is a quality enhancement plan (QEP). The QEP is a significant issue related to student learning that is faculty-driven, and has a broad-based involvement. Critical thinking has been the QEP focus at SPC. One of several measures that will assist the institution in assessing SPC’s ability to carry out the QEP, and could have positive implications as an indicator for the college, is the finding related to credit hours.There are some indications that there is a relationship between the number of credits earned and the student’s score on the CAT.

These results suggest an increase in critical thinking skills for students who have completed 61 or more credits of coursework. Although the number of SPC students in this category is very limited (n=33), further review of these variables by course type also reflected that a higher number of credits is related to a higher total score. All students in the Vet Tech course, which had the highest total mean score across the seven sections, had completed 90 or more credit hours.

Despite the limitations in the data collection, the overall CAT administration and scoring processes were highly beneficial to St. Petersburg College. The faculty who received the training and had the opportunity to utilize the scoring rubric will have transferable skills they can use in the future with their students. The administrators and faculty who conducted the training are able to continue to provide professional development to faculty. The continued use of quantifiable instruments to determine St. Petersburg College’s effective implementation of the critical thinking initiative is another example of SPC’s Institutional Effectiveness model for continuous improvement at the college.

Reference:

TennesseeTechUniversity, Critical Thinking Assessment Test Overview Retrieved on July 17, 2008, from

Appendix A: CAT Scoring Session Agenda

CAT Scoring Workshop

Friday, April 20, 2012

8:30 a.m. - 4:00 p.m.

District Office – DO-102

Agenda

Time / Activity
8:30 / Morning Snacks and Introduction to the CAT
9:00 / CAT Scoring Session (Questions 1 and 2)
10:30 / Break
10:45 / CAT Scoring Session (Questions 3 and 4)
12:15 / Lunch, Discussion, Review
1:15 / CAT Scoring Session (Questions 5 – 9)
2:30 / Break
2:45 / CAT Scoring Session (Questions 10 – 15)
4:00 / End CAT Scoring Session

Appendix B: Non-Disclosure Agreement

AppendixC: DataAnomalies

Critical Thinking Assessment Test (CAT) Scoring Session 1

[1]Note: Due to various circumstances including an inexperienced group of scorers, and less time scheduled for the workshop, the facilitators did not have sufficient time during the afternoon sessions to thoroughly review every assessment to ensure scoring accuracy. As a result, TT identified data anomalies which are included in Appendix C.