Editor’s Note: This is an extension of an earlier study by these same authors published in this Journal. It looks at the results of proctored vs. non proctored tests. This has special relevance for large distance learning programs. One of our editors believes that these concerns could also be solved using alternative forms of evaluation such as portfolios and actual performance .

Leveling the Playing Field for Online Testing

Patricia Royal, Paul Bell

USA

Abstract

The purpose of this research study was to follow up an original study that tried to determine if a relationship exists between test performance and test delivery methods, particularly for those taking proctored versus un-proctored online exams. The follow-up study replicated the previous research using the same students, instructor, and textbooks. The class was second semester sequence of a year-long course in applied sciences for undergraduate students. In the previous study students were randomly divided into two groups. One group completed exams via web-based delivery proctored, while the other half completed exams via we-based delivery un-proctored. In the current study, students who had previously taken exams un-proctored were now proctored while those who had previously taken proctored exams were now un-proctored. After three exams, a comparison of scores was analyzed to assess if a difference in mean test scores existed for the two groups. Although the difference in scores was not statistically significant, the test means scores for the un-proctored group were higher than scores obtained for the proctored group on all three exams. This pattern of test results replicated what had been previously found in the first study. Therefore, the results of both studies suggest that student achievement in on-line exams may be influenced by whether or not the test taker is supervised (proctored) or unsupervised (un-proctored). The implication of these findings for the design of on-line course assessments is discussed.

Keywords: proctored versus un-proctored testing, test delivery, web-based testing, supervised testing, unsupervised testing, asynchronous web-based learning, online testing, test delivery methods, learning or online learning, online assessment, on campus learning, distance learning.

Purpose

Among undergraduate students in a first semester web-based applied medical course, Royal and Bell (2007) found that un-proctored test takers consistently scored higher than proctored test takers. The purpose of this follow-up study is to determine whether this difference can be replicated in the second semester sequence of the course.

Introduction

The availability of on-line or web-based distance education courses has led to a surge of degree seeking students who learn and have their learning assessed in the virtual classroom known as cyberspace. This situation has placed colleges and universities in the position of assuming that students are being honest when taking exams. Furthermore, the issue of honesty not only applies to distance education students. Many faculty teaching both campus and distance courses employ computer-aided exams for their students. The use of computer-aided exams provides the student with instant feedback and flexibility, while alleviating faculty the arduous task of administrating and grading the exams (Turner, 2005; Warren & Holloman, 2005; Wellman & Marcinkiewicz, 2004; Greenberg, 1998). However, while using computer-aided exams, faculty must also face potential problems such as student accessibility, learning styles, limited computer skills, student motivation, and of course, academic integrity (Turner, 2005; Summers, Waigandt and Whittaker, 2005; Lorenzetti, 2006). The problem is that as with online learning, computer-aided testing confers a certain amount of autonomy and independence to the student. Moreover, is such an environment the test taker is assumed to adhere to principles of academic integrity. Unfortunately, such as assumption may not always be valid or may be naïve.

Royal and Bell’s study (2008), was similar in design to Wellman and Marcinkiewicz’s study (2004) which looked at the impact of proctored versus un-proctored quizzes upon student learning. While Wellman and Marcinkiewicz’s study sought to define student learning as a change in pre versus post test scores, Royal and Bell’s study was interested in comparing proctored vs. un-proctored student test scores. The students in the study were enrolled in an Applied Medical Science course at East Carolina University. The students were seeking degrees in Health Information or Health Services Management. The cohort included both campus and distance education students who were randomly divided into two group. Group 1 became the un-proctored group while group 2 became the proctored group. Both groups had the same instructor, the same text book, and access to the same power point presentations as well as lecture content that was recorded on campus via mediasite technology. One group was un-proctored while the other group was proctored during all exam taking sessions. Both groups were given 4 multiple choice exams which were taken through WebCT interface. Before taking the exams, the students were told they could not use textbooks, notes, or talk with other students when taking the exam. Once the exam was accessed, students in either group had to complete it within the same amount of time. However, the un-proctored students had a window of availability during which they could access the exam, while the proctored students, who were supervised by either a faculty member or a proxy from the local community college, had to take their exams at particular scheduled times. Except for this difference in test taking supervision, no other differences existed between the two groups. Results from the study indicated that for every exam, the un-proctored students outscored the proctored students. Out of the 4 exams, only two showed results that were statistically significant. However, the overall pattern showed that students being proctored consistently scored lower than the un-proctored students. In addition to comparing the grades, Royal and Bell calculated students’ grade point averages (GPA), and compared ages of both groups to further characterize the relationship between exam scores and test delivery methods. There was no significant difference in these two variables.

Follow Up Study

To further test the hypothesis that un-proctored students score higher on exams, a follow up study using the same students, instructor, textbooks and instruction method was conducted. The study was completed in the spring semester of 2008 using undergraduate students enrolled in the second semester sequence of a year- long undergraduate applied medical science course. Students participating in the study had signed a consent form at the beginning of the first semester which advised potential participants the study included fall and spring semesters.

Methods

Participants: Undergraduate students who were enrolled in an Applied Medical Science II course at East Carolina University. The study began with 63 students. The original number of students in the first study was 71. At the end of the first semester, a total of 66 students remained in the study. Of those 66 students, 3 students failed the first semester leaving a total of 63 students who were enrolled in the course and had signed proper consent documentation.

Course: The Applied Medical Science II course is the second part of a required course for students seeking a degree in either Health Information Management or Health Services Management. The students had been admitted to the program prior to the first course taken during the fall semester and all students had the same prerequisites. Although there is only one course, it was divided into 3 sections due to the large number of students. While one section is considered distance education, the other two sections are counted as campus courses and are taught on two different days to provide for adequate space in the classroom. Students were not required to attend class because the instructor used video recordings for the lectures which were placed on WebCT. All students, whether proctored or un-proctored used computer-aided testing.

Procedure: In the first study, students were randomly divided into two groups. One group was the un-proctored students while the other was the proctored. In the current follow up study, the group that had originally been proctored in the first study was now un-proctored and the group that had been un- proctored in the first study was now the proctored. The students were told whether they would be proctored about 10 days prior to the first exam. The proctored students were provided with a specific day(s) and time to take the exam, or assigned to be proctored at a local community college. There were a couple of students who were unable to either come to the university or attend a community college due to either proctor fees charged by the college or the distance involved in the travel was too great. These students were responsible for finding their own proctors and then had to email the instructor ahead of time describing their specific test taking. After the exam was completed, the proctors emailed the researchers confirming the students had taken the exam under proctored conditions. The un-proctored students were allowed to take their exam at their convenience, but within a specified time frame. All students received the same instructions regarding the use of textbooks and notes, and all students took their exams through WebCT interface. All students were given 3 multiple choice exams during the semester with the same time allocation for each. The second semester students followed the same protocol followed by the students during the first semester.

Results

This current research was a follow- up to a study conducted in fall of 2007 which was designed to determine if a relationship exists between method of test delivery and student performance.

Test Results

Exam 1: There were 32 un-proctored students and 31 proctored students who took exam 1. There was one student who was supposed to be proctored, but took the exam unsupervised.

Exam 2: There were 31 un-proctored students and 32 proctored students who took exam 2. The student who was un-proctored in exam 1 took the second exam with a proctor as originally scheduled.

Exam 3: There were 31 un-proctored students who took exam 3 and 32 students who took the exam with a proxy.

Exam Scores:

Exam 1: The mean score for the un-proctored students was 87.0, while the mean for the proctored students was 83.7.

Exam 2: The mean score for the un-proctored students was 88.3, while the mean for the proctored students was 86.5.

Exam 3: The mean score for the un-proctored students was 95.4, while the mean for the proctored students was 89.3 (see Table 1).

Table 1

Exam Scores:

Summary Statistics for Scores on Exam 1

N Minimum Maximum Mean SD Variance_____

Un-proctored 32 45.00 112.00 87.0 16.1 261.1

Proctored 31 54.00 112.00 83.7 17.2 298.6

Summary Statistics for Scores on Exam 2

N Minimum Maximum Mean SD Variance___________

Un-proctored 31 62.00 104.00 88.3 9.4 89.7

Proctored 32 66.00 103.00 86.5 10.2 104.3

Summary Statistics for Scores on Exam 3

N Minimum Maximum Mean SD Variance___________

Un-proctored 31 60.00 109.00 95.4 11.0 121.8

Proctored 32 37.00 113.00 89.3 17.9 322.8

Relationship between exam scores: To establish whether the relationship between the mean test scores for each group was statistically significant, a T-Test, assuming equal variances, was used to compute the significance (see Table 2).

Table 2

T-Test Analysis of Exam Scores

_____ UP (M) P (M) Diff T Probability

Exam 1 87.09 83.70 3.39 .803 .425

Exam 2 88.38 86.59 1.79 .722 .473

Exam 3 95.41 89.37 6.04 1.60 . 113

Total 90.29 86.55 3.74 1.22 .289

________________________________________________________________________

*Correlation is significant at the 0.05 level.

Discussion

Summary

Purpose

The purpose of this follow-up study was to determine whether a relationship exists between test performance and method of test delivery among undergraduate students in a medical science course offered at East Carolina University.

Methodology

The sampling frame was the same students involved in the previous study conducted in fall, 2007. All students were enrolled in the medical science course. The students who were un-proctored (Group 1) in the first study became the proctored students (Group 2) for the follow up while the proctored (Group 2) in the first study became the un-proctored (Group 1) in the follow-up study. As in the first study, faculty version 15.0 of the Statistical Package for the Social Sciences (SPSS) was used for statistical analyses. Frequencies and summary statistics were computed for exam scores for both student groups while the mean and standard deviations were computed for each group’s scores. To determine the relationship between the two variables, (test scores and performance) a T-Test analysis was computed.

Discussion

Student Sample

The students were the same students who were in the fall 2007 research study. They were all admitted students in the same program and had the same required prerequisites. There was no significant difference between GPA’s and student ages between the two groups. All students received the same study materials and access to lectures via mediasite recordings.

Exam Scores

Consistent with the previous study, the un-proctored students consistently outscored the proctored students. Although the analysis of the scores does not indicate a statistically significant difference between the groups, the pattern remained the same for both studies.

Conclusions

In the previous study conducted in fall, 2007, the results, whether statistically significant or not, indicated that for every exam the un-proctored students scored higher than proctored students. So what explains theses results…differences in the knowledge level or preparedness of the students in each group or differences in the test taking conditions between the two groups? The way to zero in on the reason was to repeat the study replicating the same methods using the same subjects but switching group assignments such that those who had originally been proctored are now un-proctored and vice versa. If the mean test grades per group repeat the pattern found in the first study: that is the un-proctored groups still performed at a higher mean compared to the proctored group, then we can state with a fair amount of confidence that for the sample in this study, the test-taking condition affected test outcome. Again the students who were supervised score lower than the unsupervised group. The original hypothesis that un-proctored students score higher than proctored students was consistent although the results were not statistically significant. The rationalization for these results may lie in the honesty or dishonesty of the students. When students are allowed to take exams via computer-aided testing, it is logical to imagine that certain students will use notes, textbooks, or other students to assist them. Even in this study, the researchers heard comments from some of the proctored students indicating that it was unfair for them because “everyone knows that students use books when taking exams”. These comments were made by several students either directly to the researchers or via email. It was almost like some of the students were not concerned about the instructor/researchers knowing that cheating did occur. In addition, some of the students who withdrew from the first study indicated that they felt it was unfair because they were unable to use their notes so they wanted to withdraw from the study.

Recommendations for Future Research

The findings suggest that the difference in mean test score performance between the two cohorts of test takers may have been due to the different test-taking conditions they experienced. Un-proctored test takers had higher mean test scores than proctored test takers because they were unmonitored while they took their tests. As a result unlike the proctored cohort, the un-proctored test takers could potentially use resources such as notes, text, and the internet while taking their exams. Therefore, the difference in test achievement may be attributable to the “proctored” condition. Future research, then, should test this hypothesis by eliminating the differences in test taking conditions for the two groups. If, after replicating the previous study methodology, and standardizing test taking conditions for both groups, their mean exam scores are similar, then, one can conclude that the previous difference in mean exam scores was more than likely due to the proctored status of test-takers. It would be useful then to conduct a future study where both groups of students (on campus face-to-face and online distance learners) take their test under equivalent conditions. Specifically, both groups will be proctored or monitored: the on-campus group will take their exams as before, on campus with a proctor, and distance students will be proctored using technology (web cams and software) that can block web surfing and/or the use of notes and texts during the exams. The bottom line is to eliminate the test taking condition as a factor that affects student test performance; if not, then online testing is not an accurate measure of student learning.

References

Greenberg, R. (1998). Online testing. Techniques, Making Education & Career Connections.
73(3), 26-29.

Lorenzetti, J. (2006). Proctoring assessment: Benefits and challenges. Distance Education Report.
10(8), 5-6.

Royal, P., & Bell, P. (2008). The Relationship Between Performance Levels and Test Delivery Methods. International Journal of Instructional Technology & Distance Learning. 5(7).

Summers, J., Waigandt, A., & Whittaker, T. (2005). A comparison of student achievement and satisfaction in an online versus a traditional face-to-face statistics class. Innovative Higher Education, 29(3), 233-250.

Turner, C. (2005). A new honesty for a new game: Distinguishing cheating from learning in a web-based testing environment. Journal of Political Science Education. 1, 163-174.

Warren, L., & Holloman, Jr., H. (2005). On-line instruction: Are the outcomes the same? Journal of Instructional Psychology, 32(2), 148-151.

Wellman, G. & Marcinkiewicz, H. (2004). Online learning and time-on-task: Impact of proctored vs. un-proctored testing. Journal of Asynchronous Learning Networks, 8(4).

About the Authors

Patricia Royal, Ed.D is an assistant professor in the Health Services and Information Management Department in the College of Allied Health at East Carolina University in Greenville NC. She holds a masters degree in social work and a doctoral degree in higher education, and completed post graduate work in public health. Email: royalp@ecu.edu

Paul D. Bell, PhD, RHIA, CTR, is Associate Professor of Health Services and Information Management in the College of Allied Health Sciences at East Carolina, University in Greenville, NC. Email: bellp@ecu.edu

Address for both Drs. Royal and Bell:
Health Services and Information Management, Suite 4340, Health Sciences Drive
College of Allied Health, East Carolina University
Greenville, NC 27858

go top

May 2009 Index

Home Page