Format Effect on Testing

In 2004, as part of my internship with the American Institutes for Research (under Dr. Christine Paulsen), I researched the impact of formatting on the application of active reading strategies on students taking standardized admissions tests online. A comprehensive literature review (Part 1) was followed up by a usability test that actually disproved the hypothesis that I had developed. The test report (Part 2) provides the results and possible reasons behind the findings. Part 3 contains the Appendices with test questions and format samples.

Part 1: A review of the literature was performed documenting active reading strategies as they apply to paper-based test taking (“PBT”); a contrast was then drawn between PBT reading strategies and that of the Gen 1 computer-based testing (“CBT”) format being piloted in 2004. Based on the research, the conclusion was drawn about the likelihood that CBT at that time might not be conducive to student performance and could fail to accurately predict academic success. Recommendations were included around the introduction of PBT-based interface features to optimize CBT conditions.

Part 2: A usability test was conducted to test the assumption made in Part 1. Learning Objectives were as follows:

  1. Observe how both test formats affected the implementation of reading strategies and what features were most successful at facilitating reading.
  2. Assess test performance based on test format. Quantitative metrics include number of correct answers (performance) and time to complete the tests.
  3. Evaluate participants’ format preferences based on post-test interviews and ratings.
  4. Solicit participants’ reactions to usability and tool usefulness in CBT based on interviews and ratings.

RESULTS: The results derived from this test were both quantitative and qualitative, including data for performance and time to complete the tests, as well as retrospective ratings and comments on the experiences, preferences, and perceptions of different aspects of the test taking experience.  The quantitative measures revealed no significant difference between the two formats.  The qualitative measures of preference and test taking behaviors were also fairly consistent across the formats, pointing to a high degree of equivalence in test-taking experiences.

Part 3: Appendices containing test artifacts and format samples.