Vreeland, T. (2003). Commentary on Joanna Bull and James Dalziel, Assessing Question Banks, Chapter 14 of: Reusing Online Resources: A Sustainable Approach to eLearning, (Ed.) Allison Littlejohn. Kogan Page, London. ISBN 0749439491. [www.reusing.info]. Journal of Interactive Media in Education, 2003 (1) Special Issue on Reusing Online Resources. ISSN:1365-893X [www-jime.open.ac.uk/2003/1/].


Chapter 14: Assessing Question Banks


Joanna Bull and James Dalziel



Commentary by Tom Vreeland

OpenVES
tsv@openves.org

Summary

In Chapter 14, Joanna Bull and James Daziel provide a comprehensive treatment of the issues surrounding the use of Question Banks and Computer Assisted Assessment, and provide a number of excellent examples of implementations. In their review of the technologies employed in Computer Assisted Assessment the authors include Computer Adaptive Testing and data generation. The authors reveal significant issues involving the impact of Intellectual Property rights and computer assisted assessment and make important suggestions for strategies to overcome these obstacles.

Main Review

In Chapter 14, Joanna Bull and James Daziel provide a comprehensive treatment of the issues surrounding the use of Question Banks and Computer Assisted Assessment, and provide a number of excellent examples of implementations.

It is worth noting that, although computer based assessment systems have been in use for thirty or forty years, their growth and utilization has been limited by closed proprietary architectures and a lack of interoperability. One of the reasons that the IMS Question and Test Interoperability Specifications are so important, is because they provide an XML model for open interoperability. We are on the threshold of explosive utilization of assessment tools in primary and secondary education and the growing integration of assessment tools in educational environments portends the transformation of the current multi-billion dollar assessment market.

In their review of the technologies employed in Computer Assisted Assessment the authors include Computer Adaptive Testing and data generation. One of the most significant new technologies being implemented in assessment is the use of Bayesian models of text classification for scoring open response items and student essays. The developers of these naïve Bayesian systems, which use a multivariate Bernoulli model, or a multinomial model, claim performance that matches or improves upon the performance of human scorers. Some of the highest profile uses of Latent Semantic Analysis and this technology are Pearson in the US Military and schools, and the Educational Testing Service (ETS) and their e-rater testing tools. The technology is widely used in scoring essays in their Graduate Record Exam (GRE).

Although the chapter is titled "Assessing Question Banks", it would be helpful if the author's taxonomy were broader than just the traditional multiple-choice questions that are useful for assessment of learning expectations in the cognitive domain. It would be particularly helpful if the taxonomy included checklists, rubrics, portfolio assessments and other "authentic" assessments. The importance of these assessments is that they make it possible to evaluate higher order thinking skills in the cognitive domain, and other learning standards in the psychomotor and affective domains.

In reading the chapter it was not always clear to me what the authors were saying with regard to validity, item analysis, difficulty, etc. One distinction, which should be made before talking about assessment metrics is the distinction between criterion referenced test items and norm-referenced test items, their differences and uses. Greater clarity on the item analysis issues could be achieved by describing the distinctive measures and item analysis methodologies used in criterion-referenced, and separately, those used in norm referenced testing.

Other elements I think deserve mention are parametized testing and pattern based assessments. These assessments rely upon an assessment algorithm or pattern for evaluation of a particular skill or knowledge. The pattern makes it possible to generate hundreds or thousands of unique items with little effort. This is one of the most important computer-based question types, is widely used in primary and secondary schools and in higher education, but is not supported by the IMS QTI specification.

In addition to the author's formative and summative categories of assessments it is useful to have a third category for pre-tests and the new genre of online assessments being used by teachers and professors just before or during a class period as part of the methodology called Just-in-time teaching.

As more and more summative high stakes testing occurs in primary and secondary schools, higher education, and business education the issues of subgroup bias and demonstrating validity across populations become more visible. The American Association for the Advancement of Science, Achieve.org, and the CEO Forum have all reported fundamental and significant problems with statewide school assessment systems in the United States. One of the most significant challenges they identified in assessments and in computer assisted assessments is the problem of alignment and correlation of assessment items with the learning standards they purport to measure.

The IMS Question and Test Interoperability (QTI) specifications provide the details necessary to openly exchange of items and assessments, to build item banks, and to manage and administer items. The QTI Information model describes items, sections, and assessments as the Information Objects in assessment systems. Sections may contain items or a group of items that share a common element, as in the case of a reading passage and the set of items that test comprehension of the passage. Assessments may contain items or sections. These structural concepts of assessments are important to consider in planning potential reuse of test items.

The principal value of computer adaptive tests is their ability to "walk" the network of knowledge, skills and abilities being measured including: prerequisites, and both sequential and parallel paths. Well designed computer adaptive tests can reveal gaps in student understanding, skills, and abilities so that targeted remediation is possible. In some cases the actual number of items on which a student is tested can be greater than in a conventional test, and the diagnostic information on students is significantly greater.

Another important technical issue related to the quality and design of multiple-choice assessments is the way in which distractors ( answer choices other than the correct answer(s).) are used to help identify patterns of student misconceptions

The authors reveal important issues involving the impact of Intellectual Property rights and computer assisted assessment and make important suggestions for strategies to overcome these obstacles.


Previous Section Top Next Section