Earlier this year Ofqual announced its intention to work with exam boards to “support the use of innovative practice and technology”. Part of this involves exploring the potential role of adaptive testing – computer-based tests that select question difficulty depending on how the student answered previous questions.
In particular, Ofqual is considering whether adaptive testing could offer an alternative approach to the current system of “tiering” – whereby for certain GCSE subjects such as maths, sciences and languages, pupils can sit a “foundation” paper or a more demanding “higher” paper. Under this current system, pupils sit different exams and are limited in the range of grades they can achieve.
An extensive item bank, covering a wide range of content for different ability levels, is an essential pre-requisite. One approach to adaptive testing, item response theory, uses a statistical model to estimate a numerical value for each individual’s level of proficiency in a subject. These can help compare scores for students who have taken very different assessments.
How does it work in practice?
An adaptive assessment begins with a random selection of a few mid-difficulty items (questions). The student’s responses to these allow an initial estimate of his or her proficiency Subsequent items can be more or less difficult based on this estimate. If they are answered correctly, the next question is more difficult. If they are answered incorrectly, the next one is less challenging. The computer continuously updates its estimate of the student’s proficiency until the process is stopped.
The first generation of computerised adaptive tests starting in the 1980s only used multiple-choice questions, however, advances in psychometric techniques and more powerful computers have enabled the use of other types of questions. Some modern foreign language assessments, such as listening and reading, can also use adaptive assessment.
Adaptive tests are already in existence in the UK. The Scottish National Standardised Assessments uses adaptive testing to provide teachers with diagnostic information and policymakers with national data on student progression. In Wales, statutory personalised assessments in reading and numeracy use adaptive tests throughout Years 2 to 9.
While benefits of computerised adaptive tests include a more personalised experience for students and flexible assessment, they are not suitable for all subjects and can’t assess all skill types. Setting essays or seeking longer narrative responses within adaptive tests presents designers with more of a technical challenge. More research is required into how best to extract assessment information from more complex responses containing diagrams, data plots, tables or performance tasks, which usually need human intervention to provide a score.
The benefits of adaptive testing
- students can be presented with a shorter, bespoke sequence of exam questions.
- Individuals can work at their own pace
- the exam schedule can be more flexible because there is no single fixed paper, and the requirement for secrecy of question papers is relaxed
- feedback for students can be provided during the assessment and immediately afterwards.
The challenges to overcome
- Any ambiguity about test difficulty could undermine public confidence and would require a clear explanation of how adaptive assessment works.
- There are unanswered questions to resolve around how best to create large pretested item banks, select and sequence questions, and when to stop testing
- High demand for new assessment items
- How to ensure fairness among students when they are presented with different assessments
- Student experience, i.e. conventional question papers allow a student to review the whole paper, to skip more difficult items initially and to return to these later. Computer-based adaptive testing may require exam questions to be answered in the order in which they are presented.
Questions for policymakers
Rolling out computerised adaptive testing in England, particularly in relation to high-stakes qualifications such as GCSEs and A-levels, requires careful thought and consideration.
Policymakers considering how adaptive assessment can be realised in England may wish to consider the following questions.
- How would trust be maintained in any ‘black box’ algorithm to control assessments?
- Will all centres have the technological infrastructure to support computer adaptive assessment?
- How will the cost of developing computer adaptive assessments compare with current paper-based fixed exams?
- How will computer adaptive assessment data be used to maintain an academic standard and inform policymaking?
- How will teachers, head teachers, and administrators be trained to use and understand computer adaptive assessment?