Writing valid science exam questions- avoiding cognitive overload

Roger Kennett
6 days ago
6 min read

We want our exam questions to be instruments; instruments that validly probe student understanding and authentically reflect the degree of development of their scientific schemas. This is a bigger challenge that we might first realise, but let's keep this aspiration as our first priority.

How can science help us write better science exams?

One of the best evidence-validated and sometimes counterintuitive theory of learning is Cognitive Load Theory (CLT). There are many aspects to CLT such as "expertise reversal syndrome" but most relevant to exam writing is the limitation of human working memory. Research indicates that our long-term memory is almost limitless (even if not always delivering back what we want when we want it!).

On the other hand, working memory is seriously limited. Our working memory is the operating RAM that we need to process new ideas and observations and adjust our existing schemas in light of them – this process is the essence of learning. However, humans of all ages are limited to about 5 (±2) items in our working memory. If information is new, it takes up a full slot for each novel item, but one slot can instead reference a developed schema in our long-term memory. This is why, as we develop from novice to expert, we are increasingly able to process and assimilate new ideas and information in the domain of our expertise.

There is so much more to be said about CLT, but the final important point is that once a human gets to cognitive overload they are tapping out. It is not that we can throttle back and perform at our maximum — at overload our brain shifts into a form of cognitive self-protection. Attention fragments, anxiety rises, we begin to omit or mis-process information, error rates increase.. At this point, additional input is not just ineffective; it can be counterproductive because it adds further load. I think we would call that a positive feedback loop to underperformance. Not what we want!

The problem with many science exam questions

I frequently see a stimulus that exhausts the working memory capacity before the student has a chance to earn any marks.

It is a question. A multi-part, elegant question in a novel context. Providing a novel context is an effective and common way to more deeply probe a student's schema, the agility of their understanding. While it has some easy-entry opening parts, such a question is often structured like this:

The huge cognitive load of the opening stimulus places a massive "entry price" for students. Many of your typical students WOULD be able to answer part (a) but they have reached cognitive overload before they get the chance. We then scratch our heads because we know that Roger should have been able to answer part (a) – sure he was never going to be able to get to part (c), but what happened?

At this point, we no longer have a valid question (measuring what we intended to measure). Part (a) was intended to measure x, but is instead measuring something else.

It is like charging $1000 entry fee to an auction where the first item is a $10 vase!

CLT explains just what went wrong.

How do we avoid this?

Reduce extraneous load

Intrinsic load is the cognitive demand that is an essential part of what you are testing. Science is hard and yet we do not have to dumb it down. On the other hand, extraneous load is all the items you are clogging up those limited working-memory slots with that are not relevant to what you are asking students to do. Perhaps it is beautiful "story-telling" that paints a rich, novel context– but how much of that is absolutely necessary? Start with a sharp set of pruning shears and cut out all that extraneous fluff.

Think of every word or phrase as a lodger in a share-house. If they are not contributing, they need to GO!

Can you make it clearer with a diagram instead of text? (in science, probably yes).

However, also ask the hard questions about your diagram / image. Does it add to the question? Does it make things clearer that text? OR, is it just decorative? Decorative = extraneous. Either improve it or dump it.

Redundancy or split-attention effect

This is caused by repeating the same information in two different ways. Perhaps you explain details in the text and then repeat that same information in the diagram. Intuitively that seems a good thing to do, right?... to give students a second way to "get" that critical information?

Unfortunately the reality is that we have now cost our student an extra working memory slot and pushed them closer to cognitive overload.

Decide which channel you will use to communicate each item, make it clear, and do it once.

Stage your entry ticket

This is a highly effective strategy. Examine your part (a) and then put before that ONLY the information relevant to answering that part.

Then, drip feed in that rich and deeper context as the students progress through your question.

Sure, only the most capable students will be able to access part (d) but that is what we intended, isn't it?

If we structure like this, and we have kept the cognitive load purely to the intrinsic load, we are much more likely to have a valid measurement instrument. Here is what that can look like:

I have been writing exams for a few decades, but I still feel that with every iteration I am learning more about creating rich and authentic instruments. So much (including AI) is elevating "the test" as the main assessment tool with all the issues surrounding that, much of which I can't change. BWT, if you are interested in working though how to do authentic assessment (not tests) in the Age of AI this might be of interest to you.

At the same time, I think I owe it to my students to make the each tool as valid and authentic as I can. I want to honour their work learning the course by not tripping them up with hidden hurdles.

I am a long way from perfect, but if you would like to see some Science HSC Trial Exams and Year 11 Course exams that aspire to be authentic instruments informed by the science of learning, then you might appreciate these exams which are available right now. https://www.learningforge.com.au/exams

Evidence

An overview of the main claims in this article and links to evidence

Sweller’s Cognitive Load Theory (1988 onward)
Working memory limits (Miller; Cowan)
Instructional design effects (Mayer, Kalyuga, Paas)
NSW CESE synthesis (excellent practitioner-facing source)
and pretty much anything by Andrew Martin UNSW

Bibliography

Claim	Evidence
1. Working memory is limited; long‑term memory is effectively vast	Centre for Education Statistics and Evaluation. (2017). Cognitive load theory: Research that teachers really need to understand. NSW Department of Education.
2. Working memory processes new information; long‑term memory stores schemas	Paas, F., Renkl, A., & Sweller, J. (2003). Cognitive load theory and instructional design: Recent developments. Educational Psychologist, 38(1), 1–4.
3. Working memory capacity ≈ 5 ± 2 items (or fewer in modern research)	Miller, G. A. (1956). The magical number seven, plus or minus two. Psychological Review, 63(2), 81–97.
4. Expertise allows chunking, increasing effective processing capacity	Cowan, N. (2010). The magical mystery four: How is working memory capacity limited, and why? Current Directions in Psychological Science, 19(1), 51–57.
5. Cognitive overload reduces performance (attention fragmentation, errors)	Yeh, Y. Y., & Wickens, C. D. (1988). Dissociation of performance and subjective measures of workload. Human Factors, 30(1), 111–120.
6. Instruction must respect working memory limits to be effective	Kirschner, P. A., Sweller, J., & Clark, R. E. (2006). Why minimal guidance during instruction does not work. Educational Psychologist, 41(2), 75–86.
7. Intrinsic vs extraneous cognitive load distinction	Sweller, J. (1988). Cognitive load during problem solving. Cognitive Science, 12(2), 257–285.
8. Reducing extraneous load improves learning	Sweller, J., van Merriënboer, J. J. G., & Paas, F. (1998). Cognitive architecture and instructional design. Educational Psychology Review, 10(3), 251–296.
9. Redundancy effect (duplicated information increases load)	Kalyuga, S. (2011). The redundancy effect in cognitive load theory. In J. P. Mestre & B. H. Ross (Eds.), The psychology of learning and motivation (Vol. 55, pp. 1–29). Academic Press.
10. Split attention from multiple sources increases cognitive load	Chandler, P., & Sweller, J. (1991). Cognitive load theory and the format of instruction. Cognition and Instruction, 8(4), 293–332.
11. Removing irrelevant information improves learning (coherence principle)	Mayer, R. E. (2009). Multimedia learning (2nd ed.). Cambridge University Press.
12. Expertise reversal effect	Kalyuga, S., Ayres, P., Chandler, P., & Sweller, J. (2003). The expertise reversal effect. Educational Psychologist, 38(1), 23–31.
13. Schema development is central to learning	Sweller, J. (1988). Cognitive load during problem solving. Cognitive Science, 12(2), 257–285.
14. Cognitive overload prevents or impairs learning	Centre for Education Statistics and Evaluation. (2017). Cognitive load theory: Research that teachers really need to understand. NSW Department of Education.