The impact of a pilot comparative assessment to inform in-term teaching
The Maths, English, Science and Humanities Assessment (MESHA) Victorian pilot for Year 8 Maths was designed to test the hypothesis that a shared assessment across similar schools would enable more precise identification of learning gaps and misconceptions than isolated, school-based assessment alone.
Conducted across 7 Victorian secondary schools in Term 3, 2025, the pilot provides early evidence that a comparative, misconception-linked assessment can strengthen formative decision making for both teachers and school leaders.
Context and design
While summative measures, such as NAPLAN or Year 12 VCE and HSC exam results, provide benchmarking, they are either too infrequent, too delayed, or insufficiently granular to inform timely instructional responses. As a result, many instructional decisions rely on internally generated data, which often lacks external reference points.
The MESHA pilot sought to address this gap through a 40-minute common assessment administered to 615 Year 8 students across 33 classes in 7 high schools. All pilot classes completed the assessment in a specified 2 week assessment window.
Participating pilot schools achieved 2025 Year 12 results placing them in the top 150 Victorian schools. This was a deliberate pilot design choice to allow schools to see a results comparison with other like-schools.
The assessment consisted of 32 multiple-choice questions aligned to the Victorian Curriculum 2.0 and Australian Curriculum 9.0 dot points, with each distractor (incorrect option) intentionally mapped to a specific misconception. The 16 questions in Part A were completed without a calculator, whereas the 16 questions in Part B were calculator based. The assessment also included a variety of procedural, worded and abstract questions.

The assessment design reflects established principles of formative assessment (Wiliam, 2017) and explicit instruction (Rosenshine), emphasising the role of evidence in guiding teaching action.
Results were returned to schools within 10 days, enabling use within the same term.
Key Findings
Comparative data gives meaning to performance
A key finding of the pilot is that raw scores alone are insufficient for determining comparative performance. For example, two classes achieving almost identical average scores (62.5 and 63.6%) on two different questions were shown to have markedly different relative performance when compared to the broader cohort. The results from the two classes are shown in the table below.

Class A is 26% below the cohort of schools. This might have a simple explanation: the specific topic was covered a while ago or the topic hadn’t been taught yet. In any case, the teaching team could check the question and decide if it’s something they expected. If it’s a surprise gap in student knowledge, they can plan how to address it.
Class B is 22% above the cohort of like-schools, meaning students correctly answered a question a lot of other students in similar schools struggled with. Again, this might easily be explained (a known high achieving class, or students have just revised this topic) or a surprise.
This reinforces the importance of external reference points. Comparative data allowed schools to determine not just how students performed, but how they performed relative to peers at similar schools, shifting interpretation from absolute to contextualised performance.
Misconceptions drive error patterns
Analysis of 19,165 student responses revealed that errors were not randomly distributed. Instead, a small number of misconceptions accounted for a disproportionate share of incorrect responses. Approximately 20% of distractors contributed to 50% of total errors, consistent with a Pareto-like distribution.

This suggests that student errors are often rooted in shared, systematic misunderstandings rather than isolated mistakes. In the pilot the 3 most common misconceptions are related to the multiplicative inverse; students did not simplify (258 out of 615 students), students added the mark down percentage back to the original price (222 out of 615 students) and students did not apply compound growth (200 out of 615 students).
For teachers, this provides a more actionable diagnostic signal: addressing a small number of high-frequency misconceptions has the potential to yield significant gains.
Strong relationships between misconceptions and overall performance
The pilot found a strong correlation (R² = 0.948) between the prevalence of misconceptions in a class and overall assessment performance. Classes with a higher proportion of misconception-driven errors tended to have lower overall scores.
This reinforces the instructional importance of identifying and addressing misconceptions directly. It also positions misconception analysis as a key lever for improving outcomes, beyond general practice or content coverage.
Assessment design reveals latent gaps
Several findings highlighted how misconceptions persist across year levels. For instance, a relatively simple proportional reasoning question (aligned to earlier curriculum levels) was among the easiest overall, yet still revealed significant gaps in some classes.
Similarly, a symmetry question, nominally a Year 7 concept, proved to be the most difficult item on the test.
These patterns are supported by Siemon et al. (2019), their research indicating that unaddressed misconceptions compound over time. Without explicit identification and intervention, gaps are carried forward into later learning.

Instructional implications
The case studies from the pilot illustrate how schools used the data to inform practice:
- Curriculum sequencing and retrieval practice: One school identified that lower performance in an accelerated class was linked to earlier instruction in the year without sufficient spaced retrieval. This prompted a shift toward embedding retrieval practice across the curriculum.
- The use of scaffolds (e.g. formula sheets): Another school identified over-reliance on formula sheets in higher-performing classes, leading to poorer performance when supports were removed. This led to a faculty-wide decision to reduce scaffold dependency.
- Targeted lesson design: Misconception data enabled teachers to plan instruction with greater precision — for example, designing tasks specifically addressing median calculation in even-numbered datasets, rather than reteaching entire topics.
Across these examples, a consistent pattern emerges: the combination of comparative data and misconception analysis enables more focused, evidence-informed instructional responses.
CTA: Read how MESHA helped Clonard College
Conclusion and next steps
The MESHA Year 8 pilot provides promising early evidence that comparative, misconception-linked assessment can enhance formative practice in secondary mathematics. By combining external comparison with diagnostic insight, the approach addresses key limitations of traditional school-based assessment.
The current findings demonstrate improved diagnostic clarity and instructional response but longitudinal evidence will be critical in validating the approach.
As the pilot expands to additional cohorts and subject areas, it represents a potentially valuable addition to the assessment landscape, particularly in the middle years, where timely, actionable data remains limited.
References
Rosenshine, B. (2012) Principles of Instruction: Research-Based Strategies That All Teachers Should Know in American Educator, v36 n1 p12-19.
Siemon, D., Barkatsas, T. and Seah, R. Eds (2019) Researching and Using Progressions (Trajectories) in Mathematics Education. Brill, Netherlands.
Wiliam, D. (2017) Embedded Formative Assessment. 2nd ed. Solution Tree Press, Indiana.
The Science Inquiry Skills in the Victorian Curriculum 2.0
- Questioning and predicting
- Planning and conducting
- Processing, modelling and analysing
- Evaluating
- Communicating
In the earlier curriculum, inquiry skills were structured as a parallel strand to content and were often interpreted as a context for learning rather than the primary means through which understanding was demonstrated.
By contrast, Version 2.0 integrates inquiry more closely with content and foregrounds how students use evidence, data and models to construct and justify scientific explanations. This is evident in the achievement standards, which increasingly require students to analyse, evaluate and justify claims using multiple sources of evidence, particularly in Years 9 and 10.
Laying skills foundations from Years 7-10
While the five Science Inquiry Skills were present in the previous Victorian Curriculum Version 2.0 makes their role more explicit and positions them more centrally within learning and assessment.
The increased emphasis on science skills in the new Victorian Curriculum supports stronger foundations for academic success in senior science, including Year 12 examinations.
When skills such as analysing data, evaluating evidence and communicating scientifically are developed deliberately across Years 7–10, students enter VCE better prepared for the kinds of questions that reward application, reasoning and justification, not just recall.
Embedding these skills early means students build confidence with evidence-based thinking over time, rather than encountering it for the first time in Year 12. This approach aligns teaching and assessment with what VCE science values, without reducing learning to exam preparation.
Just as importantly, working with real-world data, models and contemporary contexts helps students relate science to the information they encounter through media and everyday life. Developing the ability to question and critically evaluate scientific claims supports engagement, scientific literacy and confidence in tackling unfamiliar problems.
The table below compares the Year 7 achievement standards in the Victorian Curriculum Version 1.0 with those in Version 2.0, illustrating a shift from a focus on knowledge acquisition to an emphasis on interpretation, application and the societal impact of scientific knowledge.
While Version 1.0 emphasises understanding and applying scientific concepts, Version 2.0 positions students as critical interpreters of scientific knowledge who evaluate evidence, consider ethical and societal implications, and communicate for impact. This represents a shift in both cognitive demand and the purpose of science education.
Do whales have belly buttons?
By using science skills as the entry point to learning, lessons are designed to increase both engagement and depth of understanding.
Take a familiar Year 7 topic like classification. Rather than beginning with a presentation of animal groups and their defining features, learning can begin with a simple, curiosity-driven question: Do whales have belly buttons? This invites students to question, predict and justify their thinking from the outset.
From there, comparisons unfold naturally - whales and fish, fish and sharks, whales and hippos - with each step requiring students to use evidence, identify patterns and refine classifications.
By leading with curiosity and comparison, reasoning, evidence use and scientific communication become the engine of learning. This gives students a reason to care about the content and supports learning that is more memorable, meaningful and aligned with the skills emphasised in the Victorian Curriculum.
Why science skills are essential for VCE success
Across Biology, Chemistry, Physics and Psychology, recent VCE examinations consistently emphasise students’ ability to use scientific skills rather than rely on content recall alone. High-value questions commonly require students to analyse data, interpret models, justify conclusions and respond to unfamiliar contexts.
While recall remains part of each exam, it plays a limited role in distinguishing student performance. Recent exam papers increasingly present data or stimulus material from the outset, requiring sustained analysis and reasoning throughout questions.
As a result, VCE science exams reinforce the same core skills year after year, making their deliberate development essential rather than optional.
the five Science Inquiry Skills were present in the previous Victorian Curriculum Version 2.0 makes their role more explicit and positions them more centrally within learning and assessment.
The increased emphasis on science skills in the new Victorian Curriculum supports stronger foundations for academic success in senior science, including Year 12 examinations.
When skills such as analysing data, evaluating evidence and communicating scientifically are developed deliberately across Years 7–10, students enter VCE better prepared for the kinds of questions that reward application, reasoning and justification, not just recall.
Embedding these skills early means students build confidence with evidence-based thinking over time, rather than encountering it for the first time in Year 12. This approach aligns teaching and assessment with what VCE science values, without reducing learning to exam preparation.
Just as importantly, working with real-world data, models and contemporary contexts helps students relate science to the information they encounter through media and everyday life. Developing the ability to question and critically evaluate scientific claims supports engagement, scientific literacy and confidence in tackling unfamiliar problems.
The table below compares the Year 7 achievement standards in the Victorian Curriculum Version 1.0 with those in Version 2.0, illustrating a shift from a focus on knowledge acquisition to an emphasis on interpretation, application and the societal impact of scientific knowledge.
Science skills exam question examples
Biology - Evaluating evidence
In Section B, Question 1e of the 2024 VCE Biology examination, student performance dropped sharply when evaluation, rather than explanation, became the focus.
Although most students possessed the necessary knowledge of protein synthesis, the task required them to critically examine the scientific model itself by identifying a limitation and proposing a way it could be addressed.

According to the 2024 VCE Biology examination report, this question was answered poorly, with an average score of 0.7 out of 2 and more than 50% of students scoring zero. The report notes that many students focused on weaknesses of the biological process, rather than evaluating the model as a simplified representation used to organise and explain complex phenomena. High-scoring responses demonstrated an ability to judge the model’s explanatory power and constraints.
The challenge here was not content difficulty. It stemmed from the cognitive demand of the task. Students were required to:
- Differentiate between a process and its model
- Critically assess what the model includes and omits
- Explain the significance of those omissions
- Propose a reasoned improvement
The challenge for students was not knowledge, but evaluation as a scientific skill.
Chemistry – Processing, modelling and analysing data
In Section B, Question 8b.iv of the 2024 VCE Chemistry examination, where students were required to demonstrate data processing and interpretation, rather than chemical recall. The question asked students to determine the resolution of laboratory equipment, an electronic balance and a burette, based on how measurements were recorded, requiring them to infer precision from evidence rather than read it directly.

The 2024 Chemistry examination report shows this question was answered extremely poorly, with an average score of 0.2 out of 2 and 86% of students scoring zero. While many students correctly identified the balance resolution, most were unable to determine the burette resolution from its graduation spacing, despite this expectation being clearly outlined in the Study Design. High-scoring responses depended on applying conventions about measurement uncertainty and units, not recalling definitions.
As with Biology (and the other subject examples), the difficulty did not lie in unfamiliar content. Students struggled because they had to interpret data, apply procedural rules, and use evidence to justify an answer, revealing data analysis as another critical science skill that is commonly underdeveloped.
Physics – Processing, modelling and analysing data
In Section B, Question 16b of the 2024 VCE Physics examination, students were asked to construct and interpret a line of best fit from experimental data. Rather than recalling formulas or definitions, students needed to model data appropriately by considering all plotted points and representing the underlying trend accurately.
The 2024 VCE Physics examination report shows this question was answered poorly, with a large proportion (53%) of students losing marks because they ruled their line of best fit through the first and last data points only, ignoring the remaining data. This error is explicitly noted by assessors as persistent and conceptually incorrect, and has appeared repeatedly in previous years.
As with the other subjects, the difficulty was not conceptual Physics content. Students struggled because they were required to:
- Treat data as evidence, not decoration
- Model a relationship, rather than connect two extreme points
- Apply conventions of scientific graphing, including trend representation
- Demonstrate understanding visually, not just numerically
This question highlights data modelling as a distinct science skill: one that is assumed by the curriculum, frequently assessed, but still poorly mastered by many students.
Psychology – Evaluating evidence
Question 9 in Section B of the 2024 VCE Psychology extended-response question illustrates how strongly VCAA assesses evaluation as a science skill, rather than content recall alone. Although most students demonstrated sound knowledge of CBT, synaptic plasticity and amygdala function, only 1% achieved full marks because high-level responses required students to evaluate the model’s applicability.

The 2024 VCE Psychology assessment report shows that a tin 1% of the state scored full marks on this question, and the average mark was 4.4. The report makes clear that students who explained mechanisms well but failed to weigh strengths, limitations and implications were capped at mid-range scores. Success depended on constructing an evidence-based judgement, balancing pros and cons and reaching a justified conclusion, highlighting evaluation literacy as a critical, and commonly underdeveloped, scientific skill.
What this reveals is this was not a “hard content” question. It was hard because students had to:
- Transfer a model from social anxiety to specific phobia
- Evaluate applicability, not just describe mechanism
- Balance:
- why CBT could work (amygdala activity, synaptic plasticity, extinction learning)
- why it might not fully explain outcomes (heterogeneity of phobias, other brain regions, behavioural vs neural change)
- Conclude, using evidence, not opinion
Unfortunately most students stopped at Step 2. Evaluation therefore operated as a bottleneck skill: rarely practised, lightly distributed across the exam, but the differentiator for high-level performance.
Common VCE science exam pitfalls
We analysed the 10 most common pitfalls on VCE exams, and how to help students avoid them. Click to read more
How Edrolo gives you a skills-explicit curriculum
VCE sciences
All our VCE science courses for Biology, Chemistry, Physics and Psychology include skills videos that unpack the key science skills, and application of these for in the specific science subject. For Edrolo Daily Practice and Daily Plus subscribers, there’s also additional key science skill questions to accompany these videos that help students put the skills into practice.
The questions in all courses are written by experienced VCE teachers and mapped backwards from exam success. The questions model the specific content and skills that students will need to be able to demonstrate on the exams, well before they tackle the real thing.
VCE Biology 2025 exam question and Edrolo question
The questions in all courses are written by experienced VCE teachers and mapped backwards from exam success. The questions model the specific content and skills that students will need to be able to demonstrate on the exams, well before they tackle the real thing.


VCE Psychology 2025 exam question and Edrolo question
Both of these questions require:
- Students to 'explain' how or why Alzheimer's disease impacts retrieval of autobiographical events
- Explain task word in a short-answer question
- They vary slightly in that Edrolo's questions requires application to a hypothetical scenario whereas VCAA's does not, but this is good practice as common in other exam questions also


Years 7-10 Science
Edrolo Years 7-10 Science for the Victorian Curriculum 2.0 is the only resource in the market for Years 7-10 that explicitly teaches skills progressively across the junior years.
The key science skills units and questions scaffold students’ familiarity and confidence in skills, and expose them to the types of questions and problems they’ll encounter in senior sciences. Skills are then embedded across topics with activities for students to put these into practice. Plus, the new Like a scientist video series with 7NEWS meteorologist Jane Bunn further helps students put skills into context using real world phenomena.