The methodology underpinning the analysis of trends in performance in international studies of education is complex. To ensure the comparability of PISA results across different assessment years, a number of conditions must be met.
In particular, successive assessments of the same subject must include a sufficient number of common assessment items, and these items must retain their measurement properties over time so that results can be reported on a common scale. The set of items included must adequately cover the different aspects of the framework for each domain.
Furthermore, the sample of students in different assessment cycles must be similarly representative of the target population; only results from samples that meet the strict standards set by PISA can be compared over time. Even though some countries and economies took part in successive PISA assessments, some of them cannot compare all their PISA results over time.
Comparisons over time can be affected by changes in assessment conditions or in the methods used to estimate students’ performance on the PISA scale. In particular, from 2015 onward, PISA introduced computer-based testing as the main form of assessment. It also adopted a more flexible model for scaling response data, and treated items that were left unanswered at the end of test forms as if they were not part of the test, rather than as incorrectly answered. (Such items were considered incorrect in previous assessments for the purpose of estimating students’ position on the PISA scale.) Instead of re-estimating past results based on new methods, PISA incorporates the uncertainty associated with these changes when computing the statistical significance of trend estimates (see the section on “link errors” below).
Changes in enrolment rates do not affect the representative nature of the PISA sample with regards to its target population (15-year-olds enrolled in Grade 7 or above), nevertheless, such changes may affect the interpretation of trends.
Finally, comparisons of assessment results through years that correspond to different assessment frameworks may also reflect the shifting emphasis of the test. For example, differences between PISA 2018 (and earlier) and PISA 2022 results in mathematics reflect not only whether students have become better at mastering the common assessment items used for linking the assessments (which reflect the earlier assessment framework), they also reflect students’ relative performance (compared to other students in other countries) on aspects of proficiency that are emphasised in the most recent assessment framework.