The methodology underpinning the analysis of trends in performance in international studies of education is complex. To ensure the comparability of PISA results across different assessment years, a number of conditions must be met.
In particular, successive assessments of the same subject must include a sufficient number of common assessment items, and these items must retain their measurement properties over time so that results can be reported on a common scale. The set of items included must adequately cover the different aspects of the framework.
Furthermore, the sample of students in different assessment cycles must be similarly representative of the target population; only results from samples that meet the strict standards set by PISA can be compared over time. Even though some countries and economies took part in successive PISA assessments, some of them cannot compare all their PISA results over time.
Comparisons over time can be affected by changes in assessment conditions or in the methods used to estimate students’ performance on the PISA scale. With each cycle, PISA aims to measure the knowledge and skills that are required to participate fully in society and the economy. This includes making sure that the assessment instruments are aligned with new developments in assessment techniques and with the latest understanding of the cognitive processes underlying proficiency in each domain.
A major change that took place between the 2012 and 2015 assessments of all domains, including financial literacy, was the use of computers instead of pencils and paper to deliver the assessment. The PISA 2015 field trial examined the equivalence of reading, mathematics, and science items between paper- and computer-based assessments. Items that passed the equivalence test were used to link across modes and assessment cycles. However, given the small number of countries/economies that participated in the optional financial literacy assessment, a different procedure was used to link the 2012 and 2015 financial literacy assessments. The PISA 2015 field trial included a mode-effect study comparing the performance of students who were randomly assigned to take the assessments in a paper-based or a computer-based form. Linking the financial literacy scales between 2012 and 2015 was accomplished by using all available data (from the 2012 main study, the 2015 field trial and the 2015 main study), exploiting the equivalence of the two samples in the 2015 field trial. This method provided a consistent and robust linking approach, but it did not provide information on which items were directly comparable across modes. The PISA 2015 Technical Report (OECD, 2017[1]) provides more details about the scaling of the financial literacy assessment and the mode-effect study conducted in the context of the PISA 2015 field trial. As the PISA 2015, 2018 and 2022 assessments were all delivered on computers, no mode effects confounded the comparison of results between these years.
A major difference regarding the sampling design and the scheduling of the assessment took place between 2015 and the other three; this was specific to financial literacy and did not affect the assessment of the other domains. Students assessed in financial literacy in 2012, 2018 and 2022 were tested in financial literacy – and, in addition, in mathematics and reading – at the same time as other students sat the core assessment. By contrast, students assessed in financial literacy in 2015 sat the financial literacy assessment in a separate session after having been tested in mathematics, reading and science. In most participating countries/economies, the financial literacy assessment session took place on the afternoon of the same day as the core PISA tests in a large majority of sampled schools. However, in Brazil, students in about one in three schools sat the financial literacy assessment on a different day than the day when they sat the core PISA tests. This was also the case for students in about four out of five schools in Italy. Genuine financial literacy trends might be confounded by this change in the scheduling of the assessment, especially in countries/economies where most students sat the financial literacy assessment in the afternoon, as those students might have been tired after a long day of testing.
This report thus presents changes in performance between 2012 and 2022, where the major difference in implementation was in the mode of delivery; and between 2015 and 2022, where the major difference in implementation was in scheduling. It also presents changes in performance between 2018 and 2022, where no difference was observed in delivery nor scheduling.
From 2015 onward, PISA also adopted a more flexible model for scaling response data, and treated items that were left unanswered at the end of test forms as if they were not part of the test, rather than as incorrectly answered. Such items were considered incorrect in previous assessments for the purpose of estimating students’ position on the PISA scale. Instead of re-estimating past results based on new methods, PISA incorporates the uncertainty associated with these changes when computing the statistical significance of trend estimates (see the section on “link errors” below).
Changes in enrolment rates do not affect the representative nature of the PISA sample with regards to its target population (15-year-olds enrolled in Grade 7 or above), nevertheless, such changes may affect the interpretation of trends.