Digital technologies and data – including Artificial Intelligence (AI) -- hold the potential to automate and thus improve the efficiency and effectiveness of regulatory, supervisory and enforcement activities. These functions have become increasingly complex, given the substantial increase of data of regulatory relevance to be processed in recent years, along with the growth of digital market forces posing new challenges. Market regulators and public enforcement authorities have turned to supervisory technology (SupTech) tools and solutions as a means to improve their surveillance, analytical and enforcement capabilities, which can in turn have important benefits for financial stability, market integrity and consumer welfare. This chapter takes stock of the most common uses of SupTech by regulatory, supervisory and enforcement authorities to date, identifies its associated benefits, risks and challenges, and outlines considerations for devising adequate SupTech strategies.
OECD Business and Finance Outlook 2021
5. The use of SupTech to enhance market supervision and integrity
Copy link to 5. The use of SupTech to enhance market supervision and integrityAbstract
5.1. Introduction
Copy link to 5.1. IntroductionDigital technologies and data are transforming the ways in which people, firms, and governments live, interact, work and produce at an accelerating rate (OECD, 2019[1]). This chapter considers the implications of this transformation for the supervisory and enforcement practices of market regulators and public enforcement authorities, which can be rendered more efficient through the use of supervisory technology (SupTech), including a growing potential for the use of AI.
SupTech usually refers to the use of digital tools and solutions – including hardware and software – by public sector regulators and supervisors to carry out their responsibilities (FSB, 2017[2]; BCBS, 2018[3]). While some variations exist1 as to what falls under the umbrella of SupTech, the term has mainly been used to refer to supervisory practices involving financial institutions and securities markets (World Bank, 2018[4]; di Castri et al., 2019[5]). However, recognising the potential of digital technologies and data to automate and thus improve the efficiency and effectiveness of supervisory and enforcement processes, this chapter considers the relevance of SupTech applications and concepts for a wider range of institutions with regulatory and enforcement responsibilities for private sector conduct – including not only securities and financial regulators, but also competition authorities and anti-corruption authorities – whose functions entail protecting investors and consumers, ensuring that markets are fair, efficient and transparent, and reducing systemic risk. By improving surveillance, analytical and enforcement capabilities of authorities, SupTech can have important benefits for financial stability, market integrity and consumer welfare (FSB, 2020[6]).
Beyond enhancing the overall capacity and efficiency of supervisory oversight at a general level, SupTech applications may be particularly relevant to better detect insider trading, market manipulation and misconduct, as well as to better determine compliance with and enforce regulatory requirements that are principle-based or comprise judgment-based rules, such as corporate disclosure requirements. In particular, the increased availability of data on the outcomes arising from different policy interventions that were previously imperfectly observable – or only observable at significant costs – enables improved monitoring and supervision, and more effective enforcement of policies (OECD, 2019[7]).
By extension, SupTech solutions also have the potential to alleviate the regulatory burden on regulated entities, which have themselves turned to regulatory technology (RegTech) tools to improve compliance outcomes against regulatory requirements and enhance risk management capabilities. Such solutions hold the potential to reduce costs related to regulatory reporting, data collection and risk management (ESMA, 2019[8]).
According to IBM estimates (2018[9]), poor data quality costs the United States economy around USD 3.1 trillion a year, and one in three US business leaders do not trust the information they use to make decisions. Research also suggests that while financial authorities have access to a growing wealth of data to guide their decisions and actions, they tend to lack the infrastructure or skills to make use of this data, with increasing amounts of data often simply translating into more manual data processing and leading to “analysis paralysis” down the line (R²A, 2019[10]). As data continues to increase in volume, velocity, variety and complexity, it is essential that both regulators and market participants develop systems to appropriately process, monitor and analyse datasets of regulatory relevance.
This chapter takes stock of the most common uses of SupTech by supervisory and enforcement authorities to date, identifies the main benefits, risks and challenges associated with its adoption, and outlines considerations for devising adequate SupTech strategies. It draws upon insights from reports prepared by international bodies2 and other surveys, reviews of cases and research3 which appear to reflect an emerging consensus around some of the benefits and challenges related to the application of Suptech for market supervision. As part of the broader SupTech framework, the use of AI and its further potential is also addressed, illustrating both SupTech and AI use for market oversight, with a particular focus on the review of cases related to the enforcement of securities, competition and anti-bribery and corruption laws and regulations.
5.2. Drivers and typology of SupTech developments
Copy link to 5.2. Drivers and typology of SupTech developmentsDemand and supply drivers have simultaneously spurred the development and application of SupTech tools and methods by supervisory and enforcement authorities across policy areas (ESMA, 2019[8]; FSB, 2020[6]). While supply drivers permeate all three policy areas considered in this chapter, and include the availability of new analytical methods and tools at lower costs that allow large datasets to be collected, stored and analysed more efficiently, demand drivers are specific to each policy area and context. However, all converge on the need for authorities to adopt tools to process the increasing volume and availability of data being produced with respect to both traditional and digital markets.
In particular, financial and securities regulators have turned to digital tools to enhance their supervisory capability and efficiency in the aftermath of the 2008 global financial crisis, which led to an increasing complexity and volume of regulations, in turn leading to a substantial increase in regulatory data to process (FSB, 2020[6]). Competition agencies face similar challenges, with firms using digital technologies to engage in anticompetitive conduct (as discussed in Chapter 4), requiring consideration of a growing volume and complexity of data on market conduct. Recognising that the right infrastructure and expertise are needed to enforce competition laws in digital markets– especially when faced with well-resourced merging parties and defendants, competition authorities have recognised the importance of building up their capabilities.
Public enforcement authorities involved in the fight against corruption and foreign bribery have similar but also specific drivers to adopt SupTech tools. For instance, as non-trial resolutions in foreign bribery cases involving companies are becoming increasingly available in several jurisdictions with many more considering their introduction, self-reporting and cooperation with authorities is encouraged4 (OECD, 2019[11]). As these types of multi-jurisdictional cases involve large-scale investigations, companies are increasingly deploying AI tools in their cooperation efforts with authorities5. Therefore, authorities need to be able to understand how AI tools operate, how the information that is being provided to them through this process is selected and, most importantly, be able to analyse this amount of data in order to effectively exercise their enforcement functions.
Overall, as the volume and frequency of both structured and unstructured data being produced increases substantially, so does the need for architectures or systems that are able to collect, store, analyse and visualise these new forms of data6. For instance, in addition to regulatory returns from regulated entities, authorities leverage open source information (e.g. social media posts) to enhance their insights. According to a recent survey undertaken by the FSB (2020[6]), while regulatory, statistical, and market structured data make up the majority of data types collected from reporting institutions (respectively 45%, 22% and 12%), unstructured data amount to around one-fifth of the data collected by authorities (14% of regulatory unstructured data, 4% of statistical unstructured data, and 3% of market unstructured data). While unstructured data may offer useful insights, it is often collected in a format that makes it difficult to process and analyse.
The greater availability of “big data” itself stems from the increasing volume, frequency and granularity of reporting requirements, combined with the growth of the digital economy. Characterised by the “4 Vs” (volume, variety, velocity, validity), big data can pose data governance challenges for authorities, which have turned to technologies enabling sophisticated data processing techniques and generating advanced analytics. Despite the wide range of supervisory technologies available, their distinct features make their respective applications most relevant in specific areas of the data lifecycle. For instance, machine learning (ML) and natural language processing (NLP) are mostly applied by authorities for data analysis, processing and validation, while cloud computing is most often used for data storage, and blockchain is considered to offer potential for data collection (FSB, 2020[6]).
SupTech applications evolve along with technological innovations. To date, SupTech initiatives may – allowing for a certain degree of simplification – be classified as belonging to four successive technological layers or “generations”, which respectively generate descriptive, diagnostic, predictive and prescriptive analytics (
Figure 5.1) (di Castri et al., 2019[5]). While the first generation covers primarily manual data management workflows, the second involves the digitisation7 of certain paper-based processes in the data pipeline. These early generations of data architecture support mostly descriptive and diagnostic analytics (i.e. describing what happened and diagnosing why it happened). In a continuum, the third generation covers big data architecture, and the fourth involves AI as its main attribute – both enabling predictive and prescriptive analytics, in addition to enhanced descriptive and diagnostic analytics (i.e. predicting what will happen and prescribing anticipatory action).
As authorities’ use of predictive and prescriptive analytics have emerged only recently, they are still at the experimental or development stages, but are gaining momentum. By fully automating data processing and optimising data storage and computation through the use of big data architectures involving tools such as application programming interfaces (APIs) and robotic process automation (RPA), big data architectures8 can process larger datasets with greater computing power – in turn generating advanced insights such as predictive analytics. As AI-enabled solutions require large volumes of data and significant computing power in order to generate valid and actionable results, they are usually built upon pre-existing big data architectures. This fourth generation is characterised by machine-driven data management and analysis – which may involve natural language processing and machine learning to collect unstructured and disparate data, as well as recommendation engines suggesting courses of action. Chatbots may also be leveraged to perform tasks such as responding to and resolving complaints (di Castri et al., 2019[5]).
According to a recent survey from the Financial Stability Board (FSB) undertaken among FSB members (2020[6]), the first and second generations of SupTech initiatives encompass the majority of technologies used by supervisory authorities, with 49% of surveyed authorities using data analysis functions for descriptive outputs and 32% for diagnostic outputs. Only a minority of respondents report using technologies comprised in the third predictive category (11%), and the fourth prescriptive category (8%). Echoing these findings, a recent report from FinCoNet (2020[12]) based on survey responses from 21 market conduct and financial consumer protection authorities similarly demonstrates that while some SupTech tools currently deployed in this arena are used to make predictions, the majority are designed to collect or analyse data or automate workflows.
While third-generation data collection solutions and fourth-generation data analytics potentially yield the most value for authorities by enabling forward-looking supervision9 and greater storage and mobility capacity, technologies comprised within earlier SupTech generations can still generate sufficient information and substantial efficiency gains to be beneficial as well – especially with regards to enforcement processes10 (di Castri et al., 2019[5]; Dias and Staschen, 2017[13]).
5.3. The benefits of SupTech
Copy link to 5.3. The benefits of SupTechAs regulatory, supervisory and enforcement authorities all rely on data, internal procedures and working tools, as well as human and other resources, they all face common challenges – albeit to varying degrees – related to low data quality and time-consuming manual procedures (Dias and Staschen, 2017[13]). SupTech applications can help authorities address these challenges by enhancing their capability, efficiency and effectiveness in terms of data collection and analysis – in particular by enabling the automation of routine tasks, the development of new analytical techniques, and the provision of better insights. By using tools to analyse increasing volumes of both structured and unstructured data of supervisory and enforcement relevance, authorities can shift their focus away from labour-intensive tasks to activities requiring human judgement and expertise, allowing them to better allocate human resources and reduce costs over time. SupTech applications can be developed in-house, by external vendors, or a combination of both.
Overall, SupTech tools in the areas of corporate governance, competition and anti-corruption are most commonly applied by supervisory and enforcement authorities to i) enhance their detection capabilities, and ii) increase the efficiency of enforcement actions. While these two purposes are not mutually exclusive and should be envisaged as intertwined, the first focuses on adoption of tools by authorities that enable the detection of new forms of market manipulation and anti-competitive conduct that analog tools cannot detect, while the second focuses on efficiency gains enabled by digital technologies in pre-existing enforcement processes. SupTech tools can also help authorities improve their data collection and management capabilities, which can in turn improve data quality – itself a pre-requisite for enhanced data analysis.
5.3.1. Improving detection capabilities
Copy link to 5.3.1. Improving detection capabilitiesEvidence suggests that securities regulators, competition authorities and law enforcement agencies involved in combatting corruption are increasingly using SupTech tools to respectively better detect i) insider trading and other types of misconduct (such as money laundering, terrorist financing, mis-selling and fraud), ii) anti-competitive behaviour, and iii) foreign bribery and corruption allegations. SupTech applications – including AI tools – are particularly relevant for these purposes, as conduct supervision relies on the analysis of large amounts of granular, time-sensitive and unstructured data from disparate sources. In addition, as digital technologies enable new forms of money laundering, terrorist financing, mis-selling, fraud and anti-competitive behaviour to arise, new tools are required to detect and tackle them.
Use cases by financial and securities regulators: better detecting market manipulation and insider trading
Copy link to Use cases by financial and securities regulators: better detecting market manipulation and insider tradingAccording to a recent FSB survey (2020[6]), SupTech applications have gained most momentum in recent years for misconduct analysis, with the largest increase in the number of reported use cases by authorities since 2016. Evidence suggests that authorities use advanced analytics such as machine learning, natural language processing, text mining and network analysis to enhance their capacities – especially with regards to detecting networks of related transactions, identifying anomalies and unusual behaviours, and drawing insights from extensive amounts of structured and unstructured data (Coelho, De Simoni and Prenio, 2019[14]).
For example, Mexico’s National Banking and Securities Commission (CNBV) has developed a prototype for an NLP application to detect what a suspicious Anti-Money Laundering/Combatting the Finance of Terrorism network is ‘talking about’, thus facilitating the detection of unusual transactions, relationships, and networks events to identify potential money laundering issues that cannot be identified by people. The rationale for developing such a prototype is the rise of digital financial products and services posing new challenges for Mexico’s financial authorities, which entails that traditional methods and models of capturing and analysing regulatory data are ill-suited to cope with the surfeit of data being generated by new platforms, products, and customers (CNBV/R2A, 2018[15]).
Central Bank of Brazil (BCB) also launched a SupTech - Natural Language Processing Applications for Supervision project (SupTech-NLP) in 2020, with the aim to incorporate into supervision processes AI applications for document processing based on NLP techniques. Within the Suptech-NLP, BCB’s conduct supervision department developed a prototype for a robot that downloads data from financial consumer complaints’ websites and categorizes them through machine learning. Access to official consumer complaints’ databases is currently being discussed with consumer protection authorities from the Ministry of Justice.
Authorities can also leverage big data architectures to perform real-time market surveillance. Securities regulators have started to leverage these technologies to transform large datasets into usable patterns for detecting potential insider trading and market manipulation. However, designing and implementing tools focused on certain aspects of market surveillance can be complex due to the large volume and variety of data required (i.e. regulatory and market data and intelligence). As new technologies become available, they may facilitate their development and deployment (FSB, 2020[6]). Nevertheless, some authorities have already successfully deployed these solutions.
For instance, the Australian Securities and Investments Commission (ASIC) developed a Market Analysis and Intelligence (MAI) platform, which collects real-time data feeds from all Australian primary (ASX) and secondary (Chi-X) capital markets for equity and equity derivatives products and transactions (Box 5.1). Likewise, in the European Union (EU), the German Federal Financial Supervisory Authority (BaFin) is setting up an integrated automated alarm and market monitoring system (ALMA) for analysing potential market abuse cases, including insider trading and market manipulation (BaFin, 2017[16]). In North America, the Canadian Securities Administrators (CSA) is developing its Market Analysis Platform (MAP) to collect post-trade data from exchanges, alternative trading systems (ATSs) and dealers/brokers in order to facilitate enforcement investigation of potential insider trading and market abuse cases (CSA, 2018[17]).
Box 5.1. Market Analysis and Intelligence (MAI) platform by the Australian Securities and Investments Commission (ASIC)
Copy link to Box 5.1. Market Analysis and Intelligence (MAI) platform by the Australian Securities and Investments Commission (ASIC)The Australian Securities and Investments Commission (ASIC) has developed a Market Analysis and Intelligence (MAI) platform, which collects real-time data feeds from all Australian primary (ASX) and secondary (Chi-X) capital markets for equity and equity derivatives products and transactions. In particular, the MAI platform has a real-time alert monitor that detects and identifies abnormalities in order and trade messaged in traded securities. It also contains standard reports to allow analysts to drill down and analyse market data to identify trading accounts of interest that may be undertaking market misconduct such as insider trading and market manipulation. Overall, the standard dashboards within MAI include Real-Time Alert Monitor, Market Summary, Market Manipulation and Insider Trading Reports and the Market Replay, which allow for real-time or historical review of the market for a particular security. The MAI platform was preceded by the SMARTS market intelligence system.
ASIC has recently upgraded MAI from a non-cloud, Flex system to a cloud-based, HTML5 system, and has the latest version of its current vendor’s platform which includes enhanced functionality to ingest, analyse and visualise data. ASIC intends to leverage the enhanced functionality of the upgraded SMARTS market intelligence system to increase its surveillance capabilities of the Fixed Income Clearing Corporation markets and further utilise information received from the Australian Tax Office. This work is being undertaken in-house and is experimental/in-development. This capability was developed on the upgraded MAI system’s sandbox environment called Kx Analyst. Datasets that have beeningested include OTC Trade Repository Data, Bond Clearing information from Austraclear and Global Legal Entity Identifier data.
The Kx Analyst environment uses proprietary KDB+ technology and interfaces with various open source languages such as python and R, providing ASIC analysts with a single data science environment. ASIC currently receives trading account information and their related relationship information, including spousal and residential and business address information from the Australian Tax Office. From this information, ASIC has created a data set of an anonymised map of linked trading accounts. This data set will be ingested into Kx Analyst and will be linked to MAI trading data to create different analytics to improve ASIC’s market surveillance capability of identifying market misconduct.
Source: ASIC.
Use cases by agencies involved in combatting corruption: better detecting criminal allegations and fraud
Copy link to Use cases by agencies involved in combatting corruption: better detecting criminal allegations and fraudLaw enforcement agencies become aware of corruption and foreign bribery allegations through many different sources – including through media articles, embassies, international cooperation, self-reporting, financial intelligence units (FIUs), tax authorities, and whistleblowers (OECD, 2017[18]). Some of these sources depend on the processing of large amounts of diverse sets of data to detect suspicious transactions or patterns that could lead to an investigation. Against this backdrop, survey results suggest that the use of AI tools by law enforcement agencies can stand as a catalyst to better identify such transactions and patterns, in turn leading to greater detection rates.
In particular, the use of AI tools appears to be most relevant for financial intelligence units (FIUs) to better detect criminal allegations. This is because anti-money laundering regulations generally require multiple reporting obligations from financial institutions and Designated Non-Financial Businesses and Professions (DNFBPs), and FIUs also usually use information from other sources to prepare financial intelligence reports. As such, many countries appear to have already adopted – or are aiming to adopt – AI tools to sort, connect, and prioritise data in suspicious transactions reports. For instance, the German FIU reports that one of the main benefits of using AI tools would be to facilitate the identification of potentially relevant suspicious transactions reports connected to serious crime without the need to exhaustively describe all characteristic attributes of various typologies in the form of mathematical rules. Instead, these rules would be automatically derived from labeled training data.
Governmental auditing authorities also appear to be using AI tools to detect irregularities in public procurement and to screen corruption reports. For instance, Brazil’s Comptroller General reported the use of AI tools to sort and triage corruption reports from ombudsman platforms and to decide which cases merit further investigation (FARO System). Brazil and another member of the OECD Working Group on Bribery (WGB) also reported using AI tools to help raise red flags that allow authorities to intervene in tainted public procurement procedures before the awarding of the contract.
Use cases by competition authorities: better detecting cartels and other types of anti-competitive practices
Copy link to Use cases by competition authorities: better detecting cartels and other types of anti-competitive practicesWhile reactive methods of detecting anti-competitive conduct, such as leniency regimes and complaints, continue to be effective methods of detection, proactive detection tools are particularly important in the digital world, where business practices evolve quickly and firms use new technologies to implement anticompetitive conduct. Increasing data availability regarding traditional markets can also facilitate the use of these tools. Overall, predictive SupTech tools can be used by competition authorities to better detect atypical signs or suspicious behaviours in the market, and in turn help determine enforcement priorities and initiate in-depth investigations. Difficulties may arise, however, in ensuring that predictive models are applicable when working across different sectors and markets where the data and the related challenges are distinct.
Cartel screening
Copy link to Cartel screeningMarket screens are economic tools that can assist competition authorities in their investigations. As described in Chapter 4, these can include structural screens, which may identify markets where authorities may wish to pay particular attention given certain characteristics that might make collusion more likely (including product homogeneity and oligopolistic market structures). Authorities are also exploring the use of behavioural screens by looking for patterns of unusual or unexplained behaviour, and identifying “structural breaks” in market data that could show the implementation of a cartel agreement or the adaptation of the cartel to market changes. The use of such screens is supported by the OECD 2019 Recommendation of the Council concerning Effective Action against Hard Core Cartels, which recommends the use of “pro-active cartel detection tools such as analysis of public procurement data, to trigger and support cartel investigations” in implementing an effective cartel detection system (OECD, 2019[19]). In the fight against collusion, digital cartel screening tools are becoming increasingly important, especially for behavioural screens, which tend to be data and resource intensive. Such screens have been most notably used to detect bid-rigging cartels, which represent a significant share of cartel enforcement in many jurisdictions.11
Screening methods can include statistical and econometric techniques, network analysis and machine learning methods – and as such can particularly benefit from advanced data analysis solutions.12 Some of these techniques can be carried out supervised or unsupervised (OECD, 2020, p. 3[20]). For instance, the Spanish competition authority makes use of data mining techniques “such as applying statistical, econometric and machine learning to try to detect patterns of behaviour that evidence the existence of anticompetitive agreements”. Where data are limited, techniques such as web scraping or text mining can locate data (OECD, 2020, pp. 2-3[20]).
Recognising the potential of digital screens in their enforcement activities, several competition agencies have invested significant resources in the development of market screening tools based on algorithms that help to identify possible signs of collusion, such as suspicious patterns or pricing. Some jurisdictions have developed specific screening programmes that use data from electronic government procurement databases to monitor bids and bidding patterns to identify collusive bidding. For example, the Brazilian competition agency has a data analytics and screening project (Project Cérebro), and the Korean Fair Trade Commission’s (KFTC) has developed a Bid Rigging Indicator Analysis System (BRIAS), which provides for the automatic review of procurement data (Box 5.2).13 Other jurisdictions such as Spain and Canada are currently developing similar screening tools (OECD, 2020, p. 6[21]).
Box 5.2. Examples of Digital Cartel Screens
Copy link to Box 5.2. Examples of Digital Cartel ScreensBrazil’s Cérebro (Brain) Project
Copy link to Brazil’s Cérebro (Brain) ProjectBrazil’s competition authority, the Administrative Council for Economic Defense (CADE) has developed a screening project called Cérebro (the “Brain”). Cérebro is a platform that allows the integration of large public procurement databases by applying data mining tools and economic filters capable of identifying and measuring the probability of cartels occurring in public bids.
Cerebro’s data mining tools allow for the automation of the analyses formerly conducted by investigators and case handlers. The objective is both the identification of evidence of cartels in public bids, such as suspicious, implausible facts or behavioural patterns, and the provision of relevant information for the investigation of the cases. The economic filters in the platform are based on specialist literature and econometrics. They seek to provide generalised evidence of the existence of cartels based on data related to prices, costs, profit margins, market share, etc. Through the identification of firms’ behaviour as described in academic articles, CADE derived mathematical models as statistical tests for general use in a kind of reverse engineering process.
Since 2014, CADE has initiated some investigations thanks to the Cérebro tool. The tool continues to evolve. The project team is exploring possibilities of using machine-learning algorithms to preselect digital evidence more likely to contain information relevant for the investigation (OECD, 2019[22]). Currently, CADE has three ongoing investigations based on findings obtained using the Cérebro platform, and is about to start formal proceedings supported by findings from its first investigation’s use of screening techniques.
Korea’s Bid Rigging Indicator Analysis System (BRIAS)
Copy link to Korea’s Bid Rigging Indicator Analysis System (BRIAS)In 2006, the Korean Fair Trade Commission (KFTC) developed the Bid Rigging Indicator Analysis System (“BRIAS”) to help detect bid rigging. BRIAS is an automatic quantitative analysis IT system that analyses large amounts of online public procurement data and, based on indicators incorporated in it, quantifies the likelihood of bid rigging.
BRIAS collects online public procurement data concerning large-scale contracts awarded by central and local administrations within 30 days of the contract award. Then, the system analyses the data and generates scores on the likelihood of bid rigging by assessing factors like tender method, number of bidders, number of successful bids, number of failed bids, bid prices above the estimated price, and price of winning bidder. Each of these factors is assigned a weighted value and all values are then added up. For instance, higher rates of successful bids and lower number of participating companies are indicative of a possibility of collusion. All bids are also screened according to search criteria like the name of the winner candidate, or bids with similar score.
Source: (OECD, 2020, p. 21[23])
Colombia’s Sherlock Project
Copy link to Colombia’s Sherlock ProjectColombia’s competition authority, the Superintendence of Industry and Commerce (SIC), has launched a screening project (Project Sherlock) that seeks to support the SIC’s investigators in the identification of signs or patterns that suggest potential anticompetitive behaviors in the data available from public procurement processes.
The first stage of the project consisted of developing a tool that could facilitate the access of investigators to public procurement data available online. The tool collects and organises public procurement data and provides simple descriptive analysis to the investigators in a user-friendly manner. In this first stage of the project, the investigators are still tasked with identifying suspicious, implausible facts or behavioural patterns based on the data presented by the tool. The second stage of the project involves the automation of the above-mentioned tasks in which the tools would automatically identify red flags in the procurement process, in addition to simple descriptive analysis of the data.
Source: SIC
Adapting techniques to investigate harm facilitated by algorithms
Copy link to Adapting techniques to investigate harm facilitated by algorithmsSome competition authorities have also adapted their techniques to investigate harm facilitated by algorithms. In a recent paper, the UK CMA outlines inter alia techniques that could be used without access to firms’ data and those that could be used once an investigation has been launched or from available information disclosed as part of remedies14. Without access to firms’ data and algorithms, the analysis authorities can conduct will depend on the level of transparency. Where an algorithm’s outputs are transparent, such as when an algorithm sets the price offered to consumers on a website, techniques such as mystery shopping can be used to better understand the operation of the algorithm. Crawling and scraping can help increase transparency by extracting data and reverse engineering methods, including the use of APIs can help locate outputs in more complex cases. It is not always necessary to have access to the code to identify the harm. An authority could conduct an analysis where the input data used by the algorithm is available, or as mentioned above, where the algorithm’s outputs are transparent. When competition authorities “have access to the code”, the UK CMA describes three possible methods: dynamic analysis, static analysis and a manual code review. The first method involves “automated testing through execution of the code” and is considered to be the most effective (OECD, 2019, p. 40[22]).
Price monitoring tools
Copy link to Price monitoring toolsDigital tools can also be used to monitor firms’ pricing strategies and to detect anticompetitive practices such as resale price maintenance (RPM). While algorithms can facilitate anticompetitive conduct, they can also be a powerful detection tool for competition authorities. For example, the UK CMA’s DaTA unit has developed an in-house price-monitoring tool, which was used to make it easier for the case teams to detect resale price maintenance in the musical instruments sector (OECD, 2014[24]). These types of price monitoring tools can also be useful for investigation teams in determining whether the anticompetitive conduct is more widespread than the targets of the investigation. There are challenges however in using such tools to identify RPM from other normal market behaviour, as signals identified by price monitoring software are not necessarily linked to a RPM strategy (see Chapter 4). Colombia’s SIC has also developed a price-monitoring tool under its project “Sabueso” that collects data on products sold on-line in order to help its investigators discover suspicious pricing behaviour in e-commerce. The tool relies on machine learning to identify the same product in different on-line stores sold under different names and descriptions (OECD, 2020[25]).
5.3.2. Improving efficiency in enforcement actions
Copy link to 5.3.2. Improving efficiency in enforcement actionsAI tools can significantly increase efficiency in enforcement actions, as investigations and prosecutions demand extensive time and resources. In particular, authorities are often required to devote extensive human resources to cope with increasingly complex cases, often in an environment of scarce resources. As the average duration of a foreign bribery case is 7.3 years, the OECD Working Group on Bribery (OECD WGB) has recommended in 10 out of its 15 Phase 4 evaluation reports published so far15 that its member countries increase the resources allocated to law enforcement agencies fighting foreign bribery (OECD, 2014[24]). Competition authorities face similar challenges, as their budgets have decreased in real terms by approximately 5% on average between 2015 and 2018 (OECD, 2019[22]). Likewise, securities regulators also have resource limitations that constrain their ability to supervise and enforce corporate governance standards, as many securities regulators are less well funded than banking regulators (OECD, 2014[24]).
Against this background, AI tools can be useful to review large-scale evidence by ensuring that submissions comply with format and structure requirements, and analysing evidence using machine learning techniques such as NLP. Overall, AI tools are particularly well suited to standardise procedures and repetitive tasks involving large amounts of data. In the case of competition authorities and given the tight timelines for investigations, it can however be difficult to design sophisticated applications that are tailored to individual cases.
Use cases by securities regulators: better determining compliance with disclosure requirements and guiding enforcement actions
Copy link to Use cases by securities regulators: better determining compliance with disclosure requirements and guiding enforcement actionsAs many authorities continue to rely on heavily manual processes, challenges remain as to how to make effective use of unstructured or qualitative data, such as information comprised within disclosure materials or annual reports. SupTech tools can be leveraged by authorities that must undertake complex, qualitative analyses to determine compliance with legislation or regulation that is often principle-based or comprises judgment-based rules (World Bank, 2018[4]). AI tools – including machine learning and natural language processing – are particularly relevant in that respect.
For instance, the Malaysian Securities Commission (SC Malaysia) uses artificial intelligence (AI) to monitor the adoption of corporate governance best practices and quality of disclosures by listed companies on the Malaysia Stock Exchange (Bursa Malaysia). Since 2017, listed companies are required to report on their adoption of the Malaysian Code on Corporate Governance using a prescribed template for corporate governance reports. This template is designed to facilitate data extraction, evaluation and analysis by the AI system, which considers inter alia the type of information disclosed, depth of explanation, and in relation to departures, the strength of alternative practices. The use of AI has enabled SC Malaysia to annually report data and observations in relation to the adoption of the Malaysian Code on Corporate Governance and the quality of disclosures, including year-on-year progress, in SC Malaysia’s Corporate Governance Monitor report. The data also supports evidence-based regulatory measures to improve corporate governance practices or address areas of concern – including practices with low score for disclosure (SC Malaysia, 2020[26]).
AI tools can also be used to guide authorities’ enforcement actions related to suspicious trading activities that may constitute market manipulation. For instance, the Monetary Authority of Singapore (MAS) has deployed an augmented intelligence system called “Apollo” that automates the computation of key metrics used in the analysis of suspicious trading activities, and assesses the likelihood that certain types of market manipulation have occurred. As a “Robo-Expert”, it seeks to predict the likelihood of positive prosecution outcomes for new cases by understanding how experts analyse market misconduct cases. MAS built and trained Apollo using expert reports and the trading data from cases that they had successfully prosecuted in the past. Several benefits have resulted from its implementation. Automated trade analysis reduces the need for manual computation, helps to identify fraudulent transactions with higher market impact and provides greater insight into market trading behaviours. In particular, it allows for the testing of various case scenarios to fine-tune investigation strategies for individual cases, thus also helping with case prioritisation and guiding decisions on the appropriate courses of enforcement actions (MAS, 2019[27]).
Use cases by agencies involved in combatting corruption and foreign bribery: better resolving cases
Copy link to Use cases by agencies involved in combatting corruption and foreign bribery: better resolving casesThe majority of the respondents to the OECD WGB survey mentioned efficiency as part of the benefits of using AI tools. Corruption and foreign bribery investigations often require the analysis of data from several sources, including companies’ books and records, third-party sources and government authorities including tax and corporate registry information and financial intelligence, among others. AI tools can allow investigators of ongoing cases to timely and effectively detect patterns and extract better evidence from different sets of data, in turn increasing the efficiency of enforcement actions.
In particular, as the language in foreign bribery tends to be very obscure – including code words and colloquialisms to hide the discussions around the transactions – machine learning tools can be used to find more material that is relevant to investigations with those words, faster than traditional keywords. Image-based classification models can also allow authorities to derive pictures of documents and hand-written notes faster from seized devices. In addition, information retrieval and e-discovery algorithms such as email threading, near duplication and graphing technologies can also be used by authorities to better review and understand the evidence collected.
In practice, many law enforcement agencies appear to already be using advanced analytical tools – including AI tools – to solve corruption and foreign bribery cases. For instance, the Australian Federal Police reported the use of text-based AI tools to analyse data seized during an investigation and identify language potentially indicating bribery transactions (Box 5.3). In Lithuania, the Special Investigation Service has not yet adopted AI, but reports that it has used big data analytics to aggregate data from different public registries and information systems in order to reveal inconsistencies in public procurement relevant to an ongoing investigation. In Costa Rica, the Judicial Investigation Body reports that the use of AI tools to date has reduced the time of investigations and increased trustworthiness of the evidence obtained from data analysis.
The UK Serious Fraud Office was the first to use AI in a criminal case in the United Kingdom to assist the removal of legally professional privileged documents. In particular, scanning as many as 600,000 documents a day, AI reduced the pool of legally professional privileged material needing to be reviewed by independent counsel by 80%,. Beyond saving resources by reducing the timeline of the review process from two years to a few months, the use of AI also resulted in a more accurate and consistent review of the evidence.16
Box 5.3. Australian Federal Police’s Use of AI tools to increase the efficiency in enforcement actions
Copy link to Box 5.3. Australian Federal Police’s Use of AI tools to increase the efficiency in enforcement actionsOperation T
Copy link to Operation TIn 2012, Operation T was initially conducted using a traditional investigative methodology and was later benchmarked using an AI classifier. The data received was approximately 10 TB and the use of keywords originally found 900 000 documents which would have taken approximately 687 working days for one reviewer to analyse, while also potentially missing a significant amount of the key language being used. It took investigators several years to understand the terminology being used for the key persons of interest, the bribe and how it transpired, as obscure terminology was used. After using an interactive review process with AI – including seven rounds of document review equalling approximately 5600 documents over the span of two weeks – investigators started to see patterns in both the language and the communications of material found that have allowed investigators to piece together the transaction much faster.
Source: Australian Federal Police.
Use cases by competition authorities: facilitating evidence review in cartel investigations and enhancing the monitoring of remedies
Copy link to Use cases by competition authorities: facilitating evidence review in cartel investigations and enhancing the monitoring of remediesAs competition authorities have access to a greater volume of data in digital form – which is also harder to destroy – investigations that use digital search are more likely to discover relevant evidence (OECD, 2018[28]). While competition authorities are increasingly using advanced digital tools and techniques in collecting, preserving and analysing digital evidence, the use of digital forensics in cartel investigations in particular allows competition authorities to collect and analyse data in a more efficient way.
Given the large amounts of data that can result from digital searches, the use of forensic search software enables better search strategies through sophisticated search methods. In particular, forensic search software such as EnCase and Nuix can enable more sophisticated keyword searching, for example, by identifying misspelled versions of keywords and producing results based on self-learning algorithms. In addition to basic keyword searches, these types of software also allow “concept” searching, which can make it easier for the authorities to find relevant evidence (OECD, 2020, p. 9[23]). Spain has noted some of the advantages of software platforms, such as Nuix, explaining that it “enables analysis of multiple databases and offers a high-speed indexing engine. This software allows the use of various clustering algorithms and other machine learning techniques. Additionally, it offers the option of social network analysis, which can improve information filtering” (OECD, 2020, p. 3[21]).
In addition to the collection of files and documents, forensic examination of how the device in question has been used is also important (OECD, 2018[28]). For example, agencies in the United States have noted that metadata can reveal “when files have been accessed and modified, internet search history, attachment of USB storage devices, and other traces of information that indicate how an individual used the device” (OECD, 2020, p. 7[21]).Such forensic information “be useful to show knowledge or intent, to corroborate witness statements, and to counteract defendants’ claims that they had no knowledge or control over particular documents or shared network spaces” (OECD, 2020, p. 7[21]). Additionally, authorities can obtain necessary information to carry out additional tasks, for example to decrypt encrypted data.
Alongside cartel investigations, competition authorities must undertake large-scale evidence review in other areas. The UK CMA has built its own data science platform, which it uses in its various functions, to sort and analyse large amounts of data. The tool applies natural language processing techniques, and has been used in both merger review17, and market studies (OECD, 2019[22]).18 For example, the tool was used by the UK CMA to analyse 3-4 billion search events seen by Google and Bing (over a one-week sample period) for the purposes of its market study on Online platforms and digital advertising (OECD, 2019, p. 93[22]).
Overall, competition authorities have noted the efficiency of these new digitised procedures. They allow authorities to search through high volumes of data in a swift manner and with a high degree of accuracy. The Portuguese Competition Authority (Autoridade da Concorrência), for example, compared its old (analog) and new (digitised) models in cartel investigations, noting that in 2013, under the old model, it seized 5 million documents and used 2 000 documents to prove infringements, while in 2017, under the new model, it seized 40 000 relevant documents and recorded a low percentage of irrelevant data. Under the old model, the authority noted a “long and very difficult data review process” while under the new model, thanks to a preliminary onsite assessment, the data review was much quicker (4 000 documents per week thanks to the use of forensic software). Consequently, in 2017, the statement of objections was issued within 12 months, while in 2013, it took around three years (Autoridade da Concorrência, 2018[29]).
AI also offers potential to automate competition authorities’ monitoring of remedies, although these efforts are nascent. For instance, following the UK CMA’s market investigation into the payday lending market, the UK CMA published an order to address the identified market features that may prevent, restrict or distort competition (UK CMA, 2015[30]). The order set out publication requirements on those supplying payday loans (i.e. information to be supplied, timeframe for publication, duty to display a hyperlink to a UK FCA-authorised payday loan price comparison website) (UK CMA, 2015, pp. 7-11[30]). The UK CMA has been able to automate some of its monitoring, using its in-house tool to monitor parties’ websites and determine compliance with some remedies, such as presentation of information requirements.
5.3.3. Improving data collection
Copy link to 5.3.3. Improving data collectionUse cases by financial and securities regulators: improving regulatory reporting
Copy link to Use cases by financial and securities regulators: improving regulatory reportingSupTech tools are mainly used by financial and securities regutors to improve regulatory reporting. As regulatory reporting has become increasingly complex, authorities face challenges related to collecting delayed and poor quality reporting data – which can in turn impact their ability to supervise (FCA, 2020[31]; European Commission, 2020[32]; European Commission, 2018[33]). Some reports suggest that regulatory reporting has also become increasingly time-consuming and expensive for regulated entities. In a 2018 report, the European Commission estimated most firms’ regulatory reporting costs at around 1% of total operating costs.19 However, industry feedback suggests that the total burden on regulated entities is likely even higher, as the cost of building or amending reports tends to be higher than ongoing running costs (European Commission, 2018[33]).
In the aim of improving data collection, some financial authorities have piloted the adoption of both “push” and “pull” technologies in recent years. While the former refers to pre-defined data being delivered from the regulated entity to the regulator, the latter enables the authority to draw data from the regulated entity as required. Some authorities have also developed APIs to allow regulated entities to submit data – thus lowering reporting costs and enabling better communication between both parties (FSB, 2020[6]).
Taking these efforts one step further, some authorities have begun exploring how to translate rules into a machine-readable format, in order to automate regulatory reporting and further facilitate compliance (World Bank, 2018[4]; Dias and Staschen, 2017[13]; European Commission, 2020[32]). This entails digitising reporting instructions and converting them into code to make them machine executable20 (FCA, 2020[31]; Mohun and Roberts, 2020[34]; European Commission, 2020[32]). However, it is worth noting that while digitising regulatory reporting rules might entail additional benefits such as regulatory simplification, it is currently being hindered by the absence of common standards (FSB, 2020[6]; European Commission, 2020[32]).21 To address this challenge, the European Commission will develop a strategy on supervisory data in 2021, to help ensure that “(i) supervisory reporting requirements (including definitions, formats, and processes) are unambiguous, aligned, harmonised and suitable for automated reporting, (ii) full use is made of available international standards and identifiers including the Legal Entity Identifier, and (iii) supervisory data is reported in machine-readable electronic formats and is easy to combine and process” (European Commission, 2020[32]).
Use cases by competition and anti-corruption authorities: improving the collection of evidence during unannounced inspections
Copy link to Use cases by competition and anti-corruption authorities: improving the collection of evidence during unannounced inspectionsLaw enforcement authorities usually have powers to conduct unannounced inspections or “dawn raids” at business and private premises in order to access and obtain documents and information necessary, for example with respect to proving cartel conduct or in relation to corruption or foreign bribery investigations (OECD, 2019[19]). During dawn raids, digital evidence is collected either through the physical seizure of data carriers (i.e. computers, smartphones, USBs) or by searching the data carriers and servers on site. During an onsite inspection, a competition authority may copy or make forensic images of the digital data. The techniques used depends on the availability and form of data. Forensic IT tools may be used to collect digital evidence, and some competition authorities use live forensics to capture data, which cannot be obtained once the device is turned off (OECD, 2020, p. 6[23]). The ability to analyse data offsite has become more important during the COVID-19 pandemic.
5.3.4. Improving data management
Copy link to 5.3.4. Improving data managementThe three main tasks within data management include validation, consolidation and visualisation – each referring to specific target points in the data management cycle. Validation refers to the quality control checks of completeness, correctness and consistency of data against reporting rules, whereas consolidation relates to the aggregation of data from multiple sources and in varying formats, and visualisation involves the presentation of information in a legible manner (di Castri et al., 2019[5]). A wide range of SupTech tools can be leveraged to improve data management – and in particular cloud computing, which allows for greater and more flexible storage, mobility capacity and computing power (Broeders and Prenio, 2018[35]).
For instance, Mexico’s CNBV is currently implementing the second phase of a project involving cloud computing to process large amounts of anti-money laundering (AML) compliance data, thus allowing for a greater and more flexible storage, mobility capacity and computing power to support AML supervision of all supervised financial institutions. The platform will also enable the development of both basic and advanced, prospective analytics to strengthen monitoring activities and better identify atypical patterns.
5.4. Challenges and risks of SupTech
Copy link to 5.4. Challenges and risks of SupTechAdopting SupTech solutions also comes with challenges and risks, including those that commonly arise upon large technology platform and software transitions, as well as risks that are transversal in nature due to the digital environment itself. The main issues and constraints principally revolve around data quality, resourcing, and skills. Other practical and legal challenges can also arise upon the integration of SupTech tools into legacy systems. Case studies reviewed for this chapter also identified insufficient communication across all stakeholders involved as a potential hindrance to the effective implementation of SupTech solutions. Technical issues and risks stemming from the digital nature of SupTech solutions also need to be accounted for, including risks related to: cyber and data security; third party dependencies; data localisation (potentially resulting in cross-border issues), as well as poor-quality algorithms or data, and opacity in the design and outputs of SupTech tools (i.e. a “black box issue” potentially entailing reputational risks).
While most of these challenges and risks arise across the three policy areas considered, some are also specific to certain authorities, their particular functions and remit.
5.4.1. Data quality, standardisation and completeness
Copy link to 5.4.1. Data quality, standardisation and completenessSupTech applications rely on machine-readable data – i.e. in a format that can be processed by computer programmes. As such, quality, standardisation and completeness of data are key requirements and can pose major challenges, especially upon leveraging unstructured data collected from non-traditional sources of information (e.g. open source or social media). For instance, SC Malaysia mentions that getting the buy-in from listed companies to disclose the information and data in a structured manner was a key enabler to using AI, which required effort by listed companies to change their reporting format.
Providing sufficient amounts of quality data to build machine learning applications can also be an issue. For instance, in relation to its Project Apollo, Singapore’s MAS reported the scarcity of training data – particularly expert reports associated with prosecution outcomes – as a main challenge. Having a sufficient volume of such data is a key requirement to continually improve the accuracy and robustness of the algorithms, and to validate Apollo’s models and methodologies in order for its results to be admissible for use in a court of law.
Likewise, several law enforcement agencies involved in combatting corruption also identify data quality and standardisation as primary challenges for the effective use of AI, in particular for the detection and enforcement of corruption and foreign bribery offences. As such, it is important to ensure that information provided by companies to law enforcement authorities (either voluntarily through self-reporting and cooperation or under some form of compulsion) is in a format that authorities’ systems can read. Standardisation is often obtained through protocols or guidance from the authorities themselves or by using industry standard protocols.
5.4.2. Legal and procedural challenges
Copy link to 5.4.2. Legal and procedural challengesThe use of SupTech tools and AI have raised a range of legal and procedural challenges for supervisory and law enforcement authorities, which in some cases may require amendments to existing legal frameworks to facilitate their more effective use for enforcement purposes. For example, Switzerland noted that their legal framework does not currently allow for the use of AI technology in a generalised manner. However, they are undertaking pilot projects using anonymised data to assess the added value of this technology and are at a preliminary stage of reviewing the legal basis for its use.
Another challenge could be the acceptance by the courts of the use of AI technology, particularly in criminal cases. In civil cases, the use of AI (predictive coding) has already been accepted by courts in various countries as a legitimate means for document discovery in court proceedings but the position is less clear for contested criminal cases, where defendants may wish to challenge the use of the technology, its accuracy and reliability.
Due process rights of companies as a legal challenge
Copy link to Due process rights of companies as a legal challengeThe use of digital technologies for enforcement purposes has led to the identification of several legal challenges across jurisdictions, including the respect of due process rights of the companies that are subject to the authorities’ investigations (OECD, 2019[22]). While digital technologies allow authorities to collect a large amount of data from businesses during dawn raids – including any information that is stored in digital forms – the respect of due process rights of the investigated companies requires that such broad powers of investigation aided by digital technologies be exercised within the limit of proportionality. As such, the scope of data collected from businesses needs to be proportionate to the purposes of the authorities’ investigations. For example, while digital technologies allow seizing entire hard-drives or servers for examination of the documents contained therein, there is a legal risk that this goes beyond the scope of the investigation, since the seized data may include personal information or information that is irrelevant to the investigation. When such a risk materialises, it could significantly delay the investigation and negatively affect the procedural efficiency of enforcement authorities.
A related challenge in criminal cases involves obligations in some OECD countries to make available all relevant information to defendants, particularly that which is exculpatory in nature. Where evidence has been located using AI across large data sets, it could be imperative for defendants to have access to the same data sets and possibly the AI technology itself, particularly where they could not afford this themselves (equality of arms issue). This situation may be less acute for large companies that may well already have lawyers equipped with this technology. Nevertheless, these due process issues raise challenges in ensuring that digital technologies (e.g. search software and algorithms) are used within the limits prescribed by the relevant legal frameworks.
Data location as a legal challlenge
Copy link to Data location as a legal challlengeData location stands as an additional legal challenge for supervisory, competition and law enforcement authorities using digital technologies, and relying on digital evidence, to carry out their investigation. While in certain circumstances, the storage space containing the information relevant to the investigation could be located in another jurisdiction, in such cases, enforcement authorities may find themselves unable to extend their investigative powers to the data located abroad.
In the field of competition law enforcement, the International Competition Network has identified two types of approaches with differing implications (OECD, 2019[22]). The first one, called the “access approach”, allows for greater enforcement capabilities regardless of location by permitting the competition authority to search and seize any piece of information which is accessible and can be used or controlled from the premises of the investigated company. Under this approach, the location of the storage is irrelevant. Under the second approach, called the “location approach”, if the storage of the data is not at the premises of the investigated company, it would be impossible to have access to that data, unless their location is covered by the authority’s order or the judge warrant.
Box 5.4. Access to digital evidence located outside the United States
Copy link to Box 5.4. Access to digital evidence located outside the United StatesIn the United States, Title II of the Electronic Communications Privacy Act (“ECPA”) governs how and when any U.S. law enforcement agency can obtain access to stored digital communications, such as email or phone records, during the course of a criminal investigation.
A recent amendment to ECPA requires providers operating within the U.S. to produce evidence regardless of whether the company stores the evidence in the U.S. This amendment was, in part, a result of a case involving Microsoft’s refusal to comply with a search warrant because the data was stored on an overseas server.
In this case, Microsoft challenged a search warrant issued under ECPA by a U.S. magistrate judge. While being a U.S.-based company, Microsoft contended the emails were stored on a server in Ireland, and thus, not subject to the jurisdiction of U.S. courts. The challenge was ultimately appealed to the Supreme Court, but the case was vacated when the above-mentioned amendment was enacted by the U.S. Congress.
Source: (OECD, 2019[22]).
In some jurisdictions, the access approach has been recognised by the law. For example, the United States recently amended Title II of the Electronic Privacy Communications Act to allow law enforcement agencies to access information for enforcement purposes regardless of the location where the data is stored (Box 5.4). In the case of the EU, Article 6(1)(b) of the Directive 2019/1 empowers the competition authorities of the EU Member States to be more effective enforcers provides for the power “to examine the books and other records related to the business irrespective of the medium on which they are stored, and to have the right to access any information which is accessible to the entity subject to the inspection”. In other jurisdictions, it may still be controversial whether the access approach is followed.22
5.4.3. Algorithmic models and human oversight
Copy link to 5.4.3. Algorithmic models and human oversightIn relation to its NLP application, Mexico’s CNBV reported that having in place good communication channels between data scientists, NLP algorithms analysts and business units to combine their expertise and obtain better recommendations and continuous improvement of the NLP algorithms was a major challenge. This is linked to wider risks with regards to algorithms and their use by authorities. While algorithms can fail by detecting false positives/negatives rather than meaningful signals, there is also a risk of incorporating human biases in algorithmic models, as well as the risk of not being able to explain the outcomes of machine learning (i.e. a black-box issue that may impede accountability) – all of which are exacerbated when authorities lack adequate skills and expertise. On the other hand, supervisors must also deal with the countervailing concern that if they are too transparent about the models used, regulated entities may be able to more easily game the system to avoid detection (Dias and Staschen, 2017[13]; Broeders and Prenio, 2018[35]; di Castri et al., 2019[5]).
In considering such challenges, SC Malaysia has highlighted the importance of ensuring that data scientists have a general understanding of corporate governance principles, practices and disclosures given that a basic understanding of corporate governance concepts is critical to ensure that the data scientists are able to formulate the logic that will be applied by the AI in analysing the adoption of corporate governance practices and the quality of disclosures. As such, building AI capability requires not just more data but also better data. In this case, insightful and reliable corporate governance disclosures. In the developmental stage, a set of good disclosures by listed companies in Malaysia and other markets were selected and used to build the base of the AI.
Importantly, human intervention is required to identify and validate these disclosures in order to feed the development of the AI. Therefore, in order to yield benefits, SupTech tools require skilled human oversight – as technology should not be leveraged to substitute, but rather to complement and support human judgment. This has crucial financial stability implications, as tools built upon historical data associated with past instances of instability may not remain valid for predicting future crises (FSB, 2020[6]).
In addition, from a corporate governance enforcement perspective, as final decisions on whether to pursue enforcement actions are still necessarily taken by humans and based on human judgements, appeals mechanisms also provide a potential lever for considering and addressing potential biases that may be introduced through algorithmic or AI-based supervisory mechanisms. For instance, Germany’s BaFin reports that defining the patterns and types of anomalies ALMA should look for represents a challenge, as the assessment of which incidents ALMA should identify as abuse is based on experience and should therefore be verifiable by analysts.
This resonates with a challenge identified by competition authorities in relation to projects seeking to automate the monitoring of remedies (highlighted, for example, in the earlier mentioned description of the UK CMA's automated monitoring of remedies in the payday lending market). In particular, cooperation between case teams and digital specialists with respect to remedies is paramount, so that the possibility of automated monitoring is considered during the design of remedies.
Likewise, for those currently employed by law enforcement or corruption agencies, a comprehensive training is often required to enable a full understanding and acceptance of AI capabilities and results. Where investigative authorities are using this technology, prosecutorial authorities and courts will be required to understand and accept its use as well. Before adopting this technology, law enforcement authorities will need to consult with their investigators and prosecutors to ensure a smooth uptake. Importantly, the need for human oversight will clearly still be required to ensure that AI technology complements and supports existing investigative techniques.
5.4.4. Third-party dependencies, digital security and privacy concerns
Copy link to 5.4.4. Third-party dependencies, digital security and privacy concernsIncreased dependencies on third parties can stand as a risk – especially with regards to cloud service providers. Although cloud-based services hold the potential to foster information sharing between authorities – in turn improving regulatory co-operation, “public cloud” solutions raise operational, governance and oversight considerations. Such considerations have particular relevance in a cross-border contect, where authorities may be unable to assess whether legal and regulatory obligations around the delivery of a service are being met.23 Further, interoperability limitations could create lock-in effects and over-reliance on specific platforms and providers (FSB, 2019[36]; FSB, 2017[2]). As such, implementing vetting and auditing processes may be required as a means to ensure adequate safeguards. In addition, greater reliance on outsourced data storage may also increase cyber-vulnerabilities for authorities, which may in turn magnify financial stability risks. At present, most authorities store most of their data in-house for security reasons, and their use of cloud storage is reportedly limited to non-core activities (FSB, 2020[6]; FSB, 2019[37]).
While digital security vulnerabilities can be emphasised by the increased granularity of data and increased data-sharing between government agencies and across public-private partnerships, this can also generate concerns over individual privacy (OECD, 2019[7]). In particular, concerns are raised that the absence of common principles for trusted government access to personal data may lead to undue restrictions on data flows resulting in detrimental economic impacts (OECD, 2020[38]). As such, the processing of data by third parties in the context of public-private partnerships should be transparent, and comply on practices with data management supporting the ethical use of data in the public sector (OECD, 2021[39]).
Conversely, it should be noted that concerns around compliance with data protection regulations and standards (i.e. the EU General Data Protection Regulation, also knows as “GDPR”) also arise when contemplating certain SupTech tools. For instance, as distributed ledger technology offers transparency and immutability, this could create challenges in meeting GDPR standards around the ability to anonymise and erase personal data and around storage limitations (Denis and Blume, 2021[40]).
5.4.5. Legacy systems
Copy link to 5.4.5. Legacy systemsLegacy systems, along with data formats that are not compatible with SupTech, can also impede Suptech adoption. Implementing changes to such systems may require significant organisational changes at the same time to support their effective implementation.
For example, Germany’s BaFin reports the setup of the technical infrastructure behind ALMA as a major challenge, as it requires integrating different databases, AI methods, a visualisation for the supervisors, a feedback mechanism and a consistent data flow through all the stages. Additionally, in order to work with large quantities of data, hardware needs to be updated permanently in order to guarantee a high performance. These obstacles entail that valuable product increments might be difficult to deliver even in several sprints, which might result in stakeholders being potentially dissatisfied over a longer period. This challenge also includes the need for a cultural change in the organisation to enable the whole team to work in an agile framework.
In the case of Mexico, CNBV reports that the main obstacle to the implementation of its cloud computing project is the variety of technological infrastructure amongst the Mexican financial institutions. In the same vein, challenges can also arise upon the integration of SupTech tools into existing processes and procedures. For instance, Australia’s ASIC reports that the rewrite of frameworks and dashboards may slightly alter legacy procedures.
5.4.6. Financial and human resources, procurement rules, and barriers to change
Copy link to 5.4.6. Financial and human resources, procurement rules, and barriers to changeOther challenges may be encountered when developing, deploying and maintaining SupTech solutions – including authorities’ lack of adequate skills such as with respect to technology, software and hardware expertise, along with budget constraints, rigid procurement rules and obsolete regulatory frameworks. Resistance to change and organisational silos may also hinder the development of SupTech projects.
Regarding budget constraints, a common challenge identified is the cost of implementing AI systems, even though the benefits of doing so are clearly articulated. The cost of the software and user licences, along with the costs of any hardware upgrades often required, are of particular concern. As many enforcement authorities are facing budget restrictions, making an efficient use of their limited resources when designing their use of digital technologies is important. This requires adequate planning and design of the most cost-effective use of digital technologies. In practical terms, this translates for example in carefully managing the number of software licenses and data provider subscriptions to balance analytical capacity with costs (OECD, 2019[22]).
Beyond planning, there are a range of resource challenges that competition authorities have identified when building up their SupTech capacity. First, government policy restrictions may affect their approach. For example, competition authorities may be prohibited from placing data and processing functions on the public cloud, which requires them to invest in onsite capacity that can be relatively more expensive and time-consuming to establish. Second, there may be a lack of tools and products available that are designed for competition authority purposes, which may require them to develop their own such tools, although open source software may help in this process. Third, smaller competition authorities may face challenges given that there is a minimum efficient scale for some SupTech applications, meaning that the associated costs may risk occupying a relatively larger share of their budget. Co-operation and resource sharing among authorities in different jurisdictions may help alleviate this.
Authorities’ procurement rules may also render the design and implementation of technology solutions difficult, as evidence suggests that supervisors’ procurement offices are often unfamiliar with these new technologies, and conversely, service providers are often unfamiliar with procurement processes and requirements (di Castri et al., 2019[5]).
Further, the integration of SupTech expertise and tools may give rise to certain additional organisational challenges. For example, in the case of competition authorities, it may be difficult to fit data science processes within the compressed timelines of enforcement cases, meaning that more sophisticated tools may need to be pre-prepared (which can be difficult given variations across markets), or focused on advanced screening methods. Digital teams may also face cultural challenges within an authority, such as resistance to changing ways of working, or incorporating SupTech analysis at each stage of a case (including information requests and remedy design). Cultural challenges have also been identified by other law enforcement authorities in their efforts to apply SupTech tools. For example, different institutions involved in enforcement processes may have different levels of data-driven culture and familiarity with Suptech or AI applications. In the absence of sufficient training to understand how AI-driven analyses and conclusions are reached, there may be a lack of trust in relying upon their findings.
5.5. Considerations for devising adequate SupTech strategies
Copy link to 5.5. Considerations for devising adequate SupTech strategiesRecognising the potential of SupTech to transform data processes – in turn improving the timeliness and quality of decisions and actions – the use of SupTech tools by supervisory and public enforcement authorities has been gaining momentum in recent years. According to a recent FSB survey (2020[6]), the use of SupTech strategies has grown significantly since 2016, with a vast majority of surveyed financial authorities having a SupTech or innovation or data strategy in place. In addition, several competition authorities have reinforced their digital capabilities in order to take advantage of digital tools, and some competition authorities have created separate forensic IT and strategic data analysis units.24
SupTech strategies are hereby defined as seeking to develop tools to support authorities’ functions, whereas innovation/data strategies refer to institution-wide digital transformation/data-driven innovation (DT&DI) programmes that encompass the development of SupTech tools. They are not necessarily pursued in isolation (FSB, 2020[6]). SupTech applications can either be initiated by management, or originate as research questions. Evidence also suggests that SupTech applications can be explored through the use of accelerators, tech sprints, and innovation labs, regardless of whether an authority has an explicit SupTech strategy (Broeders and Prenio, 2018[35]; di Castri et al., 2019[5]).
5.5.1. Leadership, budget and skills
Copy link to 5.5.1. Leadership, budget and skillsOverall, it is important that SupTech strategies be devised in consideration of authorities’ needs, regulatory frameworks and technological capacities. Although there is no “one-size-fits-all” approach, authorities have identified several important considerations underpinning successful SupTech strategies, ranging from the design to the implementation stage, and covering leadership, budget and skills concerns.
A well-defined SupTech strategy requires effective leadership – such as through established Chief Data Officers (CDOs) – and management buy-in, as well as early engagement with end-users (i.e. ‘front-line’ supervisors) – which allows to overcome resistance to change. Evidence also suggests that adopting ‘fast fails’ approaches can enable authorities to quickly evaluate which applications merit further progress, and which ones are not fit for purpose (FSB, 2017[2]). Securing sufficient budget is also paramount for developing SupTech projects, along with adequate procurement systems.
Having technologically skilled professionals in place with the right data expertise better enables the implementation of a flexible SupTech platform, and the adoption of a data-driven culture by organisations as a whole (Bank of England, 2019[41]; FCA, 2020[42]). Several authorities have implemented a strategy for attracting and retaining adequate skills and talent – such as through employee engagement frameworks, or by offering online or other training programmes to existing staff to enhance their skills. Knowledge-based transfers between departments are also observed. In order to attain a skilled SupTech workforce, some financial services authorities have started tailoring their recruitment strategies to focus on candidates’ data analysis skills (FSB, 2020[6]).
It should be noted that a “late mover” advantage applies to authorities that have recently initiated – or are considering to initiate – the development of their data infrastructure. Indeed, integrating advanced analytics tools to a data architecture designed from scratch might prove an easier task than building new tools upon legacy systems (Coelho, De Simoni and Prenio, 2019[14]).
5.5.2. Collaboration between authorities, regulated entities and technology service providers within and across jurisdictions
Copy link to 5.5.2. Collaboration between authorities, regulated entities and technology service providers within and across jurisdictionsWhile data analysis applications are developed to facilitate internal workflows, data collection tools require some involvement from market participants. For the latter category, it is important to consult with regulated entities going forward in order to ensure that solutions adopted on both ends are aligned and compatible. As some supervisors have piloted and adopted SupTech frameworks on an ad-hoc and unco-ordinated basis, this can in turn create negative externalities for regulated entities. In particular, according to one report reviewing the experience of select firms, a lack of common standards – along with differing levels of technological progress within authorities – could lead to inconsistencies in SupTech approaches across jurisdictions (European Commission, 2020[32]).
A recent study found that in terms of automated reporting for instance, certain firms with subsidiaries in more than one jurisdiction are currently unable to implement the same reporting solution for all subsidiary companies, due to cross-country variations in supervisory expectations and technological capacities (European Commission, 2020[32]). Co-ordination between authorities and regulated entities in their respective efforts to adopt innovative technologies is important to aligning their systems where appropriate and in line with their domestic regulatory remit, in order to mitigate potential challenges and adverse effects down the line, as well as to allow both parties to reap maximum benefits from their use (Bank of England, 2020[43]). An important caveat is that SupTech might induce market participants to adjust their behaviour accordingly. A recent study finds that authorities’ adoption of SupTech solutions has a feedback effect on companies’ corporate disclosure decisions, implying that companies adjust their filings when they anticipate that such disclosure will be processed by machines (Cao et al., 2020[44]). Other evidence suggests that market participants may seek to gain sufficient knowledge of SupTech applications to game the technology to their benefit (di Castri et al., 2019[5]).
Going forward, co-ordination and collaboration between authorities, regulated entities and technology service providers within and across jurisdictions is crucial to: 1) ensure the compatibility of innovative systems adopted by regulators and regulated entities; 2) foster peer learning with regards to the successes and failures of SupTech uses; and 3) consider the possibility of devising common standards and taxonomies for relevant regulatory areas in order to ensure the scalability and interoperability of SupTech tools, especially with regards to reporting solutions. By convening and fostering exchanges among a wide range of stakeholders, international organisations and standard-setting bodies can play an important role in that respect.
References
[29] Autoridade da Concorrência (2018), BOS1: Unannounced Inspections in the Digital Age, AdC dawn raids: A new (Digital) model, https://www.oecd.org/competition/globalforum/investigative-powers-in-practice.htm.
[16] BaFin (2017), BaFin’s 2017 Annual Report, https://www.bafin.de/EN/PublikationenDaten/Jahresbericht/Jahresbericht2017/jahresbericht_node_en.html.
[43] Bank of England (2020), Transforming data collection from the UK financial sector, https://www.bankofengland.co.uk/paper/2020/transforming-data-collection-from-the-uk-financial-sector.
[41] Bank of England (2019), The Future of Finance Report, https://www.bankofengland.co.uk/-/media/boe/files/report/2019/future-of-finance-report.pdf.
[3] BCBS (2018), Sound Practices: implications of fintech developments for banks and bank supervisors, https://www.bis.org/bcbs/publ/d431.htm.
[35] Broeders, D. and J. Prenio (2018), Innovative Technology in Financial Supervision (Suptech)-the Experience of Early Users, https://www.bis.org/fsi/publ/insights9.htm.
[44] Cao, S. et al. (2020), How to Talk When a Machine is Listening: Corporate Disclosure in the Age of AI, https://dx.doi.org/10.2139/ssrn.3683802.
[51] Casalini, F. and J. López González (2019), “Trade and Cross-Border Data Flows”, OECD Trade Policy Papers 220, http://dx.doi.org/doi.org/10.1787/b2023a47-en.
[15] CNBV/R2A (2018), An AML SupTech Solution for the Mexican National Banking and Securities Commission (CNBV): R2A Project Retrospective and Lessons Learned, http://dx.doi.org/10.2139/ssrn.3592564.
[14] Coelho, R., M. De Simoni and J. Prenio (2019), “Suptech applications for anti-money laundering”, Vol. FSI Insights on policy implementation, no 18, August, https://www.bis.org/fsi/publ/insights18.htm.
[17] CSA (2018), Canadian securities regulators announce agreement with Kx to deliver advanced post-trade analysis, https://www.securities-administrators.ca/news/canadian-securities-regulators-announce-agreement-with-kx-to-deliver-advanced-post-trade-analysis/.
[40] Denis, E. and D. Blume (2021), Using digital technologies to strengthen shareholder participations, Going Digital Toolkit Note, No. 9, https://goingdigital.oecd.org/data/notes/No9_ToolkitNote_ShareholdersTech.pdf.
[5] di Castri, S. et al. (2019), The Suptech Generations, https://www.bis.org/fsi/publ/insights19.htm.
[13] Dias, D. and S. Staschen (2017), Data Collection by Supervisors of Digital Financial Services, https://www.cgap.org/sites/default/files/researches/documents/Working-Paper-Data-Collection-by-Supervisors-of-DFS-Dec-2017.pdf.
[45] ECB (2021), The ESCB’s long-term approach to banks’ data reporting, https://www.ecb.europa.eu/stats/ecb_statistics/co-operation_and_standards/reporting/html/index.en.html.
[8] ESMA (2019), Report on Trends, Risks and Vulnerabilities, https://www.esma.europa.eu/sites/default/files/library/esma50-report_on_trends_risks_and_vulnerabilities_no1_2019.pdf.
[32] European Commission (2020), Digital Finance Strategy for the EU, https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:52020DC0591.
[33] European Commission (2018), Summary Report of the Public Consultation on the Fitness Check on Supervisory Reporting, https://ec.europa.eu/info/sites/info/files/2017-supervisory-reporting-requirements-summary-report_en.pdf.
[42] FCA (2020), Data Strategy, https://www.fca.org.uk/publications/corporate-documents/data-strategy.
[31] FCA (2020), Digital Regulatory Reporting: Phase 2 Viability Assessment, https://www.fca.org.uk/publication/discussion/digital-regulatory-reporting-pilot-phase-2-viability-assessment.pdf.
[12] FinCoNet (2020), SupTech Tools for Market Conduct Supervisors, http://www.finconet.org/FinCoNet-Report-SupTech-Tools_Final.pdf.
[6] FSB (2020), The Use of Supervisory and Regulatory Technology by Authorities and Regulated Institutions, https://www.fsb.org/wp-content/uploads/P091020.pdf.
[37] FSB (2019), FinTech and market structure in financial services, https://www.fsb.org/wp-content/uploads/P140219.pdf.
[36] FSB (2019), Third-party dependencies in cloud services: Considerations on financial stability implications, https://www.fsb.org/2019/12/third-party-dependencies-in-cloud-services-considerations-on-financial-stability-implications/.
[2] FSB (2017), Artificial Intelligence and Machine Learning in Financial Services: Market Developments and Financial Stability Implications, https://www.fsb.org/wp-content/uploads/P011117.pdf.
[50] Hodges, C. (2019), “Collective Redress: The Need for New Technologies”, Journal of Consumer Policy 42, pp. 59-90, https://link.springer.com/article/10.1007/s10603-018-9388-x#citeas.
[9] IBM (2018), The Four V’s of Big Data, http://www.ibmbigdatahub.com/infographic/four-vs-big-data.
[53] IDC (2012), Digital Universe Study: Big Data, Bigger Digital Shadows and Biggest Growth in the Far East.
[48] Jones, A. (2020), Concurrentialiste: Journal of Antitrust Law, https://leconcurrentialiste.com/jones-bid-rigging/.
[27] MAS (2019), Enforcement report 2017-2018, https://www.mas.gov.sg/-/media/MAS/News-and-Publications/Monographs-and-Information-Papers/MAS-Enforcement-Report.pdf.
[34] Mohun, J. and A. Roberts (2020), “Cracking the code: Rulemaking for humans and machines”, OECD Working Papers on Public Governance, No. 42, OECD Publishing, Paris, https://dx.doi.org/10.1787/3afe6ba5-en.
[39] OECD (2021), Good Practice Principles for Data Ethics in the Public Sector, https://www.oecd.org/gov/digital-government/good-practice-principles-for-data-ethics-in-the-public-sector.htm.
[38] OECD (2020), Government access to personal data held by the private sector: Statement by the OECD Committee on Digital Economy Policy, https://www.oecd.org/sti/ieconomy/trusted-government-access-personal-data-private-sector.htm.
[25] OECD (2020), Latin American and Caribbean Competition Forum - Digital Evidence Gathering in Cartel Investigations, https://www.oecd.org/competition/latinamerica/.
[20] OECD (2020), Latin American and Caribbean Competition Forum - Session I: Digital Evidence Gathering In Cartel Investigations - Contribution from Spain.
[23] OECD (2020), Latin American and Caribbean Competition Forum - Session I: Digital Evidence Gathering In Cartel Investigations - Issues Note.
[21] OECD (2020), Latin American and Caribbean Competition Forum on Digital Evidence Gathering in Cartel Investigations- Contribution from UNCTAD.
[47] OECD (2020), Using market studies to tackle emerging competition issues, http://www.oecd.org/daf/competition/using-market-studies-to-tackle-emerging-competition-issues-2020.pdf.
[1] OECD (2019), Going Digital: Shaping Policies, Improving Lives, OECD Publishing, Paris, https://doi.org/10.1787/9789264312012-en.
[19] OECD (2019), Recommendation of the Council concerning Effective Action against Hard Core Cartels, https://legalinstruments.oecd.org/en/instruments/OECD-LEGAL-0452.
[11] OECD (2019), Resolving Foreign Bribery Cases with Non-Trial Resolutions: Settlements and Non-Trial Agreements by Parties to the Anti-Bribery Convention, http://www.oecd.org/corruption/Resolving-Foreign-Bribery-Cases-with-Non-Trial-Resolutions.htm.
[22] OECD (2019), The Path to Becoming a Data-Driven Public Sector, OECD Publishing, Paris, https://dx.doi.org/10.1787/059814a7-en.
[7] OECD (2019), “Using digital technologies to improve the design and enforcement of public policies”, OECD Digital Economy Papers, No. 274, OECD Publishing, Paris, https://dx.doi.org/10.1787/99b9ba70-en.
[28] OECD (2018), Investigative powers in practice - Break-out session 1: Unannounced inspections in the digital age - Issues Note by the Secretariat.
[46] OECD (2018), Market Study Guide for Competition Authorities, https://www.oecd.org/daf/competition/market-studies-guide-for-competition-authorities.htm.
[18] OECD (2017), The Detection of Foreign Bribery, http://www.oecd.org/corruption/the-detection-of-foreign-bribery.htm.
[52] OECD (2015), Data-Driven Innovation: Big Data for Growth and Well-Being, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264229358-en.
[24] OECD (2014), OECD Foreign Bribery Report: An Analysis of the Crime of Bribery of Foreign Public Officials, OECD Publishing, Paris, https://dx.doi.org/10.1787/9789264226616-en.
[54] OECD (2013), Supervision and Enforcement in Corporate Governance, OECD Publishing, http://dx.doi.org/10.1787/9789264203334-en.
[49] Porter, R. and J. Zona (1993), “Detection of Bid Rigging in Procurement Auctions”, Journal of Political Economy, Vol. 101/3, pp. 518-538, https://www.jstor.org/stable/2138774?seq=1#metadata_info_tab_contents.
[10] R²A (2019), The State of RegTech: The Rising Demand for “Superpowers”, https://bfaglobal.com/r2a/insights/the-state-of-regtech-the-rising-demand-for-superpowers/.
[26] SC Malaysia (2020), Corporate Governance Monitor 2020, https://www.sc.com.my/api/documentms/download.ashx?id=ff69ce0d-a35e-44d4-996a-c591529c56c7.
[30] UK CMA (2015), Payday Lending Market Investigation Order, https://assets.publishing.service.gov.uk/media/55cc691e40f0b6137400001f/Payday_Lending_Market_Investigation_Order_2015.pdf.
[4] World Bank (2018), From Spreadsheets to Suptech : Technology Solutions for Market Conduct Supervision, https://openknowledge.worldbank.org/handle/10986/29952.
Notes
Copy link to Notes← 1. Suptech is defined by Dias and Staschen (2017[13]) as “technological solutions focused on improving the processes and effectiveness of financial supervision and regulation”, and by the World Bank (2018[4]) as “the use of technology to facilitate and enhance supervisory processes from the perspective of supervisory authorities”. Castri et al. (2019[5]) define SupTech as “the use of innovative technology by financial authorities to support their work”, restricting “innovative technology” to big data and artificial intelligence (AI) tools, and “financial authorities” to supervisory and non-supervisory authorities but excluding authorities in charge of monetary and macroeconomic policies.
← 2. Including the Financial Stability Board, World Bank, International Organization of Securities Commissions, European Securities and Markets Authority, FinCoNet, etc.
← 3. In March 2021, the OECD Anti-Corruption Division carried out a survey of members of the OECD WGB containing six open-ended questions covering the purposes, benefits, challenges, cases, and plans for the future on the use of AI tools in the fight against corruption and foreign bribery. Sixteen WGB countries responded to the survey. Among the 16 respondents, supervisory and law enforcement authorities in nine countries reported already using AI tools to detect allegations and/or enforce anti-corruption laws and regulations. Five other countries are considering the adoption of AI tools to fight corruption. Two countries reported not having plans yet.
← 4. In the recent Airbus SE settlement, the largest non-trial resolution of a foreign bribery case to date and involving three WGB jurisdictions, the French Parquet National Financier granted a 50% reduction in the penalty imposed due to the cooperation and internal investigation conducted by Airbus SE.
← 5. It was reported in the Airbus SE case that the company made more than 30 million documents available for review by the authorities.
← 6. Structured data are data based on a predefined data model (i.e. an abstract representation of “real world” objects and phenomenon). Such models can be explicit, as in the case of a structured query language (SQL) database, where the data model is reflected in the structure of the database’s tables. The data model can also be implicit, as in the case of semi-structured data (e.g. structured web content), where the underlying model can be made explicit at relatively low cost. In contrast, unstructured data are data that have no predefined data model and where such a model cannot be cost-effectively extracted. Typical examples include text-heavy data sets such as text documents, emails, social media posts as well as multimedia content such as videos, images and audio streams. A study by IDC (IDC, 2012[53]) estimates that not even 5% of the “digital universe” is tagged, and thus can be considered structured or semi-structured data. However, the difference between structured, semi-structured, and unstructured data is becoming less important, since with rising computing capacities, data analytics are increasingly able to automatically extract some structures embedded in unstructured data, including multimedia content. (OECD, 2015[52])
← 7. Digitisation refers to the conversion of analogue data and processes into a machine-readable format (OECD, 2019[1]).
← 8. Big data architectures require two key design features: i) internal coherence of each of its layers so they can all process the speed, size and complexity of big data, and ii) built-in quality assurance and security procedures to ensure the validity and integrity of the data from the point of collection to the point of consumption by end users, thus enabling seamless end-to-end data flow without lags of size constraints (di Castri et al., 2019[5]).
← 9. Digital technologies can allow policy makers to be more pro-active and reactive in tracking and responding to fast-changing phenomena, whether they be risks or opportunities. At the same, advanced analytics can help to “predict” responses to policy interventions in a more robust manner than was the case previously (OECD, 2019[7]).
← 10. In particular, analysis of past supervision data, now made far more efficient and effective through the use of machine learning techniques, has been used by many regulators in many regulatory fields to improve risk-based targeting of supervision. Likewise, while research has shown how much regulators can benefit from using more effectively the complaints from consumers (Hodges, 2019[50]), most regulators remain quite “lagging” on this. SupTech offers very interesting opportunities to more easily aggregate and analyse consumer complaints, and subsequently use them to target supervision.
← 11. (Jones, 2020[48]) noted that in 2017, almost half of the KFTC’s sanctioned cartels were bid-rigging cases.
← 12. It should be noted that the use of statistical and econometric techniques to detect anti-competitive behaviours is not new. For instance, a 1993 paper (Porter and Zona[49]) proposed econometric test procedures designed to detect the presence of bid rigging in procurement auctions.
← 13. The UNCTAD contribution to the OECD’s 2020 Latin American and Caribbean Competition Forum on Digital Evidence Gathering in Cartel Investigations, noted the KFTC’s successful detection of bid rigging in a metro construction project worth USD 5 billion and CADE’s successful detection of bid rigging in the supply of cardiac pacemakers (OECD, 2020, p. 6[21])..
← 14. As part of remedies, firms’ may be required to disclose data or algorithms to allow authorities to monitor their activities. For instance, following the investigation of the retail banking market in theUnited Kingdom, the UK CMA imposed a series of remedies, including the requirement on banks to release and make available certain data (e.g. product and service information and customer transaction data) through open APIs (OECD, 2019[22]).
← 15. The Phase 4 monitoring process was launched at the OECD Anti-Bribery Ministerial Meeting held in Paris on 18 March 2016. All the reports are available at: https://www.oecd.org/daf/anti-bribery/countryreportsontheimplementationoftheoecdanti-briberyconvention.htm
← 16. See speech by Camilla de Silva, SFO Joint Head of Bribery and Corruption, speaking at the Herbert Smith Freehills Corporate Crime Conference 2018 (https://www.sfo.gov.uk/2018/06/21/corporate-criminal-liability-ai-and-dpas/).
← 17. For example, in its investigation of the acquisition of Monsanto by Bayer, the European Commission had to examine over 2.7 million internal documents submitted by Monsanto and Bayer (European Commission, 2018[33]).
← 18. Market studies allow competition authorities to assess whether competition in a market or sector is working effectively and to identify measures to address any issues detected (OECD, 2018[46]). They are a useful ex-ante tool and can help competition authorities understand a market resulting in more effective enforcement and can be especially useful in addressing emerging competition issues where enforcement action is limited (OECD, 2020[47]).
← 19. Several reasons can explain the increasing costs for supplying regulatory reports, including the challenge for firms to populate reports with the correct data; the spread of instructions across different pieces of interlinking regulation; unclear wording of rules; and firms subjected to multiple regulatory regimes having to submit differing reports containing similar underlying data (European Commission, 2018[33]; FCA, 2020[31]).
← 20. The European Commission is aiming to ensure that key parts of EU regulation are accessible to natural language processing, are machine readable and executable, and more broadly facilitate the design and implementation of reporting requirements. It will also encourage the use of modern IT tools for information sharing among national and EU authorities. As a first step in the domain of machine readable and executable reporting, the Commission has launched a pilot project for a limited set of reporting requirements (European Commission, 2020[32]). The digitisation of reporting instructions was also explored by the UK Financial Conduct Authority (UK FCA) and the Bank of England (BoE) during a TechSprint in late 2016, during which it was found that a small set of reporting instructions could be converted into machine-executable code, in turn enabling machines to use this code to automatically find and return regulatory reporting directly from a simulated version of a company’s systems. Since then, work has progressed into a first and second phase involving the UK FCA, BoE and regulated banks (FSB, 2020[6]; FCA, 2020[31]; FCA, 2020[42]).
← 21. In Europe, some industry attempts to improve and standardise the reporting process have already been made through initiatives likes the Banks Integrated Reporting Dictionary (BIRD), Integrated Reporting Framework (IReF) and the European Banking Authority’s Data (DPM) (ECB, 2021[45]).
← 22. See, for instance, Canada, where “Bureau investigators have downloaded data stored outside Canada in the course of searches of computer systems located in Canada, although there continues to be some controversy as to the precise limits of the authority granted by a warrant authorising a search of computer systems in a cross-border context.”, https://www.lexology.com/gtdt/tool/workareas/report/617528c4-0e23-4678-a460-9333ed458dc0.
← 23. In particular regarding compliance with different conditions on cross-border data transfers involving personal information (Casalini and López González, 2019[51]).
← 24. For instance, in 2018, the UK Competition and Markets Authority (UK CMA) launched a Data, Technology and Analytics (DaTA) unit, which according to the UK CMA is the largest team of data and technology experts in any competition or consumer agency worldwide (OECD, 2019[22]). The unit includes team members with data engineering, data science, and data and technology market intelligence expertise. It aims to provide the UK CMA with technical capacity for working with data and using algorithms (OECD, 2019[22]). The French Competition Authority has also established a Digital Economy Unit, which will be responsible for, among other things, developing new digital investigation tools, based in particular on algorithmic technology, big data and artificial intelligence (OECD, 2019[22]). The Spanish Competition Authority’s Economic Intelligence Unit is made up of a group of experts in mathematics, statistics, and computer science, as well as economists and lawyers and uses algorithms and big data analysis techniques to carry out its investigations (OECD, 2019[22]). Competition agencies from other jurisdictions, such as Canada and EU, have noted their plans to establish a specialist team that will facilitate the use of AI in their investigations (OECD, 2019[22]).