Artificial intelligence (AI) algorithms have the potential to revolutionize healthcare, particularly in the field of imaging diagnosis. However, it is crucial that robust evaluation and testing of AI-based software takes place before implementation in order to reduce patient and health system risk, establish trust, and facilitate wide adoption. Regulators such as the FDA, IMDRF (International Medical Device Regulators Forum), and the European Union have proposed frameworks for ensuring the safety and effectiveness of AI-based software as a medical device (SaMD).
The IMDRF defines SaMD as software that is intended for medical purposes such as diagnosis, prevention, and treatment, and they recommend four risk categories for SaMD applications based on the healthcare condition severity and the information provided by the software to healthcare decision-making. Additionally, the IMDRF has outlined quality management system principles and standards for clinical evaluation and investigation to ensure the safety, effectiveness, and performance of SaMD.
The European Union also requires manufacturers to prepare and follow a postmarket follow-up plan and to compile a clinical evaluation report outlining the technology and intended use of the medical device, as well as any claims made about its safety and effectiveness.
While these frameworks provide a solid regulatory foundation, they also have shortcomings that are likely to limit adoption of these algorithms in practice. Strategies outlined by regulatory bodies address many key aspects to help ensure the safety, effectiveness, and performance of SaMD applications, but a number of gaps remain. To improve the development and evaluation of diagnostic AI algorithms, additional strategies should be considered.
One strategy is to incorporate appropriate evaluation and improvement methods into phases of development, analogous to those that have been applied to pharmaceuticals and proposed for software applications. Algorithms should be thoroughly tested and refined before being deployed in the clinical environment, just as has come to be expected from other medical devices.
Another strategy is to establish conditions that set up a race to the top for consistent excellent algorithm performance at each installed site. This can be achieved by introducing regulatory frameworks that focus on continuous improvement and real-world performance data. The FDA's software precertification program, which is a voluntary pathway for manufacturers of SaMDs who have demonstrated a robust culture of quality and organizational excellence and are committed to monitoring real-world performance, is a step in the right direction.
In summary, the adoption of AI-based diagnostic algorithms has the potential to greatly improve healthcare, but it is crucial that appropriate measures are taken to ensure their safety, effectiveness, and performance. By incorporating appropriate evaluation and improvement methods, thoroughly testing and refining algorithms, and establishing conditions for continuous improvement, we can pave the way for the successful implementation of AI in the clinical environment.
Be sure to check out this publication for more information:
Larson DB, Harvey H, Rubin DL, Irani N, Justin RT, Langlotz CP (2021) Regulatory frameworks for development and evaluation of artificial intelligence-based diagnostic imaging algorithms: summary and recommendations. J Am Coll Radiol 18:413–424.