Intersection of artificial intelligence and medicine: tort liability in the technological age

Kyle T. Jorstad

doi:10.21037/jmai-20-57

Review Article

Intersection of artificial intelligence and medicine: tort liability in the technological age

Kyle T. Jorstad

Case Western Reserve University School of Law, Cleveland, OH, USA

Correspondence to: Kyle T. Jorstad. 11457 Mayfield Rd., Apt. 660, Cleveland, OH 44106, USA. Email: kjorstad85@gmail.com.

Abstract: This note presents an analysis of the medico-legal and bioethical risks posed by the incorporation of artificial intelligence (AI) and machine learning into clinical radiology practice, with specific focus on the field of mammography. The analysis presents an overview of the current medical malpractice framework relative to mammography; examines the fitness of current legal frameworks for apportioning liability in cases of injury resulting from errors by machine learning tools; evaluates various options for addressing the malpractice model’s gaps as AI is incorporated into clinical patient care; and provides means by which the healthcare industry may both minimize short-term liability for machine learning error, while ensuring that neither the public nor the regulatory framework are unnecessarily biased against the use of AI in medicine.

Keywords: Artificial intelligence (AI); deep learning; radiology; mammography; malpractice liability; machine error

Received: 11 September 2020; Accepted: 15 December 2020; Published: 30 December 2020.

doi: 10.21037/jmai-20-57

“By far, the greatest danger of Artificial Intelligence is that people conclude too early that they understand it.” —Eliezer Yudkowsky (1).

Introduction—artificial intelligence (AI) in medicine

The healthcare community is nervously awaiting the entrance of AI into clinical practice. This nervousness is entirely justified. AI currently exists outside the bounds of our healthcare industry, and the judiciary is presently unequipped for the medical malpractice (med-mal) claims waiting in the wings.

AI lacks many of the fundamental characteristics upon which the medical community is built. AI has taken no part in the historical development of the practice of medicine. AI also lacks the decades of meticulously peer-reviewed and thoroughly tested scientific studies in which medical procedures and drugs find their roots. In addition, AI defies the human, in-person interaction components of health care that, until now, have been largely taken for granted. Health care as we know it evolved from the small family doctor that was well-known and trusted; AI has no means of developing the same personal relationship with patients. Finally, the legal and regulatory framework governing the medical profession lacks precedent for determining what happens if a medical AI tool causes harm to a patient. In short, the med-mal system does not provide hospitals, physicians, manufacturers, or patients with the tools to predict when or where liability will be assigned when AI causes injury. Implementation of AI tools will therefore require the medical and legal communities to overcome both internal resistance by professionals and indeterminate liability to injured patients.

Application of AI to medical diagnosing is a legal question of first impression, meaning the courts lack precedent for guiding decisions (2). No other modern, established application of AI poses quite the same novel legal questions as machine diagnosing. Autopilot in airplanes is capable of being instantly overridden by in-person pilots in the event of malfunctions. Remote surgery is not entirely automated, but is controlled by a physician from a separate facility. AI in financial monitoring or resumé filtering involves lower standards of care than those required of the medical profession, implicates lesser property interests, and is subject to drastically less regulation. Self-driving cars are experiencing similar complications to AI in health care with the regulatory framework and a general lack of cohesive law for implementation (3). For the time being, the lack of comparable AI applications leaves regulation principally in the control of manufacturers and providers of the end service to self-regulate.

Risk aside, the medicolegal uncertainties surrounding AI should not dissuade efforts towards the technology’s integration into clinical practice. AI is playing an increasingly prominent role in our society, with “86% of provider organizations, technology vendors, and life science companies using some form of AI” (4). AI applications for image recognition are particularly relevant to the field of radiology, and could revolutionize medical approaches to preventative care.

Incorporating AI diagnostic tools into mammography has the potential to significantly increase early diagnoses and reduce mortality rates in the fight against breast cancer. Radiologists review approximately 37 million mammograms in the U.S. annually, at an estimated cost of eight billion dollars (5). On average, less than 5 in 1,000 patients test positive for cancer (6), and between 5% and 20% require some type of follow-up, whether that be a re-read by another mammographer, a second scan, ultrasound, physical exam, or biopsy (7). “Radiologists still miss between 10% to 30% of cancers, while 80% of women recalled for additional views have normal outcomes, with 40% of biopsied lesions being benign” (8). Despite the admirable and progressive work of medical professionals to provide quality care, there is always room for improvement.

Significant evidence suggests that AI can fill much of the gap between human performance and perfection. This note will present some of the relevant studies and results substantiating AI potential, though this note is not an analysis of existing AI diagnostic programs and their statistical accuracy. Rather, this note stipulates AI diagnostic accuracy equivalent or superior to that of physicians, and instead attacks the subsequent question: what are the ramifications of effectively removing a human radiologist from the diagnostic process? This note will further generally assume that mammogram administration, the recording and consideration of patients’ previous health and familial histories, and other factors relevant to diagnosing are thoroughly and properly conducted, placing focus solely on liability resulting from machine error. Although many of the medicolegal considerations considered herein apply equally to AI implementation for both diagnosis and treatment plans, the explicit focus of this note is on diagnostic functions. There is no silver bullet for successfully integrating AI into complex health care, but there are ways to minimize the associated risks and ensure quality of care. To that end, this note examines the expanding role played by AI in medical diagnosing and the resultant impact on the American legal system as it pertains to med-mal. Part I provides an overview of how AI functions and the development of mammography. Part II examines the current med-mal system, with particular consideration of the standard of care and its implementation in mammography and radiology. Part III discusses the fitness of the current med-mal system to address potential negative outcomes following physician reliance on AI-based diagnostic tools. Finally, part IV posits and evaluates options available to various healthcare entities for addressing the malpractice model’s shortcomings as AI penetrates clinical patient care.

Innovation and history—an understanding of AI and breast cancer

Despite growing public awareness of the increasing role AI plays in our modern technological infrastructure, significant confusion remains as to what exactly AI is. Most current technologies, including the majority of programming embedded within medical devices, are dependent on original programming. A program will follow its programming and will neither deviate from nor independently alter the base code that governs its actions. In other words, a computer is only as much as its maker allows it to be. Conversely, new forms of technological innovation utilize constructs of AI, machine learning, and deep learning to increase efficiency and reduce the amount of human labor required for a wide variety of functions (9). Many forms of modern AI are even capable of solving problems or reaching conclusions that “its programmer never anticipated or even considered” (10). This section explains the functional distinctions between AI, machine learning, and deep learning, and explores how these concepts interact.

The development of AI

The Oxford Reference defines AI as “the theory and development of computer systems able to perform tasks that normally require human intelligence, such as visual perception, speech recognition, decision-making, and translation between languages” (11). Simply put, AI is the attempt to mimic human intelligence in machine form, allowing the machine to solve problems using a set of stipulated rules with which the machine is provided. The term “machine” is herein used synonymously with the particular AI program utilized. A program as simple as a short set of “if … then” rules within accounting software might be considered AI.

Machine learning is a sub-type of AI distinct for its ability to learn from data sets, enabling the machine to reach conclusions or predictions without being explicitly programmed for specified responses (12). The real value lies in the machine’s ability to make modifications and improvements to algorithms without programmer intervention, allowing the machine to essentially bypass any human preconceptions that the programmer built into the original base code. The machine is composed of several “layers”: the first layer is the input, which receives raw data; the last layer is the output, which presents the machine’s conclusion based on the input data; and one or more middle layers that perform analysis functions, connecting the input to the output (13). Reliable machine learning algorithms require large quantities of training data from which the machine learns, enabling it to make predictions about future data. The more training data the machine is provided with, the more reliably the machine will react in practice¹ (14).

A simple example of a machine learning application might be related to traffic lights. Say a city experiences extreme traffic congestion at certain intersections during certain times of day. The city would like to develop a system that controls traffic lights to minimize congestion and maximize street usage. Rather than hire staff to sit at intersections all day and night monitoring traffic patterns, an AI program could be applied to the problem. Initially, the program is given specific instructions for operating each traffic light. The machine is also able to track the number of cars passing through each intersection, during each hour, and what direction the cars are travelling. Over time the machine will learn that certain stretches of road experience more westbound traffic during the morning hours, and will alter lights to give westbound cars more green lights in the morning. Automating the process further allows the machine to track traffic patterns as they change with time, seasons, or infrastructure changes. If a road closure increases traffic volume along an adjacent roadway, the machine will be able to independently alter traffic light duration, rather than relying on city planners to notice problems and make changes. Although use in traffic functions is a simple example, it demonstrates the potential for AI to save time and money on labor-intensive tasks, freeing up resources. Applications of AI to mammography hold the same potential for increased efficiency and decreased costs.

Deep learning is a sub-type of machine learning that requires significantly greater computing power and produces more reliable results. Where machine learning generally utilizes only one or two middle layers, deep learning features numerous additional middle layers. Deep learning mimics the human brain, in that each middle layer acts as a set of connections between neurons—the more layers, the more intricate the machine’s reasoning (15). Think of deep learning as a massive flow chart. Each conclusion the machine arrives at leads to a separate question; eventually, the machine will arrive at one final answer from among a potentially infinite number of competing outcomes. Similar to the human brain, the machine may not use every reasoning step it is capable of in making a particular decision; it uses only those layers necessary, depending on what reasoning steps the machine takes along the way. The more layers the machine travels through in answering a particular question, the “deeper” the machine’s reasoning.

Unfortunately, the more layers the machine has, the more difficulty humans have in retracing and following the logic of the machine’s conclusion, resulting in what is referred to as the “black box” problem: as a program becomes more autonomous, its algorithms become less intelligible to users, even the original programmers (4). As a result, these middle layers are considered “hidden”—not because of uncertainty regarding their presence, but because there is no reliable way to track the machine’s progress through these layers of reasoning as data is processed. Tracking layers in AI is similar to trying to track a human’s reasoning processes by observing neurons in the brain. It is relatively simple to show which neurons fire and in which order when a person is given a simple reasoning problem. It is entirely more difficult to explain how each fired neuron alters an individual’s overall thought process (10). At least for now, the black box problem makes it nigh impossible for radiologists or programmers to rationalize the outcomes of a deep learning program (14). Nevertheless, the programming applied in complex AI has proven extremely useful and accurate in a wide variety of applications.

A type of algorithm known as a Convolutional Neural Network (CNN) implements the deep learning model, but “with the explicit assumption that the inputs are images” (15). “The ability for CNNs to learn complex spatial relationships and subtle and intricate pixel-based patterns makes them a perfect tool for learning from control and ‘teaching’ style inputs of radiologic images” (16). CNNs have been implemented in various contexts of the digital world. Pinterest developed a program dubbed PinSage to map images uploaded to its servers, classify them based on content type, and draw connections based on user interests. The result is a comprehensive search feature that combines search terms and previously clicked images to provide users with a personalized content feed (17). Google applies deep learning to image analytics and enhancement, allowing the system to “fill in or restore missing details in images, simply by learning from what’s already there in the image, as well as what it’s learned from other, similar images” (18). Image-based AI is increasingly common and, as AI programming capabilities advance, increasingly complex.

In summary, AI, machine learning, and deep learning are interrelated, but not synonymous. At a base level, AI is a human-programmed computer designed for simple tasks requiring human skill. The next step up is machine learning, or programs that are able to modify themselves when provided with new data. Finally, deep learning models are subsets of machine learning that exponentially increase the machine’s accuracy, thanks to numerous additional layers of reasoning and larger datasets. Successful AI medical diagnostic systems of the caliber stipulated by this note must utilize all three of these concepts to deliver the degree of accuracy necessary for quality treatment.

Historical review of breast cancer and mammography

During the nineteenth century, diagnosis and treatment of breast cancer was more a matter of physician preference rather than a unified scientific approach (19). Mammography fundamentally altered medicine’s approach by providing what would ultimately prove to be a universally accepted method of preventative care (19). The roots of mammography stem from as early as 1913, when surgeon Albert Salomon first demonstrated the potential for radiography to distinguish carcinomas from healthy tissue (20). Seventeen years later, radiologist Stafford Warren tested a method of stereoscopic mammography on patients immediately prior to mastectomies. While preoperative clinical diagnoses for the patients were relatively uncertain, Warren found his early radiographic technique to be “often very definite and most frequently correct”, with interpretive errors made in only eight of 119 cases (20).

In addition to the discovery of the strong correlation between radiographically visible microcalcifications (tiny calcium deposits within the breast) and the occurrence of breast cancer, the 1950s saw the first implementation of analog (X-ray) film in mammography. As interest in the potential for early breast cancer diagnosis grew, organizations such as the American College of Radiology and the Cancer Control Program of the U.S. Public Health Service hosted conferences to evaluate mammography potential, and established training centers across the country. In 1963, the first needle localization was performed, allowing for improved biopsy accuracy and less invasive, “smarter” surgical removal of lesions (21). The first dedicated screen-film mammography system designed specifically for breast imaging was produced in 1973 in response to growing demand for screening.

Mammography experienced another significant advancement in the early 2000s when digital mammography became possible (22,23). Although both film and digital methods utilize X-rays, digital techniques reduce the radiation dose received by the patient (24). More importantly, digital imaging opened the door to long-term improvements in diagnostic care by permitting the aggregation of mammography results into training datasets for use in machine learning algorithms. Digital methods enable radiologists to enlarge and focus images during evaluation, resulting in the first computer programs specifically designed to assist physicians with image diagnosing.

The use of programming in diagnosing has been applied to the field of radiology since the 1960s as a method to assist in the detection of subtle signs of cancer or other abnormalities (25,26). Computer-aided detection (CADe) was first approved for clinical use in 1998 by the U.S. Food and Drug Administration (25), and became more widely implemented following its approval for reimbursement by the Centers for Medicare & Medicaid Services (CMS) in 2002 (25,27). CADe was intended as an assistive program that would flag abnormalities for review by a radiologist. Early iterations focused on a system of supervised learning in which radiologists prepare a training set of radiologic images and use data labels to identify the sections of the images that indicate abnormalities (25). The learning is considered supervised because radiologists specifically pair the diagnosis corresponding to each type of abnormality, leaving no room for the machine to independently identify indicators of either image abnormalities or diagnoses. Programmers then “train” the machine by feeding the paired images and resultant diagnoses into the machine. New images not included in the original training dataset are then used to test the diagnostic accuracy of the program. The machine searches the test image for any of the image abnormalities identified by the radiologists in the training dataset images (25). If the machine locates any matching traits, the machine flags the relevant portion of the image and notes any probable corresponding diagnosis for review by the radiologist.

CADe is premised upon a “double reading” system (meaning two radiologists review each mammogram) in which the program brings any detected abnormalities to the attention of the radiologist, who then evaluates abnormality significance to determine whether further clinical steps are necessary (26). The double reading system is central to the radiologic standard of care in Europe, where it has been widely recognized as decreasing unnecessary recall and biopsy rates while increasing cancer detection (25). As of 2016, some form of CADe was implemented in 92% of all screening mammograms conducted in the U.S. (25).

Professionals give CADe mixed reviews (8). One study reports that CADe increases detection of small invasive cancers by 164% and detects cancer an average of 5.3 years earlier (26). Although CADe increases the amount of radiologist time required to evaluate a mammogram by an estimated 19%, many argue the benefits render the increase negligible (28). Other studies show no significant difference in radiologist performance when CADe is used (29,30). Another study found that out of 4,191 case reviews by radiologists utilizing CADe software as a double reader, the radiologists altered their diagnosis in only 100 cases (2.4%); the study “found no correlation between (the radiologist’s) mammography interpretation experience and the addition of CADe leading to an improvement in sensitivity” (30). Some suggest that CADe may actually increase error likelihood (31). Across the board, however, findings point to the true weakness of CADe: it is only as reliable as the radiologist(s) reviewing the program’s findings. “Despite important advances in mammography technology, the sensitivity of (digital mammography) is still below optimal levels and varies between readers” (8). CADe’s dependence on pre-programmed parameters for detecting abnormalities further weakens it as a diagnostic tool.

CADe’s inherent constraints have encouraged exploration of CADx (computer-aided diagnosis, a form of machine learning) using unsupervised learning whereby the machine is fed a series of images with their corresponding diagnoses. The images in the training data set do not, however, contain any data labels telling the program what traits of the image data indicate abnormalities. Instead, the machine independently identifies abnormality image traits by aggregating the images from particular diagnoses and searching for correlations. For example, a machine might be fed 1,000 mammogram images, 500 of which reveal microcalcifications. Unlike in supervised learning/CADe, programmers do not identify the calcified areas within the breast image. Instead, the machine is left to process the images for commonalities. In theory, the machine becomes “trained” to identify microcalcifications, allowing it to perform similar diagnoses on actual patients².

Newer methods of mammography are unlikely to revolutionize diagnostic care in the realm of breast cancer, reinforcing radiologists’ hopes for AI. Newer methods of breast analysis have entered clinical testing, one such recent entry being tomosynthesis. Whereas mammography creates an image of a single 2D slice of breast tissue, tomosynthesis creates what is essentially an interactive 3D model of the breast, increasing the radiologist’s ability to identify potential abnormalities (32). Early studies suggest tomosynthesis may reduce recall rates by 15–20% (31). Despite potential to be more accurate than mammography, tomosynthesis also approximately doubles reading time without eliminating potential for errors (30). Other new mammography techniques currently under testing face similar barriers³. Although alternatives to mammography may marginally increase diagnostic accuracy beyond current standards, every alternative builds upon current digital diagnostic standards. If programmers account for differences in imaging methods (such as mammogram vs. tomosynthesis, slice thickness, image resolution, etc.), then AI should be capable of evaluating radiologic image outputs from any number of diagnostic methods, rendering performance distinctions between breast analysis methods marginal when used in conjunction with machine diagnosing (33).

Applying AI to mammography

Although the potential of deep learning techniques in medical diagnosing garnered interest in the 1960s around the same time as CADe, early implementation attempts failed for a variety of reasons. Advanced image-processing methods were not developed until decades later. Even had they been available, computers lacked the capabilities necessary for processing the thousands of images essential to true deep learning techniques. As noted earlier, only recently have large quantities of digital mammograms become available for use in data sets. Now that computer systems, programming, and data availability are crossing the threshold from possibility into reality, the questions surrounding AI diagnosing are shifting toward the ramifications posed by clinical use.

Not all researchers investigating AI’s application to mammography approach the endeavor with identical conceptions of clinical implementation. One potential application is triage diagnosing, where AI targets elimination of easily spotted negatives from the pool of mammograms (34). The machine would assume 60–70% of the workload, clearing out the easy cases and allowing radiologists to focus efforts on more complex, time-consuming mammograms. One advantage of triage diagnosing is that it presumably reduces the risk of misdiagnoses, since the more difficult cases are handled by radiologists. Note that this method assumes the machine’s capability of distinguishing cases by difficulty level.

Alternatively, many researchers envision a “complete” AI diagnosing system in which AI evaluates every mammogram, completely eliminating the approximately 37 million mammograms performed annually in the U.S. from radiologists’ workload (5). Although complete diagnosing may be the end result of a viable machine diagnosing system, any AI implementation in the near future would more likely be on a triage basis until efficacy is demonstrated. AI may also have more immediate applications to automated breast density assessments, which have a significant impact on mammogram accuracy (8).

It should be noted that although an AI might be capable of arriving at a medical conclusion, from a legal perspective an AI diagnostic tool cannot make a diagnosis. The medical profession is highly regulated, partially through state licensure boards. If a physician does not receive a license to practice by a state licensing board, then that physician cannot legally practice medicine anywhere within the jurisdiction. An unlicensed individual or business could not provide AI mammogram services to the public, and as of yet a machine cannot independently become licensed. As a result, there will always be a licensed individual or facility behind the AI services being provided, even if the AI is held out as the diagnosing entity. One solution used in China is to have medical AI programs pass state medical exams and thereby obtain “medical licenses” (35). AI medical licensing allows the program to be “disbarred” after a certain number of errors, without being associated with a licensed human. Regardless, references to an AI diagnosis contained within this note are made from a purely medical perspective, and presume AI use under a valid medical license.

Despite the admirable goals behind AI research, inconclusive evidence regarding efficacy may cause hesitancy. “If the sensitivity for detection of lesions by computer would be lower than the average sensitivity of physicians, it would be difficult to justify the use of automated computer diagnosis” (26). Doubts have prompted exploration of alternatives to correct mammogram misdiagnosing, one of which is increased readership; rather than having each mammogram read by one radiologist, have it evaluated by two. Double-reader solutions are unlikely to prove viable in the U.S. for two reasons, the first being the significant increase in hours dedicated to evaluating mammograms. More significantly, some evidence suggests that increasing readership may not decrease misdiagnoses. A study pitting human against machine concluded: “An estimated 16–31% of detectable cancers are missed when screening mammograms are reviewed by a single radiologist. With a second reader, three to eleven additional cancers are found per 10,000 women screened. This is why the focus has been on CAD, rather than additional radiologist viewings, for increased sensitivity to detectable cancers” (28).

Growing evidence suggests that AI diagnosing may be approaching the level of diagnostic accuracy necessary for practice integration. Google’s endeavors into AI present one such example. Researchers from Google Health recently undertook the creation of an AI model using a training dataset of mammograms from the U.K. and U.S. Upon testing, the machine reduced false positive rates by 1.2% in the U.K. and almost 5.6% in the U.S., and reduced false negatives by 2.7% and 9.4% respectively⁴ (36).

Even more importantly, the team overcame the obstacle of data transferability. A consistent concern regarding AI has been whether a machine trained on mammograms from one geographic locale would maintain diagnostic accuracy when applied to another locale. The Google team tested this theory by retraining the machine using strictly U.K. mammograms, and then testing it on U.S. mammograms. The machine achieved “3.5% less false positives and 8.1% less false negatives than the doctors” (36). Transferability is important because an indirect benefit of AI diagnosing is increased preventative care in underserved or overburdened communities (37). The risk lies in applying the machine’s algorithms to populations it has not been trained on, because populations may vary in the symptoms and predispositions to certain afflictions. For example, sickle cell anemia is more common among individuals of African American, Mediterranean, and Central American descent than any other ethnicity (38). Transferability becomes especially crucial for implementing AI diagnostic tools in underserved regions of the world where limited or nonexistent mammography records preclude machine training on local data⁵.

Another important consideration is the “locking” of AI learning functions after the AI has been sufficiently trained. Functionally, locking the program means that once programmers determine the machine has a sufficient amount of training data for a particular function, the machine’s learning capacity is frozen so that the machine’s diagnosing parameters do not continue to alter (39). Locking might prove necessary to obtain FDA approval for market usage because a product needs to perform consistently at the same level during market use as it did during trial testing.

AI locking would be counterproductive and defeat the policy rationales behind AI. AI’s promise is not only to excel beyond human capability in diagnosing disease, but to continue improving accuracy as its operations process more data. Requiring repeated locking would eliminate or at least delay this adaptive advantage. To maintain constantly improving diagnostic systems, manufacturers would be required to lock a program, obtain additional radiologic images to supplement the machine’s existing training dataset, obtain a new FDA pre-market approval, and then reissue the new AI to providers. Each of these steps entails a lengthy and highly regulated process that justifies placing AI machines outside traditional FDA rules. Instead, AI-specific guidelines should be developed to ensure oversight while allowing AI to grow as a medical tool.

The FDA recognizes the need for reimagined regulatory oversight of AI tools that embraces AI potential while ensuring patient safety. One approach explored by the FDA is an assessment of the “culture of quality and organizational excellence of a particular company” so as to obtain “reasonable assurance of the high quality of their software development, testing, and performance monitoring of their products” (39). In theory, reassurance regarding the quality of an organization and their product would leave the FDA more inclined to relax pre-market approval standards and approve AI tools without a locking requirement⁶. With the FDA playing an active role in adjusting the regulatory framework to account for expanding technological capabilities, it is possible that a new framework may be rolled out to specifically accommodate AI.

Med-mal in the context of mammography

This section focuses on conveying the current legal framework and standards for med-mal adjudication in the U.S. Med-mal is a specific area of negligence tort law that serves to compensate individuals harmed by medical practitioners. Negligence is defined as “conduct which falls below the standard established by law for the protection of others against unreasonable risk of harm” (40). The U.S. differs from many other countries in that the standards for med-mal are largely determined by individual states. These standards were originally created principally by court precedent, but within the last 30 years have been increasingly established by state legislation (41). While historically the split between federal and state legislative action resulted in varying standards across jurisdictions for med-mal claims, today a relatively uniform standard exists.

Rationales of the tort system

A tort is “a civil wrong, other than breach of contract, for which a remedy may be obtained, usually in the form of damages” (42). The tort system is a collection of laws enabling injured persons to seek remedies from the party that caused the injury. Judgments against defendants usually result in compensatory damages (financial compensation intended to repair any damage done to the injured party), and in rare circumstances punitive damages (financial compensation punishing the defendant for outrageous conduct and intended to deter repetition of the same conduct by any member of society) (43). Med-mal falls under the auspices of assault and battery, two of the common wrongs encompassed by tort law.

Tort law relies heavily on the reasonable person standard to determine whether the defendant’s conduct was negligent. The reasonable person is a legal fiction created to represent the average individual, and does not reflect the preferences or characteristics of the specific defendant. Additionally, the reasonable person standard is not a simple formula; it is a flexible, fact-intensive evaluation used by courts to determine the “kind and degree of care, which prudent and cautious men would use, such as is required by the exigency of the case, and such as is necessary to guard against probable danger” (44). The purpose of the reasonable person standard is to ensure that each individual within society is held to the same standard of care. Oliver Wendell Holmes explains “the standards of the law are standards of general application … when men live in society, a certain average of conduct, a sacrifice of individual peculiarities going beyond a certain point, is necessary to the general welfare” (45). Thus, the law will hold every radiographer to the standard of the average radiographer, but will not punish the average radiographer for performing below the standard of highest-caliber radiologists.

Society’s system of legal requirements and policies is also built around incentivizing individuals to act responsibly. To that end, society imposes a system of fault liability to enforce accountability for wrongdoing in cases of med-mal. When a physician’s failure to exercise due care directly leads to a patient’s injury, the physician is responsible for making the plaintiff whole again. Fault liability requires a plaintiff to prove that the defendant’s injury-causing conduct was either intentional or negligent (43). Fault liability is premised on agency, control, and foreseeability; the law will impose liability on the responsible party if the defendant acted of his own volition, and the resultant injury to the plaintiff could have been anticipated by a reasonable person (4). Although the system is partially retroactive in nature, the law is also preemptive in the sense that by punishing one physician for a particular error, the court seeks to discourage that same behavior in all physicians. By punishing individuals for actions over which they have control, the med-mal system encourages higher quality medical care and fairly apportions liability to any wrongdoers. If the defendant was justified in his action, or has an affirmative defense such as self-defense, then a plaintiff may either be unable to recover, or will receive a reduced recovery. Similarly, a finding of contributory negligence in which the plaintiff contributed to the injury may reduce or void a recovery depending on the extent of the plaintiff’s negligence.

Conversely, strict liability allows for the imposition of liability on a party without a showing of fault, such as intent, knowledge, recklessness, or negligence (4). Instead, plaintiffs need only demonstrate that the tort occurred, and that the defendant caused the injury. Situations deemed inherently dangerous typically involve strict liability, such as consumer product liability, keeping wild animals, and ultrahazardous activities such as explosives demolition (46). The differences in bargaining power and knowledge between the patient and the provider in such cases is given greater weight than other situations, such as med-mal claims. Inequalities of information and bargaining power, coupled with increased potential for severe injury, merit the heightened strict liability burden on defendants when injury does occur.

Legal standards in a med-mal claim

For a plaintiff to prevail on a med-mal claim, a claimant must prove four conditions: duty, breach, causation, and loss. Three of these conditions are relatively simple, and are discussed only briefly here. Duty requires the plaintiff to prove that the particular medical provider named as a defendant had a legal obligation to provide care to the injured patient. Despite a few caveats, duty is generally presumed when a physician assumes a patient’s care. Causation requires demonstration of a direct connection between the defendant’s alleged misconduct and the patient’s injury. Claims involving injuries not immediately caused by a provider but nonetheless reasonably traceable to the provider’s actions are also often allowed under a theory of proximate causation. Loss, or compensatory damages, are calculated at the conclusion of a med-mal claim and are generally easy to establish. Losses include a combination of medical bills, lost earnings and earning capacity, pain and suffering, and other types of damages resulting from the injury. Punitive damages are exceedingly rare in med-mal recoveries and are reserved for only the most egregious cases of physician misconduct (47). Note that sub-standard care by a provider does not always equate with med-mal. “A competent physician is not liable per se for a mere error of judgment, mistaken diagnosis or the occurrence of an undesirable result” (48). For a patient to have an actionable med-mal claim, the provider must not only have deviated from the accepted standard of care, but that deviation must also have caused injury to the patient.

For the purposes of analyzing liability in connection with AI services, the primary element of a med-mal claim is breach—a violation by the provider of a legal duty to adhere to a professional standard of care. The standard of care is a set of guidelines specifying the appropriate or required treatment methods for a given condition based on medical research and professional practice. As the standard of care became accepted as a metric for determining provider liability, many jurisdictions adopted the “locality rule” and compared physicians against other “similarly situated professionals in their community” (41). In theory, the locality standard protected physicians in rural and underdeveloped communities against mandated conformity with standards practiced at urban centers, where access to modern facilities and the latest medical research was more readily available.

An 1880 ruling by the Supreme Judicial Court of Massachusetts in Small v. Howard is often credited with the first use of the locality rule (49). Critics argue that the subsequent standardization of training and licensing rendered the locality rule moot. In order to protect the medical profession at large against “claims based on failure to achieve contracted outcomes”, groups such as the American Medical Association played a significant role in the development of standards for the evaluation of medical care (50). Organizations such as the Accreditation Council for Graduate Medical Education, the American Board of Medical Specialties, and state medical boards enforce a degree of uniformity within the medical community. As of 2014, only five states are believed to still implement some form of the locality rule, with the remainder now adhering to a national standard evaluating what a reasonable physician would do under like circumstances without a geographic consideration (41). Additionally, “hindsight cannot form the basis for evaluating the conduct and judgment of the treating physicians at the time their professional judgment was exercised” (51). The increasing digital connectivity of society and the ease with which professionals may stay up to date with medical advances further weaken rationales behind the locality rule (41).

The distinction between the national and locality standards plays a crucial role in setting jurisdictional standards of care. A locality standard state lagging behind medical developments or resisting practice changes may have standards of care that differ from national standards, yet would be considered “standard” within that jurisdiction (52). Consider the following hypothetical:

“In 2020, double-reading of every mammogram becomes widely accepted among the states as necessary for reliable early detection of breast cancer in women. Mammographers in State X continue to generally maintain a single-reader system for analyzing mammograms”. Plaintiff P receives a mammogram in State X, which is read and diagnosed as negative by mammographer M. One month later Plaintiff P is diagnosed with breast cancer, and brings a civil suit in State X against mammographer M for failure to meet the double-reader standard of care. State X operates under the locality rule.

Plaintiff P will be unable to show deviation from the standard of care because general practice within the geographic region of State X is a single-reader system. Although the hypothetical overstates the likelihood of medical practitioners collectively lagging behind medical advances, it emphasizes the potential for physicians practicing substandard or different levels of care to set local practice standards (52). “Local practice patterns are no longer a consideration with respect to the skill, learning, and clinical competence of the physician” (52). Especially should AI diagnosing prove to be demonstrably superior to standard preventative care, AI presents a compelling argument for the complete elimination of the locality rule.

Trying to track legal precedent only further complicates the jobs of attorneys tasked with advising clients on the standards applicable to AI tools. Legal precedent can be either binding or nonbinding (53). A prior case is legally binding if it is factually similar to the case at bar, i.e., under current consideration, and if the prior decision was issued by the same court or a superior court within the hierarchy in which the current case is being heard⁷. All other precedent is non-binding, but may still be considered persuasive to the extent that courts are influenced by the legal arguments and conclusions made by other jurisdictions (53). The tiered nature of the U.S. judiciary makes predicting outcomes difficult, because the same case could be decided differently depending on the court in which the case is heard.

Liability in a med-mal claim

Hospitals and provider networks are often named as defendants in malpractice lawsuits in addition to individual physicians. Hospital and physician networks may be held liable for the malpractice of their employees on a theory called “respondeat superior”, literally meaning “let the master answer”. Because hospitals are better situated than patients to ensure that the standard of care is being met by the physicians employed to provide treatment, hospitals may be held liable for negligent acts when performed within the physician’s scope of employment. Hospitals may also be directly sued for malpractice, without necessarily involving the physician (54). Additionally, hospitals and other network providers have a responsibility to use reasonable care in hiring, training, and supervising employees, as well as for maintaining adequate facilities.

In many states, the more common way of holding hospitals responsible for physician negligence is on a theory of agency by estoppel. Hospitals are increasingly employing physicians as independent contractors for medical services, rather than directly employing physicians on the hospital staff. Because employers have no right to control the performance of labor or services by independent contractors, hospitals technically are not liable for negligence performed by independent contractor physicians working on hospital premises (55). Courts recognize, however, that a patient arriving at a hospital for treatment may be unaware that the treating physician has no direct employment relationship with the hospital. Agency by estoppel allows a patient to recover from a hospital on the basis that the patient was given the impression that the treating physician was employed by, and acting on behalf of, the facility in which the services were performed. If the hospital “holds itself out to the public as a provider of medical services and … the patient looks to the hospital, as opposed to the individual practitioner, to provide competent medical care”, then the hospital may be liable under a theory of agency by estoppel (55).

Medical device manufacturers may also be exposed to liability if a product defect causes or contributes to a patient’s injury. Patients injured by devices deemed “unreasonably dangerous by virtue of a physical flaw, a design defect, or a failure of the manufacturer to warn of the danger or instruct on the proper use of the product as to which the average consumer would not be aware” may recover directly from the manufacturer (56). A product is defectively designed “if the foreseeable risks of harm posed by the drug or medical device are sufficiently great in relation to its foreseeable therapeutic benefits” such that reasonable providers would not prescribe it to “any class of patients” (57). Warnings or instructions are inadequate if they fail to reasonably disclose risks “to prescribing and other health-care providers who are in a position to reduce the risks of harm” (57).

Not all medical products are vulnerable to suit, however; products or devices with pre-market approval by the FDA that comply with reporting and manufacturing standards receive limited immunity from liability for personal injury (58). “Device manufacturers who received FDA approval after extensive review want to avoid repeating the review process in the courts” (59). The administrative liability system reflects a governmental desire to avoid legal disincentives for exploring novel medical treatment methods, balanced against necessary justice for wrongfully injured patients. FDA approvals protect against certain claims regarding device performance, but will not shield manufacturers against claims establishing violations of other federal standards or regulations; misrepresentations during the FDA approval process may void limited immunity (60). Although the FDA has approved CADe as a second-reader system, it is unclear whether a solo AI diagnosis tool might obtain similar approvals.

Medical devices are further protected from liability through what is known as the “learned intermediary doctrine”, which states that a trained healthcare professional, and not the product, makes ultimate care decisions for the patient. Hardware or construction defects remain the responsibility of the manufacturer, but liability for the end application of the product lies with the physician. It is therefore the provider’s duty to inform the patient of all risks associated with the product. Failure to do so generally leaves the provider, not the manufacturer, open to liability for any resultant injury.

Determining the standard of care

Establishing the standard of care usually requires testimony of at least one, often more, expert witnesses. Experts are individuals with specialized knowledge who can testify as to whether the provider met the relevant standard of care in their treatment of the patient. Experts testify as to the appropriate standard of care “through reference to a published standard, discussion of the described course of treatment with practitioners outside the District at seminars or conventions, or through presentation of relevant data” (61). Although experts will often disagree, the mere fact that the plaintiff’s expert may use a different approach is not considered a deviation from the recognized standard of medical care. Nor is the standard violated because the expert disagrees with a defendant as to what is the best or better approach in treating a patient. Medicine is an inexact science, and generally qualified physicians may differ as to what constitutes a preferable course of treatment. Such differences due to preference do not amount to malpractice (62).

Despite views among the AMA and medical professionals of juries as being “incompetent, antidoctor, (and) irresponsible in awarding damages to patients … several decades of systematic empirical research yields little support for these claims” (47). That said, negotiation of med-mal claims is based at least partly around lawyers’ notions of probable jury awards, leaving a significant portion of claim resolutions and final settlement amounts dependent on public perception (47). Unfortunately, public knowledge regarding breast cancer and mammography is lacking. Common misperceptions include beliefs that annual mammograms will always diagnose or prevent breast cancer, that only those with family histories need mammograms, and that any delay in diagnosis decreases survival odds (63). One study found “half of women favoured financial compensation for missed cancers even if the cancer was missed solely because of the failure rate of the test” (64). Another concluded, “Women overestimate their probability of dying of breast cancer by more than 20-fold and the value of screening mammography in reducing that risk by 100-fold” (65). Social emphasis paid to breast cancer without corresponding education encourages misinformation, entrenching high damage awards resulting from juror misperceptions of mammography (47).

Various efforts have been made to protect practitioners against frivolous lawsuits, one of the most prominent being statutes of limitations restricting plaintiffs’ ability to recover once a specified period of time has elapsed following the allegedly negligent treatment. The statute of limitations period varies by state but generally falls between 1 to 3 years, and is strictly enforced. Some jurisdictions toll the statute of limitations to begin running at the point when a plaintiff becomes aware of the malpractice in an effort to force prompt filing of claims without unjustly preventing compensation to injured patients. Contributory negligence by the plaintiff, such as failure to follow physician instructions or neglect of follow-up procedures, may bar recovery in part or in full when the plaintiff’s neglect is also a proximate cause of the injury (66). Statutes of limitation are the subject of significant criticism by those who consider them overly protective of the medical industry.

Many organizations promulgate Clinical Practice Guidelines (CPGs), “statements that include recommendations intended to optimize patient care” through uniform standards based on systematic review of medical literature and research (67). CPGs may be created or adopted by health insurance providers, health maintenance organizations, professional medical societies, state governments, and hospital networks. CPGs are not universally admissible in court proceedings because standards often conflict with one another and quickly become outdated through new medical research. Guidelines are also often biased depending on the promulgating agency. Groups without fiduciary obligations to patients, such as private health insurance and pharmaceutical companies, might use CPGs to favor corporate interests at the expense of the patient (41). Other CPGs specifically decline use for determining standards of care (68).

Regardless of the rule implemented by a given jurisdiction, various health care professionals are held to a heightened standard of care. These individuals are considered specialists by virtue of their more extensive training in particularized fields. The American Board of Medical Specialties currently recognizes and certifies physicians in 24 distinct specialties, including neurology, anesthesiology, cardiology, ophthalmology, and radiology (69).

The standard of care in mammography, a subspecialty of radiology, is guided both by state common law statutes as well as by federal law. Any facility performing mammograms is required to meet the standards of the Mammography Quality Standards Act, passed by Congress in 1992. The Act includes minimum qualification requirements for practitioners, certification guidelines for acceptable radiologic equipment and mammography facilities, and policies for communication and storage of mammography results (70).

The standard of care in mammography is less defined than in other areas of medicine due to the fact that mammograms often appear to be inconclusive, with one in five cancers missed and frequent false positives (71). No court rulings have emerged as controlling for malpractice cases involving mammogram misdiagnoses, and consideration of negligence tends to turn on testimony by expert witnesses appealing to juror perceptions. Accordingly, delays in diagnosis have become one of the leading considerations in U.S. breast cancer-related med-mal claims. Such diagnosis-related delays encompass both greater turnaround time between a patient having a mammogram and the radiologist providing analysis results, and delayed treatment due to failure to identify cancer in the mammogram. One study analyzed 370 breast cancer med-mal cases from 2005 to 2015, and found delay in diagnosis to be the “most common reason for alleged negligence, cited in 79% of cases” (72). Whether a plaintiff suffered increased harm due to delay is almost universally a question of fact submitted for determination by the jury.

Other leading factors included deviation from the standard of care (60%); improper test and imaging interpretation (39%); delayed treatment (28%); failure to order a biopsy (23%); wrongful death (21%); failure to refer to a surgeon (13%); and lack of informed consent (8%)⁸ (72). Radiology and mammography are frequently among the specialties and procedures most often involved in med-mal claims, resulting in defensive medicine and fewer radiologists willing to evaluate mammography results. Waivers of liability, unnecessary supplemental care and follow-up procedures, excessive referrals, reduced or refused care, and avoidance behaviors are all evidence of defensive medicine driven by practitioners’ fear of malpractice actions (73).

AI and the modern malpractice framework

The current medicolegal framework is incapable of justly assigning liability in cases of injury resulting from AI misdiagnoses. As discussed earlier, modern liability standards are founded on principles of agency, control, and foreseeability: a party capable of predicting and preventing an avoidable harm is responsible for compensating an injured party to whatever degree necessary to remedy the harm done (4). AI diagnosing presents a range of complications that will prove difficult to resolve through current notions of liability. If the justice system’s aim is to assign liability only to parties with insight into and control over the negative outcome, then it is difficult to justify apportionment of liability to AI users who lack control over the ultimate “black box” diagnosis. Questions of liability similarly cloud the malpractice formula. Distribution of liability amongst numerous handlers when no single party provides the diagnosis complicates court analysis. More importantly, it is difficult to measure violations of the standard of care in the context of untested AI tools. Existing legal theories exemplify some consideration of these issues, but not to the degree requisite for providing courts with a clear set of standards for evaluating claims and apportioning liability.

The black box dilemma

The lack of AI law and legal precedent within the current med-mal system would render any assignments of liability for AI error irresponsible and uninformed. The recognition that AI’s current nature as a black box precludes post hoc rationalizations of diagnoses presents the foremost obstacle to AI’s integration into existing notions of liability. For AI to be incorporated into the existing medicolegal structure, recovery must be based on traditional tort principles. Yet the professed advantage of AI, and indeed one of the fundamental drivers behind AI’s implementation in diagnostic radiology, is AI’s freedom from human presuppositions. Concepts of agency, control, and foreseeability collapse when attempting to apportion liability for decisions made by a black box device. “The more autonomy machines achieve, the more tenuous becomes the strategy of attributing and distributing legal responsibility for their behavior to human beings” (74).

Yavar Bathaee succinctly explains the implications of analyzing AI liability under the tort system’s intent-based approach: “the implications of this inability to understand the decision-making process of AI are profound for intent and causation tests, which rely on evidence of human behavior to satisfy them. These tests rely on the ability to find facts as to what is foreseeable, what is causally related, what is planned or expected, and even what a person is thinking or knows … If an AI program is a black box, it will make predictions and decisions as humans do, but without being able to communicate its reasons for doing so … This also means that little can be inferred about the intent or conduct of the humans that created or deployed the AI, since even they may not be able to foresee what solutions the AI will reach or what decisions it will make”⁹ (10).

Application of the tort system to AI would not only disserve the justice system, but might have reverberating effects on how society views liability for injury without demonstrable intent or causality at play.

A more thorough consideration of liability informs understanding of the black box dilemma. Every medical procedure involves numerous potential actors: hospitals or clinics providing facilities and medical professionals; physicians performing the procedure; support staff assisting physicians; manufacturers supplying medical equipment; and administrators employed by hospitals, insurance providers, and other associated entities. In instances of malpractice, the sheer number of individuals and entities involved in a patient’s medical care may leave plaintiffs uncertain who to sue.

Plaintiffs have increasingly turned to “shotgun” suits as a method for overcoming uncertainty¹⁰. In a shotgun suit, attorneys name any and all of the potentially liable parties mentioned above as defendants to the med-mal action. Filing a complaint allows plaintiffs to obligate production of specified records through subpoena power, and shotgun suits minimize the risk that a plaintiff will discover negligent defendants after the statute of limitations date has passed. Not every party sued in a shotgun suit is liable; many are dismissed during discovery and before settlement is even discussed. Shotgun suits might balance the playing field for plaintiffs fighting against filing deadlines and uncooperative physicians, but they also result in unnecessary legal expenses for the non-liable parties. In an ideal world, plaintiffs save time and legal fees by suing only the responsible parties. Given the admittedly limited to nonexistent control physicians, hospitals, or even AI manufacturers exert over the machine’s diagnosing, it may be unreasonable to hold them liable when an error surfaces.

Physician liability for AI error

Despite physicians’ stereotypically frontline role in med-mal suits, assigning liability for AI errors to individual physicians would be irrational. Standard med-mal claims name individual physicians as defendants, generally targeting doctors who had personal involvement in the patient’s treatment. According to current mammography procedures, the radiologist who read the patient’s mammogram would reasonably expect to be named as a defendant when the patient has a negative outcome. Conversely, AI presents no figure clearly deserving of blame. The technician who administered the mammogram? The staffer who uploaded the image to the machine? The physician or family doctor who conveys the results to the patient? Keeping in mind that our analysis assumes perfection of test administration and consideration of all relevant clinical and familial factors, none of these parties truly play any role in the misdiagnosis itself.

Patients generally obtain mammograms and other forms of preventative care at their primary care provider’s direction, but the actual screening is performed by clinicians performing a physical exam and obtaining family history, and then administering the actual mammogram. Clinical technicians, nurses, and administrators in clinical imaging centers similarly play little to no role in diagnosing, and short of forwarding the image results to the radiologist or conveying diagnoses to patients will not interact with the results. Radiographers administering the mammogram likely have no say in what method is used by radiologists to evaluate the mammogram, and may not even be aware if AI plays any role in the process. Normally, mammograms are sent to radiologists for diagnosis and are returned to referring clinicians or the primary care provider. If AI diagnosing is implemented, there may never be a radiologist connected to the actual mammogram, leaving only primary care providers or clinicians involved in the test.

It should be noted that if AI is implemented as a double-reader where both a machine and a radiologist evaluate an image, the radiologist’s liability will not be decreased by virtue of the machine’s involvement. Any image reviewed by the radiologist will be accompanied by traditional standard of care expectations. Using AI as a second reader, even where AI is presumed to be more accurate and reliable than a human radiologist, will not eliminate expectations the med-mal system places upon the radiologist.

Hospital liability for AI error

Hospitals should be prepared to retain a portion of liability for negative outcomes resulting from AI misdiagnoses. In theory, a network utilizing AI diagnosing might be held liable for failure to exercise due care in selecting AI “employees”. Hospitals would assert that they cannot be expected to demonstrate due care to screen AI when, practically, there is no way for the hospital to evaluate the machine’s methods. Knowledge of the machine’s reliability will often fall to representations by the manufacturer, and perhaps a sample set of tests confirmed by network staff. Functionally, however, hospitals still have the capacity to test the AI efficacy prior to unilateral dependence on AI for mammography reading, such as by implementing AI as a double-reader during a trial period. It is likely hospitals will retain some degree of responsibility for appropriate screening of AI diagnostic tools.

There are, however, strong policy rationales discouraging applications of strict liability to healthcare providers for AI error. Strict liability is justified on the basis that actors are engaged in inherently dangerous conduct, and are aware of the liability, legal framework, and standards surrounding their actions. Yet health care providers lack a cohesive set of standards governing AI use in medical image diagnosing. Furthermore, strict liability may only further deter innovation by small market participants. Fortune 500 corporations such as Amazon, Microsoft, Google, and IBM already “account for 40% of open AI positions” in the job market (75). Strict liability for AI injuries “would favor established and well-capitalized participants in the field and erect significant barriers to entry and innovation” (10).

Hospitals are the foremost area in which new medical technology is tested, and will likely be the modus for the first widespread AI implementation. Each of the top five hospitals in the 2019–2020 Best Hospitals Honor Roll (76) are currently exploring potential medical AI applications, ranging from automated immunotherapy and molecular sequencing at the Mayo Clinic Center to advanced patient management systems at Johns Hopkins Hospital (77). In 2016, Massachusetts General Hospital announced a partnership with NVIDIA for the on-site installation of a DGX-1 supercomputer, a deep learning machine valued at more than $129,000 (78). The program will be trained on the 10 billion medical images contained in the hospital database for applications in radiology and pathology (79). In the same year, the Cleveland Clinic Foundation announced a partnership with Microsoft to integrate Cortana into the Foundation’s eHospital system to “utilize predictive and advanced analytics to identify potential at-risk patients under ICU care” (80). CCF uses the eHospital program to expedite records access and remotely monitor and communicate with ICU patients across the main campus and three regional hospitals (81). Blanket liability would discourage research and development efforts like these, and render significant portions of current medical AI projects irrelevant (10).

Practically, the high likelihood of hospital liability leaves hospitals with two choices. The safe and likely popular option will be to retain large radiologist staffs and avoid AI until the surrounding medicolegal issues are resolved. Many smaller hospitals lack the financial resources to thoroughly test new technology and simultaneously cover resultant liability. A few hospitals will instead select the risky but potentially lucrative route of implementing AI. These hospitals will do so with the knowledge that lawsuits are virtually guaranteed, and the lack of precedent would leave hospitals at the mercy of the court and public opinion. From a policy perspective, assigning blanket liability to hospitals may stifle research and development in a field that arguably holds an important role in the future of medicine.

Manufacturer liability for AI error

Manufacturers will be one of the first groups plaintiffs target when AI-related injuries arise because plaintiffs may find it easier to apportion liability to the manufacturer responsible for programming and distributing the AI system. At least as AI diagnosing stands currently, manufacturers are in the best position to understand why the machine arrives at certain conclusions. As compared against other defendants in a med-mal lawsuit, programmer/manufacturers are likely the most capable of analyzing and correcting causes of diagnostic errors. Even the small subset of programmers specializing in AI deep learning algorithmic models who are qualified to testify in court as expert witnesses would be hard pressed to rationalize a diagnosis, let alone in a way understandable to the average judge or juror:

The only possible description of such a model’s decision-making is a mathematical one, but for lawyers, judges, juries, and regulators, an expert may be required to describe the model mathematically, and in many cases, even an expert is unlikely to be able to describe (mathematically or otherwise) how the model is making decisions or predictions, let alone translate that description for a regulator or fact finder (10).

Even were doctors and hospital personnel versed in the complicated programming involved in deep learning AI algorithms, manufacturers will almost certainly try to protect AI against tampering as a trade secret through product security, copyright and intellectual property protections, and nondisclosure agreements (82,83).

Despite the logic of holding manufacturers accountable for their products, there are various procedural safeguards (or loopholes, depending on perspective) that normally would shield manufacturers from lawsuits. Even if liability is assigned to manufacturers by the courts or through legislative action, it would not be under a med-mal framework. Lawsuits against medical device manufacturers are not traditionally med-mal claims, and instead are brought under a theory of defective device design focused on whether or not the product was reasonably safe. As discussed earlier, defective design claims are subject to an entirely different legal and regulatory framework, and manufacturers with FDA approval for medical AI machines may be partially or entirely shielded from liability.

AI’s nature as a software product further complicates holding AI manufacturers liable for patient injuries caused by inaccurate or erroneous diagnoses. Courts have readily attributed liability to manufacturers where the injury was directly related to a physical component of the device (84). Medical software, on the other hand, is considered “technology that helps healthcare providers make decisions by providing them with information or analysis” (85). Based on the learned intermediary doctrine, the medical software distinction places medical decisions firmly in the court of healthcare professionals utilizing the product. The learned intermediary doctrine, however, would be difficult to rationalize in the context of AI diagnosing where a machine, rather than a human individual, produces a diagnostic conclusion. Neither FDA approvals nor the learned intermediary doctrine clearly apply to diagnostic AI products, leaving manufacturers facing the same uncertain liability exposure confronting service providers. Increased products liability may be the logical direction courts turn when human physicians play decreased roles in product usage and patient diagnosing.

Legal process for an AI-related med-mal claim

Revolutionary technology such as AI defies the standard of care measure around which our medicolegal system is built. It is a given that machine diagnosing tools will not be 100% accurate—there are going to be mistakes. Despite relatively low percentages of injured patients bringing lawsuits (only an estimated 1 in 25 patients with a viable claim brings suit), malpractice claims are inevitable from among the misdiagnosed patients (47). Even assuming 100% AI accuracy, there will still remain a number of baseless suits that must traverse the legal process (86). Under the existing med-mal framework, defendants will be burdened with demonstrating that the machine met the standard of care, despite the alleged misdiagnosis.

The difficulty that will be faced by AI manufacturers and medical providers, and which has also impeded many other revolutionary medical practices, is the time lapse between the viability of a new practice and the practice becoming so widely accepted that it becomes the standard of care. Some estimates place the average lapse between new research and widespread professional acceptance at seventeen years (87). Sometimes the acceptance period can be significantly longer, as discussed above in the history of mammography. Until AI diagnosing is widely accepted by the radiological community, AI’s first users will bear heightened liability for errors.

The lack of an expert witness equivalent for AI further highlights the inadequacies of the legal process for handling machine misdiagnoses. The black box problem precludes anyone, including original programmers, from testifying with certainty as to the machine’s rationale. Though several alternatives to expert testimony exist, each would represent a fundamental departure from traditional notions of the expert witness.

One option is to make a case-based example using the machine’s diagnostic history (88). In any med-mal claim alleging an AI misdiagnosis, the plaintiff will need to demonstrate the presence of indicators in the patient’s breast images such that the breast cancer should have been diagnosed. To do so, the plaintiff will presumably summon expert witnesses to identify what sections of the breast images, in the professional’s opinion, indicate some form of abnormality that merits either a positive diagnosis or some form of clinical follow-up. Assuming the machine is pre-programmed with the capability to search its databank of breast images from previously diagnosed patients for traits similar to those identified by the plaintiff’s experts, the machine should be capable of presenting past cases similar to the plaintiff’s. Dubbed the “nearest neighbor” method, the AI is simply applying its image comparison capabilities to searching input features of previous cases to identify similarities (88). The machine’s accuracy in any similar cases may be determined by comparing the machine’s diagnoses against the patients’ outcomes. Comparing the machine’s gross accuracy in all previous similar cases will yield a measure of whether the plaintiff’s diagnosis was reasonable compared to similarly situated patients. Evidence that the machine’s reasoning was correct in most analogous cases may substantiate that the standard of care was met, replacing the role of expert witnesses for the defense.

The problem with this option is that the machine would, in theory, be playing the role of both defendant and expert witness. Rules of evidence at the state level vary by jurisdiction, but the Federal Rules of Evidence do not contain any formal prohibitions on a defendant serving as his own expert as long as the individual demonstrates sufficient “knowledge, skill, experience, training, or education” within the given field (89). Practically, however, doing so would present problems for the defense. The defendant is clearly biased, and the jury would certainly account for potential conflicts of interest. Rules of evidence might also obstruct admission of nearest neighbor testimony, depending on the jurisdiction. The nearest neighbor method additionally fails to overcome juror doubts of machine efficacy, as compared against human professionals. In the end, the nearest neighbor method might prove useful as supporting evidence and play a role in evaluating diagnostic efficacy, but may be less than ideal as a defendant’s substitute for expert witnesses.

Alternatively, the standard of care could be met by running the mammogram through several other AI diagnosing programs to demonstrate correlating results. AI “cross-testing” would require market presence of several distinct AI-based diagnostic tools with unique base learning datasets, thereby allowing each AI to arrive at different conclusions. If litigation arises in the first few years of AI implementation before various manufacturers have entered the medical arena, it is unlikely cross-testing will be an option for the first courts tasked with setting AI accountability standards since few alternative AI tools may be available. Using machines as experts further highlights the challenges of technological capabilities surpassing human physicians, a dilemma taken very seriously by medical professionals (90). Transitioning to a machine-based standard of care might be one step further than the courts, and perhaps even the medical community, are prepared to go (91).

A third option is to disregard the black box problem entirely and have AI programmers take the stand as expert witnesses to opine on the machine’s “rational” process. Although testimony by programmers might seem the most reasonable option, it should not satisfy the burden of proof required of expert testimony. Programmers’ inability to claim any degree of certitude regarding their professional opinion is unlikely to prove useful to judges or jurors who lack complex understanding of AI algorithmic functions. Another option is to have several radiologists evaluate the machine’s findings and either confirm or deny the results. Of course, AI diagnosing is justified on the basis that it transcends the image-reading skills of human radiologists. Evaluating AI diagnoses according to a human standard defeats the purpose of the expert witness and nullifies any benefits AI delivers to the medical profession.

Jurisdictions practicing the locality rule present a more localized obstacle to implementation of AI diagnostic methods. Until such a time as AI diagnostic tools become an obligatory component of preventative treatment, providers will retain the discretion to forego AI implementation. Providers will face a choice: either implement AI diagnosing, or wait for legal challenges to be resolved. If a majority of providers in a geographic area opt to “wait and see” what becomes of AI, the handful of providers utilizing AI services will be forced into the “minority locale”. Under the locality rule, these minority locales might face heightened liability solely because the providers in their town or state are not uniformly in support of AI. On the other hand, under a national standard experts may testify as to whether AI meets a broad national standard of care, protecting minority physicians against heightened liability.

Looking to the future—addressing the malpractice model’s gaps

Although AI implementation in health care is still in its early phases, the AMA has anticipated the complexities of incorporating technology into clinical practice. In 2018, the AMA issued a new policy committing the AMA to:

Leverage its ongoing engagement in digital health and other priority areas for improving patient outcomes and physicians’ professional satisfaction to help set priorities for health care AI.
Identify opportunities to integrate the perspective of practicing physicians into the development, design, validation, and implementation of health care AI.
Promote development of thoughtfully designed, high-quality, clinically validated health care AI that:
- Is designed and evaluated in keeping with best practices in user-centered design, particularly for physicians and other members of the health care team;
- Is transparent;
- Conforms to leading standards for predictable, constants in reproducibility;
- Identifies and takes steps to address bias and avoids introducing or exacerbating health care disparities including when testing or deploying new ai tools on vulnerable populations; and
- Safeguards patients’ and other individuals’ privacy interests and preserves the security and integrity of personal information.
Encourage education for patients, physicians, medical students, other health care professionals, and health administrators to promote greater understanding of the promise and limitations of health care AI.
Explore the legal implications of health care AI, such as issues of liability or intellectual property, and advocate for appropriate professional and governmental oversight for safe, effective, and equitable use of and access to health care AI (92).

Although AMA guidelines do not mandate particular policy actions regarding AI use in clinical care, they do inform considerations of the varied obligations of the medical community.

Machine diagnosing has the potential to provide significant health benefits, but will require proactive efforts by providers to address legal complications. The primary focus of every healthcare provider should not be on actions in court after a med-mal claim has been initiated, but should instead be on preventing patient injury. Preventative action serves the dual purpose of minimizing patient injury, and insulating providers against potential liability. The fact that no comparable legal cases or standards exist to guide evaluation of AI in health care only reinforces the importance of minimizing liability for providers.

Recognizing the varied commitments of the healthcare community and the lack of substantive court precedent, the following recommendations seek to guide AI implementation in health care in a manner which (I) upholds physicians’ obligations to patients; (II) minimizes provider liability; and (III) encourages the development and improvement of the medical profession in conformity with ethical obligations iterated by AMA policies.

Educational programs for clinicians and the public

A primary focus should be aggressive education campaigns providing information on how AI diagnosing works, its performance relative to radiologists, and the advantages its widespread application poses for society at large, along with increased education regarding breast cancer and the actual role mammography plays in early detection. Many individuals may hold negative perceptions of AI applications within the health field, fueled by a variety of sources—fear or misunderstanding of technology; distrust of machines due to pop-culture’s portrayals of AI; concern regarding quality of care; and even accusations that AI applications only serve physicians’ pocketbooks at patients’ expense. Even if AI misdiagnoses become less common than radiologist misdiagnoses, computer-driven injuries will draw more attention and public backlash (93). “Misunderstandings about what AI is and is not could fuel opposition to technologies with the potential to benefit everyone” (93). Educational programs will help assuage current and future patients that the providers are meeting the standard of care, and that AI has the potential to not only meet, but to surpass the current standard of care in particular applications.

Educational programs have the added benefit of increasing general knowledge regarding the function of diagnostic tools, which could play a crucial role in correcting the flawed perceptions held by the “misinformed” juror (64). Barratt et al. suggest that the “enthusiastic way in which mammographic screening has been promoted, often with limited acknowledgement of the potential for both false negative and false positive results” has contributed to a common perception among women that mammography should identify cancer 100% of the time (64). The misinformation phenomenon is common around the world, and reflects a critical need for accurate information to inform patient decisions regarding screening (64).

Educational programs should not be limited to the public, but should also be provided to the medical community, especially radiologists. For AI to meet the standard of care (assuming the traditional understanding of the “standard of care” would be applied to AI machines in future litigation), it needs to be accepted by the radiologic community. AI may be hard-pressed to obtain mass assent, especially in the face of physicians’ fears for their employability. Assuring AI’s initial application only as a method of weeding out the easy negatives would go a long way towards smoothing over fears regarding AI’s implications for the radiologic profession.

Aggressive research and testing of AI tools should accompany the programs. Data collection becomes especially relevant once providers and manufacturers begin to seek approval by CMS, private health plans, and the medical community for market use. Educational programs will also assist the medical community in confronting the various ethical complications that will arise as AI is increasingly incorporated into daily practice (94). Educational programs should be carefully presented to avoid the perception of coercion or forced conformity. Radiologists will be more likely to react positively and with open minds if AI is presented as a viable advancement worth considering, rather than an obligatory step forward.

Training programs for radiologists and clinicians will also become necessary as AI tools are integrated into clinical practice. The basics of AI algorithmic functioning, appropriate applications of AI, and ethical implications of AI use will all become important points of consideration in training (33). At a minimum, clinicians will need to be well-versed in machine functioning so as to fulfill disclosure obligations about the nature of the test and ways in which patient data may be used in future data sets (33). Educational programs are unlikely to reach every individual, and clinicians will be responsible for informing any uneducated patients.

It should be noted that as personal health data grows to govern ever larger portions of treatment decisions made for or by individuals, health professionals have a correspondingly more difficult time staying informed. For example, radiologists already may face difficulty staying abreast of the various experimental mammography techniques discussed earlier. Expecting radiologists to additionally become fluent in AI functional processes may prove a significant burden on medical professions whom we would prefer dedicate their time to treating patients and improving standards of care. Fiske and colleagues express concern that health professionals no longer have the time or training necessary for guiding patients through the modern medical “data jungle” (95). Fiske proposes health information counselors (HICs) as a solution. HICs would be trained in data analysis and analytic skills, be knowledgeable about health management and insurance systems, and be familiar enough with clinical medicine to advise patients on the role of personalized data in prevention, diagnosis, and treatment (95). Specifically, HICs might be well-placed to educate patients on how AI diagnostic systems work and the implications for personalized treatment plans. In short, the HIC could function as a personal intermediary between the patient and the health system, potentially alleviating some of the pressure placed on medical professionals as AI inexorably entrenches itself within the medical field.

There are various obstacles, however, to a position akin to the HIC envisioned by Fiske filling the role of “patient educator”. First, while HICs may very well prove a beneficial addition to medicine, it is not clear that such a role is an essential predicate to the introduction of AI systems in health care. It is certainly true that an HIC might prove an invaluable resource as a mediary between patients and medical professionals, but that does not necessarily eliminate the radiologists’ obligations to be educated on AI. As a result of the heightened liability accompanying the first introductions of AI care into medical care, radiologists will certainly be incentivized to thoroughly educate themselves on AI efficacy and functional performance before implementation. Yet even if AI becomes a generally accepted standard of care, physicals will still be obligated to obtain informed consent from patients, which necessarily involves the ability to explain the systems to be used in treatment. Although shifting this educational burden from medical professionals to some form of HIC may prove advantageous, it may also necessitate a restructuring of the obligations placed upon physicians for obtaining informed consent from patients. Additional complications might include whether private or public insurance would cover HIC expenses; whether HICs operate independently of specific medical service providers, or are provided in-house to patients; and whether HICs will replace physicians as the “family doctor” face of medicine the patient interacts with. Introduction of HICs to health care may prove valuable both in AI-related education and in health management at large, but implicates broader structural and policy changes than this analysis is capable of addressing.

Providers can also engage in lobbying efforts to have state legislatures implement guidelines governing AI’s role in medical diagnosing. Potential regulations range from standards for use of an AI program, such as a mandatory statistical accuracy; public registration systems disclosing to patients which providers utilize AI diagnosing and how it is implemented; disclosure of how personally identifiable information is stored and whether patient mammograms will be used in future AI training datasets; and whether mammogram results are communicable by automated phone or text message, or if they must be communicated in person. Given the prominent position breast cancer holds in public awareness, governments may even be incentivized to invest public research funds and grants into the growth of AI diagnostic tools. Public support for the program will certainly assist in advancing policy objectives.

Obtaining efficient regulation of AI is easier said than done. More than sixteen separate federal agencies regulate areas of the economy involving AI use (93). Regulation is often reactionary, especially in the field of technology where legislature lags behind technological advances. The Standing Committee of the One Hundred Year Study of AI notes that “research and deployment have been slowed by outdated regulations and incentive structures,” slowing exploration of the applications AI holds for health care (93). Further, the Committee concludes “inappropriate regulatory activity would be a tragic mistake. Poorly informed regulation that stifles innovation, or relocates it to other jurisdictions, would be counterproductive” (93). By keeping the relevant agencies and legislating bodies apprised of research and development progress involving AI and medicine, ideally the healthcare industry will be able to guide the eventual regulatory system to be more informed and effective. Consistent efforts to inform these parties may further delay mandatory regulatory structures, to the extent that agencies and legislators are reassured that providers and manufacturers are aware of the risks and are collectively working to address them.

Special attention in education campaigns should be directed towards jurisdictions practicing the locality standard for determining the standard of care. As discussed, the locality standard presents a unique risk to practitioners in areas less willing to adopt new techniques. Physicians interested in AI will be forced into the choice of either incorporating new medical techniques into everyday clinical practice in the face of heightened liability, or conforming to local constructions of the standard of care by foregoing revolutionary medical techniques. Although it is too early to speculate what level of acceptance the radiology community will demonstrate towards AI diagnosing of the scale discussed herein, aggressive work by the AMA and professional radiology associations will be essential for ensuring that individual clinicians do not face adverse consequences as a result of professional resistance.

Minimizing legal risk to healthcare providers

An attempt to fit AI diagnostic tools into the current medical liability scheme would contort the med-mal system. Without creating unique standards designed for governing AI, manufacturers, providers, and lawyers will lack a framework to evaluate liability for future AI innovation. In the worst case scenario, courts either label AI diagnosing as an unacceptable healthcare methodology, or place such significant restrictions on AI’s applications that functionally AI becomes more trouble to implement than it is worth. A judicial rule banning machine diagnosing would leave state or federal legislation the only path for AI implementation. Lobbying for legislation forces providers into the role of obtaining approval prior to implementation, and states are just as likely to ban medical AI as they are to embrace it. Extensive judicial restrictions would likewise curtail the medical community’s ability to introduce AI at a self-determined pace, and might similarly require corrective legislative action to bypass. Gradually introducing AI tools prior to judicial rules or legislation gives providers the advantage of time to sway public opinion and demonstrate AI efficacy, rather than retroactively fighting against misinformed presuppositions.

Either way, it is in the interest of AI manufacturers and users to prevent an AI misdiagnosis case from reaching the judiciary before AI has become the standard of care and before states have the chance to proactively pass legislation governing its use. The reality is that the first implementers of AI diagnostic tools will face tremendous liability, but not because the patient’s injuries are incalculable. Courts have been calculating damages for decades, although whether malpractice awards are reasonable is a separate discussion. The real danger lies in a med-mal claim involving an AI misdiagnosis arriving in court before adequate standards have been crafted for evaluating it. Courts will readily impose standards on the medical profession, even when self-governance rules are in place: “In most cases reasonable prudence is in fact common prudence; but strictly it is never its measure … Courts must in the end say what is required; there are precautions so imperative that even their universal disregard will not excuse their omission” (96).

Courts have notoriously avoided providing standards to govern questions of first impression unless absolutely necessary (97). The Supreme Court in particular routinely “adheres to the principle of deciding constitutional questions only in the context of the particular case before the Court” (97). Instead, the courts favor attempting to fit novel cases into familiar standards, largely because modern jurisprudence is precedent-based and informed by past cases (98).

Providers should introduce AI diagnostic tools gradually, both to acclimate the public to their use and to ensure the tools function as intended. One option is temporarily introducing AI diagnosing in the role of a second reader, similar to current CADe software. Physicians would run every mammogram through the machine, compare them, and only diagnose the patient negative for abnormalities if both radiologist and machine are in agreement. Short-term double reading at the machine’s inception represents only marginal increases in cost and readership time, and provides hospitals with an opportunity to determine whether the program meets the standard of care while retaining standard liability exposure. If detailed follow-up records are diligently maintained on patients “seen” by the machine (including any subsequent diagnoses made by either machine or radiologists), the hospital in theory has a baseline for evaluating whether the machine does in fact perform at a level comparable with human radiologists¹¹. Ongoing, periodic audits of AI performance will also assist in demonstrating the hospital is using reasonable care to ensure that AI diagnostic tools are functioning optimally.

Detailed records will also assist hospitals and other providers in meeting any FDA post-market safety monitoring requirements. The FDA employs various systems and agencies tasked with manufacturer inspections and surveillance of negative outcomes and reporting problems (99). Past FDA reluctance to approve uses of innovative diagnostic programs has been partially due to “an unclear understanding of the cost/benefit tradeoffs of these systems” (93). The FDA may be more amicable to an AI-specific regulatory scheme if providers demonstrate diligent efforts at self-regulation. Voluntary participation in phase IV post-market clinical trials and monitoring may be another step for gaining FDA support (93).

Alternatively, providers could make AI diagnosing optional to patients willing to try it, potentially accompanying the service with a waiver of liability shielding the provider from suit (also known as an exculpatory agreement). Providers might incentivize patients by offering free or discounted mammography services for a period of time if patients opt-in to an AI diagnosis program. Many women presenting for mammograms, especially those suspecting breast cancer, experience significant stress during the period of waiting for results (100). Guaranteeing faster turnaround periods for diagnoses would motivate patients to participate (100). Given that the image review process is digital, mammograms feasibly may be taken in any part of the country, digitally run through a single in-network AI machine, and the results sent back to the clinic and presented to the patient, all within the same visit (which would also expedite any necessary follow-up procedures). Growing digital connectivity is already a trend among healthcare providers, patients, and insurance providers increasingly sharing digital information and patient medical records (101). IBM notes “automation of hospital administrative processes, such as patient registration, admission, and discharge is relatively widespread” and has a multifactorial impact, from decreasing patient wait times for seeing specialists to decreased necessity for follow-up consultations due to rapid availability of medical records (101).

Acceptability of waivers of liability varies greatly by jurisdiction (102). A strong policy argument in favor of regulating waivers is the lack of bargaining power between providers and recipients of a service, especially in a healthcare context where a patient cannot safely decline a particular treatment even if the waiver terms are unfavorable. To that end, many states such as California engage in lengthy analysis to identify waivers of liability that are contrary to public policy (103). Some states disfavor exculpatory clauses and invalidate any waivers that are overbroad or issued on a take-it or leave-it basis in which the individual waiving liability is given no opportunity to bargain for alternatives (104). Other states such as Ohio are more lenient and analyze the agreement to determine whether more likely than not “an ordinarily prudent and knowledgeable individual would have understood the provision as a release from liability for negligence” (105).

Providers should additionally be cautious of making AI pilot program services available for free or at discounted rates to anyone who opts in. For consent to be informed, the individual must be acting autonomously and free of coercion or undue influence (106). Financial opt-in incentives create a real risk of inadvertently taking advantage of financially disadvantaged segments of the community by introducing coercion and undue influence. For poorer individuals, especially the uninsured, financial incentives may overcome any actual consideration of risk involved in AI diagnosing. The copay for a mammogram for insured patients is generally between 10 and 35 dollars, and uninsured patients will pay on average $102 (107). Even for insured patients, the opt-in nature may call into question whether valid, informed consent was obtained. The safest route would be to offer information on AI diagnosing, and let patients decide to opt-in without any form of financial incentive (108).

At a minimum, manufacturers should pursue the “nearest neighbor” method for corroborating machine diagnoses. It will be unknown until the first AI case is tried whether courts will accept the nearest neighbor method as a substitute for expert testimony, or even whether the method is advisable. Changes to the AI program may be deemed inadmissible for evidentiary or testimonial purposes after a suit is filed, preventing manufacturers from writing additional programs specifically for trial purposes. Therefore, manufacturers should invest in nearest neighbor-type features prior to market implementation; waiting until the first lawsuits are brought may prove too late to write new features into the program. Even if courts ultimately reject the nearest neighbor method as a substitute for expert testimony, the application will be useful as secondary evidence of machine efficacy. Manufacturers might also invest time into pursuing a method for rationalizing machine diagnoses. Although minimizing or eliminating the black box problem will be difficult, the ability of the machine to provide at least some form of justification for its output will help minimize black box problems associated with meeting the standard of care.

Some organizations recommend founding AI outputs on “possibilities instead of probabilities,” and suggest that requiring AI systems provide a list of potential diagnoses along with corresponding probabilities may increase “demonstrability of the results” (33). Although a probabilities feature only seems to marginally increase the transparency of the AI’s logic, such features would also be worth pursuit by manufacturers. At the very least, a list of every diagnosis the AI considered and the order in which the diagnoses were eliminated may demonstrate an informed reasoning process by the machine.

Compensating plaintiffs following AI error

Due to the heightened liability, financial investments, and time commitment implicated, it is likely that large hospital networks will bear the burden of AI’s introduction to medical diagnosing. Solo practitioners and smaller clinics simply do not have the resources available to take on the task of purchasing AI services from manufacturers or self-insuring against malpractice claims. Hospitals are also more likely than independent practices to have the digital infrastructure in place to facilitate the rapid processing of mammograms taken at remote clinical sites through one central location. Access to vast databases of digital radiologic images uniquely situates hospitals to train proprietary AI machines without reliance on out-of-network medical records. Hospitals also are better positioned than small practices to bear financial costs of medical errors after faulty AI diagnosing.

Managing financial liability will likely be most simply addressed by express agreement between the hospitals and the AI manufacturers. There are several alternatives for compensating victims of AI misdiagnoses. Note that the following approaches assume that AI diagnosing will not be covered under health providers’ current malpractice insurance policies. Although a technical loophole or legal interpretation might leave the terms of an insurance agreement vague enough to argue that AI malpractice is covered, more likely than not coverage is nonexistent. Healthcare providers should carefully review their coverage policies and plan accordingly¹². Malpractice coverage considers every procedure implemented by a provider, and generally excludes all non-specified activities. Even if malpractice insurance agreements are currently vague, the introduction of AI diagnosing into clinical practice will likely prompt insurance providers to decline coverage for such activities. The widespread use of breast cancer screening renders AI diagnosing a significant liability insurance providers would only reluctantly accept with a corresponding increase in premiums, and even then only with demonstrations of efficacy.

One option for plaintiff compensation is a form of common enterprise liability, where both the provider and the manufacturer agree to share in the cost of medical errors. Alternatively, private medical insurance or self-insurance could be obtained by either party as a way of compensating injured patients. Or, the machine could be assigned legal personhood and be insured as a separate entity (109). Any of these options would naturally impact the price of the AI service. If, for example, the manufacturer assumes all liability for misdiagnoses, then the hospital would pay a higher premium for the service. Depending on the sticker price associated with an AI diagnosis, failure to obtain approval for reimbursement by either CMS or private health plans might also pose a roadblock to implementation due to cost.

Alternatively, hospitals with funds and image databases could invest in developing proprietary AI systems, eliminating service fees and potentially allowing the hospital to fill the role of both provider and manufacturer. The hospital could then provide the AI service to other provider networks for a fee, and negotiate liability coverage with subscribing entities. Holding the dual roles of provider and manufacturer may assign some of the manufacturer liability discussed earlier to the hospital. That said, outside manufacturers will almost certainly play a minimal role in the form of providing the physical machine. Manufacturers will also likely be involved in the training of the machine using hospital medical records, since most hospitals will not have a dedicated staff of AI programmers.

Another option is the creation of a fund allocated specifically towards compensating injured victims. Guaranteed compensation for injured patients, potentially through mandatory binding arbitration for calculating damages, might be an additional way of encouraging voluntary program participation. Presumably a compensation guarantee will also ensure adequate attention to the program by the provider, and will soothe any concerns or reservations of state and federal legislators. More importantly, having a dedicated compensation fund expedites settlement, ensures funds availability to cover damages, and minimizes legal time and expenses, allowing providers to avoid a court battle until they are prepared to do so on their terms. A risk of dedicated settlement funds is that plaintiffs’ attorneys may catch on to the guaranteed payments and become overly litigious with the knowledge that providers will go out of their way to avoid a court battle. Guaranteed compensation funds also do not remove hospital liability without waivers of liability; depending on the nature of the opt-in agreement, patients might decide the hospital did not pay enough for their injury, and sue in the hopes of a larger recovery through settlement or trial.

Accommodations for medical AI tools will also be necessary at the agency level, specifically the FDA. Although the current one-time pre-market approval system functions well for medications and medical equipment that rarely change, AI’s success as a medical tool is premised on its ability to constantly change and improve. As discussed earlier, requiring AI locking for market usage would be counterproductive and run contrary to policy rationales advocating for AI as a diagnostic tool. Both the public and the medical community would benefit from FDA guidelines uniquely crafted to address the fluid nature of AI deep learning machines. One potential solution would be to create a certification test that verifies the AI meets a certain level of statistical diagnostic accuracy. An additional requirement might be that the AI consistently maintain performance levels as the base learning dataset grows. In other words, every time the AI retakes the certification test, it must perform at least as well as it did on previous tests (within a stipulated margin for error). By requiring regular tests, manufacturers could avoid lengthy processes of re-training and re-certification of machines using updated training datasets, while reassuring the FDA that the machine continuously performs at or above the requisite level of medical accuracy.

FDA accommodations might also insulate manufacturers from liability, removing uncertainty and increasing investment into medical AI tools. Manufacturer protections also serve hospitals by providing greater notice of where liability will be assigned in cases of misdiagnoses. Without liability concerns, manufacturer service charges paid by hospitals for use of AI programs will also likely decrease, saving hospitals money. Although the liability would be passed on to the provider, hospitals will also be able to choose how to allocate funds to account for inevitable malpractice lawsuits. Additionally, hospitals will be saved the financial, reputational, and time expenses of attempting to join manufacturers to lawsuits where injured patients sue the provider. Freedom of choice among self-insurance methods may be a valuable option for providers seeking to maintain control over liability and business expenditures.

Special concerns regarding AI implementation

Providers should give special attention to informed consent obligations. Failure to disclose to patients that their diagnoses will be completed by a machine rather than a physician presents unique legal challenges. Because the patient is consenting to diagnosis by a machine, rather than diagnosis based on the specific dataset the machine’s reasoning will be informed by, the ever-evolving nature of AI systems should not prove an obstacle to adequate informed consent. Informed consent will remain complicated, however, because patients will desire differing levels of understanding of how AI functions. Most patients are likely to simply take AI for granted, but providers will have to carefully craft informed consent requirements to guide clinicians.

Patients have a right to second opinions (110). If a patient is unable to go out-of-network and wants a second opinion on a mammogram, will the patient be able to have a radiologist evaluate her test? Availability of second opinions may not be an issue if only 60–70% of mammograms are being eliminated by AI, but if testing is ever 100% AI-managed, then alternatives need to be arranged for providing in-network second opinions by a human radiologist. In-network alternatives are similarly important for patients who might opt-out of AI medical care for personal or religious reasons. Similar considerations must be given to physicians who decline to administer AI diagnosing for moral, ethical, or otherwise personal reasons (111).

Manufacturers and providers should also carefully analyze their obligations under existing Health Insurance Portability and Accountability Act (HIPAA) obligations regarding use of patient mammograms for future AI training datasets. Specifically, covered entities and business associates must (I) ensure the confidentiality and integrity of all electronically protected health information; (II) protect against reasonably anticipated hazards to information security; and (III) protect against reasonably anticipated unauthorized uses or disclosures of information (112). Personal data must be thoroughly deidentified before use in research and development, and information must be secured and released only for proper uses. Failure to abide by patient privacy rules would be one of the easiest ways for courts or government to shut down AI endeavors. At a minimum, informed consent should include disclosure that mammograms may be stripped of personalized information and retained for future use in AI training.

It should be noted that an adverse law or court ruling in one or a few jurisdictions will not be lethal to a nationwide program for medical AI implementation. Again, AI diagnosing is an issue of first impression that courts will be forced to decide without binding precedent. The first cases may result in outcomes detrimental to providers, such as wholesale assignment of liability to radiologists or hospitals. State rulings will only bind courts of equal or lesser status within that state to the resulting rule regarding AI diagnosing. Although federal courts and other state jurisdictions may cite or follow nonbinding precedent, they remain free to determine alternative rules for addressing AI’s role in medical diagnosing. Federal courts will apply state law to resolve civil claims. Similarly, while a state might legislatively assign liability to providers for machine error, such legislation is nonbinding in other jurisdictions. One or two negative outcomes will not condemn the entire AI program across all jurisdictions.

Conclusions

Any new medical advance comes with some inherent degree of heightened liability, and AI diagnosing is no exception. The inadequacies of current medicolegal standards pose unique challenges to the incorporation of AI into clinical practice. The courts will eventually face either the creation of new standards governing machine learning, or the relegation of AI to a much more difficult integration process within the medical community. In the interim, efforts for minimization of risk to both patient and practitioner will lie with manufacturers and health networks. Through proactive work on state legislation and self-governance standards, AI implementers might have a chance at preempting adverse court rulings. Slow and thoroughly considered steps will prove key to the medical community’s ability to maintain control over the incorporation of AI into medical services. Although the lack of legal precedent for the integration of AI into the historically personal field of health care poses unique challenges, ever-expanding knowledge of technology’s capabilities provides encouragement for the future.

Acknowledgments

The author thanks Anne Lederman Flamm, JD for her dedicated comments and editing assistance in the drafting of this note. The author additionally thanks Dr. Scott Flamm for helpful guidance and discussion in the direction of this note’s focus. This note was provided to the Cleveland Clinic Foundation through Drs. Scott Flamm and Alice Rim in an unpublished format.

Funding: None.

Footnote

Conflicts of Interest: The author has completed the ICMJE uniform disclosure form (available at http://dx.doi.org/10.21037/jmai-20-57). The author has no conflict of interest to declare. This note was undertaken on behalf of the Cleveland Clinic Foundation at the request of Drs. Alice Rim and Scott Flamm, who approached the Case Western Reserve University School of Law requesting an analysis of the medico-legal and bioethical risks posed by the incorporation of AI into clinical radiology practice. The author received no financial compensation for the writing of this note or the opinions contained herein, but did receive academic credit from the Case Western Reserve University School of Law.

Ethical Statement: The author is accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

Disclaimer: The author is neither a certified medical practitioner nor a licensed attorney at law. No opinions for facts contained within this note should be taken to constitute legal advice.

Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.

¹Note: diagnostic accuracy is also dependent on the machine’s training dataset being representative of the population the machine is being asked to address. Some populations may have higher occurrence rates for particular diseases, or may manifest conditions differently. If the machine is not appropriately trained for the target population, it may prove unreliable in application.

²In this example a machine would only be trained to diagnose an image as either positive or negative for microcalcifications. Reliable machine learning would require a significantly larger and more diverse population of training images to be capable of diagnosing the variety of abnormalities typical of mammograms.

³Other alternatives include galactography, scintimammography, thermography, ultrasound, and MRI.

⁴Although not explicitly considered by the study, the differential results between the U.K. and U.S. are potentially explained by the fact that standard U.K. mammography screening procedures involve a double-reader evaluation with disputes potentially resulting in a third reader, whereas standard U.S. protocol typically involves interpretation by only one radiologist, resulting in potential U.K. procedural superiority for mammography evaluation. National screening programs implemented by the European Union and greater access to patient data provide additional advantages over the U.S. model. See (36).

⁵Because most large-scale AI diagnosing studies in radiology are relatively recent, none seem to focus as of yet on applications of the same machine to underserved regions of the world that currently have limited to nonexistent availability of preventative care. The author is unaware of any study specifically addressing the question of data transferability between industrialized nations and underserved regions.

⁶FDA considerations currently do not represent guidance, proposed, or final regulatory expectations, and are solely aimed at soliciting feedback from the public regarding future regulatory action.

⁷State supreme courts set binding precedent for all lower courts of that state, but will not be binding on other states. The federal judiciary is divided into three tiers: district courts, courts of appeals (divided into geographic circuits), and the Supreme Court. The Supreme Court sets binding precedent for all federal courts and, in some circumstances, for state courts as well. District courts follow the precedent of the Supreme Court and the circuit in which the district court sits. Circuit courts follow Supreme Court precedent, but need not follow the precedent of other circuit courts. See (53).

⁸A Plaintiff in a med-mal case may allege multiple reasons for a negative outcome; thus, one med-mal case may involve claims of delay in diagnosis, delayed treatment, and wrongful death. The referenced study considered all such allegations in the 370 cases evaluated, and therefore the factor percentages cited will not total to 100%.

⁹Some academics such as Yavar Bathaee ultimately conclude not only that post hoc rationalizations of AI decision-making are nigh impossible, but that AI intent tests can only rarely be adequately satisfied for assigning liability. See (10).

¹⁰“Shotgun” suits take their namesake from birdshot, a type of shotgun ammunition used by bird hunters. Rather than attempt to hit one bird with a single bullet, hunters fire a spray of small pellets that hit anything moving, allowing hunters to potentially down multiple birds with a single shot.

¹¹It is in the provider’s interest that patients being diagnosed by an AI during a test period return to the same provider for all subsequent follow-ups to ensure complete medical histories are on file. Alternatively, when patients opt-in, the hospital may request the patient grant them access to future medical records pertaining to similar diagnostic care.

¹²Insurance coverage is dependent on the contractual agreement between the parties and subject to significant variability, rendering these considerations largely speculative. Providers should refer to their particular insurance contracts rather than make assumptions based on the analysis contained in this note.

References

Yudkowsky E. Artificial intelligence as a positive and negative factor in global risk. Machine Intelligence Research Institute, 1 (2008). Available online: https://intelligence.org/files/AIPosNegFactor.pdf
Harned Z, Lungren M, Rajpurkar P. Machine vision, medical AI, and malpractice. Harvard Journal of Law & Technology Digest 2019. [cited 2020 Sep 8]. Available online: https://jolt.law.harvard.edu/digest/machine-vision-medical-ai-and-malpractice
Kolodny L, Schoolov K. Self-driving cars were supposed to be here already — here’s why they aren’t and when they should arrive. CNBC, 2019 Nov 30 [cited 2020 Sep 8]. Available online: https://www.cnbc.com/2019/11/30/self-driving-cars-were-supposed-to-be-here-already-heres-whats-next.html
Sullivan HR, Schweikart S. Are current tort liability doctrines adequate for addressing injury caused by AI? 21 AMA J Ethics 2019;21:E160-6. [Crossref] [PubMed]
Mammography Quality Standards Act national statistics [Internet]. MQSA Insights; 2020 Sep 1 [cited 2020 Sep 8]. Available online: https://www.fda.gov/radiation-emitting-products/mqsa-insights/mqsa-national-statistics.
Lehman CD, Arao R, Sprague B, et al. National performance benchmarks for modern screening digital mammography: update from the breast cancer surveillance consortium. Radiology 2017;283:49-58. [Crossref] [PubMed]
Hayward JH, Ray K, Wisner D, et al. Improving screening mammography outcomes through comparison with multiple prior mammograms. AJR Am J Roentgenol 2016;207:918-24. [Crossref] [PubMed]
Ekpo EU, Alakhras M, Brennan P. Errors in Mammography Cannot be Solved Through Technology Alone. Asian Pac J Cancer Prev 2018;19:291-301. [PubMed]
Erickson BJ, Korfiatis P, Akkus Z, et al. Machine learning for medical imaging. RadioGraphics 2017;37:505-15. [Crossref] [PubMed]
Bathaee Y. The artificial intelligence black box and the failure of intent and causation. Harv J Law Technol 2018;31:890-938.
The Oxford Dictionary of Phrase and Fable. Artificial Intelligence. 2nd edition. Oxford University Press, 2005. [cited 2020 Sep 8]. Available online: https://www.oxfordreference.com/view/10.1093/oi/authority.20110803095426960
Rudie JD, Rauschecker A, Bryan R, et al. Emerging applications of artificial intelligence in neuro-oncology. Radiology 2019;290:607-18. [Crossref] [PubMed]
Choy G, Khalilzadeh O, Michalski M, et al. Current applications and future impact of machine learning in radiology. Radiology 2018;288:318-28. [Crossref] [PubMed]
London AJ. Artificial intelligence and black-box medical decisions: accuracy versus explainability. Hastings Cent Rep 2019;49:15-21. [Crossref] [PubMed]
Soffer S, Ben-Cohen A, Shimon O, et al. Convolutional neural networks for radiologic image: a radiologist’s guide. Radiology 2019;290:590-606. [Crossref] [PubMed]
Kyono T, Gilbert F, Schaar M. Improving workflow efficiency for mammography using machine learning. J Am Coll Radiol 2020;17:56-63. [Crossref] [PubMed]
Castellanos S. Pinterest harnesses AI for visual-based shopping. The Wall Street Journal 2019 Sep 19 [cited 2020 Sep 8]. Available online: https://www.wsj.com/articles/pinterest-harnesses-ai-for-visual-based-shopping-11568925943
Marr B. Google: using deep learning AI to drive success. Bernard Marr & Co. Cited 2020 Sep 8. Available online https://www.bernardmarr.com/default.asp?contentID=1275
Freeman MD, Gopman J, Salzberg C. The evolution of mastectomy surgical technique: from mutilation to medicine. Gland Surg 2018;7:308-15. [Crossref] [PubMed]
Gold RH, Bassett LW, Widoff BE. Highlights from the history of mammography. Radiographics 1990;10:1111-31. [Crossref] [PubMed]
Hopper KD. Percutaneous, radiographically guided biopsy: a history. Radiology 1995;196:329-33. [Crossref] [PubMed]
van Ravesteyn NT, van Lier L, Schechter CB, et al. Transition from film to digital mammography: impact for breast cancer screening through the national breast and cervical cancer early detection program. Am J Prev Med 2015;48:535-42. [Crossref] [PubMed]
Zeeshan M, Salam B, Khalid Q, et al. Diagnostic accuracy of digital mammography in the detection of breast cancer. Cureus 2018;10:e2448. [Crossref] [PubMed]
Joe B. Advances in breast imaging: mammography and much more. UCSF Department of Radiology & Biomedical Imaging, 2015 July 7 [cited 2020 Sep 8]. Available online: https://radiology.ucsf.edu/blog/advances-breast-imaging-evolution-history-mammography
Gao Y, Geras K, Lewin A, et al. New frontiers: an update on computer-aided diagnosis for breast imaging in the age of artificial intelligence. AJR Am J Roentgenol 2019;212:300-7. [Crossref] [PubMed]
Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Comput Med Imaging Graph 2007;31:198-211. [Crossref] [PubMed]
CMS: Medicare Intermediary Manual, 42 C.F.R § 3660.20, revision issued Oct. 25, 2002.
Tchou PM, Haygood TM, Atkinson EN, et al. Interpretation time of computer-aided detection at screening mammography. Radiology 2010;257:40-6. [Crossref] [PubMed]
Cole EB, Zhang Z, Marques HS, et al. Impact of computer-aided detection systems on radiologist accuracy with digital mammography. AJR Am J Roentgenol 2014;203:909-16. [Crossref] [PubMed]
Geras KJ, Mann RM, Moy L. Artificial Intelligence for Mammography and Digital Breast Tomosynthesis: Current Concepts and Future Perspectives. Radiology 2019;293:246-59. [Crossref] [PubMed]
Lehman CD, Wellman RD, Buist DS, et al. Diagnostic Accuracy of Digital Screening Mammography With and Without Computer-Aided Detection. JAMA Intern Med 2015;175:1828-37. [Crossref] [PubMed]
Niklason LT, Christian BT, Niklason LE, et al. Digital tomosynthesis in breast imaging. Radiology 1997;205:399-406. [Crossref] [PubMed]
SFR-IA Group. CERF; French Radiology Community. Artificial intelligence and medical imaging 2018: French Radiology Community white paper. Diagn Interv Imaging 2018;99:727-42. [Crossref] [PubMed]
Keane PA, Topol EJ. With an eye to AI and autonomous diagnosis. NPJ Digit Med 2018;1:40. [Crossref] [PubMed]
Galeon D. For the first time, a robot passed a medical licensing exam. Artificial Intelligence, 2017 Nov 20 [cited 2020 Sep 8]. Available online: https://futurism.com/first-time-robot-passed-medical-licensing-exam
McKinney SM, Sieniek M, Godbole V, et al. International evaluation of an AI system for breast cancer screening. Nature 2020;577:89-94. [Crossref] [PubMed]
Rosenkrantz AB, Hughes DR, Duszak R Jr. The U.S. Radiologist Workforce: An Analysis of Temporal and Geographic Variation by Using Large National Datasets. Radiology 2016;279:175-84. [Crossref] [PubMed]
Centers for Disease Control and Prevention. Data & statistics on sickle cell disease. 2019 Oct 29 [cited 2020 Sep 8]. Available online: https://www.cdc.gov/ncbddd/sicklecell/data.html
U.S. Food and Drug Administration. Proposed regulatory framework for modifications to artificial intelligence/machine learning (AI/ML)-based software as a medical device (SaMD) [Analysis available in brief on the Internet]. 2020 Jan 28 [cited 2020 Sep 8]. Available online: https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-software-medical-device
Restatement (Second) of Torts § 282. American Law Institute, 1965.
Cooke BK, Worsham E, Reisfield GM. The Elusive Standard of Care. J Am Acad Psychiatry Law 2017;45:358-64. [PubMed]
Garner BA. editor. Black’s Law Dictionary. 10th ed. St. Paul: Thomson Reuters, 2014:1717.
Coleman J, Hershovitz S, Mendlow G. Theories of the common law of torts. Stanford Encyclopedia of Philosophy, 2015 Dec 17 [cited 2020 Sep 8]. Available online: https://plato.stanford.edu/entries/tort-theories/
Brown v. Kendall, 60 Mass. 292, 296. Mass. 1850.
Holmes OW Jr. The Common Law. Boston: Little, Brown and Company, 1881:108.
Legal Match. Tipton S. What is a strict liability tort? 2018 July 6 [cited 2020 Sep 10]. Available online: https://www.legalmatch.com/law-library/article/what-is-a-strict-liability-tort.html
Vidmar N. Juries and medical malpractice claims: empirical facts versus myths. Clin Orthop Relat Res 2009;467:367-75. [Crossref] [PubMed]
Hall v. Hilburn, 466 So. 2d 856, 866. Miss. 1985.
Small v. Howard, 128 Mass. 131. Mass. 1880.
56. Mohr JC. American medical malpractice litigation in historical perspective. JAMA 2000;283:1731-7. [Crossref] [PubMed]
Johnston v. St. Francis Medical Center, 799 So. 2d 671, 680. La. Ct. App. 2001.
Lewis MH, Gohagan JK, Merenstein DJ. The locality rule and the physician's dilemma: local medical practices vs the national standard of care. JAMA 2007;297:2633-7. [Crossref] [PubMed]
Walker J. The role of precedent in the United States: how do precedents lose their binding effect? Stanford Law School: China Guiding Cases Project, Feb. 29, 2016. [cited 2020 Sep 10]. Available online: http://cgc.law.stanford.edu/commentaries/15-John-Walker
Darling v. Charleston Community Memorial Hospital, 50 Ill. App. 2d 253, 313-4. Ill. App. Ct. 1964.
Clark v. Southview Hosp. & Family Health Ctr., 68 Ohio St. 3d 435, 438. Ohio, 1994.
American Jurisprudence. 2d Products Liability § 541, 2002.
American Jurisprudence. 3d Drugs and Medical Devices § 6(c), 1998.
Riegel v. Medtronic, Inc., 552 U.S. 312, 319-20, 2008.
Bailey R, Schleiter KE. Testing Manufacturer Liability in FDA-Approved Device Malfunction. Virtual Mentor 2010;12:800-3. [PubMed]
Hughes v. Boston Scientific Corp., 631 F.3d 762, 5th Cir. 2011.
Rhodes v. United States, 967 F. Supp. 2d 246, 299. D.D.C. 2013.
McCourt v Abernathy, 457 S.E.2d 603. S.C. 1995.
Berlin L. Malpractice and breast cancer: perceptions vs. reality. AJR Am J Roentgenol 2009;192:334-6. [Crossref] [PubMed]
Barratt A, Cockburn J, Furnival C, et al. Perceived sensitivity of mammographic screening: women’s views on test accuracy and financial compensation for missed cancers. J Epidemiol Community Health 1999;53:716-20. [Crossref] [PubMed]
Black WC, Nease RF Jr, Tosteson AN. Perceptions of breast cancer risk and screening effectiveness in women younger than 50 years of age. J Natl Cancer Inst 1995;87:720-31. [Crossref] [PubMed]
Sawlani v. Mills, 830 N.E.2d 932, 941. Ind. Ct. App. 2005.
American Academy of Family Physicians. c2020. Clinical practice guideline manual. 2017 Dec [cited 2020 Sep 10]. Available online: https://www.aafp.org/family-physician/patient-care/clinical-recommendations/cpg-manual.html
Ursano RJ, Bell C, Eth S, et al. Practice guideline for the treatment of patients with acute stress disorder and posttraumatic stress disorder. Am J Psychiatry 2004;161:3-31. [PubMed]
American Board of Medical Specialties. Frequently Asked Questions. cited 2020 Sep 10. Available online: https://www.abms.org/about-abms/faqs/
Mammography Quality Standards Act, 67 Fed. Reg. 5446 § 900.12. April 28, 1999 [cited 2020 Sep 10]. Available online: https://www.fda.gov/radiation-emitting-products/regulations-mqsa/mammography-quality-standards-act-regulations
The American Cancer Society. Limitations of mammograms; 2019 Oct 3 [cited 2020 Sep 10]. Available online: https://www.cancer.org/cancer/breast-cancer/screening-tests-and-early-detection/mammograms/limitations-of-mammograms.html
Le Lee MV, Konstantinoff K, Gegios A, et al. Breast cancer malpractice litigation: A 10-year analysis and update in trends. Clin Imaging 2020;60:26-32. [Crossref] [PubMed]
Studdert DM, Mello MM, Sage WM, et al. Defensive medicine among high-risk specialist physicians in a volatile malpractice environment. JAMA 2005;293:2609-17. [Crossref] [PubMed]
Chinen MA. The co-evolution of autonomous machines and legal responsibility. Virginia Journal of Law & Technology 2016;20:338-93.
Jones S. Automation jobs will put 10,000 humans to work, study says. FORTUNE 2017 May 1 [cited 2020 Sep 10]. Available online: https://www.yahoo.com/news/automation-jobs-put-10-000-185757764.html
Harder B. 2019-20 best hospitals honor roll and medical specialties rankings. U.S. News, 2019 July 29 [cited 2020 Sep 10]. Available online: https://health.usnews.com/health-care/best-hospitals/articles/best-hospitals-honor-roll-and-overview
Sennaar K. How America’s five top hospitals are using machine learning today. EMERJ: AI Research and Advisory Company, 2020 Mar 24 [cited 2020 Sep 10]. Available online: https://emerj.com/ai-sector-overviews/top-5-hospitals-using-machine-learning/
Woyke E. The pint-sized supercomputer that companies are scrambling to get. MIT Technology Review, 2016 Dec 14 [cited 2020 Sep 10]. Available online: https://www.technologyreview.com/2016/12/14/155407/the-pint-sized-supercomputer-that-companies-are-scrambling-to-get/
Brown K. NVIDIA, Massachusetts General Hospital use artificial intelligence to advance radiology, pathology, genomics. NVIDIA Newsroom, 2016 April 5 [cited 2020 Sep 10]. Available online: https://nvidianews.nvidia.com/news/nvidia-massachusetts-general-hospital-use-artificial-intelligence-to-advance-radiology-pathology-genomics
Gauher S, Uz F. Cleveland Clinic to identify at-risk patients in ICU using Cortana intelligence. Microsoft: Docs, 2016 Sep 26 [cited 2020 Sep 10]. Available online: https://docs.microsoft.com/en-us/archive/blogs/machinelearning/cleveland-clinic-to-identify-at-risk-patients-in-icu-using-cortana-intelligence-suite
eHospital program enhances care in medical ICUs. Cleveland Clinic, 2015 Dec 11 [cited 2020 Sep 10]. Available online: https://consultqd.clevelandclinic.org/ehospital-program-enhances-care-medical-icus/
Jeffires A, Tait E. Protecting artificial intelligence IP: patents, trade secrets, or copyrights? Jones Day, 2018 Jan [cited 2020 Sep 10]. Available online: https://www.jonesday.com/en/insights/2018/01/protecting-artificial-intelligence-ip-patents-trad
Davies CR. An Evolutionary Step in Intellectual Property Rights – Artificial Intelligence and Intellectual Property. Computer Law And Security Report 2011;27:601-19. [Crossref]
Greenman v. Yuba Power Products, Inc., 59 Cal. 2d 57, 63-64. Cal. 1963.
Price W. Artificial intelligence in health care: applications and legal implications. SSRN (Internet]. 2017 [cited 2020 Sep 10]. Available online: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3078704
Golann D. Dropped medical malpractice claims: their surprising frequency, apparent causes, and potential remedies. Health Aff (Millwood) 2011;30:1343-50. [Crossref] [PubMed]
Morris ZS, Wooding S, Grant J. The answer is 17 years, what is the question: understanding time lags in translational research. J R Soc Med 2011;104:510-20. [Crossref] [PubMed]
Caruana R, Kangarloo H, Dionisio J, et al. Case-based explanation of non-case-based learning methods. Proc AMIA Symp 1999;212-5. [PubMed]
Fed. R. Evid. 702. Available online: https://www.law.cornell.edu/rules/fre/rule_702
Drouin O, Freeman S. Health care needs AI. It also needs the human touch. STAT Health Tech, 2020 Jan 22 [cited 2020 Sep 10]. Available online: https://www.statnews.com/2020/01/22/health-care-needs-ai-it-also-needs-human-touch/
Ahuja AS. The impact of artificial intelligence in medicine on the future role of the physician. PeerJ 2019;7:e7702. [Crossref] [PubMed]
AMA: PolicyFinder. Augmented intelligence in health care H-480.940. 2018 [cited 2020 Sep 10]. Available online: https://policysearch.ama-assn.org/policyfinder/detail/augmented%20intelligence?uri=%2FAMADoc%2FHOD.xml-H-480.940.xml
Stanford University. One hundred year study on artificial intelligence (AI100). Aug 1, 2016:52. [cited 2020 Sep 10]. Available online: https://ai100.stanford.edu/
Rigby MJ. Ethical dimensions of using artificial intelligence in health care. AMA J Ethics 2019;21:E121-4. [Crossref]
Fiske A, Buyx A, Prainsack B. Health information cousenlors: a new profession for the age of big data. Acad Med 2019;94:37-41. [Crossref] [PubMed]
The T.J. Hooper, 60 F:2d 737, 740 (2d Cir. 1932), cert. denied, 287 U.S. 662. 1932.
Sweatt v. Painter, 339 U.S. 629. 1950.
Lamond G. Precedent and analogy in legal reasoning. Stanford Encyclopedia of Philosophy, 2006 June 20 [cited 2020 Sep 10]. Available online: https://plato.stanford.edu/entries/legal-reas-prec/
U.S. Food & Drug Administration. Step 5: FDA post-market drug safety monitoring. 2018 Jan 4 [cited 2020 Sep 10]. Available online: https://www.fda.gov/patients/drug-development-process/step-5-fda-post-market-drug-safety-monitoring
Barton MB, Morley DS, Moore S, et al. Decreasing women's anxieties after abnormal mammograms: a controlled trial. J Natl Cancer Inst 2004;96:529-38. [Crossref] [PubMed]
IBM Global Business Services. The digital hospital evolution: creating a framework for the healthcare system of the future. 2013:10 [cited 2020 Sep 10]. Available online: https://www.himss.eu/sites/himsseu/files/education/whitepapers/IBM%20Digital%20Hospital%20Evolution%20GBW03203-USEN-00.pdf
Matthiesen, Wickert & Lehrer, S.C. Exculpatory agreements and liability waivers in all 50 states. Aug 21, 2019:22 [cited 2020 Sep 10]. Available online: https://www.mwl-law.com/wp-content/uploads/2018/05/EXCULPATORY-AGREEMENTS-AND-LIABILITY-WAIVERS-CHART-00214377x9EBBF.pdf
Tunkl v. Regents of the University of California, 60 Cal.2d 92. Cal. 1963.
Atkins v. Swimwest Family Fitness Center, 691 N.W.2d 334. Wis. 2005.
Hall v. Woodland Lake Leisure Resort Club, 1998 Ohio App. LEXIS 4898 at 15.
Resnik DB. Bioethical issues in providing financial incentives to research participants. Medicoleg Bioeth 2015;5:35-41. [Crossref] [PubMed]
City Hospital at White Rock. What to Expect to Pay for a Mammogram. [cited 2020 Sep 10]. Available online: https://cityhospital.co/cost-of-a-mammogram/
McClellan FM, White AA 3rd, Jimenez RL, et al. Do poor people sue doctors more frequently? Confronting unconscious bias and the role of cultural competency. Clin Orthop Relat Res 2012;470:1393-7. [Crossref] [PubMed]
Hart R. Who’s to blame when a machine botches your surgery? Quartz, 2018 Sep 10 [cited 2020 Sep 10]. Available online: https://qz.com/1367206/whos-to-blame-when-a-machine-botches-your-surgery/
AMA Principles of Medical Ethics § 1.1.3 Patient Rights (g).
AMA Principles of Medical Ethics § 1.1.7 Physician Exercise of Conscience.
45 C.F.R. § 164.306 (2013).

doi: 10.21037/jmai-20-57
Cite this article as: Jorstad KT. Intersection of artificial intelligence and medicine: tort liability in the technological age. J Med Artif Intell 2020;3:17.

Intersection of artificial intelligence and medicine: tort liability in the technological age

Introduction—artificial intelligence (AI) in medicine

Innovation and history—an understanding of AI and breast cancer

The development of AI

Historical review of breast cancer and mammography

Applying AI to mammography

Med-mal in the context of mammography

Rationales of the tort system

Legal standards in a med-mal claim

Liability in a med-mal claim

Determining the standard of care

AI and the modern malpractice framework

The black box dilemma

Physician liability for AI error

Hospital liability for AI error

Manufacturer liability for AI error

Legal process for an AI-related med-mal claim

Looking to the future—addressing the malpractice model’s gaps

Educational programs for clinicians and the public

Minimizing legal risk to healthcare providers

Compensating plaintiffs following AI error

Special concerns regarding AI implementation

Conclusions

Acknowledgments

Footnote

References

Article Options

Download Citation

Share