Artificial intelligence (AI) is a general term used to describe machines and computers performing tasks usually requiring human intelligence (1). There has been interest in the application of AI in medicine since the 1950s (2). Since then, computer scientists and medical researchers have continued to investigate the use of AI in almost every field of medicine (2). Proponents of medical AI argue that it can help the clinician in all aspects of medicine, from formulating a diagnosis to making therapeutic decisions and predicting patient outcomes (3,4).
An extensive body of literature has become available regarding the use of AI in medicine, with considerable variation in the quality of articles, as well as in the study methodologies and AI techniques used. This can lead to difficulty in identifying articles of significance, particularly for clinicians with limited technical knowledge in AI. A bibliometric analysis of the most highly cited publications relating to medical AI may provide a better understanding of the progress made to date in this field, as well as identify areas for future research efforts (5-7). Citation frequency is a method of bibliometric analysis that involves examining the publications that have been most cited by other researchers. Citation count serves as an indicator of the influence and quality of a scientific publication (8).
A number of bibliometric analyses from various fields of medicine have been published (9-13). While the international interest and academic output relating to medical AI has risen significantly in recent years, a bibliometric analysis of the citation classics in this field, to the best of our knowledge, has not yet been performed. As such, this study aimed to identify and examine the characteristics of the top 100 most cited articles relating to the application of AI in medicine.
A retrospective bibliometric analysis of the 100 most cited peer-reviewed journal articles related to the use of AI in medicine was performed in April 2019. Articles from the MEDLINE® (U.S. National Library of Medicine, Bethesda, USA) database were identified using the Web of Science (Clarivate Analytics, Philadelphia, USA) citation indexing service (14,15). The MEDLINE® database indexes more than 5,000 journals, comprising more than 25 million references in medicine and life sciences published since 1950. MEDLINE® uses Medical Subject Headings (MeSH), which imposes uniformity and consistency to the indexing of biomedical literature.
The following search strategy was used: [“artificial intelligence” OR “machine learning” OR “deep learning” OR “natural language processing” OR “support vector machine” OR “naïve bayes” OR “bayesian learning” OR “artificial neural network” OR “random forest” OR “machine intelligence” OR “k-nearest neighbor” OR “decision tree learning” OR “data mining” OR “fuzzy” OR “computational intelligence” OR “computer reasoning”] AND [“medicine” OR “medical” OR “surgery” OR “surgical” OR “healthcare”]. Articles were identified if they included these search terms in either its title, abstract or MeSH terms. There were no restrictions on language or year of publication. A total of 16,025 articles were returned from this search. The top 100 articles ranked by citation count were identified and downloaded to a local database.
Analysis of individual articles was performed by three reviewers (S Sreedharan, M Mian, RA Robertson) in order to extract relevant information relating to year of publication, authorship, journal title and impact factor (IF), institution, country of origin, article type, key words and field of medicine. Journal IFs were obtained from the SCImago Journal Rank (Elsevier, Amsterdam, Netherlands) issued in 2017, the most recent year of available data at the time of search. Institution and country of origin were based on the corresponding author’s affiliations. Article type was dichotomised into original research article or review article. The original articles were further dichotomised into clinical or non-clinical, where clinical papers were defined as those where the primary research question was clinical and the study involved human participants or data. Keywords were recorded for each article by screening their titles, abstracts and MeSH terms for the search terms used. Articles that could not be assigned to a specific field of medicine were grouped in a “general” category. The “medical image analysis” category included any papers relating to the analysis of non-radiological medical images, such as histopathological images or clinical photographs. The average citations per year for each article was also included as an adjunctive measure of overall article impact. Average citation per year was calculated by dividing each article’s total number of citations by the number of years since that article had been published.
The top 100 articles relating to the use of AI in medicine were ranked according to citation count (Table 1). The median [IQR] number of citations was 238 [205–347]. The median [IQR] number of citations per year was 21 [16–41]. “The American College of Rheumatology preliminary diagnostic criteria for fibromyalgia and measurement of symptom severity”, published in Arthritis Care & Research in 2010, was ranked first with 1,475 citations. When ranked according to citations per year (Table 2), it became the fourth highest ranked article with an average of 164 citations per year.
Only 15 articles were published prior to 2000. Thirty-five articles were published within the last decade, and the majority of articles were published between 2000 and 2010 (Table 3). The oldest article on the list was “Towards the stimulation of clinical cognition. Taking a present illness by computer” published in The American Journal of Medicine in 1976. This original research article presented a novel computer software that was developed to take a clinical history for a patient presenting with oedema in order to determine the most likely underlying illness.
A total of 55 journals contributed articles to the top 100 list, with 15 contributing two or more articles. With eight articles each, Artificial Intelligence in Medicine (IF: 3.62) and Medical Image Analysis (IF: 6.50) contributed the most articles, followed by Journal of the American Medical Informatics Association (IF: 3.97) and Journal of Biomedical Informatics (IF: 7.22) with seven articles each. The journals with the highest IFs were New England Journal of Medicine (IF: 42.18) and Nature Reviews Genetics (IF: 38.94), each contributing one article.
There were 16 institutions contributing two or more articles each to the top 100 list. Harvard Medical School was the biggest contributor with five articles, followed by Vanderbilt University  and the National Library of Medicine . The United States of America was the leading country of origin with 55 articles, followed by the United Kingdom  and Canada . A further nine countries contributed two or more articles each (Figure 1).
The top 100 articles list comprised 60 original research articles and 40 review articles. The 60 original research articles included 11 clinical studies (Figure 2). Of note, 8 of the 11 clinical papers were published in the last decade, and 6 of the 11 clinical studies were related to oncology. The most frequently represented field in the top 100 list was medical informatics , followed by radiology , oncology  and non-radiological medical image analysis  (Figure 3). Papers relating to medical image analysis had the highest average citations per year at 69.03, followed by radiology (43.15) and genetics (42.16). The 11 clinical papers were from oncology , ophthalmology , internal medicine , critical care  and radiology .
“Artificial intelligence” , “natural language processing” , “machine learning” , “data mining”  and “artificial neural network”  were the most frequently used keywords among the top 100 list (Table 4). Eighteen of the 20 articles that included the keyword “natural language processing” were from the field of medical informatics.
There has been a recent surge of interest in the use of AI in medicine. However, many of the top 100 articles identified in this study were of low-level evidence and limited clinical significance, comprised of review articles and commentaries providing only a general overview of medical AI. Original research articles investigating the use of AI in clinical populations were lacking, with only 11 of the top 100 articles identified as clinical studies. The lack of highly cited clinical studies demonstrates the need for more studies investigating the integration of AI into clinical medicine. High-level evidence, including randomised controlled trials and meta-analyses, is required for clinicians to gain confidence in the capabilities of AI. Improved collaboration between clinicians and computer scientists or streamlined pathways for dual qualification in health and computer science may be feasible strategies to facilitate improved medical AI research.
Medical informatics contributed the most articles to the top 100 list. With the implementation of electronic medical records (EMR), medical informatics is becoming increasingly important. In order to reach their full potential, EMRs require automated applications to manage the huge amounts of clinical information available in them. Within the field of medical informatics, “natural language processing (NLP)” was the most frequently used keyword. NLP began in the 1950s and represents the intersection of AI and linguistics (16). Clinical information is often not coded or structured but recorded in natural language text that may not be readily accessible by informaticians. NLP may be able to overcome this problem by extracting text-based information and structuring it into useful data (17).
Radiology was the leading field of clinical medicine represented in the citation classics, contributing just over one fifth of the top 100 articles. While medical informatics had the highest number of articles, the average annual citations per article in radiology was more than double that of medical informatics, demonstrating the substantial interest in the use of AI in radiology. Radiology is a unique field of medicine in that it encompasses many of the common applications of AI. AI techniques have been experimented with in various aspects of radiology, from assisting clinicians to determine the most appropriate imaging procedure, to image interpretation and computer-assisted diagnosis, and lastly results reporting and the extraction of information from radiologist reports (18).
It is worth noting that there is an overlap in the approaches used in applying AI to radiological and non-radiological medical images. Non-radiological medical image analysis was the fourth leading field in our study. This category refers to histopathological slides and clinical photographs, such as endoscopic images or images of dermatological lesions (19). The separation of this category from radiology was made in order to recognise solely radiological studies from those involving wider medical image analysis. As with radiology, the application of AI in medical image analysis relies on the ability to collect and use large datasets in order to train AI systems in pattern recognition. It is likely that both fields will continue to be a major focus of medical AI in the future (19).
The majority of the clinical studies in the top 100 list were oncological papers. This is likely due to a number of factors. First, cancer is among the leading causes of mortality in developed countries (20), driving a scientific interest to innovate in this field. AI also has the potential to improve outcomes in oncology because of the variety of cancer types and presentations, and the risk of patients being asymptomatic until late and severe stages of disease. Lastly, oncology relies on a range of data rich modalities such as genomics and metabolomics, which enables the generation of large clinical datasets useful in the building and validation of AI models (21). In contrast, AI may be harder to implement in fields with fewer objective investigations and data available, such as psychiatry. Interestingly, cardiovascular disease did not feature in the top 100 list despite being the leading cause of mortality globally (20). This highlights a potential mismatch between disease burden and AI research efforts.
There are several limitations of our study that should be considered. A source of bias in the use of citation analysis is that older articles are more likely to be cited, independent of quality of the article. Further, total citation count does not provide information about the temporal profile of citations for each paper. We addressed this by including an average annual citation number as another indicator for article impact and contemporary influence (13). The use of journal IFs from 2017 only does not account for changes in journal IFs over time and does not account for the journal IF at the time of article publication. Another limitation is that the search terms used in our study may have excluded some relevant or highly cited articles (12). We attempted to mitigate this risk by including both general search terms, such as “artificial intelligence” or “machine learning”, as well as detailed search terms of specific AI techniques, such as “random forest” or “neural network”. It should be noted that applying these search terms to only titles, abstracts and MeSH keywords meant that articles may have not employed the mentioned AI technique, but only discussed it. However, this still highlights the AI techniques most frequently discussed across the highly cited literature in the field of medical AI.
This study provides a comprehensive overview of the top 100 most cited articles relating to the use of AI in medicine over the past 70 years. It highlights that the current citation classics are largely in the non-clinical, experimental phase and have yet to progress to the clinical, integration phase of medical AI. While medical informatics and radiology were heavily featured in the citation classics, oncology was the leading field with clinical integration of AI. There is an apparent mismatch between disease burden and AI research efforts, with a lack of representation of cardiovascular medicine in the top 100 list despite cardiovascular disease being the leading cause of mortality globally. These results offer important insights into current research trends in medical AI and could help direct future research in this highly active and exciting field.
Conflicts of Interest: The authors have no conflicts of interest to declare.
Ethical Statement: The authors are accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.
- Hamet P, Tremblay J. Artificial intelligence in medicine. Metabolism 2017;69S:S36-S40. [Crossref] [PubMed]
- Ramesh A, Kambhampati C, Monson JR, et al. Artificial intelligence in medicine. Ann R Coll Surg Engl 2004;86:334. [Crossref] [PubMed]
- Schwartz WB, Patil RS, Szolovits P. Artificial intelligence in medicine. N Engl J Med 1987;316:685-8. [Crossref] [PubMed]
- Szolovits P, Patil RS, Schwartz WB. Artificial intelligence in medical diagnosis. Ann Intern Med 1988;108:80-7. [Crossref] [PubMed]
- Choudhri AF, Siddiqui A, Khan NR, et al. Understanding bibliometric parameters and analysis. Radiographics 2015;35:736-46. [Crossref] [PubMed]
- Abramo G, D’Angelo CA. Evaluating research: from informed peer review to bibliometrics. Scientometrics 2011;87:499-514. [Crossref]
- Goodall A. The place of citations in today's academy. International Higher Education 2015. doi: https://doi.org/ [Crossref]
- Godin B. On the origins of bibliometrics. Scientometrics 2006;68:109-33. [Crossref]
- Kelly J, Glynn R, O’Briain D, et al. The 100 classic papers of orthopaedic surgery: a bibliometric analysis. J Bone Joint Surg Br 2010;92:1338-43. [Crossref] [PubMed]
- Yoon DY, Yun EJ, Ku YJ, et al. Citation classics in radiology journals: the 100 top-cited articles, 1945–2012. AJR Am J Roentgenol 2013;201:471-81. [Crossref] [PubMed]
- Brandt JS, Downing AC, Howard DL, et al. Citation classics in obstetrics and gynecology: the 100 most frequently cited journal articles in the last 50 years. Am J Obstet Gynecol 2010;203:355.e1-7. [Crossref] [PubMed]
- Mohammed MF, Chahal T, Gong B, et al. Trends in CT colonography: bibliometric analysis of the 100 most-cited articles. Br J Radiol 2017;90:20160755. [Crossref] [PubMed]
- Maingard J, Phan K, Ren Y, et al. The 100 most cited articles in the endovascular management of intracranial aneurysms. J Neurointerv Surg 2018;10:859-68. [Crossref] [PubMed]
- Kulkarni AV, Aziz B, Shams I, et al. Comparisons of Citations in Web of Science, Scopus, and Google Scholar for Articles Published in General Medical Journals. JAMA 2009;302:1092-6. [Crossref] [PubMed]
- Falagas ME, Pitsouni EI, Malietzis GA, et al. Comparison of PubMed, Scopus, Web of Science, and Google Scholar: strengths and weaknesses. FASEB J 2008;22:338-42. [Crossref] [PubMed]
- Nadkarni PM, Ohno-Machado L, Chapman WW. Natural language processing: an introduction. J Am Med Inform Assoc 2011;18:544-51. [Crossref] [PubMed]
- Friedman C, Hripcsak G. Natural language processing and its future in medicine. Acad Med 1999;74:890-5. [Crossref] [PubMed]
- Kahn CE Jr. Artificial intelligence in radiology: decision support systems. Radiographics 1994;14:849-61. [Crossref] [PubMed]
- Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal 2017;42:60-88. [Crossref] [PubMed]
- World Health Organization. Global health estimates 2016: disease burden by cause, age, sex, by country and by region, 2000–2016. Geneva: World Health Organization, 2018.
- Ramaswamy S. Translating cancer genomics into clinical oncology. N Engl J Med 2004;350:1814. [Crossref] [PubMed]
Cite this article as: Sreedharan S, Mian M, Robertson RA, Yang N. The top 100 most cited articles in medical artificial intelligence: a bibliometric analysis. J Med Artif Intell 2020;3:3.