Risk Management Tools & Resources


Let’s Chat: Examining Top Risks Associated With Generative Artificial Intelligence Chatbots

Laura M. Cascella, MA, CPHRM


Generative artificial intelligence (AI) chatbots have been making waves in various industries, particularly since the release of Open AI’s ChatGPT-4 in the spring of 2023. The hype and excitement around these powerful technologies is not without merit. Chatbots offer a compelling case for how AI can revolutionize our world and change how we approach everyday tasks and challenges.

However, with the advancement of large language models (LLMs) — i.e., the technology on which chatbots are built — comes a host of concerns. Perhaps in no other industry more than healthcare do these concerns require ongoing attention and vigilance. With patient’s lives and well-being at stake, understanding the potential benefits and limitations of AI chatbots is critical. Unfortunately, because the technology is in its infancy, many gray areas exist, best practices are not established, and questions about safety and liability remain unanswered.

This article focuses on two broad areas of risk — content accuracy/reliability and privacy/security concerns — that have been the focus of recent research and commentary related to AI chatbots.

Content Accuracy/Reliability

The accuracy and reliability of AI chatbots continue to represent a significant concern when discussing the role of this technology in healthcare — particularly in patient care. Experts suggest that much may depend on how a chatbot user poses a question or prompt, whether the request has a factual answer, and how the AI system is trained.

A 2023 article from The New England Journal of Medicine explains that current chatbots are “sensitive to the form and choice of wording.”1 Thus, depending on how a prompt is phrased, the results might vary. Similarly, if questions or prompts are posed that have no definitive answers, the technology’s responses might fluctuate.

Although chatbots tend to be very accurate when responding to fact-based queries, they are not foolproof, and errors can occur. These errors — known as hallucinations — might not always be obvious, but they could have serious consequences in relation to patient care and treatment. Further, hallucinations “are often stated by the chatbot in such a convincing manner that the person making the query may be convinced of its veracity.”2

These issues call attention to the need for healthcare providers to carefully consider the quality of their prompts and to review and verify the accuracy of chatbot responses. Unfortunately, the need to do so offsets some of the time savings and convenience that make this technology so attractive. Further, it might be tempting to skip these steps in the busy healthcare environment or when a healthcare professional finds that the technology generally is reliable (leading to automation bias).

Even when healthcare professionals do attempt to determine the accuracy or reliability of information, they may run into issues understanding how a chatbot generates its responses. Many AI technologies have opaque algorithms, making it difficult or impossible to determine how they produce results. An article in JAMA Health Forum also notes that most LLMs approach training data in a nondiscerning fashion. These systems “are trained on indiscriminate assemblages of web text with little regard to how sources vary in reliability. They treat articles published in The New England Journal of Medicine and Reddit discussions as equally authoritative."3

Further complicating matters is the issue of bias in training data, which is a top concern with AI. Bias can occur for various reasons; for example, the data used to train AI applications — as well as the rules used to build algorithms — might be biased. Additionally, bias might occur because of a variance in the training data or environment and how the AI program or tool is applied in real life.

Without a clear understanding of these issues and how chatbots work, when they are most reliable, and when they are prone to hallucinations, healthcare providers might struggle with how to incorporate AI into clinical practice and how to balance information from chatbots with their own clinical judgment.


In addition to concerns about chatbot accuracy, another top concern with this technology is inappropriate disclosure of information. Like various other technology predecessors, AI is vulnerable to privacy and security risks. In healthcare, these risks can cascade into federal and state legal violations, erosion of patient trust, and financial and reputational damage for healthcare organizations.

Chatbots represent a unique privacy risk for various reasons. First, healthcare providers can interact directly with chatbots to input information — possibly proprietary or protected health information (PHI) — and generate quick responses. Unlike other types of AI applications in healthcare, chatbots are easily accessible online and don’t require organizations or providers to engage a vendor to gain access to the technology. As a result, no business associate relationship is formed and no business associate agreement (BAA) is in place to establish or enforce privacy and security provisions.4

Second, unlike traditional search engines that generate a list of resources in response to user prompts, chatbots have the ability to engage in more human-like discourse. As a result, they can prompt users to enter additional information and “engage users in a dialogue, earning their trust and nudging them to reveal more data.”5 This “luring behavior” can increase the risk that healthcare professionals might inadvertently disclose PHI or other sensitive data.

Third, when using chatbots, providers might not be aware that they are transmitting data to the company that owns the technology. Unless that company has signed a BAA, disclosing any information that is considered PHI is a HIPAA violation.6 Thus, much like the effort providers will need to spend verifying the accuracy of chatbot responses, they also will need to make certain to de-identify PHI. “Patient names, including nicknames, references to geographic information smaller than a state, and admission and discharge dates, to cite a few examples, must be scrubbed before transcripts can be fed into the chat tool.”7

Even when providers are confident that information has been de-identified, they must consider that today’s evolving AI technology can re-identify information in ways not previously possible. An article in the AMA Journal of Ethics notes that “Existing practices of notifying patients and obtaining consent for data use are not adequate, nor are strategies for de-identifying data effective in the context of large, complex data sets when machine learning algorithms can re-identify a record from as few as 3 data points.”8

Finally, concerns abound about the use of chatbots for nefarious purposes, such aiding in cyberattacks. For example, unlike less sophisticated technology, AI can produce higher quality and more nuanced phishing emails that better appeal to a range of human emotions. Detecting these fraudulent emails is more difficult, which increases the risk of security breaches. Chatbots also can create pathways for people with limited technology know-how to develop malware, as well as opportunities for tech-savvy cybercriminals to further enhance their strategies.9

In the present and future, protecting patient privacy and securing electronic data will remain top priorities and challenges in healthcare. Because chatbots and other AI technologies are evolving rapidly, healthcare organizations and providers should frequently assess risks, maintain awareness of ongoing threats, and remain attentive to emerging best practices.

Learn More

For more information about risks associated with AI in healthcare, see the following MedPro resources:


1 Lee, P., Bubeck, S., & Petro, J. (2023). Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. The New England Journal of Medicine, 388(13), 1233–1239. doi: https://doi.org/10.1056/NEJMsr2214184

2 Ibid.

3 Mello, M. M., & Guha, N. (2023). ChatGPT and physicians' malpractice risk. JAMA Health Forum, 4(5), e231938. doi: https://doi.org/10.1001/jamahealthforum.2023.1938

4 Kanter, G. P., & Packel, E. A. (2023). Health care privacy risks of AI chatbots. JAMA, 330(4), 311–312. doi: https://doi.org/10.1001/jama.2023.9618

5 Marks, M., & Haupt, C. E. (2023). AI chatbots, health privacy, and challenges to HIPAA compliance. JAMA, 330(4), 309–310. doi: https://doi.org/10.1001/jama.2023.9458

6 Kanter, et al., Health care privacy risks of AI chatbots.

7 Ibid.

8 Crigger, E., & Khoury C. (2019, February). Making policy on augmented intelligence in health care. AMA Journal of Ethics, 21(2), E188–191. doi: 10.1001/amajethics.2019.188

9 U.S. Department of Health and Human Services Office of Information Security. (2023, June 13). Artificial intelligence, cybersecurity and the health sector. Retrieved from www.hhs.gov/sites/default/files/ai-cybersecurity-health-sector-tlpclear.pdf

MedPro Twitter


View more on Twitter