What is it about?

We implemented a secure, on-premises AI assistant integrated with our hospital’s electronic health record system. Over five months, more than 1,000 clinicians used it mostly for summarizing patient information, finding specific data in notes, and drafting documentation. These documentation-related tasks accounted for the majority of activity. Usage was sustained, with more than half of the clinicians using the tool at least weekly, and 14,910 multi-turn conversations were generated. A notable challenge was monitoring real-world use: voluntary feedback dropped after the pilot phase, making it harder to evaluate users' experience. Overall, the results show that large language model assistants can be integrated into clinical workflows and adopted at scale when strong governance and privacy safeguards are in place.

Featured Image

Why is it important?

This study provides the first large-scale evidence of how chatbots are actually used by clinicians in everyday clinical practice. Rather than relying on these tools for complex medical decision-making, clinicians primarily use them to reduce time spent searching through patient records and writing documentation. This challenges common assumptions in current AI development and public discourse, which often emphasize diagnostic or decision-support use cases. Our findings highlight that the greatest near-term value of clinical chatbots may lie in easing administrative burden, which is a major contributor to clinician workload and burnout. We also demonstrate that such systems can be developed and deployed entirely within a hospital’s secure infrastructure, showing that large-scale clinical adoption is possible without sending sensitive patient data outside institutional control.

Perspectives

Developing and putting such a system in production was challenging, but a necessary step to inform model developers and evaluations about the challenges and real-world usage of these models. This study challenges the common belief that AI development should focus on complex diagnostic tasks and suggests that these models have more potential as data processing assistants. As of today, few automated benchmarks or datasets exist to develop such models. I strongly believe that this gap would benefit from new datasets and evaluations to ensure future models are aligned with real-world use cases.

Maxime Griot
Universite catholique de Louvain

Read the Original

This page is a summary of: Implementation of large language models in electronic health records, PLOS Digital Health, December 2025, PLOS,
DOI: 10.1371/journal.pdig.0001141.
You can read the full text:

Read
Open access logo

Contributors

The following have contributed to this page