Generative AI Practitioners in Healthcare Prioritize Industry-Specific and Task-Specific Models as Budgets Surge 300%, New John Snow Labs Survey Finds

135

John Snow Labs, the AI for healthcare company announced the findings of the inaugural Generative AI in Healthcare Survey. Conducted by Gradient Flow, the research explores the trends, tools, and behaviors around generative artificial intelligence (GenAI) use among healthcare and life sciences practitioners. Findings showed a significant increase in GenAI budgets across the board, with one-fifth of all technical leaders witnessing a more than 300% budget growth, reflecting strong advocacy and investment.

The survey highlights a number of key priorities of practitioners unique to the healthcare industry. A strong preference for healthcare-specific models was a key criterion when evaluating large language models (LLMs). Requiring models to be tuned specifically for healthcare (4.03 mean response) was of higher importance than reproducibility (3.91), legal and reputation risk (3.89), explainability and transparency (3.83), and cost (3.8). Accuracy is the top priority when evaluating LLMs and lack of accuracy is considered the top risk in GenAI projects.

Another key finding is a strong preference for small, task-specific language models. These targeted models are optimized for specific use cases, unlike general-purpose LLMs. Survey results reflected this, with 36% of respondents using healthcare-specific task-specific language models. Open-source LLMs (24%) and open-source task-specific models (21%) follow behind. Proprietary LLMs are less commonly used, whether through a SaaS API (18%) or on-premise (7%).

In terms of how models are tested and improved, the survey highlights one practice that addresses both the accuracy and compliance concerns of the healthcare industry: human-in-the-loop workflows. This was by far the most common step taken to test and improve LLMs (55%), followed by supervised fine-tuning (32%), and interpretability tools and techniques (25%). A human-in-the-loop approach enables data scientists and domain experts to easily collaborate on training, testing, and fine-tuning models to their exact needs, improving them over time with feedback.

“Healthcare practitioners are already investing heavily in GenAI, but while budgets may not be a top concern, it’s clear that accuracy, privacy, and healthcare domain expertise are all critical,” said David Talby, CTO, John Snow Labs. “The survey results shine the light on the importance of healthcare-specific, task-specific language models, along with human-in-the-loop workflows as important techniques to enable the accurate, compliant, and responsible use of the technology.”

Finally, the survey explores the large amount of remaining work in applying responsible AI principles in healthcare GenAI projects. Lack of accuracy (3.78) and legal and reputational risk (3.62) were reported as the most concerning roadblocks. Worse, a majority of GenAI projects have not yet been tested for any LLM requirements cited. For those that have, fairness (32%), explainability (27%), private data leakage (27%), hallucinations (26%), and bias (26%) ranked as the most commonly tested. This suggests that no aspect of responsible AI is being tested by more than a third of organizations.

An upcoming webinar taking place at 2pm ET on April 30 with Drs. Ben Lorica of Gradient Flow and David Talby of John Snow Labs, provides additional details and analysis of the survey results and the current state of GenAI in healthcare.