Hypercube — Tackling AI Hallucinations in Medical Summaries

Mendel AI and the University of Massachusetts Amherst (UMass Amherst) have published key data on addressing the critical issue of hallucinations in AI-generated medical summaries using a pioneering artificial intelligence (AI) framework called “Hypercube”. Launched in 2023, Hypercube arrived with the potential of improved patient engagement and screening for clinical trials.

The Mendel and UMass Amherst study, titled “Faithfulness Hallucination Detection in Healthcare AI,” tackles the challenge posed by large language models (LLMs) like GPT-4o and Llama-3, which, despite their impressive capabilities, are prone to generating inaccurate or misleading information — a phenomenon known as AI hallucinations. The study evaluated LLMs for their tendency to deviate from the context of their instructions, which sometimes produced contradictory results from the provided data.

XTALKS WEBINAR: How Artificial Intelligence is Transforming Clinical Development

Live and On-Demand: Thursday, September 12, 2024, at 3pm EDT (12pm PDT)

Register for this webinar to learn more about how organizations can apply AI for the clinical protocol development process, critically evaluate AI-based solutions to ensure they are fit for purpose and how organizations can be set up for success by adopting tools to accelerate clinical development.

Key findings in the study showed that GPT-4o, while generating longer summaries, was more prone to hallucinations due to its complex reasoning, whereas Llama-3 produced fewer hallucinations but with lower-quality summaries. The research underscored the necessity of accuracy in AI models to prevent misdiagnoses and inappropriate treatments.

To combat these challenges, the research team developed the Hypercube system, leveraging medical knowledge bases, symbolic reasoning and natural language processing (NLP) to detect hallucinations. Hypercube offers a cost-effective solution by providing a comprehensive representation of patient documents, facilitating initial detection before expert review.

A framework that systematically detects and categorizes AI hallucinations significantly improves the trustworthiness of AI in clinical settings. The Mendel AI and UMass collaboration is a step forward in ensuring the reliability and safety of AI applications in healthcare. Looking ahead, the team plans to refine their detection framework and further develop automated systems like Hypercube.

Stanford University’s Human-Centered Artificial Intelligence recently published an article about a Nature study where AI-generated summaries often outperformed those written by human experts. The study highlighted the potential of AI to alleviate the documentation burden on doctors, allowing them to spend more time with patients and reducing the risk of errors.

Late last year, researchers at Penn State College of Information Sciences and Technology published a paper on their own innovative framework to manage unfaithfulness. The Faithfulness for Medical Summarization (FaMeSumm) framework was used to fine-tune smaller mainstream language models. These fine-tuned models were found to outperform the larger GPT-3 when summarizing doctor-patient records in the study.

With generative AI touted to augment clinical productivity in terms of clinical workflows, clinician-patient interactions, patient care and administrative efficiency, refining LLM models are becoming increasingly crucial.

Hypercube — Tackling AI Hallucinations in Medical Summaries

Related Vitals

5 Reasons to Attend the Upcoming 9th Operational Excellence in Clinical Trials Summit

Inside the Curanex Pharmaceuticals IPO: Botanical Therapies for Ulcerative Colitis and More

Nasus Pharma IPO Raises $10M to Advance Needle-Free Epinephrine

Esaote Expands US Footprint with FDA Nod for MyLab A50 and A70 Ultrasound Systems

European Commission Launches Life Sciences Roadmap to 2030: Becoming an Innovation Hub

Related Webinars

Can Neuroimaging Save Psychedelic Drug Development?

Clinical Data Analysis with Agents: Reliable Automation with Human Oversight

ICH GCP E6(R3): Navigating the New Era of Good Clinical Practice

Patient Experience Organization Model for Clinical Trial Success

How to Measure Patient Experience in Clinical Studies: Driving Retention, Timelines and ROI

To create an account, you will need to agree to the privacy policy and terms of use by clicking the agree button at the bottom of this form.

Privacy Policy

EU and UK – PRIVACY POLICY – HONEYCOMB WORLDWIDE INC.

Welcome

Who we are

Our EU and UK Representatives

How to Contact Us

What personal data we collect

Data Accuracy

How We Collect Your Personal Data

Why We Use Your Personal Data

Your Rights

How to exercise your rights

How long we hold your personal data

Who do we share your personal data with?

Do we transfer personal data Overseas?

Third-party links

Marketing

How to withdraw consent

Do we use automated decision-making or profiling?

How We Keep Your Personal Data Secure

How to complain

TERMS OF USE