Health-care AI is here. We don’t know if it actually helps patients.

I don’t need to tell you that AI is everywhere.

Or that it is being used, increasingly, in hospitals. Doctors are using AI to help them with notetaking. AI-based tools are trawling through patient records, flagging people who may require certain support or treatments. They are also used to interpret medical exam results and X-rays.

A growing number of studies suggest that many of these tools can deliver accurate results. But there’s a bigger question here: Does using them actually translate into better health outcomes for patients?

We don’t yet have a good answer.

That’s what Jenna Wiens, a computer scientist at the University of Michigan, and Anna Goldenberg of the University of Toronto, argue in a paper published in the journal Nature Medicine this week.

Wiens tells me she has spent years investigating how AI might benefit health care. For the first decade of her career she tried to pitch the technology to clinicians. Over the last few years, she says, it’s as though “a switch flipped.” Health-care providers not only appear much more interested in the promise of these technologies, they have also begun rapidly deploying them.

The problem is that many providers aren’t rigorously assessing how well they actually work.

Take “ambient AI” tools, for example. Also known as AI scribes, they “listen” to conversations between doctors and patients, then transcribe and summarize them. Multiple tools are available, and they are already being widely adopted by health-care providers.

A few months ago, a staffer at a major New York medical center who develops AI tools for doctors told me that, anecdotally, medics are “overjoyed” by the technology—it allows them to focus all their attention on their patients during appointments, and it saves them from a lot of time-consuming paperwork. Early studies support these anecdotes and suggest that the tools can reduce clinician burnout.

That’s all well and good. But what about patient health outcomes? “[Researchers] have evaluated provider or clinician and patient satisfaction, but not really how these tools are affecting clinical decision-making,” says Wiens. “We just don’t know.”

The same holds true for other AI-based technologies used in health-care settings. Some are used to predict patients’ health trajectories, others to recommend treatments. They are designed to make health care more effective and efficient.

But even a tool that is “accurate” won’t necessarily improve health outcomes. AI might speed up the interpretation of a chest X-ray, for example. But how much will a doctor rely on its analysis? How will that tool affect the way a doctor interacts with patients or recommends treatment? And ultimately: What will this mean for those patients?

The answers to those questions might vary between hospitals or departments and could depend on clinical workflows, says Wiens. They might also differ between doctors at various stages of their careers.

Take the AI scribes, as another example. Some research on AI use in education suggests that such tools can impact the way people cognitively process information. Could they affect the way a doctor processes a patient’s information? Will the tools affect the way medical students think about patient data in a way that impacts care? These questions need to be explored, says Wiens. “We like things that save us time, but we have to think about the unintended consequences of this,” she says.

In a study published in January 2025, Paige Nong at the University of Minnesota and her colleagues found that around 65% of US hospitals used AI-assisted predictive tools. Only two-thirds of those hospitals evaluated their accuracy. Even fewer assessed them for bias.

The number of hospitals using these tools has probably increased since then, says Wiens. Those hospitals, or entities other than the companies developing the tools, need to evaluate how much they help in specific settings. There’s a possibility that they could leave patients worse off, although it’s more likely that AI tools just aren’t as beneficial as health-care providers might assume they are, says Wiens.

“I do believe in the potential of AI to really improve clinical care,” says Wiens, who stresses that she doesn’t want to stop the adoption of AI tools in health care. She just wants more information about how they are affecting people. “I have to believe that in the future it’s not all AI or no AI,” she says. “It’s somewhere in between.”

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.
 

Psychedelics get a boost from the White House

President Trump recently signed an executive order which aims to increase access to psychedelic drug treatments. He was joined at the signing by podcaster Joe Rogan, who said he’ ha’d messaged the president about research on the psychedelic ibogaine. 

In this week’s STATus Report, host Alex Hogan chats with STAT Washington correspondent Daniel Payne about what the executive order does and doesn’t do. Hogan also looks at why ibogaine, and psychedelic drugs more broadly, are increasingly being taken seriously for stubbornly hard-to-treat conditions like addiction, depression, and PTSD.

Growing use of guest editors has turned some journals into a ‘playground of bad science’

Should academic journals begin to second guess guest editors? 

That question gained new urgency last week when the British Medical Journal’s publishing group retracted nearly its entire guest-edited special edition of the Journal of Medical Genetics, dedicated to cancer immunotherapies. In the retraction note, the journal writes that it was, in part, because of “compromised peer review in almost all articles.” The notice garnered attention for its scope, but also because it exemplified larger concerns that research integrity advocates have with guest-edited editions, which are also called special issues in some journals. 

Read the rest…

Ensemble-based working memory updating and its computational rules.

Psychological Review, Vol 133(3), Apr 2026, 515-533; doi:10.1037/rev0000569

Manipulation plays a critical role in working memory, wherein understanding how items are represented during manipulation is a fundamental question. Previous studies on manipulation have primarily assumed independent representations by default (independent hypothesis). Here, we propose the ensemble hypothesis to challenge this conventional notion, suggesting that items are represented as ensembles undergoing updating during manipulation. To test these hypotheses, we focused on working memory updating in accordance with new information by conducting three delayed-estimation tasks under addition, removal, and replacement scenarios (Study 1). A critical manipulation involved systematically manipulating the mean orientation of all memory stimuli, either increasing (clockwise) or decreasing (counterclockwise) after the updating process. Following the independent hypothesis, memory errors would be similar under both conditions. Conversely, considering the biasing effect of the ensemble on individual representations, the ensemble hypothesis predicts that memories of individual items would be updated, aligning with the ensemble’s change direction. Namely, memory errors would be more positive in the increase-mean condition compared to the decrease-mean condition. Our results supported the ensemble hypothesis. Furthermore, to investigate the mechanisms underlying ensemble computations in updating scenarios, we conducted three ensemble tasks (Study 2) with similar designs to Study 1 and developed a computational model to quantify the contributions of each memory item. The results consistently demonstrated that addition involved complete updating, while removal led to incomplete updating. Across these three research parts, we propose that items are represented as dynamic ensembles during working memory updating processes. Furthermore, we elucidate the computational principles underlying ensembles throughout this process. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Adaptive computation as a new mechanism of dynamic human attention.

Psychological Review, Vol 133(3), Apr 2026, 534-559; doi:10.1037/rev0000572

A key role for attention is to continually focus visual processing to satisfy our goals. How does this work in computational terms? Here we introduce adaptive computation—a new computational mechanism of human attention that bridges the momentary application of perceptual computations with their impact on decision outcomes. Adaptive computation is a dynamic algorithm that rations perceptual computations across objects on-the-fly, enabled by a novel and general formulation of task relevance. We evaluate adaptive computation in a case study of multiple object tracking (MOT)—a paradigmatic example of selection as a dynamic process, where observers track a set of target objects moving amid visually identical distractors. Adaptive computation explains the attentional dynamics of object selection with unprecedented depth. It not only recapitulates several classic features of MOT (e.g., trial-level tracking accuracy and localization error of targets), but also captures properties that have not previously been measured or modeled—including both the subsecond patterns of attentional deployment between objects, and the resulting sense of subjective effort. Critically, this approach captures such data within a framework that is in-principle domain-general, and, unlike past models, without using any MOT-specific heuristic components. Beyond this case study, we also look to the future, discussing how adaptive computation may apply more generally, providing a new type of mechanistic model for the dynamic operation of many forms of visual attention. (PsycInfo Database Record (c) 2026 APA, all rights reserved)

Understanding Collaborative CT and MRI Utilization Through Network Analysis: Retrospective Study Using Administrative Claims Data

<strong>Background:</strong> Japan has one of the highest densities of computed tomography (CT) and magnetic resonance imaging (MRI) scanners globally, yet efficient resource allocation remains a challenge amid demographic shifts and regional health care disparities. <strong>Objective:</strong> This study aimed to develop an analytic framework using network analysis techniques to understand the collaborative use of CT and MRI devices across health care facilities in a Japanese prefecture. <strong>Methods:</strong> A retrospective observational study was conducted using outpatient receipt data from Japan’s National Health Insurance and the Late-Stage Elderly Medical System, covering fiscal years 2016 to 2019. Network analysis techniques were used to identify patterns of shared use among medical institutions. Network graphs with community detection were developed to visualize collaborative relationships, and density and reciprocity metrics were calculated to assess interinstitutional cooperation. <strong>Results:</strong> CT examinations increased from 287,782 (2016) to 307,029 (2019), while MRI examinations increased from 107,876 to 115,929 over the same period. Collaborative examinations also increased for both modalities. Network density remained relatively stable (CT: 3.10-3.50×10<sup>-3</sup>; MRI: 3.20-3.70×10<sup>-3</sup>), while reciprocity decreased (CT: 9.74×10<sup>-2</sup> to 7.79×10<sup>-2</sup>; MRI: 2.82×10<sup>-2</sup> to 1.56×10<sup>-2</sup>). Community detection analysis showed differences in the distribution of medical institutions across clusters over time. <strong>Conclusions:</strong> Network analysis revealed structural changes in collaborative CT and MRI use patterns, including declining reciprocity, which suggests a shift toward more unidirectional referral patterns. This analytic framework provides a method for health care planners to assess interinstitutional collaboration and inform resource allocation strategies for shared diagnostic equipment.

Building a Science-Driven Business: How National Institutes of Health Funding Enabled an Evidence-Based Approach to Maternal Mental Health Innovation

The digital mental health (DMH) industry has grown drastically over the last decade; yet, many DMH products have failed to demonstrate meaningful clinical outcomes, in large part due to lack of scientific evidence. This viewpoint paper highlights an example of how early-stage DMH companies can prioritize science as a strategic advantage. We discuss Moment for Parents, an artificial intelligence–driven maternal mental health app built entirely with support from the National Institutes of Health (NIH) Small Business Innovation Research (SBIR) program. We illustrate the advantages and challenges of building a science-backed product with federal funding. Benefits include credible evidence generation, independence in product development, and enhanced market differentiation. We also discuss the challenges of navigating the SBIR ecosystem, including grant writing and administrative demands, and aligning business objectives with federal research priorities. By showcasing both the promise and complexity of SBIR funding, this viewpoint paper offers actionable insights for founders and chief executive officers who aim to prioritize science in the DMH space.
<img src="https://jmir-production.s3.us-east-2.amazonaws.com/thumbs/0353d3d1a7028c09b54629066fd3bca7" />

Development of a Child Articulation Screening Test Within Digital Therapeutics: Delphi Study

Background: Speech sound disorders are common in children and are associated with an increased risk of academic reading difficulties. The COVID-19 pandemic further highlighted the need for remote and digitalized assessment tools. In South Korea, standardized instruments such as the Urimal Test of Articulation and Phonation and Assessment of Phonology and Articulation for children are widely used but have limitations, including reliance on face-to-face evaluation, and the absence of automated scoring. Objective: This study aimed to develop and establish the content validity of an articulation assessment tool that can overcome these limitations and be integrated into digital therapeutics (DTx). Methods: A 3-round modified Delphi survey was conducted between July and September 2025 with 92% (23/25) of the invited experts, including 52.2% (12/23) physiatrists and 47.8% (11/23) speech-language pathologists, with a mean professional experience of 10.69 (SD 5.09) years. All participants (23/23, 100%) completed all rounds. Panelists evaluated the appropriateness of word lists, phonological environments, and scoring criteria. Quantitative analyses, including calculations of content validity ratio (CVR), content validity index (CVI), and median and IQR, were performed. Consensus thresholds were set at a CVR of ≥0.39, a CVI of ≥0.78, a median of ≥3.5, and an IQR of ≤1.0. Items were retained only when all 4 criteria were satisfied. While formal qualitative analysis was not performed, the research team internally reviewed and synthesized core keywords and themes from the experts’ open-ended responses to guide the refinement of items. Results: These findings were summarized into four key areas: (1) modernization of word stimuli, (2) expansion of phonological coverage, (3) refinement of scoring criteria to reduce ambiguity, and (4) enhancement of result interpretability through visualization. In round 2, a revised 35-word list was evaluated across 25 items, of which 20 (80%) met all consensus criteria. In total, 20% (5/25) of the items failed to meet at least one threshold, including phonological environment adequacy (CVR=0.48; CVI=0.74), scoring redundancy (CVR=0.13; CVI=0.57), usefulness of proportion of whole-word correctness or percentage of word proximity (CVR=0.39; CVI=0.70), contribution of mean phonological length (CVR=0.22; CVI=0.61), and usefulness of feature-based indexes (CVR=0.30; CVI=0.65; IQR 2). Items that reached consensus showed CVR values of 0.57 to 0.91, CVI values of 0.78 to 0.96, a median score of 4, and IQR values of 0 to 1. In round 3, all remaining items achieved consensus. Conclusions: This Delphi study developed a novel articulation assessment tool with robust content validity. This tool includes updated word stimuli, diverse analysis indexes, and visualization features, thereby enhancing its clinical utility and suitability for integration into artificial intelligence–based DTx. By standardizing and digitalizing articulation assessments, this tool has the potential to support personalized and accessible interventions for children with speech sound disorders.
<img src="https://jmir-production.s3.us-east-2.amazonaws.com/thumbs/482dc072cc319e77bd695decadaff1d1" />

Exploring the Cultural Adaptation of an Ongoing Evidence-Based Intervention for Chinese and Korean American Dementia Caregivers: Descriptive Study

Background: The aging and caregiving population is becoming increasingly diverse in the United States, leading to a growing need for culturally adapted interventions to address the unique needs of underrepresented groups, such as Asian Americans. However, interventions targeting Asian Americans and exploring cultural adaptation strategies remain limited in dementia caregiving research. Objective: This study aimed to describe the cultural adaptation process of an evidence-based intervention for Chinese and Korean American dementia caregivers, called the New York University Caregiver Intervention–Enhanced Support. Methods: We conducted a deductive content analysis and categorized our adaptation strategies into 5 elements: content, context, relationship fidelity and core elements, engagement, and cultural competence. Timing and types of responses to each adaptation strategy were also observed. Two authors conducted the initial analysis, and additional team members finalized the synthesis through discussion. The Template for Intervention Description and Replication (TIDieR) checklist was used to guide the methodological rigor. Results: Twenty-four major adaptations were identified and categorized. For content, we translated materials, used culturally relevant terms, incorporated ethnic-specific surveys and resources, created social media support groups on platforms widely used by the targeted population, and extended the time allocated to complete the 6 counseling sessions. Context adaptation included expanding the range of individuals eligible for family counseling sessions to include fictive kin, using online and social media apps for communication, cultural matching and training of staff, and partnerships with relevant community organizations. Relationship fidelity and core elements involved consulting with community experts, conducting focus group interviews with caregivers, having regular meetings with the developer of the original intervention and an experienced New York University Caregiver Intervention–Enhanced Support clinician as well as experts in Chinese and Korean culture, and continuing regular counseling supervision. To enhance engagement, we provided clear explanations of the study procedure, which emphasized the benefits in participants’ native languages and matched participants with social workers who shared the same cultural backgrounds. We also used a step-by-step contact approach and prolonged communication, explained staff roles to build rapport, and offered participant compensation. Finally, cultural competence was reflected in tailoring counseling techniques with respect for cultural beliefs, the use of euphemistic language for taboo subjects, and culturally appropriate refreshments to show respect and build interpersonal relationships. Conclusions: We systematically adjusted a counseling-based intervention, an approach less familiar among Asian Americans, to fit the cultural characteristics of the target population. A contribution of this study is using an integrated, theory-driven approach that combines 2 cultural adaptation frameworks while also capturing real-time adaptations informed by external feedback and self-reflection. This work provides a practical model for adapting evidence-based interventions to serve Chinese and Korean American dementia caregivers and may inform future adaptations for other East Asian populations. Trial Registration: ClinicalTrial.gov NCT05461495; https://clinicaltrials.gov/study/NCT05461495
<img src="https://jmir-production.s3.us-east-2.amazonaws.com/thumbs/e08f825f332b5b67120fb0f3ebf4f04c" />

Expected Competencies and Personal Attributes of Digital Health Navigators to Support Digital Mental Health Care: Focus Group and Interview Study With Patients and Health Care Professionals

Background: Digital mental health apps (DMHAs), and in particular digital therapeutics (DTx), offer promising opportunities to support mental health care. However, their effective use in outpatient settings in Germany remains limited. To overcome this gap, the role of digital health navigators (DHNs) has been introduced. DHNs are trained individuals who support patients and health care professionals in selecting, using, and integrating DMHAs into care. Despite increasing interest in this role, there is limited evidence on the competencies, knowledge, and personal attributes required for DHNs to work effectively in mental health settings. Objective: The study aims to explore the expected competencies, knowledge areas, and personal attributes that DHNs need to effectively support the implementation and use of DTx in outpatient mental health care. Methods: As part of the prestudy of the Digital Navigators for Acceptance and Competence Development with Mental Health Apps (DigiNavi) study, a qualitative study was conducted involving 35 participants (7 general practitioners, 8 patients in general practice, 11 outpatient psychiatrists/psychologists, and 9 patients in psychiatric outpatient clinics) from different general practices and psychiatric outpatient clinics in Germany. A total of 17 semistructured interviews and 4 focus groups were conducted to explore expectations of DHNs. Data were analyzed using qualitative content analysis. Results: Participants emphasized that DHNs should combine strong interpersonal skills (empathy, patience, and sensitive communication) with technical and basic clinical competencies. Most favored DHNs as integrated clinical team members (eg, medical assistants), citing their existing patient relationships, but noted time and training constraints. Key expectations included the ability to support patients with DTx use, adapt communication to individual needs, and convey data privacy information clearly. Foundational knowledge of mental health conditions and sensitivity to crises were considered important for identifying warning signs and escalating concerns. While DHNs were seen as essential intermediaries between patients, health care professionals, and DTx, participants highlighted the necessity for clearly defined roles, structured training, and realistic expectations to prevent role overload and enable sustainable implementation in outpatient mental health care. Conclusions: DHNs require a specialized skill set that bridges clinical understanding, digital expertise, and interpersonal competence. Our results lay the groundwork for developing training curricula and implementation strategies that align with real-world expectations for the DHN role. Defining these core competencies is essential for supporting the sustainable and effective integration of DMHAs into mental health care. Trial Registration: German Clinical Trials Register DRKS00034327; https://drks.de/search/en/trial/DRKS00034327 and ClinicalTrials.gov NCT06575582; https://clinicaltrials.gov/study/NCT06575582 International Registered Report Identifier (IRRID): RR2-10.2196/67655
<img src="https://jmir-production.s3.us-east-2.amazonaws.com/thumbs/291c3a29cc1b4784b5ad2844475976e0" />