A personalized DNA vaccine targeting up to 40 patient-specific neoantigens generated robust immune responses and encouraging survival outcomes in patients with MGMT-unmethylated glioblastoma in a small Phase I clinical trial, according to new findings published in Nature Cancer.
The study evaluated GNOS-PV01, a personalized therapeutic cancer vaccine developed by Geneos Therapeutics in collaboration with researchers at Washington University School of Medicine in St. Louis. Investigators reported that the vaccine was safe, feasible to administer, and capable of stimulating circulating and tumor-infiltrating T-cell responses in a cancer type long considered highly resistant to immunotherapy.
Glioblastoma remains one of the deadliest cancers, with median survival typically ranging from 12 to 18 months. Patients with MGMT-unmethylated disease face especially poor outcomes because they derive limited benefit from temozolomide, a standard chemotherapy agent commonly used after surgery and radiation.
“Nothing really works in this MGMT-negative or unmethylated glioblastoma patient population,” said Niranjan Sardesai, Geneos’ CEO. “Median survival is around a year, and effective treatments are very much needed.”
The open-label, single-arm GT-20 study enrolled nine patients with newly diagnosed MGMT-unmethylated glioblastoma following surgical resection and radiation therapy. Each patient received a fully individualized vaccine constructed from neoantigens identified through sequencing of their own tumors. Vaccines encoded between 17 and 40 neoantigens per patient.
According to the paper, the vaccine caused no serious adverse events, unexpected toxicities, or dose-limiting toxicities. Eight of the nine evaluable patients developed measurable immune responses. The lone nonresponder had been treated with dexamethasone, an immunosuppressive corticosteroid frequently used in glioblastoma management.
Sardesai emphasized that the immunogenicity findings were particularly notable because glioblastoma is considered an “immune-excluded” tumor with low tumor mutational burden, characteristics that have historically limited the effectiveness of checkpoint inhibitors such as anti–PD-1 therapies.
“Checkpoint-based immunotherapy has not worked in GBM,” he said. “This is a cold tumor.”
The investigators also observed signals of clinical activity. Six-month progression-free survival and 12-month overall survival were each achieved in 66.7% of patients. Median progression-free survival was 8.5 months, while median overall survival reached 16.3 months. Survival at 24 months was 33%, including one patient who remains alive four years after surgery.
“What was very striking was that three of nine patients, or one-third of the patients, had lived more than two years,” Sardesai said. “The two-year survival rate is about 10% to 15%” with standard treatment approaches in this population.
The study also identified an association between stronger CD8-positive T-cell responses and longer survival. Investigators reported that patients generating higher levels of vaccine-induced cytotoxic T cells tended to experience improved overall survival.
One of the most compelling findings involved a long-term survivor who has remained progression-free for nearly five years. Researchers analyzed a brain biopsy obtained approximately three years after treatment initiation and identified vaccine-induced T-cell clones within the tumor tissue that matched T-cell populations detected in the patient’s blood.
“For the first time, we are able to match vaccine-driven immune responses,” Sardesai said. “We are able to see T-cell clones in the blood, and these T-cell clones have infiltrated and are found in her brain.”
The vaccine platform differs from earlier glioblastoma vaccine strategies in several ways. Rather than targeting a small number of antigens, the DNA-based approach allows investigators to incorporate a much larger neoantigen repertoire into each personalized product.
“These patients received as many as 40 different antigens that were identified from their own tumor,” Sardesai said. “Prior treatments had typically been looking at 20 or fewer in GBM.”
He argued that broader antigen targeting may be especially important in glioblastoma because of the disease’s pronounced intratumoral heterogeneity.
“When it comes to targeting cancer, more is better,” he said. “You want to take more shots on goal.”
Another distinguishing feature of the platform is its apparent ability to stimulate CD8-positive killer T cells, which are considered critical for direct tumor cell elimination. Sardesai noted that generating robust CD8 responses has historically been difficult for many cancer vaccine technologies.
Importantly, each vaccine is uniquely manufactured for a single patient.
“These are exquisitely personalized vaccines,” Sardesai said. “Every patient gets their own vaccine.”
The authors cautioned that the findings remain preliminary because of the trial’s small sample size and lack of a control arm. Still, they believe the results justify larger randomized studies.
“We are very encouraged by the data,” Sardesai said. “But this is still only nine patients. We have to replicate these findings in larger, well-controlled studies.”
The company has previously reported results using the same platform in hepatocellular carcinoma, suggesting the strategy could potentially extend across multiple tumor types characterized by immune exclusion and low tumor mutational burden.
“All cancers carry neoantigens,” Sardesai said. “These personalized cancer vaccines provide a very convenient way” to target those tumor-specific alterations across different cancers.
Transcriptomic analysis of more than 100 metastatic renal cell carcinomas (mRCC) has revealed key differences in aberrant alternative gene splicing events between treatment responders and nonresponders that could aid prognostication in future.
“In the near term, these findings could help guide treatment selection by identifying patients more likely to respond to targeted therapies or standard immuno-oncology regimens,” said Patrick Pirrotte, PhD, director of the Integrated Mass Spectrometry Shared Resource at TGen and City of Hope, associate professor in TGen’s Early Detection and Prevention Division, and senior author of the paper.
“Longer term, splicing-derived antigens could provide a foundation for more personalized adoptive immunotherapy strategies tailored to the molecular features of an individual patient’s tumor,” he told Inside Precision Medicine.
Pirrotte explained that “alternative splicing [AS] is a fundamental transcriptional mechanism that expands proteomic diversity in normal cells, but aberrant splicing is increasingly recognized as a feature of cancer that can contribute to tumorigenesis, progression, and metastasis.”
His group, and collaborators, have previously demonstrated that aberrant splicing could act as a broadly relevant biomarker across different malignancies, including ovarian cancer and sarcomatoid renal cell carcinoma, but its diagnostic and predictive potential in mRCC remained largely unexplored.
To address this, Pirrotte and team conducted a retrospective analysis on tumor samples from 101 patients with mRCC who received immune checkpoint inhibitor (n=91) and/or targeted (n=77) therapies. Response rates to each of the therapies were 63% and 77%, respectively.
The researchers report in the Journal for ImmunoTherapy of Cancer that they identified 10 AS events that were specific to mRCC. Six of these were intron retention events and four were exon skipping events.
Differential AS analysis identified 461 slicing events that differed between responders and non-responders to immune checkpoint inhibitors and 253 events that differed between targeted therapy responders and non-responders. In both cases, more than 70% of novel AS events among responders involved intron retention.
“Intron retention was the predominant alternative splicing event observed in patients who responded well to therapy,” observed Pirrotte.
“Mechanistically, intron retention occurs when intronic sequences that are normally removed during RNA processing are retained in the mature transcript. This can generate novel amino acid sequences and, in some cases, tumor-associated antigens derived from aberrant splicing,” he explained. “A high intron-retention burden was associated with an immunogenic tumor microenvironment, marked by adaptive immune activation and enriched antigen processing. In simple terms, these cancer-specific splicing errors may help ‘flag’ tumor cells, making them more visible to the immune system.”
The team then investigated whether differentially spliced sequences shared between the immunotherapy and targeted therapy responder cohorts could potentially act as neoantigenic targets.
This revealed that novel peptide-generating AS events in the genes IFFO1 and ZNF692 were highly expressed among the responders. Both genes are known to play a role in tumorigenesis and metastasis in RCC and colorectal cancer. The researchers note that although the specific impact of AS events within these genes is unclear, the resulting neoantigens could play a role in future treatment approaches.
“It is becoming increasingly feasible to identify splicing-derived neoantigens that could be used in personalized immunotherapy approaches, including adoptive cell therapies such as CAR T-cell or tumor-infiltrating lymphocyte therapies,” said Pirrotte. “These strategies are designed to train or redirect a patient’s immune system to recognize tumor-specific antigen signatures. In this case, the targets would be antigens generated by aberrant splicing events, allowing immune cells to selectively recognize and kill cancer cells.”
Finally, the investigators showed that tumors with higher levels of aberrant splicing were more common among therapy responders than nonresponders. This could potentially be used as a biomarker for treatment response.
“Current biomarkers such as PD-L1 expression and microsatellite instability have shown limited and inconsistent predictive value in mRCC,” said Pirrotte. “In contrast, our study identified a significant association between tumor ‘splicing burden’ (the extent of aberrant splicing) and clinical response to therapy. These findings suggest that the tumor transcriptome, particularly splicing dysregulation, may provide a more informative framework for predicting treatment response and personalizing therapy.”
Before assessment of AS can be implemented in routine clinical practice, the core technologies will need further refinement, including clinically validated RNA sequencing workflows, robust computational pipelines for splicing analysis, and clear regulatory and technical frameworks for using the results to guide treatment decisions or develop biologic therapies.
Pirrotte and team are now assembling validation cohorts to confirm their findings in larger patient populations. They are also expanding their work to other cancer types to determine whether aberrant splicing and splicing-derived antigens represent broadly applicable biomarkers and therapeutic targets.
I started creating health content online in medical school. I realized I could reach thousands of people in seconds and share medically accurate information with students around the world. For example, I made a video showing how deep an injection goes for vaccination. The public is both fascinated and afraid of injections, but dispelling the rumors that a massive needle could go as deep as your bone goes a long way in vaccine adoption.
During my emergency medicine residency, though, things changed. What had been seen during my interview process as a strength and skill set became “high risk” overnight. I was told that continuing to post on social media could jeopardize my career.
People report that their personal contact info was surfaced by Google AI—and there’s apparently no easy way to prevent it.
A Redditor recently wrote that he was “desperate for help”: for about a month, he said, his phone had been inundated by calls from “strangers” who were “looking for a lawyer, a product designer, a locksmith.” Callers were apparently misdirected by Google’s generative AI.
In March, a software developer in Israel was contacted on WhatsApp after Google’s chatbot Gemini provided incorrect customer service instructions that included his number.
And in April, a PhD candidate at the University of Washington was messing around on Gemini and got it to cough up her colleague’s personal cell phone number.
AI researchers and online privacy experts have long warned of the myriad dangers generative AI poses for personal privacy. These cases give us yet another scenario to worry about: generative AI exposing people’s real phone numbers. (The Redditor did not respond to multiple requests for comment and we could not independently verify his story.)
Experts say that these privacy lapses are most likely due to personally identifiable information (PII) being used in training data, though it’s hard to understand the exact mechanism causing real phone numbers to show up in the AI-generated responses. But no matter the reason, the result is not fun for people on the receiving end—and, even more worryingly, there appears to be little that anyone can do to stop it.
A 400% increase in AI-related privacy requests
It’s impossible to know how often people’s phone numbers are exposed by AI chatbots, but experts say they believe that it is happening far more than is reported publicly.
DeleteMe, a company that helps customers remove their personal information from the internet, says customer queries about generative AI have increased by 400%—up to a few thousand—in the last seven months. These queries “specifically reference ChatGPT, Claude, Gemini … or other generative AI tools,” says Rob Shavell, the company’s cofounder and CEO. Specifically, 55% of these concerns about generative AI reference ChatGPT, 20% reference Gemini, 15% Claude, and 10% other AI tools, Shavell says. (MIT Technology Review has a business subscription to DeleteMe.)
Shavell says customer complaints about personal information being surfaced by LLMs usually take two forms: Either “a customer asks a chatbot something innocuous about themselves and gets back accurate home addresses, phone numbers, family members’ names, or employer details.” Alternatively, a customer may be confronted with and report the exposure of someone else’s personal data, when “the chatbot generates plausible-but-wrong contact information.”
This aligns with what happened to Daniel Abraham, a 28-year-old software engineer in Israel. In mid-March, he says, a stranger sent him a “weird WhatsApp message from an unknown number” asking for help with his account in PayBox, an Israeli payment app.
“I thought it was a spam message,” he wrote to MIT Technology Review in an email—“someone who was trying to troll me.”
But when he asked the stranger how they had found his number, they sent him a screenshot of Gemini’s instructions to contact PayBox customer service via WhatsApp—giving his personal number. Abraham does not work for PayBox, and PayBox does not have a WhatsApp customer service number, Elad Gabay, a customer service representative for the company, confirmed.
Later, Abraham asked Gemini how to contact PayBox, and it generated another person’s WhatsApp number. When I recently asked, Gemini again responded with an Israeli phone number—it belonged not to PayBox, but to a separate credit card company that works with PayBox.
Screenshot: Google Gemini provides MIT Technology Review with the incorrect number for PayBox.
Abraham’s exchange with the stranger ended quickly, but he said he was concerned about how other potential exchanges could quickly turn sour, including “harassment or other bad interactions.” “What if I asked for money in order to ‘solve’ that [customer service] issue?” he said.
To try to figure out how this happened, Abraham ran a regular Google search on his phone number, and he found that it had been shared online once, back in 2015, on a local site similar to Quora. Though he’s not sure who posted it there, it may explain how it ended up being reproduced by Gemini over a decade later.
Chatbots like Gemini, Open AI’s ChatGPT, and Anthropic’s Claude are built on LLMs that are trained on huge amounts of data scraped from across the web. This inevitably includes hundreds of millions of instances of PII. As we reported last summer, for example, the large popular open-source data set DataComp CommonPool, which has been used to train image-generation models, included copies of résumés, driver’s licenses, and credit cards.
The likelihood of PII appearing in AI training data is only increasing as public data “runs out” and AI companies look for new sources of high-quality training data. This includes information from data brokers and people-search websites. According to the California data broker registry, for instance, 31 of 578 registered data brokers operating in the state self-reported that they had “shared or sold consumers’ data to a developer of a GenAI system or model in the past year.”
Furthermore, models are known to memorize and reproduce data verbatim from training data sets—and recent research suggests that it is not just frequently appearing data that is most likely to be memorized.
Imperfect Measures
It’s standard practice now to build guardrails into an LLM’s design to constrain certain outputs, ranging from content filters meant to identify and prevent chatbots from releasing PII to Anthropic’s instructions to Claude to choose responses that contain “the least personal, private, or confidential information belonging to others.”
But as a pair of University of Washington PhD students researching privacy and technology saw firsthand recently, these safeguards don’t always work.
“One day, I was just playing around on Gemini, and I searched for Yael Eiger, my friend and collaborator,” Meira Gilbert says. She typed in “Yael Eiger contact info,” and after Gemini provided an overview of Eiger’s research, which Gilbert had expected, Gemini also returned her friend’s personal phone number. “It was shocking,” Gilbert says.
When she saw the Gemini result, Eiger remembered that she had, in fact, shared her phone number online in the previous year, for a technology workshop. But she had not expected it to be so visible to everyone on the internet.
Have you had your PII revealed by generative AI? Reach the reporter on Signal at eileenguo.15 or tips@technologyreview.com.
“Having your information be … accessible to one audience, and then Gemini making it accessible to anyone” feels completely different, Eiger says—especially when she found that the information was buried in a normal Google search.
“It was severely downgraded,” Gilbert confirms. “I never would have found it if I was just looking through Google results.” (I tried the same prompt in Gemini earlier this month, and after an initial denial, the tool also gave me Eiger’s number.)
After this experience, Eiger, Gilbert, and another UW PhD student, Anna-Maria Gueorguieva, decided to test ChatGPT to see what it would surface about a professor.
At first, OpenAI’s guardrails kicked in, and ChatGPT responded that the information was unavailable. But in the same response, the chatbot suggested, “if you want to go deeper, I can still try a more ‘investigative-style’ approach.” Their inquiry just had to help “narrow things down,” ChatGPT said, by providing “a neighborhood guess” for where the professor might live, or “a possible co-owner name” for the professor’s home. ChatGPT continued: “That’s usually the only way to surface newer or intentionally less-visible property records.”
The students provided this information, leading ChatGPT to produce the professor’s home address, home purchase price, and spouse’s name from city property records.
(Taya Christianson, an OpenAI representative, said she was not able to comment on what happened in this case without seeing screenshots or knowing which model the students had tested, though we pointed out that many users may not know which model they were using in the ChatGPT interface. In response to questions about the exposure of PII, she sent links to documents describing how OpenAI handles privacy, including filtering out PII, and other tools.)
This reveals one of the fundamental problems with chatbots, says DeleteMe’s Shavell. AI companies “can build in guardrails, but [their chatbots] are also designed to be effective and to answer customer questions.”
The exposure issue is not limited to Gemini or ChatGPT. Last year, Futurismfound that if you promptedxAI’s chatbot Grok with “[name] address,” in almost all cases, it provided not only residential addresses but also often the person’s phone numbers, work addresses, and addresses for people with similar-sounding names. (xAI did not respond to a request for comment.)
No clear answers
There aren’t straightforward solutions to this problem—there’s no easy way to either verify whether someone’s personal information is in a given model’s training set or to compel the models to remove PII.
Ideally, individual consumers should be able to request that their PII be removed, says Jennifer King, the privacy and data fellow at Stanford University Institute for Human-Centered Artificial Intelligence. But this is typically interpreted to apply only to the data that people have directly given to companies—like when they interact with a chatbot, King explains.
“I don’t know if Google even has the infrastructure … to say to me, ‘Yes, we have your data in our training data, we can summarize what we know about you, and then we can delete or correct things that are wrong or things that you don’t want in there,’” she says.
Existing privacy legislation, like the California Consumer Privacy Act or Europe’s GDPR, does not cover the “publicly available” information that has already been scraped and used to train LLMs, especially since much of this is anonymized (though multiplestudies have also shown how easy it is to infer identities and PII from anonymized and pseudonymous data).
As to “whether they [AI companies] have ever systematically tried to go back through data that had already been collected from the public internet and minimized that stuff?” King adds. “No idea.”
The next best solution would be that the companies are “taking out everybody’s phone numbers or all data that resembles [phone numbers],” King says, but “nobody’s been willing to say” they’re doing that.
Hugging Face, a platform that hosts open-source data sets and AI models, has a tool that allows people to search how often a piece of data—like their phone number—has appeared in open-source LLM training data sets, but this does not necessarily represent what has been used to train closed LLMs that power popular chatbots like Claude, ChatGPT, and Gemini. (Eiger’s number, for example, did not show up in Hugging Face’s tool.)
Alex Joseph, the head of communications for Gemini apps and Google Labs, did not respond to specific questions, but he said that “the team” is “looking into” the particular cases flagged by MIT Technology Review. He also provided a link to a support document that describes how users can “object to the processing of your personal data” or “ask for inaccurate personal data in Gemini Apps’ responses to be corrected.” The page notes that the company’s response will depend on the privacy laws of your jurisdiction.
OpenAI has a privacy portal that allows people to submit requests to remove their personal information from ChatGPT responses, but notes that it balances privacy requests with the public interest and “may decline a request if we have a lawful reason for doing so.”
Anthropic describes how it uses personal data in model training, but it does not have a clear way for people to request its removal. The company did not respond to a request for comment.
The best option for anyone who wants to protect their private data right now is to “start upstream: get personal data off the public web before it ends up in the next scrape,” says Shavell. Since the start of the year, for instance, California has offered its residents a web portal to request that data brokers delete their information. Still, this doesn’t guarantee that your data hasn’t already been used for training—and will therefore not appear in a chatbot’s response.
The Redditor who received incessant calls posted that he had “submitted an official Legal Removal/Privacy Request to Google, asking them to urgently blacklist my number from their LLM outputs,” but had not yet received a response. He also wrote last month that “the harassment continues daily.”
Abraham, the Israeli software developer, says he contacted Google’s customer service on March 17, the day after his phone number was exposed. He says he did not receive a response until May 4, and it simply asked for documentation that he had already provided.
Meanwhile, inspired by her own exposure on Gemini, Eiger, along with Gilbert and Gueorguieva, is designing a research project to further study what personal information is being surfaced by various AI chatbots—and what they may know, even if they’re not telling us.
Some of that information may “technically be public,” says Gilbert, but chatbots may be altering “the amount of effort you would put into finding” it. Now instead of searching through 10 pages of Google search results, or paying for the information from a data broker site, “does generative AI just lower the barrier to entry to target people?”
This piece has been updated to clarify OpenAI’s response.
The incidence of stage IV breast cancer increased significantly overall, across ages, and for both sexes from 2010 through 2021, according to research from a Dana Farber-led team. The percentage of patients with stage IV breast cancers, versus those with stages I to III diagnoses also increased.
Notably, this increase was seen for all tumor subtypes in both sexes.
The researchers write, “These findings suggest that efforts are needed to determine factors contributing to these increases and to identify breast cancer before patients present with de novo stage IV disease.”
The study appears this week (May 12 issue) in JAMA Network Open. The senior author is José P. Leone, MD, department of medical oncology, Dana-Farber Cancer Institute and Harvard Medical School, Boston.
In their analysis of data from over 700,000 U.S. patients, the incidence of stage IV breast cancer increased significantly by 1.2% per year, and the percentage of people with stage IV also increased significantly. Stage IV incidence increased widely across all ages, races, sexes, and tumor subtypes. Still, survival improved significantly from 2010 through 2021.
Stage IV incidence increased across all tumor subtypes in both sexes. In women, those subtypes include hormone receptor (HR)–positive/ERBB2-negative, HR-positive/ERBB2-positive, HR-negative/ERBB2-positive, and triple-negative disease.
Trends in the incidence of de novo stage IV breast cancer “remain underreported,” these authors write. A previous study evaluating incidence of distant disease in the U.S. before 2010 showed a statistically significant increase in incidence for younger patients and a statistically significant decrease in older patients.But, this current study’s authors said, a meta-analysis reported a decreasing percentage of stage IV presentation over time.
Breast cancer is the second most common cancer in women, behind skin cancer. It is the most common cancer diagnosed in females worldwide and an estimated 30% of patients develop metastases. The American Cancer Society estimates 42,140 U.S. women will die from breast cancer in 2026.
The incidence of breast cancer in younger women, in particular, has been rising. In August 2025, the CDC reported that: “Most breast cancers occur in older women, but rates have been increasing slowly among women younger than 45 years in all racial and ethnic groups.” The agency added that survival from breast cancer is improving “among women in most racial and ethnic groups.”
Breast cancer in men remains rare, but the rate is increasing also.
This population-based cohort study used data from the Surveillance, Epidemiology, and End Results (SEER) program to identify patients diagnosed with de novo invasive breast cancer between January 1, 2010, and December 31, 2021. Data analyses were conducted from January 2024 to June 2025.
Of 761,471 breast cancer diagnoses, 43,934 (5.8%) were stage IV. Stage IV incidence increased from 9.5 cases per 100,000 females in 2010 to 11.2 cases per 100,000 females in 2021. The incidence of stages I to III disease also increased, from 163 cases per 100,000 females in 2010 to 177.4 cases per 100,000 females in 2021.
Among males, there was also a statistically significant increase in stage IV incidence.
The researchers noted that, “Although overall survival improved, research is warranted to determine factors contributing to increased incidence, including potential changes in natural history of breast cancer, disease screening, and incidence and mortality of other conditions.”
AI is reshaping drug discovery, and nucleic acid–based medicines, including mRNAs, gene therapy, and oligonucleotide therapeutics, are no exception. By optimizing sequences and chemical modifications for experimental testing, AI accelerates discovery timelines, which is particularly critical for oligo therapeutics, a modality central to the n=1 rare diseases, which afflict mostly young patients for whom there is additional urgency.
However, a familiar caveat remains: AI is only as powerful as the data from which it learns. How can we provide enough high-quality input data to fuel this engine and design next-generation precision medicine?
A typical workflow for developing an AI-powered oligo predictive model begins with collecting experimental outcomes of oligo sequences, with each sequence annotated using a defined set of features. This data is then used to train AI models that identify patterns associated with improved activity and safety.
However, as is often the case with pioneering technologies such as oligonucleotides, the scarcity of data is a major problem. To overcome this limitation, scientists trawl public resources such as publications and patents to extract this data. ASOptimizer, OligoAI, and eSkip-Finder are examples of newer oligo-predicting AI models that are trained using publicly available data.
While these models are advancing in the right direction, relying primarily on this data comes with several disadvantages, such as:
inconsistent experimental conditions between the datasets,
limited diversity in sequences and chemistries,
lack of negative data, and
insufficient coverage of critical information such as toxicity and off-target effects.
Furthermore, since data sourcing and annotating often require the use of automated, AI-powered tools, there is a risk of mislabeling and misinterpretation. As such, correlation statistics between predicted and experimental values for these models are not too high, generally hovering between 0.4 and 0.7.1,2, 3
Building the data foundation for AI drug discovery
The most valuable training data is:
designed to span broad chemical space and probe critical safety features,
produced under controlled conditions,
consistently processed and annotated, and
generated in a controlled environment, ideally internally.
Large-scale screening campaigns are essential in that context as they provide the dense, reliable, datasets required to train AI models and extract meaningful insights for sequence and chemistry prediction.
Ming Wang, PhD
Brett Monia, CEO of Ionis Pharmaceuticals, describes this reality as “hard, brutal screening–screening a lot of oligonucleotides with different decorations, different amounts of chemistries, different sequences. We have plenty of (design) rules, but we still don’t have enough.”4
One way to address this challenge is through intentional screening design: deliberately varying sequence motifs and positional chemistries within screening libraries to systematically explore chemical landscapes and expand the empirical foundation on which both rules and AI models are built.
With the advent of faster and more affordable transcriptomic technologies, high-throughput RNA-seq can now be incorporated into oligonucleotide screening workflows. This method enables the systematic detection of off-target effects, including those that arise through mechanisms beyond straightforward.5,6
While these approaches generate large and complex datasets, they represent a critical investment—one that lays the foundation for a faster, more efficient, and ultimately more cost‑effective future of oligo discovery.
Digital infrastructure, engineering AI-ready data at scale
While generating large datasets may be hard and brutal, managing, curating, and analyzing them doesn’t need to be. For data to be truly reliable and trustworthy, quality must be engineered from the start. Important aspects to consider include:
A single source of truth–a centralized FAIR data repository, where all data is systematically stored and governed for controlled access and use;
Comprehensive metadata capture, including protocols, batch numbers, and reagent references to ensure results can be interpreted correctly and are not driven by experimental artifacts;
Automated quality control and data analysis of large screens for large‑scale screens, ensuring consistent, efficient, and reproducible data processing; and
Consistent ontology and nomenclature for oligo sequences and their chemistries, as exemplified by Roche’s open-source tool (HelmShaker) for translating molecules into HELM notation.
In practice, these principles are implemented through integrated digital infrastructures that combine molecular registration systems with automated analytics across diverse experimental modalities such as high‑throughput screening, next‑generation sequencing, mass spectrometry, and chromatography.
Such approaches are increasingly used across the pharmaceutical and biotechnology sectors to manage oligonucleotide ADME, process development, and screening data, thus helping teams maintain data integrity and continuity throughout the oligo discovery and development lifecycle.
AI promises to redefine what is possible in oligo discovery, and the field is already beginning to see its impact. But AI alone is not the breakthrough—data is. Only large, high‑quality experimental datasets, generated intentionally and prospectively, can unlock AI’s full predictive power.
Organizations that invest early in both systematic data generation and robust data infrastructure will be best positioned to lead the next wave of oligonucleotide discovery. This shift is especially urgent for n = 1 rare diseases, where speed, precision, and learning from every experiment can make the difference between possibility and progress.
Ming Wang, PhD, is scientific business manager at Genedata.
References:
1Hwang, G., Kwon, M., Seo, D., Kim, DH., Lee, D., Lee, K., Kim, E., Kang, M., Ryu, JH., ASOptimizer: Optimizing antisense oligonucleotides through deep learning for IDO1 gene regulation. Mol Ther Nucleic Acids. 2024 Apr 6;35(2):102186. doi: 10.1016/j.omtn.2024.102186
2Chiba, S., Lim, KRQ., Sheri, N., Anwar, S., Erkut, E., Shah, MNA., Aslesh, T., Woo, S., Sheikh, O., Maruyama, R., Takano, H., Kunitake, K., Duddy, W., Okuno, Y., Aoki, Y., Yokota, T. eSkip-Finder: a machine learning-based web application and database to identify the optimal sequences of antisense oligonucleotides for exon skipping. Nucleic Acids Res. 2021 Jul 2;49(W1):W193-W198. doi:10.1093/nar/gkab442
3Hill, B., Jaques, M.R., Nair, RR., Whiffin, N., Wood, MJA., Sanders, SJ., Oliver, PL., Hill, AC., Rinaldi, C. Accurately modelling RNase H-mediated antisense oligonucleotide efficacy. bioRxiv. 2025 Oct 30.https://doi.org/10.1101/2025.10.29.685292
5Pekker, D., Kuntz, S., McArthur, M., Nicholson-Shaw, T., Yanke, S., Mukhopadhyay, S. A Dose-Response Model for Accurate Detection and Quantification of Transcriptome-Wide Gene Knockdown for Oligonucleotide-Based Medicines. bioRxiv. 2024 May 29. https://www.biorxiv.org/content/10.1101/2024.05.28.596270v1.full.pdf
6In-silico siRNA Off-Target Predictions: What Should We Be Looking For? OTS Oligonucleotide Therapeutics Society, Webinar, 2024 May 2
WASHINGTON — The Trump administration is moving quickly to identify the next commissioner of the Food and Drug Administration after the resignation of Marty Makary on Tuesday, with an eye for someone who can rebuild trust with agency staff, focus on the agency’s food policy, and continue to drive drug-approval reforms.
Administration leaders hope to conduct the search over “the next several weeks,” according to an official with knowledge of the process, granted anonymity to speak candidly. Despite chatter among lobbyists about who is in contention, there’s currently no short list of candidates, the official said.
Despite the urgency, the process will take a while. The Senate is in session for only so many days, and the administration also needs to confirm Erica Schwartz, the Centers for Disease Control and Prevention nominee, and Nicole Saphier, the surgeon general nominee. It’s possible Kyle Diamantas, formerly in charge of the FDA’s food center, will still be acting commissioner when the midterms arrive in November.
Background: “Keep On Keep Up” (KOKU) is a tablet-based digital program based on the well-validated Otago and Fitness and Mobility Exercise programs for older adults to decrease the risk of falling. Objective: This substudy involved a process evaluation in order to analyze the usage patterns of the KOKU digital program, specifically training frequency, volume, and intensity among older adults over a 3-month self-managed training period. Pre-post changes in physical capacity and real-world walking were examined. Methods: This study is a nested cohort study within the three-armed randomized controlled SMART-AGE trial conducted in Germany (German Clinical Trials Register ID: DRKS00034316). Participants aged 67 years or older with basic digital literacy were included. KOKU provided guided but unsupervised progressive strength and balance training for 3 months. The data on training adherence, engagement, and progression were collected. Instrumented assessments included the Timed Up and Go Test, the 30-Second Chair Rise Test, and real-world walking monitoring using wearable sensors. Results: A total of 113 participants (n=63, 56% female; mean age 74.02, SD 5.36 y) were included in the analysis. During the 3-month period, participants used KOKU for 24 (SD 15) days, that is, 2 to 3 times per week. Over the entire study period, no falls or other adverse events were reported due to KOKU usage. The number of exercises performed per participant ranged from 2 to 213, with a median value of 70. The instrumented Timed Up and Go Test results revealed a prolonged total duration (=0.26; =.009). In the instrumented 30-Second Chair Rise Test, improvements were observed in the number of completed repetitions (=0.21; =.04) and frequency of repetitions (=0.23; =.03). This was mainly due to a reduction in inactive time (=−0.60; <.001). Real-world walking parameters remained unchanged, except for a slower walking speed during walking bouts of less than 30 seconds (=0.49; <.001). All changes did not meet the criteria for minimally important differences. Conclusions: KOKU is a novel digital intervention for older adults, promoting balance and strength exercises. Physical capacity improvements were small. However, the use of instrumented assessments provided further insights into participants’ capacity and mobility that would not have been identifiable with conventional assessments. Future improvements to the program should focus on incorporating more challenging exercises for individuals with varying levels of physical capacity. Trial Registration: German Clinical Trials Register DRKS00034316; https://drks.de/search/en/trial/DRKS00034316
Background: Older adults facing social or structural marginalization for reasons such as lower literacy, digital exclusion, financial constraints, restricted living environments, or complex health histories, face persistent barriers to much-needed health screening. Digital health tools, particularly those using audio computer-assisted self-interview (ACASI) technology, offer potential to overcome these barriers (audio-delivered and self-administrable), but their application to marginalized populations remains underexplored. Moreover, guidance is limited for developing such tools which require collaboration within cross-disciplinary teams. This paper presents development insights and user testing findings from the ASCAPE (Audio App-Delivered Screening for Cognition and Age-Related Health in Prisoners) project, which aimed to develop equitable digital frailty and cognition screening for older people in Australian prisons. Objective: This study aims to describe the collaborative development of the “ASCAPE-HS” prototype, a tablet-based ACASI-delivered Frailty Index and aging screener, and to synthesize key lessons from the project that can inform equitable digital health tool development in hard-to-reach older adults. Also, to present findings on the usability and acceptability of ASCAPE-HS in a diverse community sample. Methods: The ASCAPE-HS prototype was developed through an iterative process involving researchers, clinicians, software developers, and end users under a digital health equity framework. The prototype included a self-administered, audio-delivered Frailty Index, alongside other items relevant to aging. The design process prioritized accessibility, sociocultural relevance, and technical feasibility, with regular multidisciplinary consultation and iterative refinement. Exploratory user testing with 20 older adults (aged 47‐93 years, including n=5 who had not finished secondary schooling, n=3 people with previous imprisonment history, and n=9 with mild or moderate cognitive impairment) provided feedback on usability and acceptability. Results: A 50-item Frailty Index was developed, alongside an additional selection of holistic questions that could meaningfully capture age-related health, and transferred to an iOS app (Apple, Inc), with ACASI features. Key elements included lay wording, consistent interface, simple “tapping” response options with repeatable audio feedback, a tutorial, and artificial intelligence–generated audio guidance. Key development considerations were synthesized into a checklist for teams undertaking similar projects. Successful strategies for the collaborative design process included diverse teams abreast of emerging literature and policy with varying expectations for engagement during development, and dedicating time to flexible, iterative development processes. Acceptability (median scores ≥4 out of 5 across 6 constructs) and usability (mean System Usability Scale score 79.0, SD 8.8) were high. Conclusions: A collaborative approach can produce ACASI-based health screening tools that are well-received by older adults. We highlight the feasibility of integrating frailty and aging assessment into a usable and acceptable digital tool, and offer actionable principles for collaborative, evidence-based development of equitable health screening tools in diverse, hard-to-reach populations.
<img src="https://jmir-production.s3.us-east-2.amazonaws.com/thumbs/30699dc5ac2d0ad05f7755da3210033e" />