Cancers of unknown primary (CUP) create major challenges for oncologists. Emerging research such as DNA methylation tissue of origin CUP analysis offers hope for more accurate identification of primary sites. These metastatic tumors have no identifiable primary site. Doctors often rely on empiric chemotherapy, which leads to poorer patient outcomes.
Researchers now offer a practical new tool. A machine learning model presented at AACR 2026 uses DNA methylation to predict the tissue of origin with a compact panel of markers.
Research Led by Marco A. De Velasco, PhD Marco A. De Velasco, PhD (Faculty Member, Department of Genome Biology, Kindai University, Japan; view biography) led the project. His team studied over 7,400 tumors from 21 cancer types in the TCGA dataset. They reduced the features from ~450,000 CpG sites to about 1,000.
This change boosts practicality while keeping excellent accuracy.
“Our goal here was to develop a classifier that can predict tissue of origin using a focused set of CpG sites, rather than hundreds of thousands,” said Marco A. De Velasco, PhD. “We want to improve practicality while maintaining strong performance.”
DNA Methylation for Tissue of Origin Prediction in CUP
This epigenetic approach creates stable, tissue-specific signatures that remain reliable even in metastatic disease. The selected markers preserved clear cancer-type clusters in unsupervised analysis. They also showed the expected inverse link with gene expression. These findings confirm the model reflects true tumor biology.
Performance Results
The classifier reached an F1 score of ~0.945 on the test set. Validation cohorts delivered nearly 0.9 F1 after grouping similar subtypes. Most errors involved biologically related cancers, such as colorectal or gynecological tumors. Performance stayed stable regardless of sample size, heterogeneity, or purity.
“Focusing on a reduced set of CpGs is intended to reduce cost, reduce time, and enhance processing,” noted Marco A. De Velasco, PhD. “This should be a much shorter, cheaper process than current Infinium arrays.”
Clinical Role and Next Steps Marco A. De Velasco, PhD explained the tool’s value: “This is not meant to replace identifying an oncogenic driver but to complement it… We are also identifying phenotypes that we can further interrogate for possible druggable targets.”
Read more about precision oncology approaches and liquid biopsy updates.
Future plans include trials in real CUP patients and adaptation for circulating tumor DNA. These steps aim to deliver faster, less invasive testing that supports personalized care.
Q&A for Busy Oncologists
Q1: How accurate is this epigenetic classifier for CUP? A: It achieved ~0.945 F1 on test data and ~0.9 F1 in validation. Results improve when similar subtypes are grouped.
Q2: Does it replace NGS driver testing? A: No. It works alongside NGS to identify origin and phenotypes in driver-negative cases and guide site-specific therapy.
Q3: What makes the reduced panel more practical? A: Only ~1,000 markers cut cost, turnaround time, and tissue needs while preserving accuracy and biology insights.
Q4: When could this reach the clinic? A: Prospective CUP trials are in preparation, with testing likely starting within the next year. Liquid biopsy work is already underway.


