Results - Health (2026) - ELSA Benchmarks Platform

method: Two-Stage P-PGM + Beta-CVAE Pipeline for Differentially Private Synthetic RNA-seq Generation2026-05-04

Authors: Ahmet Yiğit Doğan, Canberk Balcı, Eren Kotar, Ömer Coşkun

Affiliation: Boğaziçi University

Description: The CAMDA 2026 ELSA Health Privacy Challenge asks teams to generate synthetic bulk RNA-seq data that is both useful for downstream classification and resistant to membership inference attacks. Looking at the 2025 leaderboard, two methods stood out: P-PGM, which provides a formal differential privacy guarantee and achieves 82.9% BRCA subtype classification accuracy, and CVAE with NMF, which learns richer biological structure but lacks any formal privacy guarantee. Our hypothesis was that combining these two approaches in a sequential pipeline would push both axes simultaneously, with Stage 1 providing the provable privacy backbone and Stage 2 recovering the gene co-expression structure that P-PGM loses by its independence assumption.

In Stage 1, we implement a private probabilistic graphical model (P-PGM) that computes Laplace-noised marginals of the real training data under a total privacy budget of epsilon = 10, split across per-gene global means, per-gene variances, class-conditional means, and class counts. This gives a formal (epsilon=10, delta=0)-DP guarantee via basic composition, meaning any single patient's removal would change the output distribution by at most a factor of e^10. The noisy marginals are used to generate a coarse synthetic dataset where each gene is sampled independently given the class label. In Stage 2, a conditional variational autoencoder (beta-CVAE) is trained on a 50/50 mix of real data and Stage 1 synthetic data, with KL annealing over the first 20% of training to prevent posterior collapse. This CVAE learns a compressed, class-conditional latent space and generates new samples by decoding random draws from the prior, so no individual encoder posterior is ever used at generation time. After Stage 2, we apply a per-gene linear variance calibration that rescales each synthetic gene to exactly match the real training data's mean and standard deviation, correcting a systematic overestimation artifact inherent to the CVAE decoder architecture.

Our initial claim was that Stage 2 would improve classification utility on top of Stage 1. In practice, we found that the primary evaluation metric (logistic regression accuracy) rewards class-mean separation rather than co-expression recovery, which is exactly what P-PGM's 1-way and 2-way marginals already capture well. The variance calibration turned out to be the single most impactful post-processing step, guaranteeing r_mean = 1.000 and r_std = 1.000 by construction. Final results are 69.3% accuracy and 67.8% AUPR on TCGA-BRCA, and 89.2% accuracy and 77.5% AUPR on TCGA-COMBINED, with mean MIA AUC of 0.514 and 0.503 respectively across all six attacks, consistent with a strong formal DP guarantee. While the BRCA utility falls short of the 2025 top entries, the COMBINED results are strong with no competing benchmark, and the MIA resistance is among the best that can be expected from any differentially private method.

method: Synthetic RNA-seq data generation with a foundation model: TabPFN version 2.52026-05-06

Authors: Deborah Boyenval

Affiliation: University of Helsinki

Description: We use TabPFN v2.5, a tabular foundation model developed by PriorLabs. TabPFN is a transformer-based model pretrained on synthetic tabular tasks and designed primarily for supervised classification and regression through in-context learning, without task-specific fine-tuning. We explore a non-standard use case by applying the experimental unsupervised data-generation extension, TabPFNUnsupervisedModel from tabpfn-extensions, as a synthetic data generator for bulk RNA-seq profiles.

Direct application of the unsupervised generative extension to the full 978-gene expression matrix was numerically unstable in our preliminary experiments. In our computational setting, we observed empirical operational stability limits of approximately 200 features for TCGA-BRCA and 150 features for TCGA-COMBINED. These values should be interpreted as implementation- and data-dependent empirical limits rather than theoretical limits of TabPFN.

To avoid unstable high-dimensional generative calls, our submitted generator, tabpfn_bloc_strat, uses a block-wise generation strategy. The 978 gene-expression features are partitioned into fixed feature blocks of at most [B] genes, preserving the original feature order. For each block, a separate TabPFNUnsupervisedModel is fitted on the corresponding block of real training data, and synthetic features are sampled independently block by block. The generated blocks are then concatenated in the original gene order to obtain a full 978-gene synthetic expression matrix.

This block-wise strategy captures within-block dependencies but does not explicitly model cross-block gene correlations, so it is an approximation rather than a full joint model over all 978 genes. We therefore expect some fidelity loss in exchange for stable generation at the full challenge dimensionality.

After generating synthetic expression profiles, we train a TabPFN classifier on the real training profiles and their labels. The classifier is then applied to synthetic profiles, and synthetic labels are sampled probabilistically from the predicted class probabilities. Thus, label assignment is performed post hoc through a discriminative model rather than through a class-conditional generator.

For each split, the number of synthetic samples is set equal to the number of real training samples. We used default TabPFN generation settings where possible and did not perform systematic hyperparameter optimization.

TabPFN-2.5: Advancing the State of the Art in Tabular Foundation Models Léo Grinsztajn, Klemens Flöge, Oscar Key, Felix Birkel, Philipp Jund, Brendan Roof, Benjamin Jäger, Dominik Safaric, Simone Alessi, Adrian Hayler, Mihir Manium, Rosen Yu, Felix Jablonski, Shi Bin Hoo, Anurag Garg, Jake Robertson, Magnus Bühler, Vladyslav Moroshan, Lennart Purucker, Clara Cornu, Lilly Charlotte Wehrhahn, Alessandro Bonetto, Bernhard Schölkopf, Sauraj Gambhir, Noah Hollmann, Frank Hutter

Accurate predictions on small data with a tabular foundation model Noah Hollmann, Samuel Müller, Lennart Purucker, Arjun Krishnakumar, Max Körfer, Shi Bin Hoo, Robin Tibor Schirrmeister & Frank Hutter

Source code

method: Synthetic RNA-seq Data Generation with MIAV and TabSDS2026-05-06

Authors: Daniil Filienko, Elias Chaibub Neto, Sikha Pentyala, Jineta Banerjee, Luca Foschini, Martine De Cock

Affiliation: University of Washington Tacoma, Sage Bionetworks

Description: We adapt TabSDS [1] and MIAV-TabPFN [2], with our implementation available at https://github.com/Filienko/CAMDA_ppml_huskies_blue.

TabSDS [1] is a non-parametric, model-free procedure that reconstructs dependency structure through rank-based transformations and controlled perturbations of the original data.. Following an empirical evaluation, we set n_c = 20 with sampling proportion prop = 0.5.

MIAV-TabPFN [2] augments TabPFN's in-context learning with auxiliary variables. These variables are generated by sampling independent noise and then rank-matching it to the empirical distributions of the real features. We attach the results generated with noise=0.

References

[1] Chaibub Neto, E. (2025). TabSDS: A Lightweight, Fully Non-Parametric, and Model-Free Approach for Generating Synthetic Tabular Data. ICML 2025.

[2] Chaibub Neto, E. (2026). Using Maximal Information Auxiliary Variables to Improve Synthetic Data Generation Based on TabPFN Foundation Models. ICLR 2026.

Ranking Table

Description Paper Source Code

		Utility					Fidelity				Biological Plausibility				Privacy
Date	Method	Accuracy (real)	Accuracy (synthetic)	AUPR (real)	AUPR (synthetic)	Number of overlapping important features	MMD score (train)	KL mean (train)	KL mean (test)	Discriminative score	Mean Co-expression Preservation (r>0.3)	Mean Co-expression #TP edges (r>0.3)	Diff. Expr. Mean TPR @ FPR <=0.05 (up-reg.)	Diff. Expr. Mean TPR @ FPR <= 0.05 (down-reg)	Distance to the closest (real)	Distance to the closest (synthetic)	GAN-leaks MIA AUC	GAN-leaks MIA TPR@FPR=0.01	GAN-leaks MIA TPR@FPR=0.1	Confidence LR MIA AUC	Confidence LR MIA TPR@FPR=0.01	Confidence LR MIA TPR@FPR=0.1
2026-05-04	Two-Stage P-PGM + Beta-CVAE Pipeline for Differentially Private Synthetic RNA-seq Generation	87.05%	74.38%	87.38%	68.74%	13	0.1503	1.1812	1.3532	45.86%	7.71%	36578	0.1607	0.3769	24.0302	27.0632	50.31%	1.35%	9.99%	51.10%	1.54%	11.02%
2026-05-06	Synthetic RNA-seq data generation with a foundation model: TabPFN version 2.5	85.86%	74.47%	85.53%	69.45%	8.8	0.0306	0.1326	0.1950	68.30%	85.37%	3797.4	0.4135	0.3518	24.0409	45.1829	50.75%	1.91%	10.93%	51.75%	1.22%	10.54%
2026-05-06	Synthetic RNA-seq Data Generation with MIAV and TabSDS	87.33%	78.69%	87.52%	73.36%	14.4	-0.0006	0.1678	0.3054	92.50%	98.06%	34500.8	0.8512	0.8610	24.0243	4.7288	100.00%	100.00%	100.00%	55.62%	0.94%	10.49%
2026-01-14	Multivariate Normal (MVN) baseline	86.41%	82.09%	85.77%	83.42%	20.6	0.0166	0.1419	0.2179	54.36%	90.36%	28910.6	0.8715	0.8446	24.0532	28.3770	52.79%	2.34%	11.92%	55.34%	0.99%	10.26%

Health (2026)

Inactive evaluations

method: Two-Stage P-PGM + Beta-CVAE Pipeline for Differentially Private Synthetic RNA-seq Generation2026-05-04

method: Synthetic RNA-seq data generation with a foundation model: TabPFN version 2.52026-05-06

method: Synthetic RNA-seq Data Generation with MIAV and TabSDS2026-05-06

Ranking Table

Ranking Graphic

Ranking Graphic

Ranking Graphic

Ranking Graphic