Foundation models in computer vision have demonstrated exceptional performance in zero-shot and few-shot tasks by extracting multi-purpose features from large-scale datasets through self-supervised pre-training methods. However, these models often overlook the severe corruption in cryogenic electron microscopy (cryo-EM) images by high-level noises. We introduce DRACO, a Denoising-Reconstruction Autoencoder for CryO-EM, inspired by the Noise2Noise (N2N) approach. By processing cryo-EM movies into odd and even images and treating them as independent noisy observations, we apply a denoising-reconstruction hybrid training scheme. We mask both images to create denoising and reconstruction tasks. For DRACO's pre-training, the quality of the dataset is essential, we hence build a high-quality, diverse dataset from an uncurated public database, including over 270,000 movies or micrographs. After pre-training, DRACO naturally serves as a generalizable cryo-EM image denoiser and a foundation model for various cryo-EM downstream tasks. DRACO demonstrates the best performance in denoising, micrograph curation, and particle picking tasks compared to state-of-the-art baselines. We will release the code, pre-trained models, and the curated dataset to stimulate further research.
We visualize the denoising results of DRACO and state-of-the-art baselines. Our results show the most significant SNR improve- ment without the loss of the particle structure details. In contrast, Low-pass leads to a severe blur on particles, MAE introduces severe patch-wise artifacts and Topaz only shows either minor SNR improvements or blurred results.
We show the picking results of DRACO and baselines on the test datasets range from small transport proteins to huge ribosomes. Blue, red, and yellow circles denote true positives, false positives, and false negatives, respectively.
Denoising cryo-ET HIV tilt series with DRACO. Figure a and b show the HIV tilt series before and after DRACO's denoising process. Using IMOD, we reconstruct 3D volumes of HIV from both the original and denoised series, showing their slice in Figures c and d. Note the horizontal stripes in these images, which are artifacts due to the missing wedge issue in cryo-ET. Figure e shows a denoised slice from Figure c by DRACO.
@inproceedings{shen2024draco,
title={Draco: Denoising Reconstruction Autoencoder for CryO-EM},
author={Shen, Yingjun and Dai, Haizhao and Chen, Qihe and Zeng, Yan and Zhang, Jiakai and Pei, Yuan and Yu, Jingyi},
booktitle={Proceedings of the 38th International Conference on Neural Information Processing Systems},
year={2024}
}