Understanding How PXRD Conditioning Influences Accuracy and Uncertainty in Generative Crystal Structure Prediction
Frederik L. Johansen a b, Adam F. Sapnik b, Ulrik Friis-Jensen a b, Erik B. Dam a, Rocío Mercado c, Raghavendra Selvan a, Kirsten M. Ø. Jensen b
a Department of Computer Science, University of Copenhagen, Denmark
b Department of Chemistry & Nano-Science Center, University of Copenhagen, Denmark
c Department of Computer Science & Engineering, Chalmers University of Technology, Sweden
Proceedings of MATSUS Spring 2026 Conference (MATSUSSpring26)
C1 Structural Foundations of Nanomaterials Properties
Barcelona, Spain, 2026 March 23rd - 27th
Organizers: Nadine Schrenker and Stefano Toso
Oral, Frederik L. Johansen, presentation 213
Publication date: 15th December 2025

Characterizing the atomic structure of functional materials is essential for advancing technologies in energy storage, catalysis, and electronics. Powder X-ray diffraction (PXRD) remains a central tool for this task, providing average periodic structure even for nanomaterials where finite size and surface effects are present. Yet identifying a suitable crystallographic model directly from PXRD is difficult, as instrumental resolution limits and sample characteristics broaden or distort peaks, creating overlap and ambiguity. In parallel, generative and machine learning approaches to crystal structure prediction have progressed rapidly, but most operate on high-level descriptors such as composition or symmetry and do not incorporate diffraction data directly1,2,3.This motivates a central question: how does PXRD conditioning shape the accuracy and uncertainty of generative crystal structure predictions?

To investigate this, we introduce deCIFer4, a PXRD-conditioned autoregressive transformer that generates full CIF structures from an encoded diffraction profile, with optional inputs such as composition or space group. deCIFer is trained on a large set of inorganic structures paired with simulated PXRD patterns and therefore provides a representative case study for PXRD-guided generative modelling.

We assess robustness through a unified evaluation combining synthetic perturbation experiments and real experimental PXRD. Synthetic tests apply physically motivated distortions to the input pattern, including unit-cell scaling, peak shifts, peak asymmetry, Scherrer broadening, additive noise and background variation. Across these conditions, we observe that PXRD conditioning improves structure prediction when diffraction features are sufficiently informative, producing tight sets of plausible structural candidates. As distortions increase, deCIFer transitions to broader structural distributions that reflect the diminishing information content of the PXRD, and when the signal becomes uninformative, the predictions revert toward statistically favored structures. Lastly, we evaluate the model on experimental PXRD from well-characterized bulk- and nanomaterial samples. In these cases, deCIFer provides reasonable structural candidates that capture the main diffraction features that can serve as starting models for downstream analysis. These results highlight that PXRD conditioning can improve the robustness of crystal structure prediction and support practical workflows in materials characterization.

© FUNDACIO DE LA COMUNITAT VALENCIANA SCITO
We use our own and third party cookies for analysing and measuring usage of our website to improve our services. If you continue browsing, we consider accepting its use. You can check our Cookies Policy in which you will also find how to configure your web browser for the use of cookies. More info