Themed collection Data-driven discovery in the chemical sciences

22 items
Open Access Paper

Discovery of highly anisotropic dielectric crystals with equivariant graph neural networks

We adopt the latest approaches in equivariant graph neural networks to develop a model that can predict the full dielectric tensor of crystals, discovering crystals with almost isotropic connectivity but highly anisotropic dielectric tensors.

Graphical abstract: Discovery of highly anisotropic dielectric crystals with equivariant graph neural networks
Paper

Accurate and reliable thermochemistry by data analysis of complex thermochemical networks using Active Thermochemical Tables: the case of glycine thermochemistry

Active Thermochemical Tables (ATcT) are employed to resolve existing inconsistencies surrounding the thermochemistry of glycine and produce accurate enthalpies of formation for this system.

Graphical abstract: Accurate and reliable thermochemistry by data analysis of complex thermochemical networks using Active Thermochemical Tables: the case of glycine thermochemistry
Open Access Paper

Leveraging natural language processing to curate the tmCAT, tmPHOTO, tmBIO, and tmSCO datasets of functional transition metal complexes

Leveraging natural language processing models including transformers, we curate four distinct datasets: tmCAT for catalysis, tmPHOTO for photophysical activity, tmBIO for biological relevance, and tmSCO for magnetism.

Graphical abstract: Leveraging natural language processing to curate the tmCAT, tmPHOTO, tmBIO, and tmSCO datasets of functional transition metal complexes
Open Access Paper

Predictive crystallography at scale: mapping, validating, and learning from 1000 crystal energy landscapes

We demonstrate the reliability and scalability of computational crystal structure prediction (CSP) methods for small, rigid organic molecules by performing in-depth CSP investigations for over 1000 such compounds.

Graphical abstract: Predictive crystallography at scale: mapping, validating, and learning from 1000 crystal energy landscapes
Open Access Paper

Integration of generative machine learning with the heuristic crystal structure prediction code FUSE

We integrate generative machine learning with heuristic crystal structure prediction in FUSE. The combined result shows superior performance over both components, accelerating the pace at which we will be able to predict and discover new compounds.

Graphical abstract: Integration of generative machine learning with the heuristic crystal structure prediction code FUSE
Open Access Paper

Beyond theory-driven discovery: introducing hot random search and datum-derived structures

Ephemeral Data-Derived Potential (EDDP)-driven long high-temperature anneals combined with AIRSS, termed as hot-AIRSS, enable the exploration of low-energy configurations of complex materials.

Graphical abstract: Beyond theory-driven discovery: introducing hot random search and datum-derived structures
Paper

Optical materials discovery and design with federated databases and machine learning

New hypothetical compounds are reported in a collection of online databases. By combining active learning with density-functional theory calculations, this work screens through such databases for materials with optical applications.

Graphical abstract: Optical materials discovery and design with federated databases and machine learning
Open Access Paper

Knowledge distillation of neural network potential for molecular crystals

Knowledge distillation worked to improve the neural network potential for organic molecular crystals.

Graphical abstract: Knowledge distillation of neural network potential for molecular crystals
Open Access Paper

Mapping inorganic crystal chemical space

We enumerate binary, ternary, and quaternary element and species combinations and present a two-dimensional representation of inorganic crystal chemical space, labelled according to whether the combinations pass standard chemical filters and if they appear in known databases.

Graphical abstract: Mapping inorganic crystal chemical space
Open Access Accepted Manuscript - Paper

How to do impactful research in artificial intelligence for chemistry and materials science.

Open Access Accepted Manuscript - Paper

Making the InChI FAIR and sustainable while moving to Inorganics

Open Access Accepted Manuscript - Paper

Prediction rigidities for data-driven chemistry

Accepted Manuscript - Paper

Specialising and Analysing Instruction-Tuned and Byte-Level Language Models for Organic Reaction Prediction

Open Access Accepted Manuscript - Paper

A critical reflection on attempts to machine-learn materials synthesis insights from text-mined literature recipes

Open Access Accepted Manuscript - Paper

Data-efficient fine-tuning of foundational models for first-principles quality sublimation enthalpies

Accepted Manuscript - Paper

Re-evaluating Retrosynthesis Algorithms with Syntheseus

Open Access Accepted Manuscript - Paper

Sequence determinants of protein phase separation and recognition by protein phase-separated condensates through molecular dynamics and active learning

Open Access Accepted Manuscript - Paper

Modelling ligand exchange in metal complexes with machine learning potentials

Open Access Accepted Manuscript - Paper

Web-BO: Towards increased accessibility of Bayesian optimisation (BO) for chemistry

Open Access Accepted Manuscript - Paper

Embedding human knowledge in material screening pipeline as filters to identify novel synthesizable inorganic materials

Open Access Accepted Manuscript - Paper

How big is Big Data?

Open Access Accepted Manuscript - Paper

Are we fitting data or noise? Analysing the predictive power of commonly used datasets in drug-, materials-, and molecular-discovery.

22 items

About this collection

We are delighted to share with you a selection of the papers associated with a Faraday Discussion on Data-driven discovery in the chemical sciences. More information about the related event may be found here: http://rsc.li/data-fd2024. Additional articles will be added to the collection as they are published. The final versions of all the articles presented and a record of the discussions will be published after the event.

The Discussion will involve four central themes – each focused on different aspects of chemical "discovery", and each aiming to promote the exchange of ideas between the molecular and materials communities: Discovering chemical structure, Discovering structure–property correlations, Discovering synthesis targets, Discovering trends in big data.

On behalf of the Scientific Committee, we hope you join us and participate in this exciting event, and that you enjoy these articles and the record of the discussion.

Spotlight

Advertisements