Machine learning massively speeds up scouring of periodic table for stable structures

An algorithm has been created that can rapidly scan hypothetical crystal structures containing any natural element and find those likely to be stable. The program, which was trained by machine learning on a dataset of 140,000 materials, makes predictions almost as accurately as quantum chemistry simulations and far more efficiently. The researchers searched 31 million potential structures and found 1.8 million stable ones. Such searches could potentially find superior materials for multiple applications.

The Schrödinger equation is unsolvable for chemical systems more complex than the hydrogen atom. Chemists therefore often rely on ab initio simulations of the electron density such as DFT to calculate the properties of multi-electron systems. These can be remarkably accurate, but, in large polyatomic systems, such as crystals, they are extremely computationally costly. ‘As a rule of thumb, DFT can be done for any material with less than 1000 atoms in the unit cell,’ says Shyue Ping Ong at the University of California, San Diego, ‘but it scales badly and you cannot do DFT for millions of materials easily, even if they are simple materials.’

Interatomic potentials simplify this by simply considering the shape of the potential energy surface of atoms in different environments. To learn the shape of this potential energy surface, researchers have relied on machine learning algorithms. ‘Say you’re interested in carbon,’ says Ong. ‘So, maybe you perform DFT calculations on different structures of carbon to get the energies and forces. You then use that data to train a machine-learning potential for just carbon… But that potential is not going to work for silicon or silicon carbide. You have to repeat the whole training process for every single chemistry you work with.’

Mind over matter

To overcome this limitation, Ong and his project scientist Chi Chen used graph neural networks, a machine learning approach that combines a graph representation with algorithms loosely modelled on the brain. They started from DFT calculations of the energies and interatomic forces in 140,000 known and hypothetical inorganic structures in the Materials Project – an open source database started in 2011. These acted as training data for a graph neural network algorithm called M3GNet, which taught itself to infer how a given atom of a particular element was likely to behave in different configurations. The result is the M3GNet ‘universal interatomic potential’ governing the interactions between 89 different elements.

The researchers took arbitrary structures for 5283 crystals. They first used M3GNet to find the approximate ground states. The M3GNet ‘ground states’ were subsequently relaxed again by DFT to find more accurate ground states. Relaxation of the arbitrary structures by M3GNet reduced the crystals’ energies by at least 10 times as much as subsequent DFT relaxation. This showed that M3GNet had found the ground state of the system almost as well as DFT, but at much lower computational cost – the structure of K57Se34,, for example which had originally required 15h on a 32-core processor when calculated by DFT, could be calculated by M3GNet in 22s on a single-core laptop. ‘Right now, thousands or tens of thousands of atoms are doable,’ Ong says. ‘We are implementing an improvement to the code that uses GPUs which will make systems with more than 100,000 atoms more doable.’

Moreover, Chi Chen, now at Microsoft Quantum, says two-stage relaxation has an advantage. ‘For many structures, the DFT calculation gets stuck in local minima. M3GNet relaxation followed by DFT relaxation gets lower energies compared to doing DFT relaxation alone,’ he explains.

Charting material space

The M3GNet searched 31 million hypothetical crystal structures with 50 or fewer atoms in the unit cell, finding that about 1.8 million were potentially stable. The researchers analysed 2000 of these using DFT, which agreed with the prediction of stability in 1578 cases. The researchers also used M3GNet to predict the elasticities of the materials and the phonon dispersion curves, which are important to determine properties such as dynamic stability and thermal conductivity, and in both cases they found good agreement with DFT.

The researchers now hope the algorithm can find useful new materials. ‘There is no synthesis in this work,’ says Ong. ‘DFT is the benchmark. But it is my hope that some of these predictions will eventually be experimentally verified. For example, I work on lithium-ion batteries. One of the properties that you’re usually interested in is lithium-ion conductivity. Our universal potential can work with any combination of elements in the periodic table and so can be used to simulate any potential lithium-ion battery material.’

‘It’s very impressive work,’ says Pablo Piaggi of Princeton University in New Jersey. ‘Typically deep learning in molecular simulations is used to study a specific substance and a very small number of chemical elements. In this paper they go in completely the opposite direction… My impression in general was that the work has tremendous potential to predict new compounds, which possibly may outperform known materials for different applications.’