Lab Home | Phone | Search | ||||||||
|
||||||||
How can machine learning help guide scientific exploration? In many cases, a straightforward application of existing methods is insufficient to handle the small data size and other limitations of experimental data. In this talk, I will discuss some recent work on developing machine learning approaches for inorganic and soft-matter problems. First, I will discuss predicting the composition of amine-templated metal oxides (ATMO) materials�a diverse class of 3700+ structures spanning 70 elements extracted. Suppose you start with a given compound and then change the organic template amine - how would the composition of the crystalline product change? This is especially challenging as there are typically very few (or no) examples of matched pairs known in the literature. To address this, we developed an Augmented CycleGAN model which uses unpaired data to train a model which can "translate" the known composition of a compound directed by amine A into a distribution of reasonable chemical compositions that could be the result when directed by amine B. [1] Second, I will discuss predicting the miscibility of aqueous two-phase systems (also known as aqueous biphasic systems), which find applications for environmentally-friendly separations and extractions. Solution phase interactions are a challenge for direct simulation. Instead of using pre-computed features, we show how this problem can be treated as a graph-regularized logistic matrix factorization problem to "learn" relevant features directly from the experimental observations. Surprisingly, even with just a few hundred examples, these learned representations can outperform supervised machine learning models trained on physicochemical features.[2] Finally, I will discuss a recent perspective on limitations of existing machine learning and AI methods for the discovery of "exceptional" materials, which are by their nature outside of the training distribution. After briefly summarizing case studies of the limits of existing machine learning approaches for the discovery of high-Tc superconductors and superhard structural materials, I will describe six research directions which can address these limitations.[3] [1] Q. Ai, A. J. Norquist, J. Schrier, "Predicting compositional changes of organic-inorganic hybrid materials with Augmented CycleGAN" Digital Discovery 1, 255-265 (2022) doi: 10.1039/D1DD00044F [2] D. Behnoudfar, C. M. Simon, J. Schrier "Data-driven imputation of miscibility of aqueous solutions via graph-regularized logistic matrix factorization" J. Phys. Chem. B., 127, 7964-7973 (2023) doi:10.1021/acs.jpcb.3c03789 [3] J. Schrier, A.J. Norquist, T. Buonassisi, J. Brgoch, "In Pursuit of the Exceptional: Research Directions for Machine Learning in Chemical and Materials Science" J. Am. Chem. Soc. 145, 21699-21716 (2023) doi:10.1021/jacs.3c04783 Bio: Joshua Schrier is a physical chemist interested in using computers to accelerate the discovery of new materials, by using a combination of physics-based simulations, cheminformatics, machine learning, and automated experimentation. He is the Kim B. and Stephen E. Bepler Professor of Chemistry at Fordham University in New York City. Prior to joining Fordham in 2018, he was on the faculty at Haverford College, and a Luis W. Alvarez computational sciences postdoctoral fellow at Lawrence Berkeley National Laboratory. As a faculty member, he has received awards include the Dreyfus Teacher-Scholar, U.S. Department of Energy Visiting Faculty, and Fulbright scholar awards. Host: Ping Yang (T-1) |