OSU release – CORVALLIS, Ore. – Nanosized cages may play a big role in reducing energy consumption in science and industry, and machine-learning research at Oregon State University aims to accelerate the deployment of these remarkable molecules. The porous organic cage molecules being studied at OSU are able to selectively capture gas molecules, potentially enabling huge energy savings in the myriad gas separations conducted in the chemical sector. “These porous molecular solids are like sponges that soak up gases discriminately,” said Cory Simon, assistant professor of chemical engineering and corresponding author of a study published in ACS Central Science. Together, the separation and purification of chemical mixtures is responsible for more than 10 percent of the world’s energy consumption. Porous cage molecules have nanosized cavities intrinsic to their structure, and gas molecules are attracted to and trapped within these cavities via adsorption. “But each cage adsorbs certain gases more readily than others, and this property potentially makes the cages useful for separating gas mixtures more energy-efficiently,” Simon said. However, there are thousands of these cage molecules that could be synthesized – to make even one of them and test its properties takes months in the lab – and hundreds of different chemical separations are required in industry; hence the need for a computational approach to sort through the possibilities and find the best molecule for the job at hand. Simon exploited the idea that the shape of any given cavity is responsible for which gas molecules it most readily attracts. Simon and students Arni Sturluson, Melanie Huynh and Arthur York employed an “unsupervised” machine-learning method to categorize and group together cage molecules based on their cavity shapes and, thus, adsorption properties. Unsupervised means the computer did the learning about shape/property relationships on its own; it wasn’t given any labels to instruct it. “Just show the data to the algorithm, and it automatically finds patterns – structure – in the data,” Simon said. The researchers used a training dataset of 74 experimentally synthesized porous organic cage molecules that were each computationally scanned, resulting in a 3D “porosity” image of each similar to an image generated by a CT scan. “On the basis of these 3D images, we took inspiration from a facial recognition algorithm, eigenfaces, to group together cages with similarly shaped cavities,” he said. “Using the singular value decomposition, we encoded the 3D images of the cages into lower-dimensional vectors.” Simon explains the process using the analogy of people’s faces. “Imagine you were forced to map everyone’s face onto a point in a two-dimensional scatter plot while preserving as much information as you can about the faces,” he said. “So each face is described by just two numbers, and similar-looking faces are grouped close by in the scatter plot. Essentially, the singular value decomposition performed this encoding, but for porous cage molecules.” The research demonstrated that the learned encoding captures the salient features of the cavities of porous cages and can predict properties of the cages that relate to cavity shape. “Our methods could be applied to learn latent representations of cavities within other classes of porous materials and of shapes of molecules in general,” Simon said.