Molecular Simulations, Epidemiology Highlights at ARC23

The Center for Research Computing hosted leading researchers in fields from molecular dynamics to public health at the Center’s 2023 Advancing Research through Computing symposium. The symposium’s theme was COVID-19 Wins and Lessons, and speakers explored the ways that advanced computing was employed in research into the myriad problems raised by the pandemic.

Following the theme, the symposium opened with two back-to-back speakers who worked on one of the great “wins.” Keynote speaker Rommie Amaro, distinguished professor of theoretical and computational chemistry at UC San Diego, and Pitt chemistry professor Lillian Chong both worked on one of the most successful and eye-opening discoveries into the functioning of the SARS-CoV-2 virus. A team led by Amaro developed a molecular dynamics simulation of the SARS-CoV-2 virus in atomic detail to show the function of the spike protein as the virus infects a human cell.

Amaro described the simulation technique as a “computational microscope” that complements physical experimentation. “Simulation combines the molecular data in a structural view and fills in the missing pieces that experimentation cannot. Experimentalists can make the environment more frozen and manipulate molecules the way that the simulation has guided them.”

A team led by Amaro explored the movement of SARS-CoV-2’s spike protein to understand how it gains access to the human cell. In a first-of-its-kind feat, the team built an AI-based workflow to more efficiently simulate the spike within the SARS-CoV-2 viral envelope comprising 305 million atoms—the most comprehensive simulation of the virus performed to date.

Vital to the effort was the WESTPA software package developed by Chong and collaborators, which created unprecedented views of systems of thousands of atoms acting over longer timescales than previously possible with even the most powerful supercomputers.

“We were shooting for the moon,” Chong said. “The head piece of the spike protein itself is one million atoms, and it opens on the timescale of seconds, way beyond the timescales we could capture before. Standard simulations would have taken years. We were able to focus on ‘rare events’ – the lucky transitions like the spike protein opening.”

The team’s study published in Nature Chemistry advanced understanding of how the opening of the spike protein – the structures protruding from the round surface of the cell of the virus -- infect the host cell. One glycan – sugar molecules covering the spike protein – acts as a lever, pushing the spike receptor from a “down” to an “up” position. Collaborators from two experimental labs validated the results of the simulations that can be viewed at the UC San Diego site or The New York Times. The team simulations included COVID atoms suspended in an airborne drop of water to model the respiratory transmission of the virus.

In recognition of this work, the team won the 2020 ACM Gordon Bell Special Prize for HPC-Based COVID-19 Research, widely known as the Nobel Prize of Supercomputing.

After the wins came the lessons. Two speakers – Pitt’s Harry Hochheiser, associate professor, in Biomedical Informatics and Director of the Models of Infectious Disease Study (MIDAS) Coordination Center, and Roni Rosenfeld, professor and Head of CMU’s Machine Learning Department – both approached the dilemmas of uncertainty in creating forecasts of the course of the pandemic from their experience of helping to guide public policy at the local, state, and federal level.

Two large sets of problematic variables affected the possibility of any kind of forecast – data and human behavior.

“The impact of the policy comes from public communication of the research that is coming out,” explained Hochheiser. “Computing is not enough.”

Rosenfeld took a philosophical approach to the question of why some forecasts work and some do not. The Delphi group he leads at CMU develops epidemiological forecasting, with a long-term vision of making it as universally accepted and useful as weather forecasting. Like Hochheiser’s MIDAS group at Pitt, the Delphi group worked to provide useful real-time information to the CDC, state and local public health agencies, healthcare providers, civic leaders, data journalists and the public.

“In predicting the weather, past and present are not debatable,” Rosenfeld explained. “We can study how the current state turns into the future state. But not in epidemiology. Uncertainty is also in the present, and uncertainty is in the past.”

Ambiguous data from the past and present contributes to the uncertainty. From area to area and institution to institution, hospitals, government and other health agencies do not define data points like hospitalizations, deaths and test positivity by the same metrics – meaning that there is not only disagreement about what will happen, but disagreement about what already has happened.

“People have vastly different reactions,” Rosenfeld said. “There was a failure to focus on the behavior of people.”

Rosenfeld advocated building health systems and information technology based on the assumption that another emergency is inevitable.

“In an emergency, the only systems likely to prove useful are those already in operation,” he explained. “Problems in the real world change faster than technology, and we must be able to ask questions overnight.”