Using simulation-based techniques, scientists can ask how their ideas, actions, and designs will interact with the physical world. Yet this power is not without costs. Cutting edge simulations can often take months of supercomputer time. Surrogate models and machine learning are promising alternatives for accelerating these workflows, but the data hunger of machine learning has limited their impact to data-rich domains. Over the last few years, researchers have sought to side-step this data dependence through the use of foundation models— large models pretrained on large amounts of data which can accelerate the learning process by transferring knowledge from similar inputs, but this is not without its own challenges.






