A team of MIT researchers has created a ‘ChatGPT for spreadsheets’ that helps solve difficult engineering challenges more rapidly. According to the researchers, the approach could help engineers tackle extremely complex design problems, from power grid optimisation to vehicle design.
Many engineering challenges come down to the same headache – too many knobs to turn and too few chances to test them. Whether tuning a power grid or designing a safer vehicle, each evaluation can be costly, and there may be hundreds of variables that could matter.
Consider car safety design. Engineers must integrate thousands of parts, and many design choices can affect how a vehicle performs in a collision. Classic optimisation tools could start to struggle when searching for the best combination.
The new approach rethinks how a classic method, known as Bayesian optimisation, can be used to solve problems with hundreds of variables. In tests on realistic engineering-style benchmarks, such power-system optimisation, the approach found top solutions 10–100 times faster than widely used methods.
The technique leverages a foundation model trained on tabular data that automatically identifies the variables that matter most for improving performance, repeating the process to hone in on better and better solutions. Foundation models are huge artificial intelligence systems trained on vast, general datasets. This allows them to adapt to different applications.
The researchers’ tabular foundation model doesn’t need to be constantly retrained as it works toward a solution, increasing the efficiency of the optimisation process. The technique also delivers greater speedups for more complicated problems, so it could be especially useful in demanding applications such as materials development or drug discovery.
‘Modern AI and machine-learning models can fundamentally change the way engineers and scientists create complex systems,’ said Rosen Yu, a graduate student in computational science and engineering. ‘We came up with one algorithm that can not only solve high-dimensional problems, but is also reusable so it can be applied to many problems without the need to start everything from scratch,’
When scientists seek to solve a multifaceted problem but have expensive methods to evaluate success, such as crash testing a car to know how good each design is, they often use a tried-and-true method called Bayesian optimisation. This iterative method finds the best configuration for a complicated system by building a surrogate model that helps estimate what to explore next while considering the uncertainty of its predictions.
But the surrogate model must be retrained after each iteration, which can quickly become computationally intractable when the space of potential solutions is very large. In addition, scientists need to build a new model from scratch any time they want to tackle a different scenario.
To address both shortcomings, the MIT researchers utilised a generative AI system known as a tabular foundation model as the surrogate model inside a Bayesian optimisation algorithm. ‘A tabular foundation model is like a ChatGPT for spreadsheets. The input and output of these models are tabular data, which in the engineering domain is much more common to see and use than language,’ Yu said.
Just like large language models such as ChatGPT, Claude and Gemini, the model has been pre-trained on an enormous amount of tabular data. This makes it well equipped to tackle a range of prediction problems. In addition, the model can be deployed as is, without the need for any retraining.
To make their system more accurate and efficient for optimisation, the researchers employed a trick that enables the model to identify features of the design space that will have the biggest impact on the solution. ‘A car might have 300 design criteria, but not all of them are the main driver of the best design if you are trying to increase some safety parameters. Our algorithm can smartly select the most critical features to focus on,’ Yu said.
It does this by using a tabular foundation model to estimate which variables (or combinations of variables) most influence the outcome. It then focuses the search on those high-impact variables instead of wasting time exploring everything equally. For instance, if the size of the front crumple zone significantly increased and the car’s safety rating improved, that feature likely played a role in the enhancement.
One of their biggest challenges was finding the best tabular foundation model for this task, Yu sais. Then they had to connect it with a Bayesian optimisation algorithm in such a way that it could identify the most prominent design features.
‘Finding the most prominent dimension is a well-known problem in math and computer science, but coming up with a way that leveraged the properties of a tabular foundation model was a real challenge,’ Yu said.
With the algorithmic framework in place, the researchers tested their method by comparing it to five state-of-the-art optimisation algorithms. On 60 benchmark problems, including realistic situations suchas power grid design and car crash testing, their method consistently found the best solution between 10 and 100 times faster than the other algorithms. ‘When an optimisation problem gets more and more dimensions, our algorithm really shines,’ Yu added.
But their method didn’t outperform the baselines on all problems, such as robotic path planning. This likely indicates that that scenario wasn’t well defined in the model’s training data, Yu said.
In the future, the researchers want to study methods that could boost the performance of tabular foundation models. They also want to apply their technique to problems with thousands or even millions of dimensions, such as the design of a naval ship.


