AI program plays the long game to solve decades-old math problems

A game of chess requires its players to think several moves ahead, a skill that computer programs have mastered over the years. Back in 1996, an IBM supercomputer famously beat the then world chess champion Garry Kasparov. Later, in 2017, an artificial intelligence (AI) program developed by Google DeepMind, called AlphaZero, triumphed over the best computerized chess engines of the time after training itself to play the game in a matter of hours.

More recently, some mathematicians have begun to actively pursue the question of whether AI programs can also help in cracking some of the world’s toughest math problems. But, whereas an average game of chess lasts about 30 to 40 moves, these research-level math problems require solutions that take a million or more steps, or moves.

In a paper appearing on the arXiv preprint server, a team led by Caltech’s Sergei Gukov, the John D. MacArthur Professor of Theoretical Physics and Mathematics, describes developing a new type of machine-learning algorithm that can solve math problems requiring extremely long sequences of steps. The team used their new algorithm to solve families of problems related to an overarching decades-old math problem called the Andrews–Curtis conjecture. In essence, the algorithm can think farther ahead than even advanced programs like AlphaZero.

“Our program aims to find long sequences of steps that are rare and hard to find,” says study first author Ali Shehper, a postdoctoral scholar at Rutgers University who will soon join Caltech as a research scientist. “It’s like trying to find your way through a maze the size of Earth. These are very long paths that you have to test out, and there’s only one path that works.”

The use of AI to solve math problems has become increasingly popular. Google DeepMind’s AlphaProof performed at the level of a silver medalist in the 2024 International Mathematical Olympiad, a high-school level math competition. And OpenAI’s o3 program recently reasoned its way through benchmark problems in math, science, and computer programming.

The Caltech-led mathematicians are focusing not on routine problems but rather the toughest in their field. In the new study, they used AI to solve two families of problems within the Andrews–Curtis conjecture, a group theory problem first proposed 60 years ago.

While they did not solve the main conjecture itself, they disproved families of problems, referred to as potential counterexamples, which had remained open for roughly 25 years; they also made significant progress on another family of counterexamples that has been open for 44 years. Counterexamples are basically mathematical cases that would disprove an original conjecture. If the counterexamples themselves are disproved, then the original conjecture may still be true.

“Ruling out some of the counterexamples gives us confidence in the validity of the original conjecture and helps build our intuition about the main problem. It gives us new ways to think about it,” Shehper says.

Gukov says that navigating these math problems is like “getting from A to B” via convoluted routes that require thousands, millions, or even billions of steps. He compares the problems to solving an incredibly complex Rubik’s Cube.

“Can you take this scrambled, complicated Rubik’s Cube and get it back to its original state? You have to test out these very long sequences of moves, and you won’t know if you are on the right path until the very end,” says Gukov, who is also the director of Caltech’s new Richard N. Merkin Center for Pure and Applied Mathematics.

AI program plays the long game to solve decades-old math problems

The team’s AI program learned to come up with long sequences of moves—what the researchers termed “super moves”—that are unexpected, or what the researchers call outliers. This contrasts with how AI programs like ChatGPT operate.

“If you ask ChatGPT to write a letter, it will come up with something typical. It’s unlikely to come up with anything unique and highly original. It’s a good parrot,” Gukov says. “Our program is good at coming up with outliers.”

Discover the latest in science, tech, and space with over 100,000 subscribers who rely on Phys.org for daily insights.
Sign up for our free newsletter and get updates on breakthroughs,
innovations, and research that matter—daily or weekly.

To train their AI program, the researchers used a machine-learning model known as reinforcement learning. First, the team showed the AI easy problems to solve, and then progressively gave it harder and harder problems.

“It tries various moves and gets rewarded for solving the problems,” Shehper explains. “We encourage the program to do more of the same while still keeping some level of curiosity. In the end, it develops new strategies that are better than what humans can do. That’s the magic of reinforcement learning.”

At present, AI programs are typically not very good at predicting outlying, rare events that have dramatic consequences, such as financial market crashes. The team’s new algorithm cannot make predictions like this either, but it may contain the seeds of what would be required to make intelligent predictions of this nature. “Basically, our program knows how to learn to learn,” Gukov says. “It’s thinking outside the box.”

The team’s new algorithm has already made a big splash in the math community.

New data suggest most of the growth in the wage gap since 1980 comes from automation

In Other sciences

on 21 November 20225 min read

“We made a lot of improvements in an area of math that was decades old,” Gukov says. “Progress had been relatively slow, but now it’s hustling and bustling.”

In fact, three new mathematicians have joined the team—Lucas Fagan and Zhenghan Wang of UC Santa Barbara, and Yang Qiu of Nankai University in Tianjin, China—and the group has posted another preprint paper that reports solving even more families of potential counterfactuals belonging to the Andrews–Curtis conjecture.

Rather than scale up the AI models, the team’s approach has been to find new clever tricks and strategies that do not require large amounts of computing power.

“We try to demonstrate good performance on small-scale computers, easily accessible to a small academic collaboration, so that any of our colleagues around the globe can easily reproduce these results,” say the researchers.

More information:
Ali Shehper et al, What makes math problems hard for reinforcement learning: a case study, arXiv (2024). DOI: 10.48550/arxiv.2408.15332

Journal information:
arXiv

Provided by
California Institute of Technology

Citation:
AI program plays the long game to solve decades-old math problems (2025, February 12)