Super Mario, scientist
What video game speedrunning can teach us about automating scientific discovery
Super Mario 64 was one of my favourite games as a child — I still remember racing the penguin down the icy slide in ‘Cool, Cool, Mountain’, or plucking the star from the tail of a giant eel in ‘Jolly Roger Bay’. These whimsical challenges contribute to the game’s enduring appeal: nearly three decades on from its release, a community of ‘speedrunners’ are still actively competing to see who can collect all the game’s 120 stars1 the fastest.
Speedrunners contest world records by having either the best execution of the fastest known route, or by discovering and exploiting newer, faster, routes. In September, the world record 120-star run stood at 1 hour, 37 minutes, and 35 seconds — compared to the 20 or so hours it takes your average player. In his world record run, the speedrunner Weegee plays with sustained machine-like precision; it’s hard to believe that anyone could shave off the extra 36 seconds needed to break through the 1:37 barrier. By now, the standard 120-star route is so thoroughly optimised that the time advantage from most new routes is measured in fractions of a second. It seemed there was little left to discover in this 27-year-old-game.
That is, until September, when a new ‘holy grail’ trick was discovered that cuts about a minute off the 120 star run time. What I found fascinating about this discovery is not the specifics of the trick per se, but the process by which it was found. Even if you’re not interested in video games, or Super Mario 64 speedrunning, if you bear with me I think you’ll find the story interesting too.
One of the longest sections of the 120-star run occurs in a level called ‘Rainbow Ride’, in which Mario is supposed to ride a slow magic carpet along a rainbow track to get a star atop ‘The big house in the sky’. The big house in question (more like a castle) is shown below, and you can browse a view of the whole level here if you’re so inclined.
This magic carpet ride takes about 2 minutes — painfully slow for a speedrun2. For decades, the speedrunning community thought that it might be possible to skip the carpet ride and jump straight to the top of the house. This would be the so-called ‘carpetless’ strategy. While there had been tantalisingly close attempts at getting up in the past, nobody had been able to find a way up that was actually feasible for a human to reliably execute. At least, not until the programmer Krithalith released a video of a jump technique he discovered and named ‘Orthogonal Jones’
Orthogonal Jones involves a precise sequence of jumps starting from the window on the left side of the house up onto the slanted green roof. Krithalith didn’t find this sequence himself, after all, no human has found a way of getting up in decades of attempts. Orthogonal Jones was found by an algorithm called ‘scattershot’.
Scattershot is a sort of empirical machine gun; it uses brute force search with random inputs to find the shortest path to collect a star from a given starting point. By simulating the potential paths of thousands, or millions, of Marios, scattershot can discover new optimal routes through levels3.
This isn’t a video of scattershot or Super Mario 64, but it’s a nice illustration of the principle.
However, random exploration by itself is unlikely to converge on the best solution. The problem with random search methods is that the longer a sequence needs to be, the greater the processing power and time required to explore all the options; random search is inefficient for long input sequences. Krithalith got around this by incorporating learning into the algorithm, and saving shorter optimal move sequences that can be used as staging points for random exploration4. These short sequences can then be chained together to find the best overall long sequence.
As Krithilith developed scattershot, he benchmarked it against established routes and was able to shave off a few fractions of a second here and there. But in September 2023, scattershot was ready to take on carpetless. As the YouTuber Karl Jobst recounts in his video about the discovery:
“In September of 2023 Krithalith would test scattershot on Rainbow Ride and see if he could solve carpetless. He placed Mario on the windowsill and left scattershot running. Within an hour it had already found its way to the roof. An hour doesn't seem like a long time, but [scattershot is] so fast that 1 hour of scattershot playing Super Mario 64 is equivalent to over 100 years of real time.”
It took scattershot under an hour to discover what no human was able to discover in decades.
Even though Super Mario 64 is a constructed system, built by humans, we don’t fully understand what is possible within the system. The ‘rules’ of Mario’s world are known and programmed, but there are emergent interactions between these rules that mean some behaviour is unpredictable from first principles. After 27 years of exploration, we are still discovering new things that are feasible in the system.
The story of carpetless demonstrates that brute empiricism may be the only feasible route to discovering the possibilities present in even simple systems. Many particulars of a system may be compatible with the same underlying principles, and finding the particulars that actually exists requires someone (or something) to go and probe them.
This is a relevant lesson for science in general. In science, as opposed to a video game, we don’t know the underlying rules from the outset — natural systems are governed by ‘hidden rules.’ There are two complementary ways of knowing in science that help uncover these hidden rules. The first is building an intuitive or theoretical model of the shape of a system. Recalling the joke about the spherical cow, these models are useful approximations. They may smooth over certain details, but they provide a general sense of how the systems work, where its boundaries lie, and what may be feasible within the constraints of the system. The other way of knowing is trial-and-error (empirical) testing. This engages with the spiky reality to probe systems experimentally and find where truth diverges from smooth models.
Intuitive or theoretical models are needed to understand the general principles that describe a system, and empirical testing is needed to validate theories and find places where theory diverges from reality. This entails a kind of trade-off between theoretical and empirical understanding; making models of systems is a type of information compression, but empirical testing is a sort of anti-compression that brings details back in.
In the case of carpetless, the speedrunning community had enough of an intuitive understanding of the world of Super Mario 64 built up that they had a hypothesis that it was possible to get up to the roof of the house by jumping from the windowsill, they just weren’t sure how to concretely achieve it. They did know roughly where to point the empirical machine gun, however, so this combination of theory and empiricism played out to discover Orthogonal Jones.
This is analogous to what happens in science, where ‘scientific taste’ guides experimental directions. The sort of ‘tool-assisted’ route finding that was employed to solve carpetless may also be a useful case study for those looking to automate aspects of scientific research (e.g. Future House). A productive general strategy for scientific automation could be to combine tools for rapid, semi-random exploration with people or models able to direct that exploration to useful areas.
Of course, reality is not as easy to probe as a video game like Super Mario 64. It lacks the ability to save and resume from the save, the ability to emulate the whole system in full and speed it up, and it lacks clear optimisable goals. But what if we found ways to make reality more video-game like?
At first, it seems there’s no obvious analogy to ‘collecting a star’ in the scientific process. However, if we consider that the goal of science is to build ever better theoretical models of systems, then there is a potential way to identify functions that can be optimised. Systems can take on different states, and the set of possible states and state transitions is defined by underlying rules. Science is the process of finding those rules, but only the states and the paths between them are visible to the scientist. We can say that a scientist understands a system when they know all the possible states and how to perturb the system to move between them. With this framing, the analogy to finding an optimal star-collecting route is finding the shortest sequence of perturbations that takes a system from some specified state A to a desired state B. Having a perfect model of the system is equivalent to having a function that spits out the shortest path between any two states.
In practice, this means that a way to approach automating science could be to collect voluminous statistics on system states (deeply characterising both healthy and unhealthy model systems, say) then using scattershot-like guided randomness to find interventions that lead to the shortest path between those states. This approach creates its own function that can then be optimised. If you’re trying to understand a disease, finding the quickest path to reversing that disease and restoring a healthy state tells you a great deal about which factors are most important in the disease process.
Video game-like environments are also potentially valuable to test scientific automation systems. Some natural processes proceed too slowly to use as a test bed for scientific automation processes. Instead, we could create simulated versions of natural phenomena that are well-understood which can be run much faster than real time and loaded with defined states. If scientific automation tools prove capable of uncovering the hidden rules of these simulations, then they could be unleashed on real-world problems. This would greatly speed up the design-test-refine cycle compared to only testing scientific automation techniques ‘in the wild.’
In Super Mario 64, at least, Krithalith’s discovery is already leading to concrete progress. In October, the speedrunner Karin used Orthogonal Jones to set a new world record 120-star time, the first to break the 1 hour 37 minute barrier.
For the uninitiated, a star is a collectable item that looks like a 5-pointed yellow star with eyes. Each level in Super Mario 64 has 7 stars to collect, and collecting each star presents its own unique challenge
By representing each state the game could be in, and all possible inputs given the state, finding the best route to a star becomes a graph traversal problem