Backpropagate
Posts
Teaching Series #9 - Monte Carlo Simulation

Teaching Series #9 - Monte Carlo Simulation

"...I would rather be vaguely right than precisely wrong..."

Dr Milos Ojdanic
July 05, 2025 • Estimated Reading Time: 8 minutes

At its heart, Monte Carlo simulation is a computational method that relies on repeated random sampling to obtain numerical results/representations. Instead of analytically trying to solve the problem, devising complex equations to model the problem, Monte Carlo uses randomness to simulate the process of the problem many times. Think of each run of the simulation as a single experiment. By observing the outcomes of millions of experiments, we can get an idea of probabilities and distributions of possible results.

The method was developed during WWII by scientists working on the Manhattan Project, who needed to simulate random neutron diffusion in fissionable material 🙂

Well, we will need to work with and imagine something way simpler 🙂 Imagine you want to calculate the area of your garden pool. But you don’t have any measuring tools at your disposal. What you can do is draw a perfect square around your pool, one whose area you can approximate by steps. After that, you can randomly start throwing small stones inside the pool (imagine your parents give you permission to do so 🙂 ). After sufficient repetition, you can approximate the area of the pool as it is equal to the ratio of the square, where the ratio is the number of stones you put in the pool over the number of stones in the square.

Why is the Monte Carlo simulation important and powerful?

Besides being versatile, such that you can use it in many different problems and domains, it is essential to provide you with probabilistic insights. Meaning, instead of a single answer, it gives you a range of possible outcomes and their probabilities. Therefore, it is very important to provide insights into understanding the risk and uncertainty of the taken actions.

How Monte Carlo Simulation Works (the basics)

Specific implementation can vary greatly depending on the problem, but almost all Monte Carlo simulations follow a similar set of fundamental steps:

Define the system/problem: First, we identify what we are trying to model or what question we are trying to answer. This involves identifying the variables, their possible ranges, and the relationships between them. For our pool example, the system was the square and the pool within it, and the question was "What is the pool area?".
Generate random inputs (sampling): For each variable that has uncertainty, we generate a random value within its defined range. For example, if a variable can be somewhere between 1 and 10, we randomly pick a number in that range. In our pool example, this is like randomly throwing the stones.
Perform deterministic calculations: Using the random inputs we just generated, we run a single "trial" or "simulation." We plug these random values into our model or equations and calculate an outcome. This part is "deterministic" because for a given set of inputs, the output is always the same. In the pool example, this was simply observing whether a stone landed in the pool or not for each throw.
Aggregate results and analyse: We repeat steps 2 and 3 many, many times (thousands, millions, even billions!). Each repetition gives us a single possible outcome. Once we have a large collection of outcomes, we can analyse them statistically. This might involve calculating averages, ranges, probabilities, or distributions. In our pool example, we aggregated by counting how many landed inside and outside the pool to form our ratio.

The Role of Random Numbers

When we say "generate random inputs," we will be in the code later asking a computer to give us a number that has no predictable pattern. For example, if we want to simulate a dice roll, we need a way to get a number between 1 and 6, where each number has an equal chance of appearing.

However, here is one fun fact: computers are inherently deterministic machines. This means that if we give them the same input, they will always produce the same output. So you may ask, how do they generate "random" numbers?

The answer is, they don't generate truly random numbers! Instead, they generate pseudo-random numbers. These are numbers generated by an algorithm (a set of instructions) that produces a sequence of numbers that appear random. If you know the starting point (called the "seed") of the algorithm, you can reproduce the entire sequence. Think of it like a very, very long, pre-calculated list of numbers that looks completely messy, but if you knew the starting point, you could write down the whole list.

For most Monte Carlo simulations, pseudo-random numbers are perfectly sufficient, as long as the sequence is long enough and the numbers are evenly distributed (meaning no obvious patterns or biases). The goal is to simulate true randomness closely enough that our statistical analysis is valid.

How Monte Carlo methods are used in Reinforcement Learning?

We learned in our previous lessons that an RL agent wants to learn a policy, which is essentially a strategy that tells what action to take in every possible state to maximise their total future reward.

Monte Carlo methods in RL learn these value functions by averaging returns over many complete episodes. An episode is a single process through which an agent explores the actions and states, collects all the rewards it can for taken actions for corresponding states, and then uses the total reward received from that episode (called the "return") to update its estimate of how good each state or state-action pair encountered in that episode was.

In our pool example, instead of estimating the pool area, the agent is estimating the "value" of a state or action. Each "stone throw" is a complete episode. By observing the "outcome" (total reward) of many episodes, the agent can average these outcomes to get a more accurate estimate of the value.

The key idea to take with you is that Monte Carlo methods only learn after an entire episode is completed. They don't try to update their understanding mid-episode. This makes them particularly useful for episodic tasks, like games, where there is a clear beginning and end. All with the aim of finding an optimal policy.

Advantages and Limitations of Monte Carlo Simulation

Advantages of Monte Carlo simulations:

Handles Complex Systems with Ease: This is arguably its biggest strength. If a system has many interacting variables, non-linear relationships, or stochastic (random) elements, analytical solutions can become impossible. Monte Carlo can simulate these complexities by simply running many trials. Think about our initial simple pool example, no complex math needed, just random throws!
Provides Probabilistic Results: Unlike deterministic models that give a single "point estimate," Monte Carlo simulations provide a distribution of possible outcomes. This means you don't just get an average; you get to see the range of possibilities, the likelihood of different outcomes, and even worst-case or best-case scenarios. This is incredibly valuable for risk assessment and decision-making under uncertainty.
Relatively Easy to Implement for Certain Problems: The core logic of generating random numbers and running simulations is conceptually straightforward. While complex problems still require careful modelling, the fundamental mechanics are often simpler to code.
Flexible and Adaptable: You can easily incorporate new variables or change assumptions in your model without having to rebuild the entire mathematical framework from scratch.

The limitations and challenges of Monte Carlo simulation. Knowing these helps us decide when it might not be the best tool or what challenges to expect.

Computational Cost: This is often the biggest challenge. To get accurate results, Monte Carlo simulations often require a very large number of trials (millions or even billions). Each trial can involve complex calculations, so the total computational time can be enormous. Take again our pool example, throwing just a few stones gives a rough, often imprecise estimate, but throwing a million gives a much better one, and takes a lot more time!
"Curse of Dimensionality": This challenge becomes more pronounced as the number of random variables (or "dimensions") in our simulation increases. The number of samples needed to adequately explore the entire space of possibilities can grow exponentially, making the simulation computationally infeasible.
Slow Convergence: While Monte Carlo methods do converge to the true answer, their rate of convergence is often relatively slow, typically proportional to the inverse square root of the number of samples. This means if you want to double the accuracy, you need four times as many samples!
Difficulty with Rare Events: If we are trying to simulate a very rare event (e.g., a specific system failure with extremely low probability), Monte Carlo can be inefficient. We might need to run an astronomically large number of simulations before you even observe the event once, not to mention enough times to get a reliable probability. More advanced techniques exist for these kinds of problems that we will cover later.

That’s all for now, too many things potentially to grasp the mind around. Let’s digest this 🙂 In the next session, we will be learning more about where Monte Carlo simulation shines, and we will be using practical examples to explain Monte Carlo Tree Search.

Sincerely, MO