Teaching Series #1 - Evolutionary Algorithms

"...In the chain of life all links are equally valuable because all prove equally necessary..."

Evolutionary Algorithms are called many names, at one point even AI, but in essence, they are a class of optimisation algorithms that mimic biological evolution processes like natural selection, mutation, and crossover to find optimal solutions to complex problems. Think of it like a survival-of-the-fittest competition, but for potential fittest solutions to a problem!

What are evolutionary algorithms, and where do they come from?

Imagine trying to solve a super complex puzzle, like finding the best design for an airplane wing, or figuring out the perfect recipe for a new drug or simply a new cookie you would like to bake. These problems often have so many possible solutions that you couldn't possibly check them all one by one. This is where evolutionary algorithms come in!

They're a type of metaheuristic optimisation algorithm (scary terms, but don't worry, we will break them down!) that is inspired by the process of natural selection and evolution. Think about how species adapt and change over millions of years to become better suited to their environment. The "fittest" individuals survive and pass on their traits, while less fit ones die out.

In the world of evolutionary algorithms, we apply this "survival of the fittest" idea to potential solutions to a problem. We start with a bunch of random solutions, let them "compete," and then allow the "better" solutions to "reproduce" and create new, hopefully even better, solutions. It's like speeding up evolution in a computer to find great answers!

For example, if you wanted to design a perfectly aerodynamic car, instead of engineers trying every single possible shape, an evolutionary algorithm could generate thousands of random car shapes, test how "fit" they are (how aerodynamic they are), and then combine the best features of the best designs over many "generations" until it finds an amazing solution.

Key components:

  • Population: In an evolutionary algorithm, we don't just work with one potential solution. We work with a whole group of solutions at the same time. This group is called the "population." Think of it like a diverse collection of different species in a natural ecosystem, all trying to survive.

    • Analogy: If you're trying to find the best cookie recipe, your initial "population" might be 10 different random cookie recipes you found online – some good, some bad, some just okay.

  • Individuals (or Chromosomes/Genotypes): Each single potential solution within that population is called an "individual." In the context of the algorithm, these individuals are often represented as a set of parameters or a coded structure, much like how DNA forms the "blueprint" of a living organism. These coded representations are sometimes called "chromosomes" or "genotypes."

    • Analogy: Each of those 10 cookie recipes from our population would be an "individual." The specific ingredients and quantities in each recipe (e.g., 2 cups flour, 1 cup sugar, 1 egg) would be its "chromosome" or "genotype."

# Think of the following example:
Dog_A = {"size": "big", "color": "brown", "speed": "fast"}
Dog_B = {"size": "small", "color": "white", "speed": "medium"}
Dog_C = {"size": "medium", "color": "black", "speed": "slow"}
# ...and so on for the rest of the population.

"""
Each dog is an "individual," and its features (size, color, speed) represent its "chromosome" or "genotype," which together make up the "population" of dogs.
"""

Third crucial component: the Fitness Function!

In natural evolution, "fitness" is all about an organism's ability to survive and reproduce in its environment. In evolutionary algorithms, we need a way to measure how "good" each solution in our population is at solving the problem. This is where the fitness function comes in.

The fitness function is like a judge or a grading system. It takes an individual solution as input and outputs a single numerical value that represents how well that solution performs. The higher the fitness value, the "fitter" or "better" the solution is.

  • Why is it important? Without a fitness function, the algorithm wouldn't know which solutions are good and which are bad. It's the compass that guides the evolutionary process towards better and better solutions.

  • Analogy: Going back to our cookie recipe example, the fitness function would be your taste family (serve as a panel of judges!). You bake each of the 10 recipes, taste them, and give each one a score based on how delicious it is. The higher the score, the "fitter" the cookie recipe. For the airplane wing design, the fitness function might be a simulation that calculates how much lift and drag each wing design generates.

So, for our dog population, the fitness function could be how fast each dog can run or how well it can jump. The dog that runs faster or is better at jumping gets a higher fitness score related to competing in a dog’s Olympics in the category of catching a disk.

Quiz Question: If an evolutionary algorithm is trying to find the shortest route for a delivery truck, what would be a good measure for its fitness function?

A) The number of turns the truck makes. B) The total distance of the route. C) The amount of fuel consumed. D) The type of roads used.

The total distance of the route (B) would be the most direct and effective measure for the fitness function. The goal is to minimise the distance, so a lower distance value would indicate a higher fitness.

Selection

Now that we have our population of individuals and a way to measure their fitness, the next step is Selection. This is where the "survival of the fittest" really comes into play, as we choose which individuals get to pass on their "genes" to the next generation.

Selection is the process of choosing individuals from the current population to become "parents" for the next generation. The idea is simple: individuals with higher fitness values (the "fitter" ones) have a better chance of being selected. This ensures that the good traits are more likely to be passed on.

There are several ways to implement selection, but two common ones are:

  1. Roulette Wheel Selection: Imagine a roulette wheel where each slice of the wheel represents an individual in the population. The size of an individual's slice is proportional to its fitness. So, fitter individuals get a larger slice. When the wheel spins, the individual it lands on is selected. This means fitter individuals have a higher probability of being chosen, but even less fit ones still have a small chance, which helps maintain diversity.

    • Visual Analogy: Think of a game show where the prize wheel has sections. The contestants who did better in the earlier rounds get larger sections on the wheel for the final prize, increasing their chances of winning.

  2. Tournament Selection: In this method, a small group of individuals is randomly chosen from the population (this is the "tournament"). These individuals then "compete" against each other (where competing is often simply which one has a higher fitness score), and the one with the highest fitness in that small group is selected to be a parent. This process is repeated until enough parents are chosen.

    • Analogy: Like a mini-tournament where a few athletes compete, and only the winner gets to advance to the next round.

Both methods aim to favour fitter individuals while still allowing some chance for less fit ones to contribute, which can prevent the algorithm from getting stuck on a local optimum (a good but not the best solution).

It’s time for reproduction

Now that we've selected our "parent" individuals (the fittest ones), it's time for them to reproduce and create new "offspring" for the next generation. This brings us to Crossover (also known as recombination).

Crossover is the evolutionary equivalent of two parents combining their genetic material to create a new individual. In evolutionary algorithms, it's the process where parts of two selected parent solutions are combined to create one or more new "child" solutions. This is how the algorithm explores new combinations of traits and potentially finds even better solutions than the parents.

Think of it like shuffling two decks of cards, but only taking half of each deck to form a new, mixed deck.

Here's a simple visual example using two parent "chromosomes" represented as sequences of values (like genetic code):

Let's say we have two parent solutions for a problem, represented as a series of numbers (their "genes"):

  • Parent 1: [1, 2, 3, 4, 5, 6]

  • Parent 2: [A, B, C, D, E, F]

Now, we choose a "crossover point." Let's say we pick the point after the third position:

Parent 1: [1, 2, 3 | 4, 5, 6] Parent 2: [A, B, C | D, E, F]

With one-point crossover, we swap the parts after the crossover point:

  • Offspring 1: Takes the first part of Parent 1 and the second part of Parent 2: [1, 2, 3, D, E, F]

  • Offspring 2: Takes the first part of Parent 2 and the second part of Parent 1: [A, B, C, 4, 5, 6]

As you can see, the two offspring are new combinations of their parents traits! This allows the algorithm to mix and match good characteristics from different successful solutions.

What if a crucial piece of the perfect solution isn't present in any of our current parent solutions?

This is where our final key component, Mutation, comes into play.

Mutation is the evolutionary process of random, small changes occurring in the genetic material. In evolutionary algorithms, it involves introducing small, random alterations to the "genes" (parameters or structure) of an individual solution.

  • Why is it important? While crossover efficiently combines existing good traits, mutation is vital for introducing entirely new information or characteristics into the population that might not have been present before. Without mutation, the algorithm might get "stuck" with a limited set of solutions and never find truly novel or optimal ones. It helps prevent premature convergence to a suboptimal solution.

  • Analogy: Imagine our cookie recipe again. Crossover lets you combine the best aspects of two delicious recipes. But what if the secret ingredient for the perfect cookie (say, a pinch of sea salt) wasn't in any of your initial recipes? Mutation would be like accidentally dropping a tiny bit of sea salt into one of your new batches, leading to a surprisingly better result! It's a random 'oops' that turns into a 'wow'!

Let's go back to our offspring from the crossover example:

  • Offspring 1: [1, 2, 3, D, E, F]

Now, we apply a small chance of mutation to each "gene." Let's say, by chance, the D mutates into a Z:

  • Mutated Offspring 1: [1, 2, 3, Z, E, F]

This small, random change could be just what's needed to unlock a significantly better solution, or it could make it worse – it's a random process! However, because of selection, the good mutations will be kept and propagated, while the bad ones will likely die out.

That is really all when it comes to components!

Evolutionary Cycle

An evolutionary algorithm follows, generation after generation, to find optimal solutions. Think of it as a continuous loop of improvement!

Here are the typical steps in the cycle:

  1. Initialization:

    • We start by creating an initial population of individuals. These are often generated randomly, like throwing a bunch of darts at a dartboard to see where they land – some will be close to the centre, most won't! This ensures a diverse set of starting points.

  2. Evaluation:

    • Next, every individual in the current population is evaluated using the fitness function. We calculate how "good" each solution is at solving our problem and assign it a fitness score.

  3. Selection:

    • Based on their fitness scores, individuals are selected to become "parents" for the next generation. As we discussed, fitter individuals have a higher chance of being chosen.

  4. Reproduction (Crossover & Mutation):

    • The selected parents then undergo crossover to combine their "genetic material" and create new "offspring."

    • These new offspring then experience mutation, where small random changes might occur in their "genes." This step introduces novelty and helps explore new areas of the solution space.

  5. Replacement (or Next Generation):

    • The newly created offspring (often along with some of the best individuals from the previous generation) form the new population. The old population is typically discarded.

This entire cycle (Evaluation, Selection, Reproduction, Replacement) then repeats over many generations. Each time, the population ideally becomes "fitter" as better solutions are found, combined, and slightly altered. The process continues until a satisfactory solution is found, or a predefined number of generations has passed, or there's no significant improvement in fitness.

It's a continuous loop, much like how natural evolution is a continuous process of adaptation!

Different types of evolutionary algorithms (for more curious and ambitious students)

The umbrella term "Evolutionary Algorithms" actually covers several specific algorithms. The most well-known are:

  1. Genetic Algorithms (GAs):

    • What they are: These are the most common type and what we've largely been discussing so far. They typically represent individuals as binary strings (sequences of 0 and 1, like DNA), though other representations are also used.

    • Focus: They emphasise crossover (recombination) as the primary search operator, with mutation playing a secondary, but still crucial, role.

    • Applications: Widely used for optimisation, scheduling, engineering design, and more.

  2. Evolutionary Strategies (ES):

    • What they are: These algorithms typically represent individuals as real-valued vectors (lists of numbers, not only 1 or 0).

    • Focus: They place a stronger emphasis on mutation as the main search operator, often with self-adaptive mutation rates (meaning the algorithm learns the best mutation rate as it runs!). Crossover is less prominent or sometimes even absent.

    • Applications: Very effective for continuous optimisation problems, like tuning parameters in a control system or optimising design shapes.

  3. Genetic Programming (GP):

    • What they are: This is a super interesting one! Instead of evolving simple strings of numbers, Genetic Programming evolves computer programs or trees that can solve a problem. Imagine the algorithm writing its own software!

    • Focus: The "individuals" are actual programs, and the crossover and mutation operations manipulate the structure of these programs (e.g., swapping sub-trees of code).

    • Applications: Used for tasks like symbolic regression (finding mathematical formulas), automatic program generation, and robotics control. It's like having a computer invent a new algorithm for you!

While these have their differences, remember they all share the fundamental evolutionary principles we've discussed: population-based search, fitness evaluation, selection, and the use of variation operators (crossover and mutation). They're just different "flavours" of the same evolutionary idea!

Advantages and disadvantages of using evolutionary algorithms (EA).

Like any powerful tool, they have their strengths and weaknesses.

Advantages:

  1. Robustness (Handles Complex Problems): EAs are incredibly good at finding solutions in very large, complex search spaces where traditional optimisation methods might fail (w.r.t.: class of mathematical techniques as calculus and gradient, etc. - will be covered in future). They don't need a lot of information about the problem's mathematical structure (like gradients), which makes them suitable for "black-box" problems where you just know inputs and outputs.

    • Imagine: Trying to find the highest point on a mountain range in the dark. Traditional methods might only find the peak of the hill you're on. EAs are like sending out many explorers who randomly wander, some find higher ground, and then those areas are further explored – eventually, someone finds the highest peak!

  2. Global Optimisation Capability: Because they work with a population of solutions and use mutation, EAs are less likely to get stuck in "local optima" (a good solution that isn't the absolute best overall). They can explore different parts of the solution space simultaneously.

  3. No Derivative Information Needed: Many traditional optimisation algorithms require you to know the derivatives of the function you are optimising. EAs don't need this, which makes them very flexible for problems where such information is hard or impossible to get.

  4. Parallelizable: Since individuals in a population are evaluated independently, many parts of an EA can be run in parallel, speeding up the process on modern computers.

Disadvantages:

  1. Computational Cost: EAs can be computationally expensive, especially for complex problems with large populations and many generations. Each needs to be evaluated, and that can take a lot of processing power.

    • Think: Our cookie algorithm is fast, but imagine evaluating the aerodynamic performance of thousands of complex airplane designs!

  2. No Guarantee of Global Optimum: While EAs are good at avoiding local optima, they don't guarantee finding the absolute best possible solution (the "global optimum") in finite time. They find a "good enough" or "very good" solution within the given constraints.

  3. Parameter Tuning: The performance of an EA heavily depends on choosing the right parameters (e.g., population size, mutation rate, crossover rate). Finding the optimal values for these parameters often requires trial and error.

  4. Problem Representation: Designing an effective "chromosome" (representation of a solution) and a suitable "fitness function" can sometimes be challenging and is critical to the algorithm's success.

So, EAs are powerful tools, especially for tricky problems, but they come with their own set of considerations.

If you would like to explore more in-depth, having your hands on the code itself (it is the best way of learning, I agree), please visit the repository of the backpropagate.me

That’s all for this lecture. Happy learning!

Sincerely, MO