Photo by 贝莉儿 DANIST on Unsplash

How to Build a Genetic Algorithm Basic Introduction

Alb Bolush
6 min readMar 8, 2021

--

We will discuss shortly and by javascript example what is a genetic algorithm and how to build one in a few steps. It will be a basic introduction, just to understand principles, as you will see afterward it can be used in various situations with small adjustments for the particular task.

Genetic algorithms are commonly used for optimization and search problems. Where otherwise the solution algorithm would lead to simple iteration cycles that in some cases can grow exponentially or just have a bad performance.

You can see the algorithm that we will build in action https://albertbol.github.io/nn-and-ga-example/ just click on the “Genetic algorithm phrase guess” tab, generate population with default settings and click “Guess phrase” to see the visualization:

Algorithm steps

It's always easier to understand by example. Let's take a simple problem, we want to write a random phrase and let the genetic algorithm guess it.

To solve this issue we need to:
1. Create the starting population (random phrases);
2. Check how good they are by using fitness level (compare to guess phrase);
3. Check if the problem is already solved, maybe we already generated the right phrase and we can stop the algorithm, if not we continue with crossover (shuffling letters) steps;
4. Pick randomly 2 parents (phrases) with possibly the highest fitness level, to create a new child (phrase) by exchanging their genes (letters) with each other and repeat this step for each phrase in the population;
5. With those 4 steps we could already find the phrase, but what if in our starting population we don't have enough characters lets say we are missing the letter “A” in this case it does not matter how much we will shuffle and exchange letters from different phrases we still won't find the guessed phrase. That's why there is a mutation, that will be applied to each child and with a defined probability it will change the letter in a random position to a random one. In this case, we will add something new to each population and increase our chances to find the phrase we guessed;

Shortly that was all about it, that's the main idea of the genetic algorithm and it can be applied to a different set of problems, your population can be anything that you can present as genes, usually, they are represented in binary as strings of 0s and 1s, but other encodings are also possible. For example, your population will be cars, their genes will be a combination of different directions south, north, east, west, and fitness level will be a distance to finish line. Everything that you can present as a solution (genes) and write a function that will see how good it's performing (fitness level) you can use a genetic algorithm to solve it.

Time to code

Of course, when you start to code as always it’s escalating pretty quickly to details, as mentioned above we will be focusing primarily on basic grasp and will not be trying to achieve the best possible result for this particular problem, its more about introduction where you can dive more deeply later in the parts that are most interesting to you and your tasks.

Create the starting population

The size of the population is chosen by the developer and has no golden rule it's all about finding the best one for specific needs.

Check how good they are by using fitness level

Now we need to iterate through each phrase in the population and count fitness level. We will simply compare and count matching letters, then we will divide by the total length of the word to get a percentual number.

As you can see we have Math.pow function used, later in crossover we always want to pick phrases with the highest fitness level, but also if we will just use fitnessRate the closer we get to matching phrase, the slighter difference between fitness levels in the population we get and chances are big that we will be picking not the best phrase over and over again, which will lead to bad population regeneration. We can present it as a logarithmic progression, but what we can do instead is increasing fitness level exponentially the closer it gets to the matched phrase in this case chances are much bigger to pick the best phrase, and the algorithm will be much quicker generating closer phrase.

Crossover cycle

First, we will need to normalize our pow fitness rate (convert numbers to floating points), we will be using percents, to have a pow fitness rate in a range from 0 to 1.

Now we can exchange genes:

Let's take a closer look at DNA changes function. Our mission is to regenerate the population with better phrases. So we are iterating through and selecting 2 parents, then we apply crossover to create a new child (phrase), afterward we apply possible mutation and push the phrase back to array.

Selection:

There are many strategies how to pick 2 parents for a crossover, main problems are performance and quality of fitness rate picked, when we have giant population arrays we don't want to compare and search each time best ones with iterations and also we want to keep the random effect, to avoid crossover patterns with exactly same parents all the time.

We are generating a random number from 0 to 1 and then subtracting our normalized pow fitness rate until a random number will be less than 0. Higher powRate has much more chances to get picked and also we are rarely iterating deep enough because every index is decreasing random number.

Now when we have parents picked, time for crossover:

We are picking a random crossover midpoint that will divide 2 parents and then join them into a new phrase. As in the example, the image below has a midpoint index 1 and we apply it to the first parent phrase bee, so we take b from the first parent and the rest from another one leading to a new combination bat.

Mutation:

We are going through each letter in a newly spawned phrase, generating a random number and checking if it's smaller than our mutation rate to replace a character with a random one. In this case, it's a 1% chance for mutation, same as population size there is no strict rate number, and it's specific to your task usually between 0.1–0.01.

Congratulations 🎉 🥳

We went through the main genetic algorithm logic, now we just need to repeat the process until we will find the phrase with a fitness level equal to 1. The final connected code you can see on github gist and play around with the working solution https://albertbol.github.io/nn-and-ga-example/.

Summary

It was a lot to cover, but hopefully, it was easy enough to grasp the basic idea behind the genetic algorithm. I want to thank and recommend watching “The Coding Train” youtube channel “Genetic Algorithm” playlist, to dive a little bit deeper into the subject.

Also, feel free to check out my article for color prediction neural network algorithm “How to build an artificial neural network basic algorithm” (https://medium.com/codex/how-to-build-a-genetic-algorithm-basic-introduction-c6a7cd503499)

Thanks for reading.

--

--