Note the website (ai-contest.com) that the post links to seems to have been hijacked by a gambling site.
For the use-cases where Genetic Programming was popular, I would recommend looking at Bayesian Optimization (bayesopt) as an alternative today (I know I keep recommending the area - but I hope I do when it is relevant :-)). This is mostly because IMHO it has a principled foundation that has been productively developed further in the past few years. Here's a good book on the topic [1], and I've a tutorial as well [2]. Interestingly one of the books I had encountered when reading up on Genetic Algo. years ago was by Melanie Mitchell [3]!
Bayesopt or Genetic Programming, or any search algorithm that can operate over non-differentiable objective functions are very useful in practice. For ex, when performing model selection in the space of hyperparameters, when your model is not differentiable such as a traditional Decision Tree [4]. Or exotic use-cases like molecule discovery [5].
You can try out bayesopt using the botorch or hyperopt libraries. The latter only implements a specific bayesopt algo. which was/is popular but it seems to have been bettered of late [4].
Nice writeup, short so is more approachable than John Koza’s classic book on GP. That said, if GP looks useful to you, eventually read Koza’s book, or at least experiment with his Common Lisp code.
Also don’t confuse Genetic Algorithms (GA) with GP.
Naive question: what are the most suitable problems that Genetic Programming is to solve, despite that machine learning especially deep learning is all the rage now? Or do we have models that integrate genetic programming into deep learning?
Real-time and energy-constrained problems would be the top use cases in my mind given the current state of affairs. Beyond this, problems that are non-differentiable or where we desire strong symbolic interpretation. I think these techniques also tend to extrapolate more robustly than DNNs. Technically, any problem where algorithmic shortcuts like the chain rule of calculus break down.
I've been working on GP architectures that use short linear program tapes as part of their genome. Executing these programs can be done very quickly. Assuming the program tape is <=10kb in length and you are constraining to ~a megacycle in the interpreter, you can execute these programs 100k~1m times per second on a chunky workstation. The problems you can solve in this time & space are non-trivial. Once you evolve the desired program, you can write the interpreter for it in any language on any platform within half an hour.
Training is definitely going to be expensive but it can fit in one powerful machine. You don't need to keep a terabyte of data around in VRAM to make reasonable rates of progress. Paging out parts of the population to disk or S3 buckets is feasible with these techniques. It subdivides very nicely into however many islands you can afford to run.
Inference is essentially free and could run on the processor in your toaster oven.
These approaches can make CUDA & friends look like a Rube Goldberg machine by comparison. Being able to explain how the entire thing works to a peer in a single afternoon is perhaps another significant feature of these designs.
At the end of the day, there is a pretty good reason these techniques aren't popular - There aren't any quadratic scaling laws that constrain the architecture of GP. Only the much scarier exponential ones. I contend that the search space of all linear programs is viable, but I still don't have any proof that this is the case after chasing the rabbit for about a year. My central argument is that there are many high-quality programs out there. We just need to find one of them. I think the arguments against this tend to unfairly cast the problem as searching for a single needle in a haystack of 10^10000.
Parametric programming problems are problems whose structure (most often sparsity structure) is known.
Most quadratic and linear programming solvers assume a fully generic problem structure. You give them any matrix and they'll spit out an answer. This means that internally they must also use generic matrix operations.
If you know the full structure of the optimization problem and how it changes with respect to changes in parameters, you could in principle compile a unique solver that is fully specialized to the parameterized problem. If your problem is particularly simple, you could even enumerate all possible answers in a lookup table.
Now here is the problem: While it is not that difficult to find the sparsity pattern of the optimization problem and choose the appropriate sparse matrix library based on the problem, actually producing a unique program that does nothing but solve your parametric optimization problem is a pipe dream. Most approaches involve a lot of human ingenuity to find exploitable structures in specific problems that become useless the moment you change the problem even a little bit.
So here is my suggestion: figure out how to use genetic programming to produce fast quadratic programming solvers to specific problems.
Note that you don't necessarily need a perfect solution since you can always run a bunch of QP steps on an initial guess to nudge it towards convergence. The goal would be to reduce the number of these steps to as close to zero as possible.
Note that genetic programming is a specific subset of genetic algorithms that focuses on searching for "programs" (hence the name), typically encoded in the form of a tree structure similar to an AST, though other representations exist. In theory, you could use GP for almost any situation where you'd want to synthesize a mathematical expression or piece of code for which you have a criteria to optimize (i.e. "fitness"). In practice, since GPs are basically just semi-randomly searching a huge combinatorial space, they tend to work best in low-dimensional problems, ideally with a fitness function that is cheap to evaluate. They can work nicely for finding nonlinear symbolic formulas for regression, for example. But there's also some other cool results over the years - Hod Lipson has published several cool results in robotics with them.
Until a few years ago, the popular deep learning methods like CNNs weren't great at that kind of thing, but LLMs definitely changed that - it's safe to say that by drawing on huge amounts of data LLMs are much better at most practical programming tasks. They're not necessarily a complete replacement though, there's definitely space for hybrid approaches, eg https://arxiv.org/abs/2401.07102 or https://www.nature.com/articles/s41586-023-06924-6.
Genetic Programming and the wider field of Evolutionary Computation and Genetic Algorithms are really good at some kinds of optimization.
If the problem fits into a tree form, then Genetic Programming programming is your best friend.
If the problem can be encoded onto a set of substrings of 0s and 1s, then Genetic Algorithms are your best friend.
Likewise if your problem can be encoded using floating point numbers, then Evolutionary Strategies is your best friend.
Now... I could ruffle some feathers by saying that Genetic Algorithms and Evolutionary Strategies aren't really all that different. Same algorithms could be used for both. But historicaly they they came roughly at similar times frome different places on earth, Illinois vs Germany.
Back to GP. The cool thing about GP is that when the solution is evolved you have THE solution to the problem. No futzing about with how or what the results mean.
A big problem in the past is that GP doesn't scale well like neural networks. It is not embarrassingly parallel. It has been very limited by the types of hardware/architectures available.
I suppose with Genetic Programming, given an appropriate set of descriptive symbols, it is relatively easy to understand the final result and intuit if there is any over-fitting involved. On the other hand, machine learning results are typically a black box, the weights involved typically do not easily lend themselves to understanding the nuances of the solution.
Definitely not an expert here, but the main difference is that in machine learning the goal is to find/optimize algorithm to fit given datasets, whereas in genetic programming the goal is to find/optimize dataset to fit given algorithm. There is a lot of overlap though.
EDIT: > what are the most suitable problems that Genetic Programming is to solve
Given above, genetic algorithms are really suitable when part of your problem domain is bounded execution time. You perform n iterations of genetic algorithm and even if the result does not fit within fitness function, you still get some result that is closer to fitness function than a wild guess.
> in genetic programming the goal is to find/optimize dataset to fit given algorithm
No. Possibly you're confused between GAs and GP, a common confusion. In GP, the goal is to find an algorithm - in a real programming language, not as weights - to optimise an objective function. Often that is specified by input-output pairs, similar to supervised learning.
Big combinatorial problems still use genetic algorithms. Specifically I know it's still used for logistics routing problems that are too big for solvers like Gurobi.
Deep learning on graphs is unfortunately still a little underwhelming.
Hah, this is about the ~2010 Google ai contest “planet wars”. I participated in the 2011 iteration, ‘ants’, which was a fun but humbling experience. Anyone else try that? I recall the final winner’s solution was far less elegant than I expected after the code was released, but man was it effective.
I was recently working on simple demos for Genetic Algorithms that only use basic, programming techniques.
My first one just tries to make a 64-bit bitting that is all 1's. I made random selection, roulette wheel, and tournament. Random is there as empirical proof that GA's are smarter than chance. Tournament is doing a great job.
Next idea is student loans. For maybe 15 loans with different rates, what payment plan knocks them all out the fastest? I'm fixing the payment size with a payment plan being a list of numbers representing which loan to apply the payment to (or 0 for no payment). Each step is (a) apply payment, (b) advance all loans with interest, and (c) see if the total is zero. Fitness will be interesting given I need to optimize for minimal principle and payments.
I found some other easy examples people did online that beginners might want to recreate. They include trying to get from point A to B with bananas at random places (avoid falling), best combo of items in your backpack for travel, and ants finding the best path to get the most food in a 2D map. I'll try to dig up the links later if someone wants them. Otherwise, those descriptions alone might be a starting point.
Note the website (ai-contest.com) that the post links to seems to have been hijacked by a gambling site.
For the use-cases where Genetic Programming was popular, I would recommend looking at Bayesian Optimization (bayesopt) as an alternative today (I know I keep recommending the area - but I hope I do when it is relevant :-)). This is mostly because IMHO it has a principled foundation that has been productively developed further in the past few years. Here's a good book on the topic [1], and I've a tutorial as well [2]. Interestingly one of the books I had encountered when reading up on Genetic Algo. years ago was by Melanie Mitchell [3]!
Bayesopt or Genetic Programming, or any search algorithm that can operate over non-differentiable objective functions are very useful in practice. For ex, when performing model selection in the space of hyperparameters, when your model is not differentiable such as a traditional Decision Tree [4]. Or exotic use-cases like molecule discovery [5].
You can try out bayesopt using the botorch or hyperopt libraries. The latter only implements a specific bayesopt algo. which was/is popular but it seems to have been bettered of late [4].
[1] https://bayesoptbook.com/
[2] Part 1 https://blog.quipu-strands.com/bayesopt_1_key_ideas_GPs
[3] Found a free copy online https://www.boente.eti.br/fuzzy/ebook-fuzzy-mitchell.pdf
[4] "... Analysis of the Black-Box Optimization Challenge 2020" https://proceedings.mlr.press/v133/turner21a.html
[5] ChemBO is an example but there are others https://proceedings.mlr.press/v108/korovina20a.html
No, you're confusing GA with GP.
You're right! My bad. Thank you for pointing it out! Leaving my comment up for info on bayesopt.
Nice writeup, short so is more approachable than John Koza’s classic book on GP. That said, if GP looks useful to you, eventually read Koza’s book, or at least experiment with his Common Lisp code.
Also don’t confuse Genetic Algorithms (GA) with GP.
> Also don’t confuse Genetic Algorithms (GA) with GP.
I just lump it all under "Evolutionary Computing"...
Naive question: what are the most suitable problems that Genetic Programming is to solve, despite that machine learning especially deep learning is all the rage now? Or do we have models that integrate genetic programming into deep learning?
Real-time and energy-constrained problems would be the top use cases in my mind given the current state of affairs. Beyond this, problems that are non-differentiable or where we desire strong symbolic interpretation. I think these techniques also tend to extrapolate more robustly than DNNs. Technically, any problem where algorithmic shortcuts like the chain rule of calculus break down.
I've been working on GP architectures that use short linear program tapes as part of their genome. Executing these programs can be done very quickly. Assuming the program tape is <=10kb in length and you are constraining to ~a megacycle in the interpreter, you can execute these programs 100k~1m times per second on a chunky workstation. The problems you can solve in this time & space are non-trivial. Once you evolve the desired program, you can write the interpreter for it in any language on any platform within half an hour.
Training is definitely going to be expensive but it can fit in one powerful machine. You don't need to keep a terabyte of data around in VRAM to make reasonable rates of progress. Paging out parts of the population to disk or S3 buckets is feasible with these techniques. It subdivides very nicely into however many islands you can afford to run.
Inference is essentially free and could run on the processor in your toaster oven.
These approaches can make CUDA & friends look like a Rube Goldberg machine by comparison. Being able to explain how the entire thing works to a peer in a single afternoon is perhaps another significant feature of these designs.
At the end of the day, there is a pretty good reason these techniques aren't popular - There aren't any quadratic scaling laws that constrain the architecture of GP. Only the much scarier exponential ones. I contend that the search space of all linear programs is viable, but I still don't have any proof that this is the case after chasing the rabbit for about a year. My central argument is that there are many high-quality programs out there. We just need to find one of them. I think the arguments against this tend to unfairly cast the problem as searching for a single needle in a haystack of 10^10000.
What kinds of problems are you trying to solve that can be solved by short program tapes but aren't obvious to a human?
Parametric programming problems are problems whose structure (most often sparsity structure) is known.
Most quadratic and linear programming solvers assume a fully generic problem structure. You give them any matrix and they'll spit out an answer. This means that internally they must also use generic matrix operations.
If you know the full structure of the optimization problem and how it changes with respect to changes in parameters, you could in principle compile a unique solver that is fully specialized to the parameterized problem. If your problem is particularly simple, you could even enumerate all possible answers in a lookup table.
https://en.m.wikipedia.org/wiki/Parametric_programming
Now here is the problem: While it is not that difficult to find the sparsity pattern of the optimization problem and choose the appropriate sparse matrix library based on the problem, actually producing a unique program that does nothing but solve your parametric optimization problem is a pipe dream. Most approaches involve a lot of human ingenuity to find exploitable structures in specific problems that become useless the moment you change the problem even a little bit.
So here is my suggestion: figure out how to use genetic programming to produce fast quadratic programming solvers to specific problems.
Note that you don't necessarily need a perfect solution since you can always run a bunch of QP steps on an initial guess to nudge it towards convergence. The goal would be to reduce the number of these steps to as close to zero as possible.
Note that genetic programming is a specific subset of genetic algorithms that focuses on searching for "programs" (hence the name), typically encoded in the form of a tree structure similar to an AST, though other representations exist. In theory, you could use GP for almost any situation where you'd want to synthesize a mathematical expression or piece of code for which you have a criteria to optimize (i.e. "fitness"). In practice, since GPs are basically just semi-randomly searching a huge combinatorial space, they tend to work best in low-dimensional problems, ideally with a fitness function that is cheap to evaluate. They can work nicely for finding nonlinear symbolic formulas for regression, for example. But there's also some other cool results over the years - Hod Lipson has published several cool results in robotics with them.
Until a few years ago, the popular deep learning methods like CNNs weren't great at that kind of thing, but LLMs definitely changed that - it's safe to say that by drawing on huge amounts of data LLMs are much better at most practical programming tasks. They're not necessarily a complete replacement though, there's definitely space for hybrid approaches, eg https://arxiv.org/abs/2401.07102 or https://www.nature.com/articles/s41586-023-06924-6.
Genetic Programming and the wider field of Evolutionary Computation and Genetic Algorithms are really good at some kinds of optimization.
If the problem fits into a tree form, then Genetic Programming programming is your best friend.
If the problem can be encoded onto a set of substrings of 0s and 1s, then Genetic Algorithms are your best friend.
Likewise if your problem can be encoded using floating point numbers, then Evolutionary Strategies is your best friend.
Now... I could ruffle some feathers by saying that Genetic Algorithms and Evolutionary Strategies aren't really all that different. Same algorithms could be used for both. But historicaly they they came roughly at similar times frome different places on earth, Illinois vs Germany.
Back to GP. The cool thing about GP is that when the solution is evolved you have THE solution to the problem. No futzing about with how or what the results mean.
A big problem in the past is that GP doesn't scale well like neural networks. It is not embarrassingly parallel. It has been very limited by the types of hardware/architectures available.
But GP is a great field of exploration!
fitness evaluation can be massively parallel and scales easily..
Yes!
I suppose with Genetic Programming, given an appropriate set of descriptive symbols, it is relatively easy to understand the final result and intuit if there is any over-fitting involved. On the other hand, machine learning results are typically a black box, the weights involved typically do not easily lend themselves to understanding the nuances of the solution.
Exactly, though without some sort of parsimony pressure you can end up with rambling programs to interpret.
Genetic algorithms to fit car shapes: https://rednuht.org/genetic_cars_2/
Areas where the problem space is continuous but the solution model isn't differentiable.
An example would be a neural network over a discrete field.
Back at university I played around with boolean neural networks and this was one way of training them directly.
Definitely not an expert here, but the main difference is that in machine learning the goal is to find/optimize algorithm to fit given datasets, whereas in genetic programming the goal is to find/optimize dataset to fit given algorithm. There is a lot of overlap though.
EDIT: > what are the most suitable problems that Genetic Programming is to solve
Given above, genetic algorithms are really suitable when part of your problem domain is bounded execution time. You perform n iterations of genetic algorithm and even if the result does not fit within fitness function, you still get some result that is closer to fitness function than a wild guess.
> in genetic programming the goal is to find/optimize dataset to fit given algorithm
No. Possibly you're confused between GAs and GP, a common confusion. In GP, the goal is to find an algorithm - in a real programming language, not as weights - to optimise an objective function. Often that is specified by input-output pairs, similar to supervised learning.
Correct.
I've used GP successfully in a very particular circuit path covering problem.
GA (not GP) is commonly used for hyperparameter choice in deep learning.
Big combinatorial problems still use genetic algorithms. Specifically I know it's still used for logistics routing problems that are too big for solvers like Gurobi.
Deep learning on graphs is unfortunately still a little underwhelming.
The ai-contest.com links go to a strange, irrelevant, and suspicious website.
Scamers, spammers, online casinos, porn sites; all often take over expired domains to host their trash and collect stray traffic
Hah, this is about the ~2010 Google ai contest “planet wars”. I participated in the 2011 iteration, ‘ants’, which was a fun but humbling experience. Anyone else try that? I recall the final winner’s solution was far less elegant than I expected after the code was released, but man was it effective.
I was recently working on simple demos for Genetic Algorithms that only use basic, programming techniques.
My first one just tries to make a 64-bit bitting that is all 1's. I made random selection, roulette wheel, and tournament. Random is there as empirical proof that GA's are smarter than chance. Tournament is doing a great job.
Next idea is student loans. For maybe 15 loans with different rates, what payment plan knocks them all out the fastest? I'm fixing the payment size with a payment plan being a list of numbers representing which loan to apply the payment to (or 0 for no payment). Each step is (a) apply payment, (b) advance all loans with interest, and (c) see if the total is zero. Fitness will be interesting given I need to optimize for minimal principle and payments.
I found some other easy examples people did online that beginners might want to recreate. They include trying to get from point A to B with bananas at random places (avoid falling), best combo of items in your backpack for travel, and ants finding the best path to get the most food in a 2D map. I'll try to dig up the links later if someone wants them. Otherwise, those descriptions alone might be a starting point.
I just hope the baby is fine!
I thought it was a real baby.
Nice work! Thanks for posting.
This is from 2011, FYI.
Last I checked, genetic programming wasn’t promising, and I’m a little surprised to see people paying attention to it here.
OTOH that was similar to what people were saying about hidden layers, so YMMV