genetic programming, Pyevolve, Python

Genetic Programming and Flex layouts

To show how Genetic Programming of Pyevolve can be flexible, I’ve done a simple example using Adobe Flex and Pyevolve, the example is just to show how to evolve some kind of Flex layouts, I’ve not implemented the fitness function, this example will just create a random Flex layout using MXML. So, here is the Pyevolve code of the example:

import random
from pyevolve import *

def gp_hbox(x, y):
   return "%s %s" % (x,y)

def gp_vbox(x, y):
   return "%s %s" % (x,y)

def gp_panel(x, y):
   return "%s %s" % (x,y)

def eval_func(chromosome):
   code_comp = chromosome.getCompiledCode()

   for a in xrange(0, 5):
      for b in xrange(0, 5):
         evaluated     = eval(code_comp)
   return random.randint(1,100)

def main_run():
   genome = GTree.GTreeGP()
   genome.setParams(max_depth=5, method="ramped")
   genome.evaluator += eval_func

   ga = GSimpleGA.GSimpleGA(genome)

   button     = repr("<mx:Button label='Button'/>")
   label      = repr("<mx:Label text='Label'/>")
   text_input = repr("<mx:TextInput width='50'/>")

   ga.setParams(gp_terminals       = [button, label, text_input],
                gp_function_prefix = "gp")
   ga.setMinimax(Consts.minimaxType["minimize"])
   ga.evolve(freq_stats=5)
   print ga.bestIndividual()

if __name__ == "__main__":
   main_run()

As you can see, I’ve created the layout tags like HBox, VBox and Panel as functions of GP and the Button, Labe, TextInput as terminals of the GP, the result is very funny, it’s just a random layout, but you can use your imagination to create some nice and interesting fitness functions.

Here is the SWF generated from a random individual of the population:

I hope you enjoyed =)

News

FISL 10 – “Forum Internacional de Software Livre”

fisl10-tecposts

I’m very proud of my city today, because it is here (in Porto Alegre, RS, Brazil) that the FISL 2009 (10th) is happening ! The FISL is a conference which is attracting more than 7.000 attendees in the same time I’m writing this post. All these people (developers, companies, etc..) are here to talk and focus on “free software”. The conference is being presented with the presence of distinguished speakers like Richard Stallman (FSF), Peter Sunde (from Pirate Bay), Jon “Maddog” Hall (founder of Open Source Internation) and many other great speakers and Python people too =)  even the president of Brazil will be present tomorrow (friday) !!!

Unfortunately, almost the all sessions are in Portuguese, but this conference is a “must-go” for every developer ! See here some photos of the event.

You can find more information about FISL here:

The main site of the event

A Javalobby article about the first day

The real-time transmissions and some videos of conference

The session list of the event

News

Benford’s Law and the Iran’s election

This post is just to point some interesting analysis using Benford’s Law to check anomalies in the Iran’s election, the first is from Walter Melbane, an expert in electoral fraud, the article “Note on the presidential election in Iran, June 2009” is available here. The second paper is from Boudewijn F. Roukema from Torun Centre for Astronomy, the title is “Benford’s Law anomalies in the 2009 Iranian presidential election” and it’s available here.

I’ve done some other posts about the Benford’s Law too, like here, here and here.

genetic programming, Pyevolve, Python

Genetic Programming meets Python

I’m proud to announce that the new versions of Pyevolve will have Genetic Programming support; after some time fighting with these evil syntax trees, I think I have a very easy and flexible implementation of GP in Python. I was tired to see people giving up and trying to learn how to implement a simple GP using the hermetic libraries for C/C++ and Java (unfortunatelly I’m a Java web developer hehe).

The implementation is still under some tests and optimization, but it’s working nice, here is some details about it:

The implementation has been done in pure Python, so we still have many bonus from this, but unfortunatelly we lost some performance.

The GP core is very very flexible, because it compiles the GP Trees in Python bytecodes to speed the execution of the function. So, you can use even Python objects as terminals, or any possible Python expression. Any Python function can be used too, and you can use all power of Python to create those functions, which will be automatic detected by the framework using the name prefix =)

As you can see in the source-code, you don’t need to bind variables when calling the syntax tree of the individual, you simple use the “getCompiledCode” method which returns the Python compiled function ready to be executed.

Here is a source-code example:

from pyevolve import *
import math

error_accum = Util.ErrorAccumulator()

# This is the functions used by the GP core,
# Pyevolve will automatically detect them
# and the they number of arguments
def gp_add(a, b): return a+b
def gp_sub(a, b): return a-b
def gp_mul(a, b): return a*b
def gp_sqrt(a):   return math.sqrt(abs(a))

def eval_func(chromosome):
   global error_accum
   error_accum.reset()
   code_comp = chromosome.getCompiledCode()

   for a in xrange(0, 5):
      for b in xrange(0, 5):
         # The eval will execute a pre-compiled syntax tree
         # as a Python expression, and will automatically use
         # the "a" and "b" variables (the terminals defined)
         evaluated     = eval(code_comp)
         target        = math.sqrt((a*a)+(b*b))
         error_accum += (target, evaluated)
   return error_accum.getRMSE()

def main_run():
   genome = GTree.GTreeGP()
   genome.setParams(max_depth=5, method="ramped")
   genome.evaluator.set(eval_func)

   ga = GSimpleGA.GSimpleGA(genome)
   # This method will catch and use every function that
   # begins with "gp", but you can also add them manually.
   # The terminals are Python variables, you can use the
   # ephemeral random consts too, using ephemeral:random.randint(0,2)
   # for example.
   ga.setParams(gp_terminals       = ['a', 'b'],
                gp_function_prefix = "gp")
   # You can even use a function call as terminal, like "func()"
   # and Pyevolve will use the result of the call as terminal
   ga.setMinimax(Consts.minimaxType["minimize"])
   ga.setGenerations(1000)
   ga.setMutationRate(0.08)
   ga.setCrossoverRate(1.0)
   ga.setPopulationSize(2000)
   ga.evolve(freq_stats=5)

   print ga.bestIndividual()

if __name__ == "__main__":
   main_run()

I’m very happy and testing the possibilities of this GP implementation in Python.

And of course, everything in Pyevolve can be visualized any time you want (click to enlarge):

ramped_small

ramped_big

The visualization is very flexible too, if you use Python decorators to set how functions will be graphical represented, you can have many interesting visualization patterns. If I change the function “gp_add” to:

@GTree.gpdec(representation="+", color="red")
def gp_add(a, b): return a+b

We’ll got the follow visualization (click to enlarge):

full

I hope you enjoyed it, I’m currently fixing some bugs, implementing new features, docs and preparing the next release of Pyevolve, which will take some time yet =)

Genetic Algorithms, Time Waste

The Darwin’s cake experiment

Suppose that you are the owner of a famous bakery, and you have a recipe of a really delicious cake which is well known and desired by many of your clients. Is in this scene that enters the Darwin’s cake experiment.

Suppose that you also have nearly 1.000 clients (you are very famous hehe) that you can send new cakes done by you with different amounts of ingredients and these same clients will return to you how much they liked the new cake recipe in a rating between 1 and 10 in a way to know what is the most popular desired taste.

So I was thinking, this is an optimization problem. Your problem is to find the almost “perfect” amouts of each ingredient of the cake for you most popular clients taste. If we use a Genetic Algorithm to solve this optimization problem, we can imagine some like this:

Create, let’s say, 1.000 cakes (the individuals) with random amounts of ingredients and send them to clients evaluation (fitness function), and then take the rating returned by your clients (the fitness). So you can now create a new generation of cake recipes by applying the genetic operators on the the first generation based on the clients ratings and so on.

This is just a joke, but if a big company decides to make it real, I think it’ll be very funny and they will create the first computer-generated cake !

I was thinking too, if things like this can be done to chemical products; you can do experiments in an automated way, this is a very interesting research field for robotics and AI =)

News, Python, Science

Prime Numbers and the Benford’s Law

Today, I read a news article from the Physorg.com about the new pattern found in the Prime Numbers, the article talks about the new discovery by Bartolo Luque and Lucas Lacasa:

In a recent study, Bartolo Luque and Lucas Lacasa of the Universidad Politécnica de Madrid in Spain have discovered a new pattern in primes that has surprisingly gone unnoticed until now. They found that the distribution of the leading digit in the prime number sequence can be described by a generalization of Benford’s law.

I was very surprised by the fact that nobody have noticed that before and after read the original paper (if you are interested, read it) describing the new patterns discovered, I was very impressed and impatient to see it in pratice !

The new pattern discovered is based on the so-called GBL (Generalized Benford’s Law), which you can see in the paper at the Eq 3.1:

gbl

Where the P(d) means the probability of appearance of the leading digit d. The alpha is the exponent of the original power law distribution (for alpha = 1, the GBL reduces to the Benford’s law).

The authors says that for a given integer interval of [1,N], there exists a particular value alpha(N) for which the GBL fits with extremely good accuracy the first digit distribution of the primes appearing in that interval and showing the functional relation between alpha and N in the Eq 3.2:

functional

Where a = 1.10 +- 0.05 for large values of N. They also cite a GBL extension, but I’ll use just these formulae to plot our distributions.

So I have implemented these formulae into the simple pybenford module as follows:

def gbl(alpha, digit):
   return 1/(10**(1-alpha)-1)*((digit + 1)**(1-alpha)-digit**(1-alpha))

def calc_alpha(n, a=1.10):
   return 1/(math.log(n)-a)

def gbenford_law(alpha):
   return [gbl(alpha, digit)*100.0 for digit in xrange(1,10)]

For the reason that we are using an infinite integer sequence, we must always pick the sequence interval [1, N] where N = 10^D  (see the  Natural Density section of the paper for more information).

The next step is to create a list of prime numbers between an arbitrary interval of D=8, or [1,10^8]. In this step I used the Sieve (see more information) utility to create a file with the generated prime numbers in the cited interval, I used the follow command to get this file output:

sieve2310.exe -s 1 -e 100000000  >>sieve_n8.txt

The sieve is very fast, this will create the file “sieve_n8.txt” with nearly 66MB (don’t worry, it’s a very fast generation, it took 8 seconds for me using a Intel Core 2 Duo 2GHz).

And we are ready to use Python and pybenford to read the prime numbers, calculate the leading digits frequency and plot our result ! Here is the code I created:

import pybenford

sieve_file = open("sieve_n8.txt", "r")
prime_list = [int(prime) for prime in sieve_file]
sieve_file.close()

alpha              = pybenford.calc_alpha(10**8)
benford_law        = pybenford.gbenford_law(alpha)
prime_distribution = pybenford.calc_firstdigit(prime_list)
pybenford.plot_comparative(prime_distribution, benford_law, "Prime Numbers")

And voilà, here is the output plot showing an extremely good accuracy claimed by paper authors (click on the image to enlarge):

prime_plot

The plotting of the distributions (click to enlarge)

If you are interested on Benford’s law, there are some posts about it here and here.

I hope you liked this =)

UPDATE 10/05: Mike Loukides did a good work generalizing for other bases, thank you for sharing your experiment Mike.

UPDATE 08/08 (lol): There are many more comments about this post on Reddit, see here.

Genetic Algorithms, News, Science

Evolving autopilots could boost space slingshots

From the NewScientist article:

COULD space probes use genetic algorithms as autopilots to help them navigate the complexities of the solar system?

Deep-space missions such as NASA’s veteran z Voyager probes often rely on gravity assists. They use a planet’s gravitational field as a slingshot, which allows them to visit other celestial bodies without using up too much fuel. But programming a probe with its trajectory years ahead of time can be a problem, says Ian Carnelli of the European Space Agency in Noordwijk, the Netherlands.

Missed launch windows, unexpected winds and misbehaving rockets mean that probes hardly ever leave Earth in the planned position or velocity, and radiation pressure from solar flares can perturb the craft’s course in deep space. If the probe is out of position when it starts a gravity-assisted manoeuvre, the slingshot will be inefficient.

In the Journal of Guidance, Control and Dynamics (DOI: 10.2514/1.32633), Carnelli and colleagues Bernd Dachwald and Massimiliano Vasile suggest that a probe could navigate for itself using a genetic algorithm (GA).

(…)

Carnelli likens this to hundreds of virtual pilots flying simulated spacecraft, with the GA disposing of those that waste fuel or steer a slow course, while “breeding” the best ones together, a process akin to natural selection. “After hundreds of generations of the GA you obtain a ‘pilot’ that is an extremely good performer – able to fly the assist trajectory that uses the least propellant while reaching the next target planet faster,” he says. Carnelli has run successful simulations of GA-enabled missions to Mercury via Venus, and Pluto via Jupiter.

(…)

Read the full article.

I'm starting a new course "Machine Learning: Foundations and Engineering" for 2024.