News: Algorithm searches for models that best explain experimental data

From the original news article from PhysOrg:

An evolutionary computation approach developed by Franklin University’s Esmail Bonakdarian, Ph.D., was used to analyze data from two classical economics experiments. As can be seen in this figure, optimization of the search over subsets of the maximum model proceeds initially at a quick rate and then slowly continues to improve over time until it converges. The top curve (red) shows the optimum value found so far, while the lower, jagged line (green) shows the current average fitness value for the population in each generation.

(…)

Regression analysis has been the traditional tool for finding and establishing statistically significant relationships in research projects, such as for the economics examples Bonakdarian chose. As long as the number of independent variables is relatively small, or the experimenter has a fairly clear idea of the possible underlying relationship, it is feasible to derive the best model using standard software packages and methodologies.

However, Bonakdarian cautioned that if the number of independent variables is large, and there is no intuitive sense about the possible relationship between these variables and the dependent variable, “the experimenter may have to go on an automated ‘fishing expedition’ to discover the important and relevant independent variables.”

You can see the original research paper here.

Invite: PyCon US 2011 – Genetic Programming in Python

If you are going to PyCon US 2011, I would like to invite you to the talk “Genetic Programming in Python“, the talk will be given by Eric Floehr on March 12th 1:20 p.m. – 2:05 p.m.

Here is the abstract:

Did you know you can create and evolve programs that find solutions to problems? This talk walks through how to use Genetic Algorithms and Genetic Programming as tools to discover solutions to hard problems, when to use GA/GP, setting up the GA/GP environment, and interpreting the results. Using pyevolve, we’ll walk through a real-world implementation creating a GP that predicts the weather.

(…)

Genetic Algorithms (GA) and Genetic Programming (GP) are methods used to search for and optimize solutions in large solution spaces. GA/GP use concepts borrowed from natural evolution, such as mutation, cross-over, selection, population, and fitness to generate solutions to problems. If done well, these solutions will become better as the GA/GP runs.

GA/GP has been used in problem domains as diverse as scheduling, database index optimization, circuit board layout, mirror and lens design, game strategies, and robotic walking and swimming. They can also be a lot of fun, and have been used to evolve aesthetically pleasing artwork, melodies, and approximating pictures or paintings using polygons.

GA/GP is fun to play with because often-times an unexpected solution will be created that will give new insight or knowledge. It might also present a novel solution to a problem, one that a human may never generate. Solutions may also be inscrutable, and determining why a solution works is interesting in itself.

Health and Genetic Algorithms

From R&D Mag – Developing a potential life-saving mathematical tool -:

Math and medicine are coming together to help people who have suffered an abdominal aortic aneurysm, which with 15,000 is the 13th-leading cause of death in the United States.

At the heart of the effort are genetic algorithms written by Oak Ridge National Laboratory researchers that allow physicians to more efficiently assess and organize the often vast amounts of information contained in patient reports. Ultimately, with this tool—a sophisticated way to quickly extract key phrases—doctors will be able to characterize features and findings in reports and provide better patient care.

(…)

This work builds on previous studies involving genetic algorithms developed for mammography. That system allows doctors to quickly identify trends specific to an individual patient and match images and text to a database of known cancerous and pre-cancerous conditions.

Darwin on the track

From The Economist article:

WHILE watching the finale of the Formula One grand-prix season on television last weekend, your correspondent could not help thinking how Darwinian motor racing has become. Each year, the FIA, the international motor sport’s governing body, sets new design rules in a bid to slow the cars down, so as to increase the amount of overtaking during a race—and thereby make the event more interesting to spectators and television viewers alike. The aim, of course, is to keep the admission and television fees rolling in. Over the course of a season, Formula One racing attracts a bigger audience around the world than any other sport.

Meanwhile, at the Hall of Justice!

UPDATE 05/10: there is an article in the Physorg too.

Sometimes we face new applications for EC, but for this I was not expecting, from Eurekalert:

WASHINGTON, Oct. 5 — Criminals are having a harder time hiding their faces, thanks to new software that helps witnesses recreate and recognize suspects using principles borrowed from the fields of optics and genetics.

(…)

His software generates its own faces that progressively evolve to match the witness’ memories. The witness starts with a general description such as “I remember a young white male with dark hair.” Nine different computer-generated faces that roughly fit the description are generated, and the witness identifies the best and worst matches. The software uses the best fit as a template to automatically generate nine new faces with slightly tweaked features, based on what it learned from the rejected faces.

“Over a number of generations, the computer can learn what face you’re looking for,” says Solomon.

Genetic Programming meets Python

I’m proud to announce that the new versions of Pyevolve will have Genetic Programming support; after some time fighting with these evil syntax trees, I think I have a very easy and flexible implementation of GP in Python. I was tired to see people giving up and trying to learn how to implement a simple GP using the hermetic libraries for C/C++ and Java (unfortunatelly I’m a Java web developer hehe).

The implementation is still under some tests and optimization, but it’s working nice, here is some details about it:

The implementation has been done in pure Python, so we still have many bonus from this, but unfortunatelly we lost some performance.

The GP core is very very flexible, because it compiles the GP Trees in Python bytecodes to speed the execution of the function. So, you can use even Python objects as terminals, or any possible Python expression. Any Python function can be used too, and you can use all power of Python to create those functions, which will be automatic detected by the framework using the name prefix =)

As you can see in the source-code, you don’t need to bind variables when calling the syntax tree of the individual, you simple use the “getCompiledCode” method which returns the Python compiled function ready to be executed.

Here is a source-code example:

from pyevolve import *
import math

error_accum = Util.ErrorAccumulator()

# This is the functions used by the GP core,
# Pyevolve will automatically detect them
# and the they number of arguments
def gp_add(a, b): return a+b
def gp_sub(a, b): return a-b
def gp_mul(a, b): return a*b
def gp_sqrt(a):   return math.sqrt(abs(a))

def eval_func(chromosome):
global error_accum
error_accum.reset()
code_comp = chromosome.getCompiledCode()

for a in xrange(0, 5):
for b in xrange(0, 5):
# The eval will execute a pre-compiled syntax tree
# as a Python expression, and will automatically use
# the "a" and "b" variables (the terminals defined)
evaluated     = eval(code_comp)
target        = math.sqrt((a*a)+(b*b))
error_accum += (target, evaluated)
return error_accum.getRMSE()

def main_run():
genome = GTree.GTreeGP()
genome.setParams(max_depth=5, method="ramped")
genome.evaluator.set(eval_func)

ga = GSimpleGA.GSimpleGA(genome)
# This method will catch and use every function that
# begins with "gp", but you can also add them manually.
# The terminals are Python variables, you can use the
# ephemeral random consts too, using ephemeral:random.randint(0,2)
# for example.
ga.setParams(gp_terminals       = ['a', 'b'],
gp_function_prefix = "gp")
# You can even use a function call as terminal, like "func()"
# and Pyevolve will use the result of the call as terminal
ga.setMinimax(Consts.minimaxType["minimize"])
ga.setGenerations(1000)
ga.setMutationRate(0.08)
ga.setCrossoverRate(1.0)
ga.setPopulationSize(2000)
ga.evolve(freq_stats=5)

print ga.bestIndividual()

if __name__ == "__main__":
main_run()

I’m very happy and testing the possibilities of this GP implementation in Python.

And of course, everything in Pyevolve can be visualized any time you want (click to enlarge):

The visualization is very flexible too, if you use Python decorators to set how functions will be graphical represented, you can have many interesting visualization patterns. If I change the function “gp_add” to:

@GTree.gpdec(representation="+", color="red")
def gp_add(a, b): return a+b

We’ll got the follow visualization (click to enlarge):

I hope you enjoyed it, I’m currently fixing some bugs, implementing new features, docs and preparing the next release of Pyevolve, which will take some time yet =)

Digital Archeology Reveals Dinosaur Details using Genetic Algorithms

From the article of LiveScience.com:

The pick and shovel can go only so far in digging up details about dinosaurs. Now supercomputers are revealing knowledge about their anatomy otherwise lost to history.

(…)

For example, if the muscles connected to the thigh bone of a Tyrannosaurus rex were short, that would suggest it was angled vertically as in humans. However, if they were very long, it could have been angled horizontally as in birds.

Initial attempts to randomly decipher which pattern of muscle activation works best result almost always in the animal falling on its face, explained computer paleontologist Peter Falkingham at the University of Manchester. But the scientists employ “genetic algorithms,” or computer programs that can alter themselves and evolve, and so run pattern after pattern until they get improvements.

Eventually, they evolve a pattern of muscle activation with a stable gait and the dinosaur can walk, run, chase or graze, Falkingham said. Assuming natural selection evolves the best possible solution as well, the modeled animal should move similar to its now extinct counterpart. Indeed, they have achieved similar top speeds and gaits with computer versions of humans, emus and ostriches as in reality.