Johnny said:Last weekend I wrote a small program that mutates a given gene into a target sequence given parameters such as genome size, population size, generation time, mutation rate, and selection coefficient. Basically it randomly inserts the gene into a genome of whatever size you specify. Then it starts to mutate the genome. Each mutation has a 70% chance of being an point mutation, a horrendously small chance of being a mutation in the actual gene (depends on the genome size you specify), and even smaller chance of being the right base. If the mutation happens to be a point mutation at the right base, it then has a 33% chance of mutating towards the target sequence (i.e. 1 / 3 chance to be the correct base mutation), and a (2 * selection coefficient) chance of being integrated into the genome. If it is actually incorporated into the genome, the program then runs a quick calculation to determine the number of years it would take to incorporate the gene into the entire genome given the population size, the selection coefficient, and the generation time. After it calculates the number of years necessary to insert into the genome, it begins the process all over again. This is done until the input gene matches the target gene. The output is the # of years / generations / total mutations / etc.
Then I wrote another program which takes the same parameters and simply does a statistical analysis -- no actual mutating of the genome. The predicted evolution rates and the actual evolution rates (given the above program) correlate very nicely.
The problem with running such simulations, as Bob b touched on, is that we must apply some selection criteria. And since we're not really using life-forms, reproductive success isn't really a viable option. Dawkins chose to use "anything towards the goal" as his selection criteria. If we were trying to apply his program as a real-life analogy, the assumption made is not that evolution works towards a specified product, rather evolution works as an algorithm towards a goal. That goal is increased reproductive fitness. Thus, with Dawkins program and in mine, the underlying assumption is that each mutation is a positive mutation (i.e. it must benefit the organism, or at least not harm the organism). Consider the following simulation of the evolution of a gene sequence.
TTT CTT CTG TTC AAG AAC ATC TCC TTG
TTT CTT CTG TCC AAG AAC ATC TCC TTG
TTT ATT CTG TCC AAG AAC ATC TCC TTG
TTT ATT CTG TCC AAG AAC ATC TCC TTA
Final sequence:
TTT ATT CTG TCC AAG AAC ATC TCC TTA
The assumption at each of those intermediate sequences is that it is working towards a goal. In my simulation, that goal is the target sequence. In real life, that goal is increased reproductive fitness. So while the program may spit out a number in the form of years (that sequence took 547 years to evolve in a population of 15000 organisms with 100k genes each replicating once a year with a mutation rate of one per replication), there is also the assumption that each intermediary stage is beneficiary. In real life, we are not guaranteed that. So a real simulation is actually very difficult to accomplish, because you to program a defined set of instructions or a guidelines by which to compare each intermediate form.
I don't know why I just ranted on about all that. Just wanted to tell you about my program and the difficulties of modeling natural selection.
Interesting, so what you're saying is bob's challenge should give us 547 letter changes to make a meaningful sentence