Saturday, March 11, 2006

project finally begun, first update.

In this post I will give the overview of the broad plan (more or less again), followed by the prioritization that I have planned for my experiments. I will then provide the details of the first experimental set-up followed by the first update on the experiments that begun 4 days ago.

The long-term plan to answer the questions posted previously will be to evolve and thoroughly analyze dynamical system agents in several different experimental set-ups: two different tasks (i) reversible and (ii) irreversible learning on a continuum; and two experimental set-ups: (a) a disembodied/non-situated one and (b) an embodied/situated one. The reversible learning further subdivides in classical conditioning or operant conditioning.

This could add altogether to 6 different experimental scenarios. However, they will not be tackled all at once. I am prioritising the set-ups in terms of relatively simplicity and in terms of the main goal of my research. Although eventually I would like to understand all of these in terms of dynamical systems, I will give priority to the abstract scenarios over the embodied/situated ones because my main interest is in the analysis of the dynamics underlying learning and memory.

Furthermore, I will prioritize these abstract set-ups in terms of relative simplicity. The first set-up to be tackled will be concerned irreversible learning. Second, I will tackle the reversible learning scenarios.

The first experimental set up is to do with evolving agents that can learn once during their lifetime. They require that the agent ‘memorise’ the presentation of a feature in the environment and be able to make a decision concerning this feature after a time delay. There are a number of examples of irreversible forms of learning in animals; one which is particularly interesting is parental imprinting in birds as studied by Konrad Lorenz.

The tasks are similar to the imprinting scenarios that I have played with until now but they extend in very important directions: [1] So far the learning on a continuum has only been evolved and tested on few successive presentations of test individuals (i.e. one or two). One important direction is to extend this imprinting-like-learning over many successive presentations of test individuals. This will address questions regarding the persistence of memory.

The agent is a fully connected CTRNN. There’s one sensory signal that is being fed to all nodes in the CTRNN via a set of weights that are evolved along with the rest of the CTRNN parameters. The feature that the agent has to remember is a signal between [1, 2] provided for a fixed length of time (10 units of time). At the beginning of a trial, a random delay in introduced ([10, 20] ut). The first individual is then presented and this should be interpreted as the ‘parent’ individual in the imprinting metaphor of the task. Random delays are introduced again. A second signal is then produced. This can be of the same value or a different one to the first. This is interpreted as a ‘test’ individual. The agent has one output node. The output of this node is interpreted as ‘this is my parent’ when 1 and ‘this is not my parent’ when the output is 0. The agent is evaluated after the test individual is presented and a successful agent must produce the correct output for any number of individuals presented after the ‘parent’.

The learning is irreversible in this case because the agent cannot relearn a new parent at any particular point during ‘its life’ only at the beginning. At the same time, this learning is interesting because the agent has to hold on to its ‘memory’ of the first individual for the longest time possible.

So, I finally started on Tuesday (09.03.06). I have further subdivided the first set-up into 4 stages: [i] 2-possible-parents irreversible learning with only one evaluation, [ii] 2-possible-parents irreversible learning with several successive evaluations, [iii] possible parents on a continuum irreversible learning with only one evaluation, and [iv] possible parents on a continuum with several successive evaluations.

I was able to evolve on the same day I started 2 and 3 node CTRNNs for stage [i]. I have not had the chance to analyse these circuits nor their evolutionary dynamics for two reasons. First, because all of it has evolved so quickly, but also because the plan is to wait until successfully evolving the full task (i.e. [iv]) before stopping and analysing.

Using an incremental approach I was also able to evolve 3 node CTRNNs for stage [ii]. The incremental approach was very simple (and inspired loosely in Phattanasri's work): the parent individual is always presented at first, after a random delay the first test individual is presented. Once one agent in the population achieves 95% fitness score the fitness trial changes to include an extra test individual after some other random delay. So on, until an agent can discriminate up to 5 test individuals one after another.

In the time that I get between evolving, I am building up some tools (in C and Matlab) to help me visualise the performance of the agents and analyse both the evolutionary dynamics and the CTRNN dynamics.

Now I've set-up the evolutionary runs for the continuum and successive case [iv] at once. There are two parallel incremental approaches at work for these runs. The first incremental approach is the same one as the already described to go from discriminating the first test individual correctly to identifying successive individuals, until it generalises. The second incremental approach concerns the shift from 2-possible-parents to a continuum of possibilities. This approach is a bit more subtle and I will describe later on.

There are a rather large number of questions to be answered in this first scenario (including questions in i, ii, iii and iv). These have begun to come up as I evolve the simpler milestones. I will write about the questions in the next update.


Post a Comment

<< Home