linking back to

My lab:

This post was written for the "actionpotential" blog over at in their #NPGsfn11 guest blog series. You can see some editing that has happened in the post over there, this here is the original how I wrote it.

"Like riding a bike" is what we say if we want to tell people that some skills are never forgotten. In this way, skills are like habits: they stick and are hard, if not impossible to shake. Therefore, one of the models used in Neuroscience to mimic the process of skill learning is habit formation. Animals (mostly rats or mice) are trained in a specific task until it becomes so automatized or stereotyped that the behavior becomes difficult to change. It is precisely because of these properties of stereotypization and lack of behavioral control that habit formation is also the experiment of choice to model drug addiction. Drug addicts are thought to have developed a drug-taking habit that has become so automatic and rigid, that they cannot help but execute it (especially when faced with drug-associated cues). Thus, skills and habits are motor memories that can last a lifetime.

The way habit formation in animals is usually brought about is by overtraining them. For instance, in one poster in the first session of this year's Society for Neuroscience conference, Smith and Graybiel trained rats in a T-Maze: on a given auditory cue, the animals had to go either left or right for a reward. Before a habit is formed, i.e., in the early phase of the experiment, the behavior is still flexible (termed 'goal-directed'). This is tested by devaluing the reward the animals get for chosing the correct arm in the T-maze. For instance, if turning right on tone A is rewarded with water, and turning left on tone B is rewarded with food, the animals may receive as much water as they want before a test in the T-maze, but still be hungry. The consequence of this devaluation procedure is that the animals will make more 'mistakes' when the 'water' cue (tone A) is sounded - they are not thirsty any more, so why should they heed tone A? This behavior is abolished by habit formation: overtraining the animals in this procedure does not lead to any reduction in turning right on tone A: the behavior has become stereotyped, automatic, insensitive to devaluation.
The neurobiology of these processes is complicated: two brain areas are known to prevent habit formation when lesioned: the dorslateral striatum (DLS) and the prefrontal infralimbic cortex (IL). In order to learn more about the role of these two brain regions, Smith and Graybiel recorded from neurons in these regions during habit formation. What they found was intriguing: both regions showed the emergence of a slowly stabilizing response pattern in the course of the training, but the pattern evolved much more quickly in the DSL than in the IL. Thus, the authors were able to follow the different stages of the training using recordings in these two regions, allowing them to assign earlier and later roles to each area. This is important as the timing of these stages is a critical aspect of the whole process: establish a habit too early and it can become maladaptive. Any athlete can tell you that it's much harder to unlearn the wrong move than to learn it anew.

It is precisely this temporal control over habit formation with which another poster in this session (actually just across the isle) was concerned. Schreiweis et al. used a similar T-maze design to train wild type and genetically modified mice. The GM mice had their FoxP2 gene replaced by a humanized version. While the authors did not use a devaluation test, their experiments nevertheless suggested that the timing of habit formation was shifted. The experiments took advantage of so-called allocentric and egocentric learning strategies leading either to declarative or procedural memory, respectively. This means they used experiments which either trained the animals to use their own, self-motion (egocentric) in order to pick the right arm of the T-maze (e.g., 'turn left'), or they trained the animals to use external cues (allocentric) to learn where to go (e.g., 'go for the star, not the square'). In this framework, the allocentric strategy leads to a memory that can be declared (star), whereas the egocentric strategy leads to a memory which is best described as a procedure (turning left). The experiments of Schreiweis et al. seem to suggest that FoxP2 manipulation leads to a shift in the balance between these two processes, in this case towards the egocentric strategy. Coincidentally (or not), language acquisition, the process that FOXP2 is famous for being involved in, is a very prominent case of egocentric/procedural/habit learning.

These two posters are highly relevant for our own research which is mainly concerned with invertebrates. As explained in two previous posts, the sort of organization one can abstract form these and previous findings can be found in the fruit fly Drosophila as well. Also in Drosophila, the FoxP gene (there is only one) function distinguishes between self and non-self (egocentric and allocentric, in the maze experiments; we have just submitted this work) and we can shift the time point at which habit formation occurs by switching off a region in the fly brain called the mushroom-bodies (Brembs, Curr. Biol. 2009). Thus, it seems there is an ancient organization of behavioral control, present in the last common ancestor of invertebrates and vertebrates, the Urbilaterian,  in which learning about external cues interacts with learning about the behavior of the animal itself, such that behavioral flexibility is only given up after sufficient training. Before we had sufficient evidence to see how strong these parallels between our work and the vertebrate experiments really were, we started to call the two processes world- and self-learning, respectively.

If these parallels keep developing as they currently are, then the mechanisms underlying the forms of motor learning which can be subsumed under self-learning are as ancient as the 'Kandelian' world-learning mechanisms and have interacted with these mechanisms for more than 500 million years to allow the animals to control how stereotypic (or efficient) or how flexible their behaviors ought to be in their given environment.

But of course, at this point, it could all be confirmation bias on my part, so there there are quite some more experiments to be done before this hypothesis can enter the textbooks as fact. Today, at least the two poster presenters seemed to like the idea.
Posted on Sunday 13 November 2011 - 20:02:29 comment: 0

You must be logged in to make comments on this site - please log in, or if you are not registered click here to signup
Render time: 0.1002 sec, 0.0063 of that for queries.