the alone and lonely nine swords: 五月 2011

Sometimes, our matlab simulation may require a large amount of computation resources, which even causes hundreds of days for us to wait for the results. It seems unendurable and what we can do is just to give up the experiments. Is it true? No, we can find some solution to overcome this difficulty. Cluster! the same idea of Mapreduce. First, we dicompose the problem into subproblems, then we solve these subproblems using different processors in a cluster of computers. Finally, we combine the results of subproblems into one final solution to the original large-scale problem.

To use matlab on the cluster, the first problem we have to solve is to compile the matlab code into a standalone application. Because it not only speeds up the computation time but also is not limited by the restriction of maximum licenses for the matlab software we can use at the same time.

To compile the matlab, we have to configure the environment at the beginning. The command is

module add mcc

This command sets the lib path for matlab standalone executable. Without this setting, you may get the error saying that some shared library is missing. Next, we call

mcc -m -v main.m -a \sourcecode\

which not only compiles your main matlab codes but also the codes in the folder and subfolder of sourcecode.

Now we can talk about how to submit the executable file to clusters.

Dopamine and Reinforcement Learning

Psychological and microeconomic theories of choice suggest that humans and animals must assign values to actions and objects in the world. These values can then be used to select the appropriate action or goal for a particular circumstance. Two major strands of research suggest that reinforcement learning is a mechanism that humans and animals use to learn these values. Classic behavioral studies of reinforcement learning in free choice environments have used the concurrent variable interval schedule introduced by Herrnstein in the late 1960's. In our lab we have extended this behavioral work and are now developing a replacement behavioral task better suited for neuroeconomic research.

Recent evidence has linked computational models of reinforcement learning (e.g. Sutton & Barto, 1998) originally derived from the psychological models of Bush & Mosteller and Rescorla & Wagner to the midbrain dopamine system. In particular, electrophysiological studies, suggest that dopamine neurons in the substantia nigra pars compacta (SNc) and ventral tegmental area (VTA) encode a reward prediction error (RPE) signal, the difference between experienced and anticipated reward.

Work done by Hannah Bayer in her thesis work in the Glimcher lab extended this research to show that dopamine neuron activity quantitatively encodes the predicted RPE signal (Bayer & Glimcher, 2004). Other labs have extended this research to show that the BOLD response in the Striatum (a dopamine target area) reflects a RPE signal as measured by fMRI in humans.

Further evidence for the encoding of values in the Striatum through reinforcement learning comes from electrophysiological recordings including the thesis work of Brian Lau in the Glimcher Lab. Brian demonstrated that both 'offer values' and 'chosen values' are represented in the Striatum. The time course of this neural encoding is compatible with possible roles in choice selection and the generation of RPE signals.

Reinforcement learning in monkeys with stimulation (Schafer).
Previous work (Schultz) indicates that SNc dopamine neurons encode a RPE when animals receive (or miss) a reward. We are extending this to actual decision tasks modeled after Hernnstein’s matching law (Herrnstein, 1961), where the animal chooses between two targets with different reward contingencies. We find that under choice conditions, dopamine firing rates are well predicted by the reinforcement learning models. Our current project causally tests the hypothesis that dopamine neurons are, in fact, encoding a RPE signal used in reinforcement learning. By actively stimulating dopamine neurons with pulses of current at the appropriate time during our choice task, we should cause the animal's predicted value of an option to increase and the animal’s behavior should change to reflect this.

Bandit task in humans (DeWitt, Dean).
The classic choice task developed by Herrnstein to study the 'Matching Law' has critical flaws when extended to the dynamic environments faced by humans and animals. We have developed a novel dynamic choice task based on the n-armed bandit problem widely studied in economics and computer science that overcomes these flaws. Importantly, we know the optimal strategy for our task on a choice-by-choice basis and this strategy is classic reinforcement learning! Our new task allows the measurement of the efficiency of reinforcement learning and to determine if humans and animals correctly trade-off the effect of noise against underlying changes in the environment as predicted by Bayesian theory.

Reinforcement learning in Parkinson's disease (Rutledge).
Parkinson's disease is characterized by a loss of dopamine neurons in the SNc and is associated with tremor, rigidity, and akinesia. The effect of this degeneration on reinforcement learning is unclear. To characterize human reinforcement learning we developed a task, adapted from our monkey choice task, in which subjects fish for crabs to earn money. By testing patients with Parkinson's disease both on and off dopaminergic medication, we find that reinforcement learning is modulated as predicted by theory. This project is a collaboration with Mark Gluck (Rutgers-Newark).

Methods for imaging dopamine areas in humans (DeWitt, Rutledge).
We are developing novel techniques to measure BOLD signals in dopamine projection and target areas in humans using Functional Magnetic Resonance Imaging (fMRI). We use the BOLD signal to provide an indirect measure of dopamine neural activity in humans. Unfortunately, current fMRI techniques make it difficult to accurately measure the midbrain dopamine areas and the orbito-frontal cortex (a major dopamine target area implicated in reinforcement learning). To better describe dopamine activity in choice tasks, we are developing imaging protocols to overcome measurement problems and functional and anatomical localizers to accurately and reliably find the dopamine areas. Our anatomical localizer uses an appropriate pulse sequence to image iron that accumulates in the dopamine areas as a byproduct of dopamine synthesis. Our functional localizer uses a classical conditioning task with primary rewards (juice) to identify dopamine areas. We have also developed a new method of image reconstruction using field map estimates to correct for signal dropout in the orbito-frontal cortex caused by magnetic field inhomogeneities near the air-filled sinuses.This project is a collaboration with Souheil Inati (Center for Brain Imaging, NYU).

An axiomatic model of dopamine function (Dean, Rutledge).
Although widely accepted, the dopamine RPE model has never been properly tested. We have developed a formal economic model which provides us with a number of testable axioms. We are collecting fMRI data using a task in which subjects choose between lotteries and observe the outcomes to win and lose real money. As expected, dopamine area activity is correlated with the predicted RPE signal. We are now testing whether dopamine area activity satisfies our economic axioms. This project is a collaboration with Mark Dean and Andrew Caplin (Economics, NYU).

the alone and lonely nine swords

2011年5月24日星期二

implicit problems using matlab on clusters

2011年5月18日星期三

how to compile matlab codes to standalone applciations

2011年5月14日星期六

a project summary transfered from nyu Glimcher lab's website

popular posts