Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 29
Download | ![]() |
Open PDF In Browser | View PDF |
cbm manual Payam Piray June 18, 2019 This document serves as the manual of cbm (computational and behavioral modeling) toolbox. cbm provides tools for hierarchical Bayesian inference (HBI). See the corresponding manuscript (here) for more information about the HBI and its comparison with other methods. You can download cbm here. In this manual, I present two well-known examples and explain how cbm tools should be used to perform Bayesian model comparison, parameter estimation and inference at the group level. To be able to follow this manual, you need to be familiar with matlab syntax (cbm will be published soon in other languages, particularly python). 1 Example 1: bandit task I assume that the current directory contains this manual and “codes” directory, as well as “example_RL_task” and “example_2step_task” directories. The codes directory contains all cbm matlab functions. For using cbm tools, you need to know what is your model-space and code your own models. You can then use cbm to make inference. For this example, models code and data have been stored in the example_RL_task directory. Suppose you have 40 subjects’ choice data in a 2-armed bandit task, in which subjects chose between two actions and received binary outcomes. All data have been stored in a mat-file called all_data.mat in a cell format, in which each cell contains choice data and outcomes for one subject. First, enter the example_RL_task directory and then load those data: In [1]: % enter the example_RL_task folder, % which contains data and models for this example % (assuming that you are in the directory of the manual) cd(fullfile('example_RL_task')); In [2]: % load data fdata = load('all_data.mat'); data = fdata.data; % data for each subject subj1 = data{1}; % subject 1 subj2 = data{2}; % subject 2 % and so on For example, subj1 contains information of subject 1: subj1.actions are all actions and subj1.outcome are the corresponding outcomes. 1 Also suppose you have 2 candidate models. An RL model, which has a learning rate (alpha) and a decision noise (beta) parameter. The other model is a dual-alpha RL model with two separate learning rates for positive and negative prediction errors (alpha1 and alpha2, respectively) and a decision noise (beta) parameter. See matlab functions model_RL and model_dualRL as examples. It is important to remember that cbm does not care how your model works! It assumes that your models take parameters and data (of one subject) as input and produce a log-probability of data given the parameters (i.e. log-likelihood function). Tools in cbm only require that the input and output of models follow a specific format: loglik = model(parameters,subj) For example, for the model_RL, we have: loglik = model_RL(parameters,subj1) here parameters is a vector, which its size depends on the number of parameters of the model_RL (for model_RL, its size is 2). subj1 is the structure containing data of subject 1 as indicated above. loglik is the log-likelihood of subj1 data given parameters, as computed by model_RL. We now use cbm tools to fit models to data. For this example, we use the two models implemented in model_RL and model_dualRL together with the data stored in all_data.mat. Data stored in all_data is a synthetic dataset of 40 subjects. The data of the first 10 subjects are generated by model_RL and the data of the next 30 subjects are generated by model_dualRL. Note, however, that cbm is not meant to provide a collection of different computational models. You should code your models for questions of your interest yourself. The tools in cbm fit your models to your data and compare the models given the data. First, make sure that cbm is added to your matlab path and then load the data: In [3]: addpath(fullfile('..','codes')); % load data from all_data.mat file fdata = load('all_data.mat'); data = fdata.data; Before using cbm tools, it is always good to check whether the format of the models is compatible with the cbm. To do that, create a random vector parameters and call the models: In [4]: % data of subject 1 subj1 = data{1}; parameters = randn(1,2); F1 = model_RL(parameters,subj1) parameters = randn(1,3); F2 = model_dualRL(parameters,subj1) F1 = -200.9599 2 F2 = -57.2434 Note that because parameters were randomly drawn, when you run the same code, it generates different values for F1 and F2. Also note that parameters are drawn from a normal distribution. In theory, RL models require constrained parameters (e.g. alpha between 0 and 1). In model_RL and model_dualRL, some transformations applied to the normally-distributed parameters to meet their theoretical bounds (see model_RL and model_dualRL). I’ll explain later a bit more about transforming (normally-distributed) parameters in your models. We checked the models with some random parameters. It is also good to check them with more extreme parameter values: In [5]: parameters = [-10 10]; F1 = model_RL(parameters,subj1) parameters = [-10 10 10]; F2 = model_dualRL(parameters,subj1) F1 = -654.4651 F2 = -169.2246 Again, F1 and F2 should be real negative scalers. After making sure that model_RL and model_dualRl work fine, we now use cbm tools to fit models to data. First, we should run cbm_lap, which fits every model to each subject data separately (i.e. in a non-hierarchical fashion). cbm_lap employs Laplace approximation, which needs a normal prior for every parameter. We set zero as the prior mean. We also assume that the prior variance for all parameters is 6.25. This variance is large enough to cover a wide range of parameters with no excessive penalty (see supplementary materials of the reference article for more details on how this variance is calculated). In [6]: v = 6.25; prior_RL = struct('mean',zeros(2,1),'variance',v); % note dimension of 'mean' prior_dualRL = struct('mean',zeros(3,1),'variance',v); % note dimension of 'mean' We also need to specify a file-address for saving the output of each model: 3 In [7]: fname_RL = 'lap_RL.mat'; fname_dualRL = 'lap_dualRL.mat'; Now we run cbm_lap for each model. Note that model_RL and model_dualRL are both in the current directory. In [8]: cbm_lap(data, @model_RL, prior_RL, fname_RL); % Running this command, prints a report on your matlab output % (e.g. on the command window) cbm_lap 18-Jun-2019 22:04:28 ====================================================================== Number of samples: 40 Number of parameters: 2 Number of initializations: 14 ---------------------------------------------------------------------Subject: 01 Subject: 02 Subject: 03 Subject: 04 Subject: 05 Subject: 06 Subject: 07 Subject: 08 Subject: 09 Subject: 10 Subject: 11 Subject: 12 Subject: 13 Subject: 14 Subject: 15 Subject: 16 Subject: 17 Subject: 18 Subject: 19 Subject: 20 Subject: 21 Subject: 22 Subject: 23 Subject: 24 Subject: 25 Subject: 26 Subject: 27 Subject: 28 Subject: 29 Subject: 30 Subject: 31 4 Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: done :] 32 33 34 35 36 37 38 39 40 Also run cbm_lap for model_dualRL In [9]: cbm_lap(data, @model_dualRL, prior_dualRL, fname_dualRL); % Running this command, prints a report on your matlab output % (e.g. on the command window) cbm_lap 18-Jun-2019 22:04:35 ====================================================================== Number of samples: 40 Number of parameters: 3 Number of initializations: 21 ---------------------------------------------------------------------Subject: 01 Subject: 02 Subject: 03 Subject: 04 Subject: 05 Subject: 06 Subject: 07 Subject: 08 Subject: 09 Subject: 10 Subject: 11 Subject: 12 Subject: 13 Subject: 14 Subject: 15 Subject: 16 Subject: 17 Subject: 18 Subject: 19 Subject: 20 Subject: 21 Subject: 22 Subject: 23 5 Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: Subject: done :] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Let’s take a look at the file saved by the cbm_lap: In [10]: fname = load('lap_RL.mat'); cbm = fname.cbm; % look at fitted parameters cbm.output.parameters ans = -0.9099 -2.2959 -1.8879 -2.9416 0.1626 -2.3983 -2.2385 -1.1035 -0.1508 -3.1319 -1.6312 -1.9657 1.3684 0.2462 0.5822 0.1195 0.2869 -0.9798 -0.1596 0.2281 0.0641 0.6446 -1.7082 -0.2423 -0.3139 -0.1045 -1.0967 0.3766 2.8806 1.4585 1.3809 1.2869 1.6409 0.7961 1.3080 2.4429 6 -0.2855 0.6944 1.3887 0.0819 -0.5053 0.4317 0.1695 0.6204 0.0504 -0.2048 0.4743 0.5359 1.0363 0.7106 -1.4016 0.0332 -0.8369 -0.0262 0.1145 0.3787 0.0006 1.6417 1.3087 2.0278 1.4371 1.0805 1.9617 1.4263 0.8347 -0.8997 2.2729 -0.2258 1.2425 0.7142 0.0334 0.2387 1.7652 2.4478 1.4053 0.6052 1.2742 1.6922 2.0756 0.6821 Note that these values are normally-distributed parameters (I’ll explain later how to obtain bounded parameters). The order of parameters depend on how they have been coded in the corresponding model. In model_RL, the first parameter is alpha and the second one is beta. Therefore, here the first column corresponds to alpha and the second one to beta. Now we can do hierarchical Bayesian inference using cbm_hbi. cbm_hbi needs 4 inputs. The good news is that you already have all of them! In [11]: % 1st input: data for all subjects fdata = load('all_data.mat'); data = fdata.data; % 2nd input: a cell input containing function handle to models models = {@model_RL, @model_dualRL}; % note that by handle, I mean @ before the name of the function % 3rd input: another cell input containing file-address to files saved by cbm_lap fcbm_maps = {'lap_RL.mat','lap_dualRL.mat'}; % note that they corresponds to models (so pay attention to the order) % 4th input: a file address for saving the output fname_hbi = 'hbi_RL_dualRL.mat'; Now, we are ready to run cbm_hbi: 7 In [12]: cbm_hbi(data,models,fcbm_maps,fname_hbi); % Running this command, prints a report on your matlab output % (e.g. on the command window) cbm_hbi_hbi Running hierarchical bayesian inference (HBI)... 18-Jun-2019 22:04:43 HBI has been initialized according to lap_RL.mat [for model 1] lap_dualRL.mat [for model 2] Number of samples: 40 Number of models: 2 ====================================================================== Iteration 01 Iteration 02 model frequencies (percent) model 1: 50.5| model 2: 49.5| dL: 6.62 dm: 49.48 dx: 0.22 Iteration 03 model frequencies (percent) model 1: 48.6| model 2: 51.4| dL: 2.14 dm: 1.91 dx: 0.12 Iteration 04 model frequencies (percent) model 1: 46.3| model 2: 53.7| dL: 3.09 dm: 2.30 dx: 0.13 Iteration 05 model frequencies (percent) model 1: 43.3| model 2: 56.7| dL: 4.28 dm: 3.04 dx: 0.16 Iteration 06 model frequencies (percent) model 1: 39.4| model 2: 60.6| dL: 4.91 dm: 3.84 dx: 0.18 Iteration 07 model frequencies (percent) model 1: 35.6| model 2: 64.4| 8 Iteration 08 model frequencies (percent) model 1: 32.7| model 2: 67.3| Iteration 09 model frequencies (percent) model 1: 31.2| model 2: 68.8| Iteration 10 model frequencies (percent) model 1: 30.5| model 2: 69.5| Iteration 11 model frequencies (percent) model 1: 30.0| model 2: 70.0| Iteration 12 model frequencies (percent) model 1: 29.6| model 2: 70.4| Iteration 13 model frequencies (percent) model 1: 29.4| model 2: 70.6| Iteration 14 model frequencies (percent) model 1: 29.3| model 2: 70.7| Iteration 15 model frequencies (percent) model 1: 29.2| model 2: 70.8| 9 dL: dm: dx: 4.10 3.81 0.17 dL: dm: dx: 3.42 2.93 0.15 dL: dm: dx: 2.08 1.47 0.11 dL: dm: dx: 1.40 0.72 0.12 dL: dm: dx: 0.99 0.53 0.10 dL: dm: dx: 0.65 0.34 0.08 dL: dm: dx: 0.44 0.21 0.06 dL: dm: dx: 0.30 0.13 0.04 Iteration 16 model frequencies (percent) model 1: 29.2| model 2: 70.8| Iteration 17 model frequencies (percent) model 1: 29.1| model 2: 70.9| Iteration 18 model frequencies (percent) model 1: 29.1| model 2: 70.9| Iteration 19 model frequencies (percent) model 1: 29.1| model 2: 70.9| dL: dm: dx: 0.21 0.07 0.03 dL: dm: dx: 0.15 0.04 0.02 dL: dm: dx: 0.11 0.02 0.01 dL: dm: dx: 0.08 0.01 0.01 dL: 0.06 dm: 0.00 dx: 0.01 Converged :] Runnig cbm_hbi writes a report on your standard output (often the screen). On every iteration, cbm_hbi reports model frequency, which is the estimate of how many individual datasets (i.e. subjects) is explained by each model (in percent). Furthermore, on every iteration, there are 3 metrics showing the changes relative to the previous iteration: dL is the change in the log-likelihood of all data given the model space (more specifically a variational approximation of log-likelihood); dm is the (percentage of) change in model frequencies; dx indicates changes in (normalized value of) parameters. Although either of these measures or their combination can be used as stopping criteria, cbm_hbi uses dx as the stopping criteria. By default, cbm_hbi stops when dx<0.01. Let’s now take a look at the saved file: In [13]: fname_hbi = load('hbi_RL_dualRL.mat'); cbm = fname_hbi.cbm; cbm.output ans = struct with fields: 10 parameters: responsibility: group_mean: group_hierarchical_errorbar: model_frequency: exceedance_prob: protected_exceedance_prob: {2x1 cell} [40x2 double] {[-1.8854 -0.0098] [1.0678 -0.2944 1.1510]} {[0.2954 0.0409] [0.1183 0.1038 0.0912]} [0.2913 0.7087] [0.0041 0.9959] [NaN NaN] Almost all useful parameters are stored in cbm.output, which we explain them here. First, let’s look at model_frequency, which is the HBI estimate of how much each model is expressed across the group: In [14]: model_frequency = cbm.output.model_frequency model_frequency = 0.2913 0.7087 Note that the model_frequency is normalized here (so it sums to 1 across all models). Also note that the order depends on the order of models fed into to HBI as input. Therefore, in this example, HBI estimated that about 29% of all subjects are explained by the model_RL and about 71% by the model_dualRL. Now let’s take a look at estimated group mean stored in cbm.output.group_mean. In [15]: group_mean_RL = cbm.output.group_mean{1} % group mean for parameters of model_RL group_mean_dualRL = cbm.output.group_mean{2} % group mean for parameters of model_dualRL group_mean_RL = -1.8854 -0.0098 group_mean_dualRL = 1.0678 -0.2944 1.1510 11 Note that these are normally distributed parameters. That’s the reason that group learning rate (which is typically constrained to be between zero and one) is negative here. This is because HBI (and other tools in the cbm) assume that parameters are normally distributed. Therefore, if you have a model with some constraints on parameters (e.g. an RL), you should transform the normally distributed parameter in your model function. To make this point clear, take a look at the model_RL function. On the first two lines of this function, you see these codes: nd_alpha = parameters(1); % normally-distributed alpha alpha = 1/(1+exp(-nd_alpha)); % alpha (transformed to be between zero and one) Here, nd_alpha is the normally-distributed parameter passed to the model (for example by cbm tools). Before using it, the model transformed it to alpha, which is bounded between 0 and 1 and is served as the effective learning rate. To do this, a sigmoid function has been used: sigmoid( x ) = 1+1e−x which is illustrated here: In [16]: x = -5:.1:5; y = 1./(1+exp(-x)); plot(x,y); title('y = sigmoid(x)'); xlabel('x'); ylabel('y'); As you see, sigmoid is a monotonically increasing function, which transforms its input (x) to an output (y) between 0 and 1. Therefore, if you want to obtain the parameters of your model in their 12 theoretical range (e.g. a learning rate between 0 and 1), you should apply the same transformation (e.g. the sigmoid) to the normally distributed parameter (e.g. to the group_mean_RL(1)). The second parameter of the model_RL is the decision noise parameter, which is theoretically constrained to be positive. For transforming this one, we did an exponential-transformation to the second parameter passed to the model_RL: nd_beta = parameters(2); beta = exp(nd_beta); The HBI also quantifies (hierarchical) errorbar of the group_mean parameters, which is saved in cbm.output.group_hierarchical_errorbar: In [17]: group_errorbar_RL = cbm.output.group_hierarchical_errorbar{1}; group_errorbar_dualRL = cbm.output.group_hierarchical_errorbar{2}; You can use the group_mean and group_hierarchical_errorbar values to plot group parameters, or use cbm_hbi_plot to plot the main outputs of the HBI. In [18]: % 1st input is the file-address of the file saved by cbm_hbi fname_hbi = 'hbi_RL_dualRL.mat'; % 2nd input: a cell input containing model names model_names = {'RL', 'Dual RL'}; % note that they corresponds to models (so pay attention to the order) % 3rd input: another cell input containing parameter names of the winning model param_names = {'\alpha^+','\alpha^-','\beta'}; % note that '\alpha^+' is in the latex format, which generates a latin alpha % 4th input: another cell input containing transformation function associated with each transform = {'sigmoid','sigmoid','exp'}; % note that if you use a less usual transformation function, you should pass the handle cbm_hbi_plot(fname_hbi, model_names, param_names, transform) % this function creates a model comparison plot (exceednace probability and model frequ % a plot of transformed parameters of the most frequent model. plotting the group parameters of the most frequenct modelThere is no protected exceedance probab Plotting exceedance probability instead... 13 14 Similar to a t-test, you can use the hierarchical errorbars to make an inference about a parameter at the population level. We explain that feature in the next example. The value of individual parameters are saved in the cbm.output.parameters In [19]: parameters_RL = cbm.output.parameters{1}; parameters_dualRL = cbm.output.parameters{2}; Also you can look at the estimated responsibility that each model generated each individual dataset. Across models, responsibilities sum to 1 for each subject. In [20]: responsibility = cbm.output.responsibility responsibility = 0.9755 0.9992 0.9993 0.9996 0.9990 0.9995 0.9923 0.0245 0.0008 0.0007 0.0004 0.0010 0.0005 0.0077 15 0.9934 0.9972 1.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0002 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000 0.0007 0.6553 0.0000 0.5644 0.0000 0.0007 0.3221 0.1548 0.0000 0.0000 0.0000 0.0002 0.0000 0.0000 0.0000 0.0012 0.0066 0.0028 0.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9998 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 1.0000 0.9993 0.3447 1.0000 0.4356 1.0000 0.9993 0.6779 0.8452 1.0000 1.0000 1.0000 0.9998 1.0000 1.0000 1.0000 0.9988 The first and second columns indicate the responsibility of model_RL and model_dualRL in generating the corresponding subject data, respectively. Look at the estimated responsibility of model_dualRL: In [21]: plot(responsibility(:,2)); ylim([-.1 1.1]) 16 As you see, the estimated responsibility of model_dualRL for the first 10 subjects is almost zero. These results make sense as this is a synthetic dataset and the first 10 subjects are actually generated using model_RL. The next 30 individual datasets are generated using model_dualRL. Let’s also take a look at exceedance probability, a metric typically used for model selection. In [22]: xp = cbm.output.exceedance_prob xp = 0.0041 0.9959 The exceedance probability indicates the probability that each model is the most likely model across the group. A more useful metric is called protected exceedance probability, which also takes into account the null hypothesis that no model in the model space is most likely across the population (i.e. any difference between model frequencies is due to chance). In [23]: pxp = cbm.output.protected_exceedance_prob 17 pxp = NaN NaN As you see, this is currently only NaN values. This is because for computing protected exceedance probabilities, the HBI should be re-run under the (prior) null hypothesis. This is how you can do it: In [24]: fdata = load('all_data.mat'); data = fdata.data; fname_hbi = 'hbi_RL_dualRL'; % 1st input is data, % 2nd input is the file-address of the file saved by cbm_hbi cbm_hbi_null(data,fname_hbi); % Running this command, prints a report on your matlab output % (e.g. on the command window) cbm_hbi_hbi 18-Jun-2019 22:05:01 Running hierarchical bayesian inference (HBI)- null mode... HBI has been initialized according to lap_RL.mat [for model 1] lap_dualRL.mat [for model 2] Number of samples: 40 Number of models: 2 ====================================================================== Iteration 01 Iteration 02 dL: 5.07 dx: 0.15 Iteration 03 dL: 1.14 dx: 0.05 Iteration 04 dL: -0.08 dx: 0.02 Iteration 05 dL: 0.06 dx: 0.01 Iteration 06 dL: 0.08 dx: 0.01 18 Converged :] cbm_hbi_null saves a new file ‘hbi_RL_dualRL_null.mat’ and it also updates the cbm in hbi_RL_dualRL.mat. Load again hbi_RL_dualRL.mat and look at the protected exceedance probability In [25]: fname_hbi = load('hbi_RL_dualRL.mat'); cbm = fname_hbi.cbm; pxp = cbm.output.protected_exceedance_prob pxp = 0.0041 0.9959 Note that here values of xp and pxp are not really different (their difference is very small). In many datasets, however, their difference might be quite substantial. 2 Example 2: Two-step Markov decision task We now use cbm tools for computational modeling in the two-step Markov decision task introduced by Daw et al. (2011). This task is a well-known paradigm to distinguish two behavioral modes, model-based and model-free learning. Daw et al. have proposed three reinforcement learning accounts, a model-based, a model-free and their hybrid (which nests the other two and combines their estimates according to a weight parameter), to disentangle contribution of these two behavioral modes on choices. For this example, we use an empirical dataset reported in Piray et al. (2016) (20 subjects for this example), which is stored in the example_2step_task directory. The hybrid, model-based and model-free algorithms contain 7, 4 and 6 parameters, respectively. Please see Daw et al. for formal description of the models. loglik = model(parameters,subj); Now enter the example_2step_task directory and load data. In [26]: % assuming that the current directory is example_RL_task cd(fullfile('..','example_2step_task')); % load data from all_data.mat file fdata = load('all_data.mat'); data = fdata.data; subj1 = data{1}; subj1 subj1 = 19 struct with fields: choice1: transit: state2: choice2: outcome: description: [201x1 double] [201x1 double] [201x1 double] [201x1 double] [201x1 double] {5x1 cell} See subj1.description for a description of information stored for each subject. Now, you should call cbm_lap with each model separately. In [27]: prior_mb = struct('mean',zeros(4,1),'variance',6.25); fname_mb = 'lap_mb.mat'; cbm_lap(data, @model_mb, prior_mb, fname_mb); % Running this command, prints a report on your matlab output % (e.g. on the command window) If you have many subjects or a model with many parameters, the cbm_lap can take a long time to fit all of them. If you have access to cluster computing, however, you can run cbm_lap in parallel for subjects. For example, here we fit model_mf to the data of only subject 1: In [28]: % create a directory for individual output files: mkdir('lap_subjects'); % 1st input: data % now the input data should be the data of subject 1 data_subj = data(1); % 2nd input: function handle of model (i.e. @model_mf) % 3rd input: a prior struct. The size of mean should % be equal to the number of parameters prior_mf = struct('mean',zeros(6,1),'variance',6.25); % 4th input: output file % note that here the output is associated with subject 1 % we save all output files in the lap_subjects directory fname_mf_subj = fullfile('lap_subjects','lap_mf_1.mat'); cbm_lap(data_subj, @model_mf, prior_mf, fname_mf_subj); Warning: Directory already exists. cbm_lap 18-Jun-2019 22:05:04 ====================================================================== Number of samples: 1 20 Number of parameters: 6 Number of initializations: 42 ---------------------------------------------------------------------Subject: 01 done :] After all jobs finished, you should call cbm_lap_aggregate to aggregate individual files: In [29]: % first make a list of lap_mf_* files: fname_subjs = cell(20,1); for n=1:length(fname_subjs) fname_subjs{n} = fullfile('lap_subjects',['lap_mf_' num2str(n) '.mat']); end fname_subjs fname_subjs = 20x1 cell array {'lap_subjects/lap_mf_1.mat' } {'lap_subjects/lap_mf_2.mat' } {'lap_subjects/lap_mf_3.mat' } {'lap_subjects/lap_mf_4.mat' } {'lap_subjects/lap_mf_5.mat' } {'lap_subjects/lap_mf_6.mat' } {'lap_subjects/lap_mf_7.mat' } {'lap_subjects/lap_mf_8.mat' } {'lap_subjects/lap_mf_9.mat' } {'lap_subjects/lap_mf_10.mat'} {'lap_subjects/lap_mf_11.mat'} {'lap_subjects/lap_mf_12.mat'} {'lap_subjects/lap_mf_13.mat'} {'lap_subjects/lap_mf_14.mat'} {'lap_subjects/lap_mf_15.mat'} {'lap_subjects/lap_mf_16.mat'} {'lap_subjects/lap_mf_17.mat'} {'lap_subjects/lap_mf_18.mat'} {'lap_subjects/lap_mf_19.mat'} {'lap_subjects/lap_mf_20.mat'} Now specify the final output file-address and call cbm_lap_aggregate In [30]: fname_mf = 'lap_mf.mat'; cbm_lap_aggregate(fname_subjs,fname_mf); 21 % Running this command prints a report on your matlab output % (e.g. on the command window) Aggregation is done over 20 subjects :] You see that lap_mf.mat is saved by cbm_lap_aggregate. Similarly, you can fit model_hybrid to data using cbm_lap. I did that and saved lap_hybrid.mat. Now that we have fitted models to data using cbm_lap, we can run cbm_hbi. Note that you can configure the algorithm using optional inputs: In [31]: % 1st input: data for all subjects fdata = load('all_data.mat'); data = fdata.data; % 2nd input: a cell input containing function handle to models models = {@model_hybrid, @model_mb, @model_mf}; % note that by handle, I mean @ before the name of the function % 3rd input: another cell input containing file-address to files saved by cbm_lap fcbm_maps = {'lap_hybrid.mat','lap_mb.mat','lap_mf.mat'}; % note that they corresponds to models (so pay attention to the order) % 4th input: a file address for saving the output fname_hbi = 'hbi_2step.mat'; cbm_hbi(data,models,fcbm_maps,fname_hbi); % Running this command prints a report on your matlab output % (e.g. on the command window) Now, we look at the hbi_2step.mat file saved by the cbm_hbi In [32]: fname_hbi = load('hbi_2step.mat'); cbm = fname_hbi.cbm; First, take a look at model frequencies: In [33]: cbm.output.model_frequency ans = 0.6531 0.3469 0.0000 As you see the hybrid model takes about 65% of responsibility. Now let’s take a look at the exceedance probability In [34]: cbm.output.exceedance_prob 22 ans = 0.9102 0.0897 0.0000 Note that for computing the protected exceedance probability, the HBI should be re-run under the (prior) null hypothesis cbm_hbi_null. Next, we look at the parameters of the hybrid model: In [35]: cd(fullfile('..','example_2step_task')) % 1st input is the file-address of the file saved by cbm_hbi fname_hbi = 'hbi_2step.mat'; % 2nd input: a cell input containing model names model_names = {'Hybrid', 'MB', 'MF'}; % note that they corresponds to models (so pay attention to the order) % 3rd input: another cell input containing parameter names of the winning model param_names = {'\alpha_1','\alpha_2','\lambda','\beta_1','w','p','\beta_2'}; % 4th input: another cell input containing transformation function associated with each transform = {'sigmoid','sigmoid','sigmoid','exp','sigmoid','none','exp'}; % note that no transformation applied to parameter p (i.e perserveration) in the hybrid cbm_hbi_plot(fname_hbi, model_names, param_names, transform) % this function creates a model comparison plot (exceednace probability and model frequ % a plot of transformed parameters of the most frequent model. plotting the group parameters of the most frequenct modelThere is no protected exceedance probab Plotting exceedance probability instead... 23 24 A critical parameter of the hybrid model is the weight parameter, which indicates the degree to which choices influenced by model-based and model-free values. Since the weight parameter is also constrained to be between 0 (i.e. pure model-free) and 1 (i.e. pure model-based), the normally distributed parameter has been transformed in model_hybrid using the sigmoid function. Similar to a t-test, you can use the hierarchical errorbars to make an inference about a parameter at the population level. For example, suppose you are interested to test whether the subjects show significantly more model-based than model-free behavior. In terms of the hybrid model, this can be examined by testing whether the (transformed) weight parameter is significantly different from 0.5 (which indicates equal contribution of model-based and model-free values). Since sigmoid function transforms 0 to 0.5, we should test whether the normally distributed weight parameter is significantly different from 0. cbm_hbi_ttest performs this inference according to a Student’s t-distribution: In [36]: % 1st input: the fitted cbm by cbm_hbi fname_hbi = 'hbi_2step'; % 2nd input: the index of the model of interest in the cbm file k = 1; % as the hybrid is the first model fed to cbm_hbi % 3rd input: the test will be done compared with this value (i.e. this value indicates m = 0; % here the weight parameter should be tested against m=0 25 % 4th input: the index of the parameter of interest i = 5; % here the weight parameter is the 5th parameter of the hybrid model [p,stats] = cbm_hbi_ttest(cbm,k,m,i) p = 0.1667 stats = struct with fields: tstat: -1.4585 pval: 0.1667 df: 14.0626 We see that the p-value is not smaller than 0.05, so there is no signigicant evidence that the weight is different from 0. In other words, there is no significant evidence that subjects are more influenced by model-based or model-free values. As another example, let’s see whether the perseveration paramater is significantly different from 0 (parameter p in the above plot). This parameter indicates whether subjects repeat their choices (or avoid if p<0) regardless of the estimated values. This parameter has not been transformed, so the test should be against m=0. In [37]: % 1st input: the fitted cbm by cbm_hbi fname_hbi = 'hbi_2step'; % 2nd input: the index of the model of interest in the cbm file k = 1; % as the hybrid is the first model fed to cbm_hbi % 3rd input: the test will be done compared with this value (i.e. this value indicates m = 0; % here the perseveration parameter should be tested against m=0 % 4th input: the index of the parameter of interest d = 6; % here the perseveration parameter is the 6th parameter of the hybrid model [p,stats] = cbm_hbi_ttest(cbm,k,m,d) p = 6.1737e-07 26 stats = struct with fields: tstat: 8.5369 pval: 6.1737e-07 df: 14.0626 Therefore, the perseveration parameter is significantly larger than zero (p<0.001). In this example, the model-free model does not take any responsibility. In this situation, the HBI assigns prior values to the individual parameters (with a large variance). This prior value is 0 before transformation. Therefore, for parameters bounded in the unit range (e.g. learning rate), this prior value (after transformation) is 0.5, for decision noise parameter is 1 and for the reservation parameter is 0. In [38]: % 1st input is the file-address of the file saved by cbm_hbi fname_hbi = 'hbi_2step.mat'; % 2nd input: a cell input containing model names model_names = {'Hybrid', 'MB', 'MF'}; % note that they corresponds to models (so pay attention to the order) % 3rd input: another cell input containing parameter names of the winning model param_names = {'\alpha_1','\alpha_2','\lambda','\beta_1','p','\beta_2'}; % 4th input: another cell input containing transformation function associated with each transform = {'sigmoid','sigmoid','sigmoid','exp','none','exp'}; % note that no transformation applied to parameter p (i.e perserveration) in the hybrid cbm_hbi_plot(fname_hbi, model_names, param_names, transform, 3) % this function creates a model comparison plot (exceednace probability and model frequ % a plot of transformed parameters of the most frequent model. There is no protected exceedance probability as cbm_hbi_null has not been executed Plotting exceedance probability instead... 27 28 2.1 Reference If you use cbm, please cite this paper: Piray P, Dezfouli A, Heskes T, Frank MJ, Daw ND. Hierarchical Bayesian inference for concurrent model fitting and comparison for group studies. bioRxiv https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007043 For a more formal description of the HBI algorithm, please see this paper. 29
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 29 Page Mode : UseOutlines Author : Title : Subject : Creator : LaTeX with hyperref package Producer : pdfTeX-1.40.19 Create Date : 2019:06:18 22:13:40-04:00 Modify Date : 2019:06:18 22:13:40-04:00 Trapped : False PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0EXIF Metadata provided by EXIF.tools