Skip to contents

1 Introduction to the model

The Interference Measurement Model (IMM) is a measurement model for continous reproduction tasks in the domain of visual working memory. The model was introduced by Oberauer et al. (2017) . The aim of the IMM is to capture the response behavior in continuous reproduction tasks including the occurrence of swap errors to other items encoded in visual working memory.

The IMM assumes that during retrieval, to-be-remembered items or features (e.g., colors, orientations, or shapes) are associated with the context they appeared in (e.g. the spatial location). These associations can be continuous in their strength and represent bindings between the contents and context of to-be-remembered information (see Figure 1.1). At retrieval there are different sources of activation that contribute to the activation of the to-be-retrieved contents. Background noise (b) uniformly activates all possible responses, for example all 360 colors that participants can chose from in a color wheel experiment. Cue-independent activation (a) equally activates all features that were encoded into visual working memory during retrieval. And cue-dependent activation (c) activates the features that are associated with the current retrieval cue (e.g., the spatial location cued to be retrieved). Additionally, the IMM assumes that cue-dependent activation follows a generalization gradient (s) that activates similar contexts.

Illustration of the IMM from Oberauer et al. (2017)

Figure 1.1: Illustration of the IMM from Oberauer et al. (2017)

The activation for each potential feature \(x\) that could be retrieved is the sum of the weighted activation from all three activation sources, given a retrieval cue \(L\) at the location \(\theta\):

\[ A(x|L_\theta) = b \times A_b(x) + a \times A_a(x) + c \times A_c(c|L_\theta) \]

The background activation (\(A_b\)) is independent from all encoded features and thus modeled as a uniform distribution around the circular feature space. This is implemented as a von-Mises (vM) distribution centered on 0 with a precision \(\kappa = 0\):

\[ A_b(x) = vM(x,0,0) \]

The cue-independent activation (\(A_a\)) is modeled as the sum of von Mises distributions centered on the feature values \(x_i\) of all \(n\) encoded items \(i\) with the precision of memory :

\[ A_a(x) = \sum^n_{i = 1} vM(x,x_i,\kappa) \]

The cue-dependent activation (\(A_c\)) is again modeled as the sum of von Mises distributions centered on the feature values \(x_i\) of all \(n\) encoded items \(i\) with the precision of memory . These distributions are weighted by the spatial distance \(D\) between the context \(L\) a feature was associated with to the cue context \(L_\theta\). This distance is weighted by the generalization gradient \(s\) that captures the specificity of bindings to the cue dimension:

\[ A_c(x|L_\theta) = \sum^n_{i = 1} e^{-s*D(L,L\theta)} \times vM(x,x_i,\kappa) \]

The probability for choosing each response \(\hat{x}\) then results from normalizing the activation for all possible responses \(N\). In the original publication of the IMM this was done using Luce’s choice rule:

\[ P(\hat{x}|L_\theta) = \frac{A(\hat{x}|L_\theta)}{\sum^N_{j=1}A(j|L_\theta)} \]

In the bmm package, we decided to implement an alternative normalization using the softmax normalization:

\[ P(\hat{x}|L_\theta) = \frac{e^{A(\hat{x}|L_\theta)}}{\sum^N_{j=1}e^{A(j|L_\theta)}} \]

A comparison between these different normalization function in the context of activation based models of working memory can be found in the appendix of Oberauer and Lewandowsky (2019). Additionally, a more recent factorial comparison of different models for visual working memory (Oberauer 2023) indicated that the softmax normalization generally captures the observed data better than Luce’s choice rule in the context of continuous reproduction tasks.

In sum, the IMM assumes that responses in continuous reproduction tasks are the results of cue-based retrieval and cue-independent activation of all features corrupted by background noise.

2 Parametrization in the bmm package

For identification, one of the weighting parameters has to be fixed. In the original publication the strength of cue-dependent activation \(c\) was fixed to one. The default setup of brms however currently only allows to fix the strength of the background noise \(b\) to zero. Therefore, in all implementations of the IMM in the bmm package, the strength of cue-dependent and cue-independent activation, \(c\) and \(a\), can be estimated and predicted by independent variables.

Apart from that, both the precision of memory representations \(\kappa\) and the generalization gradient \(s\) are parameterized the same way as in the original publication. Please also note, that the scaling of the generalization gradient s is dependent on the scaling of the distance D between the target location and the locations of the non-targets. In previous studies estimating the IMM (Oberauer et al. 2017) these distances were scaled in radians, as all items were placed on a imaginary circle around the center of the screen. However, other studies might position the color patches randomly inside a frame of a certain width and height and thus might use euclidean distances. Also changing the radius of the imaginary circles that color patches are placed on, will change the absolute distance between items. This will affect the absolute size of the generalization gradient s. Thus, differences in the generalization gradient s between different studies should not be interpreted strongly, especially if the studies used different distance measures or different experimental settings with respect to the placement of the items. For now, we recommend that only differences in the generalization gradient s between conditions of a single experiment should be taken as robust results.

Additionally, because we use the softmax normalization for translating activation into probabilities, the estimates for the strength of cue-dependent and -independent activation, \(c\) and \(a\) have to be interpreted relatively to the strength of the baseline activation \(b\) that is fixed to zero. Thus, it is possible that the strength of cue-dependent and cue-independent activation, \(c\) and \(a\), become negative. This does not reflect an absolute negative activation but rather an activation that is relatively smaller than the baseline activation.

3 Fitting the model with bmm

You should start by loading the bmm package:

3.1 Generating simulated data

Should you already have a data set you want to fit, you can skip this section. Alternatively, you can use data provided with the package (see data(package='bmm')) or generate data using the random generation function provided in the bmm package.

# set seed for reproducibility
set.seed(123)

# specfiy generating parameters
Cs <- c(4,4,2,2)
As <- c(0.5,1,0.5,0.5)
Ss <- c(10,10,5,5)
kappas <- c(15,10,15,10)
nTrials = 1000
set_size = 5

simData <- data.frame()
for (i in 1:length(Cs)) {
  # generate different non-target locations for each condition
  item_location <- c(0, runif(set_size - 1, -pi,pi))
  
  # generate different distances for each condition
  item_distance <- c(0, runif(set_size - 1, min = 0.1, max = pi))
  
  # simulate data for each condition
  genData <- rimm(n = nTrials,
                  mu = item_location,
                  dist = item_distance,
                  c = Cs[i], a = As[i],
                  b = 0, s = Ss[i], kappa = kappas[i])
  
  condData <- data.frame(
    resp_error = genData,
    trialID = 1:nTrials,
    cond = i,
    color_item1 = 0,
    dist_item1 = 0
  )
  
  init_colnames <- colnames(condData)
  
  for (j in 1:(set_size - 1)) {
    condData <- cbind(condData,item_location[j + 1])
    condData <- cbind(condData,item_distance[j + 1])
  }
  
  colnames(condData) <- c(init_colnames,
                          paste0(rep(c("color_item","dist_item"),times = set_size - 1), 
                                 rep(2:(set_size),each = 2)))
  
  simData <- rbind(simData,condData)
}

# convert condition variable to a factor

simData$cond <- as.factor(simData$cond)

3.2 Estimating the model with bmm

To estimate the IMM we first need to specify a formula. We specify the formula using the bmmformula function (or bmf). In this formula, we directly estimate all parameters for each of the four conditions. Unlike for brmsformula we do not provide the dependent variable in this formula.:

model_formula <- bmf(
  c ~ 0 + cond,
  a ~ 0 + cond,
  s ~ 0 + cond,
  kappa ~ 0 + cond
)

Then, we can specify the model that we want to estimate. This includes specifying the name of the variable containing the dependent variable resp_error in our simulated data set. Additionally, we also need to provide the information of the non_target locations, the name of the variable coding the spatial distances of the nt_features to the target spaDist, and the set_size used in our data. The set_size can either be a fixed integer, if there is only one set_size in your data, or the name of the variable coding the set_size in your data:

model <- imm(resp_error = "resp_error",
             nt_features = paste0("color_item",2:5),
             set_size = set_size,
             nt_distances = paste0("dist_item",2:5),
             version = "full")

In the above example we specified all column names for the non_targets explicitly via paste0('color_item',2:5). Alternatively, you can use a regular expression to match the non-target feature columns in the dataset. For example, you can specify the model a few different ways via regular expressions:

model <- imm(resp_error = "resp_error",
             nt_features = "color_item[2-5]",
             set_size = set_size,
             nt_distances = "dist_item[2-5]",
             regex = TRUE)

By default the imm() function implements the full imm model. You can specify a reduced model by setting the version argument to “abc” or “bsc” (see ?imm).

Finally, we can fit the model by passing all the relevant arguments to the bmm() function:

fit <- bmm(
  formula = model_formula,
  data = simData,
  model = model,
  cores = 4,
  backend = "cmdstanr"
)

Running this model takes about 2 to 5 minutes (depending on the speed of your computer). Here we load an already saved fit object, but on your machine you will have to wait until the model finishes sampling. Both brms and cmdstanr typically print out some information on the sampling progress.

fit <- bmm:::bmmfit_imm_vignette

Using this fit object we can have a quick look at the summary of the fitted model:

summary(fit)
  Model: non_targets(resp_err = "resp_err",
                     nt_features = c("color_item2", "color_item3", "color_item4", "color_item5"),
                     nt_distances = c("dist_item2", "dist_item3", "dist_item4", "dist_item5"),
                     setsize = 5) 
  Links: mu1 = tan_half; kappa = log; a = identity; c = identity; s = log 
Formula: mu1 = 0
         kappa ~ 0 + cond
         a ~ 0 + cond
         c ~ 0 + cond
         s ~ 0 + cond 
   Data: simData (Number of observations: 4000)
  Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
         total post-warmup draws = 4000

Regression Coefficients:
            Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
kappa_cond1     2.63      0.05     2.54     2.73 1.00     6083     2959
kappa_cond2     2.32      0.05     2.23     2.42 1.00     7319     2917
kappa_cond3     2.69      0.06     2.57     2.80 1.00     5563     2675
kappa_cond4     2.37      0.06     2.26     2.49 1.00     5249     3407
a_cond1         1.56      0.58     0.53     2.76 1.00     5249     3194
a_cond2         0.78      0.67    -0.44     2.18 1.00     4622     2935
a_cond3         0.86      0.39     0.17     1.65 1.00     5170     2595
a_cond4         0.74      0.36     0.11     1.48 1.00     4076     2881
c_cond1         3.67      0.16     3.39     4.02 1.00     3570     1866
c_cond2         4.40      0.26     3.97     5.02 1.00     3055     2170
c_cond3         1.91      0.08     1.77     2.07 1.00     5476     2970
c_cond4         2.05      0.11     1.86     2.28 1.00     3991     2363
s_cond1         0.94      0.61    -0.06     2.32 1.00     2337     1944
s_cond2         1.78      0.47     1.02     2.82 1.00     2803     2751
s_cond3         1.79      0.45     1.13     2.88 1.00     3945     2300
s_cond4         1.20      0.54     0.36     2.44 1.00     3145     2427

Constant Parameters:
                  Value
mu1_Intercept      0.00

Draws were sampled using sampling(NUTS). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).

The first thing you might notice is that below the parts of the formula that was passed to the bmm() function, bmm has added a lot of additional specifications to implement the IMM. This is nothing that you have to check. But if you are interested in customizing and exploring different assumptions imposed by the IMM, you could start by taking this formula and adapting it accordingly.

Next, we can have a look at the estimated parameters. The first thing we should check is if the sampling converged, this is indicated by all Rhat values being close to one. If you want to do more inspection of the sampling, you can check out the functionality implemented in brmsto do this.

The parameter estimates for c and a are already on their native scale, but both s and kappa are estimated using a log link function, so we have to transform these back to the native scale.

fixedFX <- brms::fixef(fit)

# print posterior means for the s parameter
fixedFX[startsWith(rownames(fixedFX),"c_"),]
#>         Estimate  Est.Error     Q2.5    Q97.5
#> c_cond1 3.670572 0.16084115 3.394321 4.020778
#> c_cond2 4.401691 0.26487921 3.969000 5.023963
#> c_cond3 1.913522 0.07632643 1.768573 2.065081
#> c_cond4 2.050996 0.10616580 1.856427 2.276359

# print posterior means for the s parameter
fixedFX[startsWith(rownames(fixedFX),"a_"),]
#>          Estimate Est.Error       Q2.5    Q97.5
#> a_cond1 1.5602995 0.5754700  0.5318122 2.755671
#> a_cond2 0.7796907 0.6711383 -0.4447393 2.181122
#> a_cond3 0.8583318 0.3851788  0.1650280 1.650582
#> a_cond4 0.7404941 0.3554419  0.1097129 1.484371

# print posterior means for the s parameter
exp(fixedFX[grepl("s_",rownames(fixedFX)),])
#>         Estimate Est.Error      Q2.5    Q97.5
#> s_cond1 2.563880  1.834421 0.9425244 10.20855
#> s_cond2 5.932295  1.603699 2.7644132 16.79831
#> s_cond3 5.987427  1.567627 3.0849707 17.89819
#> s_cond4 3.317288  1.721499 1.4403183 11.47896

# print posterior means for the s parameter
exp(fixedFX[grepl("kappa_",rownames(fixedFX)),])
#>             Estimate Est.Error      Q2.5    Q97.5
#> kappa_cond1 13.94268  1.051448 12.670659 15.32498
#> kappa_cond2 10.21227  1.049288  9.281720 11.21525
#> kappa_cond3 14.68647  1.058859 13.110356 16.39623
#> kappa_cond4 10.74340  1.061760  9.548517 12.07860

These results indicate that all parameters, except for s were well recovered. As already noted by Oberauer et al. (2017), a good recovery of the generalization gradient s requires a lot of data. Thus you might consider opting for the simplified version of the IMM without the s parameter, the imm_abc.

We can illustrate the recovery of the data generating parameters by plotting the full posterior distributions alongside the data generating parameters. For this we need to extract the posterior draws using the tidybayes package and include the data generating parameters into the plots of the posteriors.

library(tidybayes)
library(dplyr)
library(tidyr)
library(ggplot2)

# extract the posterior draws
draws <- tidybayes::tidy_draws(fit)
draws <- select(draws, starts_with("b_")) %>% select(-(1:3)) %>% 
  mutate_at(vars(starts_with("b_s")),exp) %>% 
  mutate_at(vars(starts_with("b_kappa")),exp)

# plot posterior with original parameters overlayed as diamonds
as.data.frame(draws) %>% 
  gather(par, value) %>% 
  ggplot(aes(value, par)) +
  tidybayes::stat_halfeyeh(normalize = "groups") +
  geom_point(data = data.frame(par = colnames(draws),
                               value = c(kappas, As, Cs, Ss)),
             aes(value,par), color = "red",
             shape = "diamond", size = 2.5) +
  scale_x_continuous(lim=c(-1.5,20))


colnames(draws)
#>  [1] "b_kappa_cond1" "b_kappa_cond2" "b_kappa_cond3" "b_kappa_cond4"
#>  [5] "b_a_cond1"     "b_a_cond2"     "b_a_cond3"     "b_a_cond4"    
#>  [9] "b_c_cond1"     "b_c_cond2"     "b_c_cond3"     "b_c_cond4"    
#> [13] "b_s_cond1"     "b_s_cond2"     "b_s_cond3"     "b_s_cond4"

References

Oberauer, Klaus. 2023. “Measurement Models for Visual Working Memory—a Factorial Model Comparison.” Psychological Review 130 (3): 841–52. https://doi.org/10.1037/rev0000328.
Oberauer, Klaus, and Stephan Lewandowsky. 2019. “Simple Measurement Models for Complex Working-Memory Tasks.” Psychological Review 126 (6): 880–932. https://doi.org/10.1037/rev0000159.
Oberauer, Klaus, Colin Stoneking, Dominik Wabersich, and Hsuan-Yu Lin. 2017. “Hierarchical Bayesian Measurement Models for Continuous Reproduction of Visual Features from Working Memory.” Journal of Vision 17 (5): 11. https://doi.org/10.1167/17.5.11.