Working memory allows us to retain visual information to guide upcoming future behavior. In line with this future-oriented purpose of working memory, recent studies have shown that action planning occurs during encoding and retention of a single visual item, for which the upcoming action is certain. We asked whether and how this extends to multi-item visual working memory, when visual representations serve the potential future. Human participants performed a visual working-memory task with a memory-load manipulation (one/two/four items) and a delayed orientation-reproduction report (of one item). We measured EEG to track 15- to 25-Hz beta activity in electrodes contralateral to the required response hand—a canonical marker of action planning. We show an attenuation of beta activity, not only in Load 1 (with one certain future action) but also in Load 2 (with two potential future actions), compared with Load 4 (with low prospective-action certainty). Moreover, in Load 2, potential action planning occurs regardless whether both visual items afford similar or dissimilar manual responses, and it predicts the speed of ensuing memory-guided behavior. This shows that potential action planning occurs during multi-item visual working memory and brings the perspective that working memory helps us prepare for the potential future.
Working memory allows us to hold onto visual information to prepare for and guide potential future action (van Ede, 2020; Nobre & Stokes, 2019; Rainer, Rao, & Miller, 1999; Baddeley, 1992; Fuster & Alexander, 1971). For example, when a football player breaks through the defense line, the player may look to see where the left- and right-wing attackers are located. This information is retained in memory as the player sprints toward the goal and prepares to potentially pass the ball to either attacker, depending on the development of the attack. In this example, visual information is retained in working memory in anticipation of multiple potential future actions. In this way, visual working memory allows for flexible behavior—being prepared for multiple potential future actions offers a way to deal with uncertainty in a dynamically unfolding environment (Cisek & Kalaska, 2010).
A vast body of research into visual working memory has provided a detailed understanding of the mechanisms of encoding and retention (e.g., Schneegans & Bays, 2017; Serences, 2016; D'Esposito & Postle, 2015; Luck & Vogel, 2013; Harrison & Tong, 2009). Ultimately, working memory serves as a bridge between perception and upcoming action. Therefore, it is also important to consider the role of potential action planning alongside encoding and retention in visual working memory (Heuer, Ohl, & Rolfs, 2020; Olivers & Roelfsema, 2020; van Ede, 2020; Myers, Stokes, & Nobre, 2017). Two recent EEG studies provide evidence that action planning and visual retention can co-occur during working memory (Boettcher, Gresch, Nobre, & van Ede, 2021; Schneider, Barth, & Wascher, 2017). At least, they show that this occurs during encoding and retention of a single visual item, for which the upcoming action can be fully predetermined in advance.
From the literature of motor planning research, the concept of parallel action planning proposes that we often plan multiple potential actions in parallel, even before selecting the relevant action for implementation (Gallivan, Barton, Chapman, Wolpert, & Flanagan, 2015; 2016; Grent-'t-Jong, Oostenveld, Jensen, Medendorp, & Praamstra, 2013; Cisek, 2007; Cisek & Kalaska, 2005). A recent study tentatively suggests that multiple potential actions may also be planned alongside visual working memory. When two visual items are retained in visual working memory, and one of the two items is probed for action, visual and motor representations are selected concurrently after the memory delay (van Ede, Chekroud, Stokes, & Nobre, 2019). However, it is yet to be demonstrated how the planning of multiple potential actions unfolds during the memory delay, alongside the encoding and retention of more than one visual item in working memory.
In this study, we used EEG to address whether and how multiple potential actions are planned alongside the encoding and retention of multiple visual items in working memory. We envisioned two hypothetical scenarios. In the one-or-none scenario (Figure 1C), action planning may occur alongside visual working memory only when we retain one visual item for which the required action is known. In this case, one would expect a relative attenuation of EEG-beta activity (a canonical marker of action planning; van Wijk, Daffertshofer, Roach, & Praamstra, 2009; Neuper, Wörtz, & Pfurtscheller, 2006; Mcfarland, Miner, Vaughan, & Wolpaw, 2000; Salmelin & Hari, 1994) during the memory delay only when we retain one visual item. Alternatively, in the graded scenario (Figure 1D), action planning may occur alongside visual working memory, even when we retain multiple visual items in anticipation of multiple potential actions. In this case, one would expect to observe an attenuation of beta activity that depends on the number of action possibilities, as has previously been reported in studies considering action planning without simultaneous item retention in visual working memory (Tzagarakis, West, & Pellizzer, 2015, 2021; Tzagarakis, Ince, Leuthold, & Pellizzer, 2010).
To preview our results, we show that: (i) action planning of multiple potential actions co-occurs with visual retention of multiple visual items; (ii) this effect occurs regardless of whether potential actions require a similar or dissimilar manual response; (iii) the degree to which actions are planned during the memory delay is predictive of the speed of memory-guided action afterward.
Twenty-five healthy human adults (mean age 25.32 years, SD = 4.27 years, sex of the participants is unknown, four left-handed) participated in the experiment. All participants had normal or corrected-to-normal vision. None of the participants were excluded from the analyses. The experiment was approved for by the Central University Research Ethics Committee of the University of Oxford. Participants provided written informed consent before participating in the study. They received a monetary compensation of £10 per hour after participation.
Experimental Design and Procedure
Participants performed a visual working-memory task with a delayed orientation-reproduction report (Figure 1A). A blocked memory-load manipulation was implemented in the task. To achieve this, each block of trials was preceded with an instruction display (block-wise precue) that indicated the relevant color(s) of that block (Figure 1B). Participants were asked to encode and retain only the bars of the instructed color(s). In Load 1, participants were required to retain the orientation of the one bar whose color matched that of the instruction cue; in Load 2, participants were required to retain the orientations of the two bars whose colors matched the instruction cue; in Load 4, participants were required to retain the orientations of all four colored bars. In all cases, four bars were presented on the screen at encoding, as such controlling for bottom–up stimulus differences between the load conditions.
The encoding display was followed by a memory delay (1500 msec) in which only the fixation cross remained on the screen. After the delay, a response dial was presented on the screen (as in the work of van Ede, Niklaus, & Nobre, 2017). This dial consisted of a gray circle with two smaller circles (or handles), positioned opposite each other on the larger circle, that together represented an orientation. The color of the two handles indicated which bar orientation should be reproduced. The color was always chosen randomly from the relevant colors in that block.
The position of the two handles could be adjusted by moving the computer mouse in the direction of the orientation of the cued item. By moving the computer mouse in a certain direction, the response dial handles automatically turned to match this direction, regardless of how they were positioned before movement onset—thus linking memorized orientation to specific manual actions, regardless of the dial's starting position. Participants confirm their response by making a left-side mouse click. For consistency, the mouse was always controlled with the right hand (even in the few participants who preferred their left hand). The initial orientation of the response dial was randomly varied on a trial-by-trial basis.
Participants had an unlimited amount of time to initiate their response after the response dial had been presented on the screen. Once their response was initiated, participants had a maximum of 2500 msec to confirm their orientation-reproduction report with a left-side mouse click. A visualization of an hourglass was presented under the response dial to indicate the amount of time that had passed. After response confirmation, participants received feedback on their orientation-reproduction precision. If the absolute difference in orientation between the response and the target was smaller than 15°, the fixation cross turned green; if the absolute difference was larger than 15°, or if the response deadline had passed, the fixation cross turned red. The intertrial interval was randomly varied between 500 and 800 msec.
Participants were seated at a viewing distance of 90 cm from the computer screen. The bars had a diameter of 5.7° visual angle and were centered at a 5.7° visual angle distance from fixation. Every encoding display contained four colored oriented bars (in red, green, blue [RGB] color values) (green, RGB: 0, 210, 63; blue, RGB: 0, 128, 255; orange, RGB: 255, 127, 39; purple, RGB: 238, 0, 238) that were presented on a gray background (RGB: 25, 25, 25) for 500 msec. The relevant colors indicated in the block-wise precue were randomly chosen. Color locations and bar orientations were both randomly chosen on a trial-by-trial basis.
Preceding the main task, participants performed one practice block of 20 trials for each memory-load condition (i.e., 60 trials in total). During the main task, participants performed two consecutive sessions with a 10- to 15-min break in between. Each session contained 10 blocks of 20 trials for each memory-load condition (i.e., 2 × 10 × 20 × 3 = 1200 trials in total).
Load conditions were always grouped into three consecutive subblocks of 20 trials each, with load conditions 1, 2, and 4 occurring in random order. After every 60 trials, participants were prompted to have a self-paced break. Participants 1 and 2 performed 12 blocks during each session (i.e., 1440 trials in total). After realizing that number of trials took a considerable amount of time, the number of blocks per session was reduced from 12 to 10 from Participant 3 onward.
Behavioral Data Analyses
All behavioral analyses were performed in R (R Core Team, 2020). The variables of interest for the behavioral analyses were absolute error (in degrees) and decision time (in msec). Absolute error was defined as the absolute difference between the reported and the target orientation. Decision time was defined as the time between the onset of the response dial and the initiation of the mouse response (as in the work of van Ede et al., 2017, 2019).
Before turning to the main analyses, behavioral data were cleaned by removing outlier decision times. First, trials with decision times smaller than 200 msec or larger than 5000 msec were excluded from further analyses. Next, for each participant, trials were excluded with decision times larger than the mean plus 2.5 times the standard deviation. Means and standard errors of the variables of interest were calculated for each participant and memory load using the Rmisc package (Hope, 2013), and visualized using the ggplot2 package (Wickham, 2016). Two one-way repeated-measures ANOVAs were performed to statistically evaluate the effect of memory load on the variables of interest. For each memory-load comparison, and each variable of interest, post hoc comparisons were done using the Tukey honest significant difference test.
EEG Acquisition and Analyses
EEG was measured using Synamps amplifiers and Neuroscan acquisition software (Compumedics Neuroscan), using the standard 10–10 System 64 electrode setup. The left mastoid was used as an active reference. The ground electrode was placed on the left upper arm. During acquisition, the data were low-pass filtered with a 250-Hz cutoff and sampled at 1000 Hz.
All EEG analyses were performed in MATLAB (2020b; The MathWorks, 2020) using the FieldTrip toolbox (Oostenveld, Fries, Maris, & Schoffelen, 2011; https://fieldtriptoolbox.org). After acquisition, data were rereferenced to an average of the left and right mastoids. Then, 50-Hz noise was filtered using a dft filter, and the data were down-sampled to 200 Hz. The data were epoched from −1000 to 3000 msec, relative to memory encoding onset. Independent component analysis (ICA) was used to correct for eye-movement artifacts. The appropriate ICA components used for artifact rejection were identified by correlating the time-courses of the ICA components with those of the measured horizontal and vertical EOG. After blink correction, the FieldTrip function ft_rejectvisual was used to visually assess which trials had exceptionally high variance, which were marked for rejection. Trials that had been marked as too fast or too slow (as described in the Behavioral Data Analyses section) were also rejected from further analyses. A surface Laplacian transform was applied to increase the spatial resolution of the central 15- to 25-Hz beta signal of interest (as in the work of van Ede et al., 2019).
Channel and Frequency-Band Selection
For all reported analyses, channel and frequency-band selections were predetermined. To investigate motor activation contralateral to the hand used for reporting (i.e., the right hand), we focused on EEG activity in channel C3 (i.e., a canonical EEG channel over the left motor cortex). To zoom in on the beta-band, we additionally extracted 15- to 25-Hz beta activity for all time course-based visualizations (although note that we always also statistically evaluated our data in the full time–frequency plane). These selections are in line with previous research (e.g., Boettcher et al., 2021; van Wijk et al., 2009; Neuper et al., 2006; Mcfarland et al., 2000; Salmelin & Hari, 1994) and were set a priori.
Time–frequency responses for the frequency-range from 3 to 40 Hz (in steps of 1 Hz) were obtained using the short-time Fourier transform. Data were Hanning-tapered with a sliding time window of 300 msec, progressing in steps of 50 msec. Time–frequency responses were contrasted for each memory-load comparison (Load 1 vs. 4; Load 1 vs. 2; Load 2 vs. 4), using a normalized subtraction to express load effects as a percentage change: ((a − b) / (a + b)) × 100. Time–frequency responses were averaged across participants in Channel C3. To focus on the delay period of interest, we considered all data in the time-window of −100 to 2500 msec (relative to memory encoding onset). For topographies, time–frequency responses were averaged for the predetermined beta frequency-band from 15 to 25 Hz, and visualized in three consecutive time-windows covering the full delay period: 500–1000 msec (i.e., the first 500 msec after encoding offset), 1000–1500 msec, 1500–2000 msec. To obtain beta time-courses, time–frequency responses were averaged over the 15- to 25-Hz band.
Dependence on Orientation-Similarity
To assess whether the observed differences between Loads 2 and 4 depended on the item similarity in Load 2 trials, we also separately examined Load 2 trials in which the items were similar versus dissimilar. To this end, the absolute difference in orientation was calculated between the two relevant items in Load 2. Trials were marked as similar if this absolute difference was smaller than 45°; trials were marked as dissimilar if this absolute difference was larger than 45°. The previously described calculation of time–frequency responses, topographies, and time-courses were repeated for the following contrasts: Load 2-similar versus 4; Load 2-different versus 4; Load 2-similar versus 2-dissimilar.
Dependence on Decision Time
We also aimed to assess whether action planning—as reflected in EEG beta activity in C3—during the memory delay may have impacted decision times after the delay. To this end, trials in each memory load and participant were marked as fast or slow using a median split for decision times. We did this separately for each load condition and ran the previously described calculation of time–frequency responses, topographies, and time-courses for the following contrasts: Load 1 fast versus slow; Load 2 fast versus slow; Load 4 fast versus slow.
Cluster-based permutations (Maris & Oostenveld, 2007) were performed for the statistical evaluation of the above-described EEG contrasts for load, orientation-similarity, and behavior. This nonparametric approach (or Monte Carlo method) offers a solution for the multiple-comparisons problem in the statistical evaluation of EEG data, which, in our case, included a sizeable number of time–frequency comparisons. It does so by reducing the data to a single metric (e.g., the largest cluster of neighboring data points that exceed a certain threshold) and evaluating this (in the full data space under consideration) against a single randomly permuted empirical null distribution. Cluster-based permutations were performed on the time–frequency responses (considering clusters in time and frequency) and time-courses (considering clusters in time) of each above-described contrast, using 10.000 permutations, and an alpha level of 0.025.
Working Memory Performance Improves as a Function of Item Certainty
Before turning to the main EEG results, we characterized the effect of memory load on task performance (i.e., absolute error and decision time). With an increase in memory load, the absolute difference between the target orientation and the reported orientation (i.e., absolute error) significantly increases (Figure 2A; F(2, 72) = 52.02, p < .001). Post hoc comparisons revealed a significantly lower absolute error in Load 1 compared with Load 2 and Load 4, and in Load 2 compared with Load 4 (all p < .001). Similarly, the time it takes to initiate the mouse response (i.e., decision time) also significantly increases with an increase in memory load (Figure 2B; F(2, 72) = 20.77, p < .001). Post hoc comparisons revealed significantly faster decision times in Load 1 compared with Load 2 and Load 4 (both p < .001), and in Load 2 compared with Load 4 (p = .011). These effects of memory load on absolute errors and decision times can further be appreciated by a visualization of their respective density plots (Figure 2A, B, right).
These results confirm the effectiveness of the memory-load manipulation: Although in each memory-load condition four bars were always presented on the screen at encoding, one, two, or four bars were selectively retained in visual working memory, as instructed by the block-wise precue. Moreover, these results show that, with lower memory loads, participants' orientation-reproduction reports are initiated faster and are more precise. With lower memory loads, there is a higher certainty about which item will be probed and, therefore, a higher certainty about the required action. Hence, faster response initiation with lower memory loads might at least partly be accompanied by an increase in action planning during the working-memory delay. Next, we will present neural evidence for this idea. Critically, we will show that this holds not only when comparing Load 1 to Loads 2 and 4, but also when comparing Loads 2–4, although, in both conditions, the prospective action is unknown during the memory delay.
Planning Multiple Potential Actions alongside Visual Working Memory
We now turn to our central question: whether and how multiple potential actions are planned alongside multi-item visual encoding and retention in working memory. To investigate this, we used EEG to track 15- to 25-Hz beta attenuation in electrodes contralateral to the response hand (i.e., C3)—a canonical neural marker of action planning (e.g., Boettcher et al., 2021; van Wijk et al., 2009; Neuper et al., 2006; Mcfarland et al., 2000; Salmelin & Hari, 1994). To disentangle the one-or-none scenario (Figure 1C; i.e., action planning only occurs alongside visual working memory during retention of single visual item for which the required computer mouse movement direction can be fully planned in advance) from the graded scenario (Figure 1D; i.e., action planning occurs alongside visual working memory, even during retention of multiple visual items in anticipation of multiple potential actions), we compared beta activity during the memory delay across each possible memory-load comparison: Load 1 versus Load 4; Load 1 versus Load 2; and Load 2 versus Load 4.
We observed a significant relative attenuation of beta activity in C3 for Load 1, compared with Load 4 (Figure 3A-i; time–frequency map cluster p < .0001). This relative beta attenuation in Load 1 showed a left-central topography (i.e., contralateral to the response hand; Figure 3A-ii). Moreover, in line with Boettcher et al. (2021), it showed a bimodal temporal profile (Figure 3A-iii; time-course cluster p < .0001). Similarly, we observed a significant relative attenuation of beta activity in C3 for Load 1 compared with Load 2 (Figure 3B-i; time–frequency map cluster p < .0001), with similar topographical (Figure 3B-ii) and temporal (Figure 3B-iii; time-course early cluster p < .0001, late cluster p = .0004) characteristics as in the comparison between Load 1 and Load 4. These data are consistent with the notion that, in Load 1, participants know with certainty at encoding which visual item they will need to report at the end of the working-memory delay. Accordingly, participants can plan the required action ahead of time, leading to a stronger action-planning signal in the EEG in Load 1 compared with Loads 2 and 4 (i.e., when more than one item can become relevant later).
The most critical finding emerged when we compared action planning in Load 2 versus Load 4. In both conditions, participants remained oblivious about which item would be probed for report at the end of the memory delay. Nevertheless, when directly comparing these conditions, we also observed a significant relative attenuation of beta activity in C3 for Load 2 compared with Load 4 (Figure 3C-i; time–frequency map; early cluster p = .0013, late cluster p = .0021). As before, this effect was characterized by a similar C3-centered topography (Figure 3C-ii) and bimodal temporal profile (Figure 3C-iii; time-course; early cluster p = .0011, late cluster p = .0014).
In accordance with previous research (Boettcher et al., 2021; Schneider et al., 2017), we confirm that action planning occurs alongside the retention of a single visual item in working memory when the required action is certain. The key novelty here is the emergence of this action-planning signature during the retention of more than a single visual item in working memory, when the to-be-implemented action remains uncertain throughout the memory delay. These results are in line with the graded scenario (Figure 1D) we previously envisioned: We observed an attenuation of beta activity that depends on the number of action possibilities (largest in Load 1, intermediate in Load 2, smallest in Load 4).
Similar and Dissimilar Potential Actions Are Planned alongside Visual Working Memory
During the retention of two oriented bars (i.e., in Load 2), the difference in orientation between those two bars varies between trials: The difference can be smaller (i.e., when both orientations are similar) or larger (i.e., when both orientations are dissimilar). Consequently, the potentially required actions can also be similar (i.e., when they both require the mouse to be moved a similar direction) or dissimilar (i.e., when they each require the mouse to be moved in dissimilar directions).
Next, we aimed to rule out the possibility that the observed relative attenuation of beta activity for Load 2 (compared with Load 4) was driven primarily by trials with similar memorized orientations (as suggested in the work of Grent-'t-Jong, Oostenveld, Jensen, Medendorp, & Praamstra, 2014), which maybe have been associated with, or reduced to, a single action plan. To this end, we separated trials in Load 2 as follows: Trials were marked as similar when the absolute difference in orientation between two bars was smaller than 45°; trials were marked as dissimilar when this difference was larger than 45°. Next, we compared beta attenuation in C3 during the memory delay for Load 2 compared with Load 4, while this time distinguishing between Load 2-similar and Load 2-dissimilar trials. For completeness, Load 2-similar and -dissimilar were also compared directly.
In line with the data presented in Figure 3C, we observed a significant relative attenuation of beta activity in C3 when comparing Load 2-similar to Load 4. This attenuation started soon after encoding (Figure 4A-i; time–frequency map early cluster p = .0021, late cluster p = .0021), had a left-central motor topography (Figure 4A-ii), and showed a bimodal temporal profile (Figure 4A-iii; time-course early cluster p = .0009, late cluster p = .012). Crucially, when we exclusively included trials from Load 2 that were marked as dissimilar (and thus, required distinct potential actions) in our comparison, we still observed a significant relative attenuation of beta activity in C3 for Load 2-dissimilar compared with Load 4 (Figure 4B-i; time–frequency map early cluster p = .011, late cluster p = .0005), with the same topographical (Figure 4B-ii) and temporal (Figure 4B-iii; time-course early cluster p = .0026, late cluster p = .0005) characteristics as previously described. Moreover, when directly comparing Load 2-similar and Load 2-dissimilar trials, we did not observe a significant attenuation of beta activity in C3 (Figure 4C).
These results show that the observed relative attenuation of beta activity in Load 2 compared with Load 4 was not merely driven by trials in Load 2 where both memorized item orientations were similar. Accordingly, these results indicate that multiple potential actions can be planned during visual working memory, even when two visual representations in working memory require distinct actions for reproduction.
Potential Action Planning alongside Visual Working Memory Allows for Faster Memory-guided Behavior
Finally, we investigated whether potential action planning during the memory delay might be beneficial for performance, specifically for the speed of action implementation after the working-memory delay. To examine this, we marked trials as fast or slow, based on the onset time of the orientation-reproduction report after the onset of the memory probe. To this end, we performed a median split separately for each memory-load condition and each participant. We reasoned that, if preparedness for potential future actions is beneficial for the speed at which one of these actions is later implemented, the degree of beta attenuation in C3 after encoding and during retention should be stronger in trials with faster decision times than in those with slower decision time. Moreover, this should only be the case in situations where action planning occurs alongside visual working memory.
For trials in Load 1, we observed a significant relative attenuation of beta activity in C3 for fast compared with slow trials, starting soon after encoding (Figure 5A-i; time–frequency map cluster p = .00029). This effect again showed a left-central topography (Figure 5A-ii) and a bimodal temporal profile (Figure 5A-iii; time-course early cluster p = .011, late cluster p = .0004). Critically, when performing the same median split analysis for Load 2, we again observed a significant relative attenuation of beta activity in C3 for fast compared with slow trials. This effect also emerged soon after encoding (Figure 5B-i; time–frequency map early cluster p < .0001, late cluster p < .0001), had a left-central topography (Figure 5B-ii), and a bimodal temporal profile (Figure 5B-iii; time-course early cluster p = .0003, late cluster p = .0002). In contrast, when comparing fast to slow trials in Load 4, we did not observe such a significant attenuation of beta activity during the memory delay (Figure 5C). Nonetheless, after the memory delay, beta activity still became significantly predictive of response-onset times (Figure 5C-i, iii; time–frequency map cluster p = .0025; time-course cluster p = .0031).
These results indicate that preparedness for multiple potential future actions is beneficial for the speed at which one of these actions is later implemented when we retain one or two visual items, but not (or at least to a nonsignificant extent) when we retain four visual items. This is consistent with the finding (Figure 3C) that there is more action planning during the memory delay in Load 2 (i.e., when action certainty is intermediate) than in Load 4 (i.e., when action certainty is low).
Although visual working memory allows us to retain information from the past, it inherently serves the future. It forms the bridge between perception and action, allowing us to use detailed visual representations from memory to guide potential future action (van Ede, 2020; Nobre & Stokes, 2019; Rainer et al., 1999; Baddeley, 1992; Fuster & Alexander, 1971). Critically, working memory often contains not one but multiple pieces of information that may become relevant for upcoming behavior (Ma, Husain, & Bays, 2014; Cowan, 2001; Luck & Vogel, 1997), and it may therefore serve not just the future, but the potential future. Accordingly, we asked whether and how multiple potential actions are planned during visual working memory, alongside the encoding and retention of multiple visual items. We show an attenuation of beta activity in central electrodes contralateral to the required response hand that depends on the number of action possibilities (strongest in Load 1, intermediate in Load 2, weakest in Load 4). In Load 2, this effect occurred regardless of whether both potential actions required a similar or dissimilar manual response. Moreover, the degree of beta attenuation during the memory delay (in Load 1 and Load 2) was predictive of the speed of the ensuing memory-guided action after. These results are in line with the previously envisioned graded scenario, whereby action planning occurs alongside visual working memory, even when we retain more than one visual item in anticipation of multiple potential actions. This brings the concept of parallel action planning (Cisek, 2007; Cisek & Kalaska, 2010) to the domain of multi-item retention in visual working memory.
Previous research focusing on working memory of a single visual item already showed that action planning of its associated certain action occurs during the working-memory delay (Boettcher et al., 2021; Schneider et al., 2017). We directly build upon these findings, showing that, alongside the encoding and retention of two visual items, their associated potential actions are also planned during the memory delay. This occurs despite the uncertainty throughout the delay which of the two items will be probed for action later. Moreover, the action planning signature we observed followed a similar bimodal activation pattern as recently reported for single-item action planning during visual working memory (Boettcher et al., 2021). Another recent study that focused on working memory of multiple visual items previously showed that, when either of two visual items in visual working memory is probed for report, visual and motor information are selected concurrently (van Ede et al., 2019). This study—which focused on neural activity after the memory delay—provided tentative evidence for the idea that parallel action planning may co-occur with multi-item visual retention. Here, we provide direct, complementary evidence for this interpretation by focusing on EEG activity in the delay-period itself.
The concept of parallel action planning has been around for more than a decade. An early study showed that when primates decide between two reaching actions toward different target locations, both actions are planned in parallel at first, and one of these actions is selected for implementation later (Cisek & Kalaska, 2005). This has led to the proposition of the affordance competition hypothesis (Cisek, 2007), suggesting that behavior is a competition between parallel representations of potential action affordances. Recent work has argued that potential action planning is also prevalent in humans when they plan and perform reaching actions toward multiple potential locations (Wong & Haith, 2017; Gallivan et al., 2015, 2016; Grent-'t-Jong et al., 2013; Stewart, Baugh, Gallivan, & Flanagan, 2013). Yet, so far, it has been considered predominantly in tasks without concurrent item retention in working memory, whereby visual objects guide our actions. We now provide evidence for the notion that multiple potential actions, which are guided by detailed visual item representations, are also planned alongside encoding and retention during visual working memory. At the same time, our data show that action planning is more profound when the relevant action is fully known in advance (i.e., in Load 1) as compared with when there are multiple potential courses of action (i.e., in Loads 2 and 4). This is in line with other earlier research, which showed that beta attenuation is inversely related to the number of action possibilities, being larger with higher action certainty, and vice versa (Tzagarakis et al., 2010, 2015, 2021).
We interpret our data in the Load 2 condition as reflecting the planning of multiple potential actions. However, one might alternatively argue that participants plan one single action selectively, even when they anticipate multiple potential actions. Indeed, concluding the occurrence of parallel action planning from trial-average data is notoriously difficult (Dekleva, Kording, & Miller, 2018). However, three aspects of our data argue against this alternative interpretation. First, if this were true, it logically follows that the degree of action planning in Loads 1 and 2 should be comparable (i.e., in both cases one action is planned). In other words, we should observe no difference in beta activity during the memory delay in Load 1 compared with Load 2, contrasting our observations. Second, in half of the trials in Load 2, the action that is selectively planned should be the “incorrect” action that is associated with the visual item that is not probed later. This should be detrimental to decision times, as this would require a switch of plans in this half of the trials. Yet, we observed that larger beta attenuation in Load 2 during the memory delay predicts faster decision times later, suggesting that beta attenuation generally facilitated performance. A third possibility is that multiple potential actions in Load 2 are “merged into one” whenever two visual items require a similar manual response. However, we observed an attenuation of beta activity in Load 2 (compared with Load 4) regardless of whether the two potential actions required a similar or dissimilar manual response. These data support out parallel-planning interpretation by countering the possibility that one potential action is selectively planned during the memory delay, even when there are multiple potential courses of action.
We revealed the existence of planning multiple potential actions during multi-item visual working memory—linking the study of multi-item visual working memory to the vast literature on motor planning. Although our work focused on demonstrating action planning during multi-item visual working memory, it leaves open the exact computational mechanisms of retention and subsequent selection of motor plans alongside multi-item visual working memory. From the literature of motor planning research, the existence of parallel feedback loops for task and body states in premotor and posterior parietal cortices has been previously demonstrated, and was argued to allow for the integration of the body within the task (e.g., Haar & Donchin, 2020; Shadmehr & Krakauer, 2008). Such computational theories of motor control provide a relevant avenue for future research on action planning alongside visual working memory. For example, do multiple parallel feedback loops operate for each of the multiple potential action plans that may be held available alongside multi-item visual working memory? And how may such loops integrate visual and motor information that are held in working memory concurrently?
Several lines of previous research have focused on bidirectional influences between visual working memory and action (for recent reviews, see the works of Heuer et al., 2020; Olivers & Roelfsema, 2020; van Ede, 2020; Myers et al., 2017). It has been shown, for example, that action planning can benefit visual working-memory performance: Memory performance is higher when the memorized locations of visual memoranda and (planned) actions are congruent, both for eye movements (Ohl & Rolfs, 2017, 2018, 2020; Hanning & Deubel, 2018; Hanning, Jonikaitis, Deubel, & Szinte, 2016) and for manual actions (Hanning & Deubel, 2018; Heuer & Schubö, 2017, 2018). We focused on the reverse direction and considered how retention during visual working memory may naturally recruit action planning. Moreover, we show that the degree of action planning during the working-memory delay is beneficial for the speed of memory-guided action afterward. At the same time, we found no relation between action planning and the precision of the memory-guided orientation-reproduction report. This suggests that action planning in our task did not necessarily influence the quality of visual working-memory representations. Instead, planning multiple potential actions may have occurred alongside visual working-memory retention, allowing both action plans and visual representations to be readily available for fast response-implementation after the memory delay.
Although visual working memory allows us to retain information that is no longer physically available to us, visual working memory is not merely a temporary storage mechanism. Instead, we often rely on representations in visual working memory to guide and plan potential future actions, even—or perhaps especially—under varying degrees of action certainty. This is useful in our everyday lives where we are often faced with multiple sources of visual information that we need to retain and that each afford distinct potential actions. Being prepared for more than one action scenario allows us to cope with action uncertainty in a dynamically unfolding world (Cisek & Kalaska, 2010) and, ultimately, allows us to act rapidly when working-memory contents become relevant for behavior. The findings presented here provide evidence that multiple potential actions can be planned alongside visual working memory. They also reinforce the idea that visual working memory is ultimately future-oriented.
This research was supported by a Newton International Fellowship from the Royal Society and the British Academy (NF140330), a Marie Skłodowska-Curie Fellowship from the European Commission (ACCESS2WM), and an ERC Starting Grant from the European Research Council (MEMTICIPATION, 850636); awarded to F. v. E. Data were collected while F. v. E. was a fellow in the Brain & Cognition lab of Anna C. (Kia) Nobre. The authors wish to thank Kia Nobre for her valuable exchanges throughout the lifetime of the project, Sammi Chekroud for his assistance during the collection of the data, and Baiwei Liu and Merlijn Breunesse for their valuable comments on the manuscript.
Reprint requests should be sent to Rose Nasrawi, Institute for Brain and Behavior Amsterdam, Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, 1081BT, Amsterdam, The Netherlands, or via e-mail: email@example.com or Freek van Ede, Institute for Brain and Behavior Amsterdam, Department of Experimental and Applied Psychology, Vrije Universiteit Amsterdam, 1081BT, Amsterdam, The Netherlands, or via e-mail: firstname.lastname@example.org.
Code and Data Availability
All the code for the EEG and behavioral analysis are available at https://github.com/rosenasrawi/MultiplePotentialActionsVWM. The data can be made available upon reasonable request.
Rose Nasrawi: Conceptualization; Formal analysis; Investigation; Visualization; Writing—Original draft; Writing—Review & editing. Freek van Ede: Conceptualization; Data curation; Funding acquisition; Investigation; Supervision; Writing—Original draft; Writing—Review & editing.
Diversity in Citation Practices
Retrospective analysis of the citations in every article published in this journal from 2010 to 2021 reveals a persistent pattern of gender imbalance: Although the proportions of authorship teams (categorized by estimated gender identification of first author/last author) publishing in the Journal of Cognitive Neuroscience (JoCN) during this period were M(an)/M = .407, W(oman)/M = .32, M/W = .115, and W/W = .159, the comparable proportions for the articles that these authorship teams cited were M/M = .549, W/M = .257, M/W = .109, and W/W = .085 (Postle and Fulvio, JoCN, 34:1, pp. 1–3). Consequently, JoCN encourages all authors to consider gender balance explicitly when selecting which articles to cite and gives them the opportunity to report their article's gender citation balance.