Abstract
Evidence suggests that the CNS uses motor primitives to simplify movement control, but whether it actually stores primitives instead of computing solutions on the fly to satisfy task demands is a controversial and still-unanswered possibility. Also in contention is whether these primitives take the form of time-invariant muscle coactivations (“spatial” synergies) or time-varying muscle commands (“spatiotemporal” synergies). Here, we examined forelimb muscle patterns and motor cortical spiking data in rhesus macaques (Macaca mulatta) handling objects of variable shape and size. From these data, we extracted both spatiotemporal and spatial synergies using non-negative decomposition. Each spatiotemporal synergy represents a sequence of muscular or neural activations that appeared to recur frequently during the animals' behavior. Key features of the spatiotemporal synergies (including their dimensionality, timing, and amplitude modulation) were independently observed in the muscular and neural data. In addition, both at the muscular and neural levels, these spatiotemporal synergies could be readily reconstructed as sequential activations of spatial synergies (a subset of those extracted independently from the task data), suggestive of a hierarchical relationship between the two levels of synergies. The possibility that motor cortex may execute even complex skill using spatiotemporal synergies has novel implications for the design of neuroprosthetic devices, which could gain computational efficiency by adopting the discrete and low-dimensional control that these primitives imply.
SIGNIFICANCE STATEMENT We studied the motor cortical and forearm muscular activity of rhesus macaques (Macaca mulatta) as they reached, grasped, and carried objects of varied shape and size. We applied non-negative matrix factorization separately to the cortical and muscular data to reduce their dimensionality to a smaller set of time-varying “spatiotemporal” synergies. Each synergy represents a sequence of cortical or muscular activity that recurred frequently during the animals' behavior. Salient features of the synergies (including their dimensionality, timing, and amplitude modulation) were observed at both the cortical and muscular levels. The possibility that the brain may execute even complex behaviors using spatiotemporal synergies has implications for neuroprosthetic algorithm design, which could become more computationally efficient by adopting the discrete and low-dimensional control that they afford.
Introduction
The CNS may construct complex behaviors through more basic building blocks or “motor primitives” (Flash and Hochner, 2005; Bizzi et al., 2008; Giszter, 2015). Both the evidence and definition of these primitives remain tentative. In many studies, they have been defined as time-invariant, “spatial” synergies of simultaneous cocontraction across muscles in frogs (Tresch et al., 1999), cats (Ting and Macpherson, 2005; Yakovenko et al., 2011), rats (Kargo and Nitz, 2003), monkeys (Brochier et al., 2004; Overduin et al., 2012), and humans (Klein Breteler et al., 2007; Torres-Oviedo and Ting, 2007; d'Avella et al., 2008; Roh et al., 2012; Berger et al., 2013). An alternative “spatiotemporal” synergy model (d'Avella et al., 2003) may relate more closely to the classical idea of motor programs (Keele, 1968; Polit and Bizzi, 1979; Georgopoulos et al., 1983; Kargo and Nitz, 2003). A ccording to this scheme, synergies would encode a brief profile of time-varying activity in each of several muscles. Each spatiotemporal synergy can capture an asynchronous, but nonetheless invariant, pattern of intermuscular coordination. The synergies themselves may be recruited asynchronously by the CNS in constructing more complex behaviors, which seems to be needed to account for multiple-phase motions such as object reach, grasp, and carry movements (d'Avella et al., 2006; Overduin et al., 2008). Spatiotemporal synergies could substantially lessen the computational burden on the CNS during motor control because it could reuse synergies across different task conditions and need only specify when and how much to recruit each synergy within each individual movement segment (Rückert and d'Avella, 2013). These synergy-encoding schemes pose a much simpler control problem than commanding each muscle continuously throughout a movement (cf. spatiotemporal synergies) and independently of other muscles (cf. either spatiotemporal or spatial synergies).
Whether the CNS stores discrete motor primitives in the form of spatiotemporal synergies is an open question. Studies using microstimulation support the idea of simple, discrete movement encoding, although generally at the kinetic and kinematic level rather than the muscular level. For example, intraspinal microstimulation generates forces by which a limb is driven through one or more postures in frogs (Giszter et al., 1993) and mammals (Tresch and Bizzi, 1999; Lemay and Grill, 2004). Within the brain, intracortical microstimulation (ICMS) trains of sufficient duration within mammalian motor cortex (Graziano et al., 2002; Ramanathan et al., 2006; Griffin et al., 2011, 2014; Stepniewska et al., 2014) can elicit complex, multijoint, multiphase forelimb behaviors (e.g., a reach, grasp, and retraction). The convergent forces (Giszter et al., 1993) and invariant end points (Graziano et al., 2004) evoked by microstimulation tend to coincide with movements and postures seen in subjects' spontaneous behavior, suggesting that the evoked movements are not artifactual.
Here, we investigated whether populations of motor cortical neurons reflect recruitment of motor primitives in the form of spatiotemporal synergies. We studied a manual behavior performed by two rhesus macaques (Macaca mulatta) (Overduin et al., 2008). The subjects had to reach for an object presented in a well, grasp it, and then carry it to an opposing well. To elicit a variety of hand postures and forces, the objects included 25 spheres, cubes, and cylinders of different dimension. As we have reported previously, the EMG data recorded from forelimb muscles could be decomposed into a small number of spatiotemporal (Overduin et al., 2008) or spatial (Overduin et al., 2012) synergies. We performed these decompositions using iterative factorization algorithms suitable for non-negative data such as muscle activations (d'Avella et al., 2003). We now apply the same decomposition algorithms to simultaneously recorded cortical data from primary motor cortex (MI) and the dorsal (PMd) and ventral (PMv) premotor cortex. We hypothesized that, if spatiotemporal muscle synergies are represented in motor cortex, motor neuronal firing patterns should decompose into the same number of dimensions as observed for EMG-derived synergies and with the same pattern of modulation over object conditions. We also examined spatial synergies extracted from the same muscle and neural data to contrast spatiotemporal and spatial synergy-encoding schemes.
Materials and Methods
Subjects.
Cortical, muscular, and behavioral data were recorded from two rhesus monkeys: G1 (an 8-year-old, 5.9 kg female) and G2 (a 4-year-old, 6.5 kg male). Procedures all had the approval of the Massachusetts Institute of Technology Committee on Animal Care.
Behavior.
Subjects pressed a start button and then reached for, grasped, and carried objects between two wells. The 25 objects included 15 cylinders, of which 5 each spanned 1 of 3 dimensions (inner diameter, 0.6–3.2 cm; uniform diameter, 1.3–3.8 cm; height, 0.6–5.7 cm), 5 cubes of variable width (1.5–3.6 cm), and 5 spheres of variable diameter (range 1.6–3.6 cm). To study object-related rather than sparser position-related modulation of motor activity, we focused exclusively on one carry direction (as in earlier reports such as Overduin et al., 2008, 2014). Of the two carry directions, we have continued to focus on leftward-carry trials because these involved larger-amplitude reaching movements (the start button being located further from the object) and more opportunity to observe grasp-related activity during reaching. Head movements were constrained by a cranial post during recording. Behavioral event data collection and processing have been described previously (Overduin et al., 2008).
Sessions.
The recordings from monkey G1 comprised 7798 successful trials performed over 20 recording sessions spanning 45 d; those from G2 comprised 775 trials performed over 6 sessions spanning 6 d. Muscle recording was done in each of these sessions. Neural data were only collected in a subset of the sessions, but when collected, they were acquired at the same time as the muscle data. The muscle data thus include some trials (as well as entire, interspersed sessions) when neural data were not collected and the EMG decomposition results (see Figs. 1, 4) are based on trials recorded both with and without simultaneous neural data (Overduin et al., 2008). The ensemble-level neural activity was restricted to the subset of data (3466 trials over nine sessions from G1 and 257 trials over four sessions from G2) over which the ensemble units were recorded (while the animals were exposed to a common set of object conditions, as described under “Cortical ensembles” below).
Surgery.
Craniotomies occurred after (G1) or along with the first of (G2) the muscle implantation surgeries (described in Overduin et al., 2008). These sterile surgeries were performed under general anesthesia (10 mg/kg ketamine and 0.05 mg/kg atropine injected intramuscularly, followed by 5 mg/kg sodium pentobarbital intravenously in G1 or by inhalation of 1–2% isoflurane with 2 L of O2 in G2) and then administration of analgesics and systemic antibiotics. Using bone screws and cement, custom stainless steel wells (width: G1, 28 mm; G2, 20 mm) were secured around craniotomy sites centered over right motor cortex.
Muscle data.
EMG recordings were made via 15 (G1) or 19 (G2) electrodes chronically implanted in muscles of the left forelimb. Intrinsic hand muscles included the opponens digiti quinti manus (Op5), flexor digiti quinti brevis manus (F5B), opponens pollicis (OpP), adductor pollicis (AdP), and abductor pollicis brevis (AbPB). Wrist and extrinsic hand flexors included the flexor carpi ulnaris (FCU), radial and ulnar flexor digitorum profundus (FDPR and FDPU, respectively), flexor digitorum superficialis (FDS), and flexor carpi radialis (FCR). Wrist and extrinsic hand extensors included the extensor carpi ulnaris (ECU), extensor digiti quarti and quinti proprius (ED45), extensor digiti secundi and tertii proprius (ED23), extensor digitorum communis (EDC), extensor carpi radialis brevis (ECRB), and abductor pollicis longus (AbPL). Proximal muscles acting on the elbow and shoulder included the brachioradialis (BR), biceps brachii longus (Bic), radial and ulnar short heads of the triceps brachii (TriR and TriU, respectively), pectoralis major (Pec), and deltoideus (Del). EMG data were recorded simultaneously from all 15 (G1) or 19 (G2) muscles on a trial-by-trial basis (between button press and reward events). EMG data collection (including cross-talk analysis) has been described previously (Overduin et al., 2008).
Cortical areas.
Units were recorded from PMd, PMv, and MI as identified from MRI and sensorimotor mapping (somatosensory responses, ICMS-evoked movements, and thresholds). Unit somatosensory response fields were estimated at the beginning of sessions by touching the monkey's skin and passively moving its limbs. At the end of sessions, ICMS was applied through tungsten microelectrodes (FHC; 250-μm-diameter shaft tapered to a 3-μm-wide tip; 0.3–3-MΩ impedance). In each session, ≤10 electrodes were introduced into the brain via manual microdrives (depth resolution: 30 μm). These custom microdrives were mounted on a grid that was secured to the recording well and that constrained the interelectrode spacing to 1 mm. Stimulation parameters included 0.05 s train duration, 330 Hz pulse frequency, 10–150 μA current, and 2 × 0.2 ms pulse duration (cathodal leading).
Cortical units.
The electrodes used for ICMS were also used to acutely record extracellular voltages. We considered only those units lying within cortical regions convexly bounded by sites that demonstrated sensorimotor responses restricted to the left forelimb. Signals were preamplified (1× gain, by a head stage ∼5 cm from the electrodes) and then amplified (10,000×), band-pass filtered (600–6000 Hz, second-order filter with roll-off on both ends) and digitized (Neuralynx). Spikes (1.1 ms waveforms sampled at 30 kHz, with the threshold-crossing point at 0.26 ms) were stored to disk whenever electrode voltages exceeded a manual threshold. Single units were identified offline using MClust (MClust-3.4) and custom algorithms in MATLAB (The MathWorks) for this and the following analyses. Accepted units were found to have mean firing rates between 1 and 58 Hz, and 1% of interspike intervals (ISIs) were <1 ms. A total of 202 units were identified (168 of these in G1 and 34 in G2).
Cortical ensembles.
These consisted of the largest number of units (54 units in G1, 17 in G2) which were recorded over ≥4 trials (in any session) with each of a set of at least 2 objects from each of the 5 shape classes (13 objects for G1, 11 for G2). Ensembles included 11, 12, and 31 (G1) and 4, 3, and 10 (G2) units in areas PMd, PMv, and MI, respectively. We pooled data from units recorded on different days (Poliakov and Schieber, 1999; Morrow and Miller, 2003; Schieber and Rivlis, 2007) given that the same objects were presented on those days and that the animals' behavior was relatively stereotyped in handling these objects. The neural data matrix comprised the unit firing rate averages over the ≥4 trials in each of these 11 or 13 object conditions.
Cortical subensembles.
For single-trial comparisons of spatiotemporal muscle and neural synergies, we also defined “subensembles” consisting of the largest number of units that had been recorded simultaneously within the above 54- and 17-unit ensembles. For both monkeys, these subensembles included 6 units, spanning either 247 (G1) or 56 (G2) trials. For monkey G1, there were two such subensembles; we focused on the one having better average clustering metrics, including higher L-ratio and lower isolation distance (Schmitzer-Torbert et al., 2005), and fewer ISIs <1 ms, as well as higher spike rates.
EMG preprocessing.
The data in each muscle channel were integrated over short bins (G1: 9 ms, G2: 11 ms) and normalized to the maximum EMG over the 25 object conditions. As described previously (Overduin et al., 2008), 40 trials in each condition were selected, time aligned to the moment of object removal from the first well, cropped to a 100-bin period around this time (G1: [−0.35:+0.55] s, G2: [−0.5:+0.6] s), averaged over trials, and concatenated over conditions to give the muscle data matrix Dm (“m” indicating “muscle” data). The different integration and window times chosen for the two subjects were based on their unique movement latencies and were set to span the median reach and carry durations of each animal (Overduin et al., 2008).
Unit preprocessing.
Each unit's spikes were summed over the same time bins used for EMG data (G1: 9 ms, G2: 11 ms). All trials fully spanned by a unit were selected, time aligned to the moment of object removal, and cropped to the same time windows as for EMG data minus a fixed 50 ms delay (G1: [−0.4:+0.5] s, G2: [−0.55:+0.55] s). This delay maximized (to the nearest 10 ms and within a [−0.5:+0.5] s range) the number of significant correlations between mean unit firing rates and integrated EMG values over all (168 + 34) units and (15 + 19) EMG channels, pooled over animals. The delay is also consistent with values reported by others (Morrow and Miller, 2003; Schieber and Rivlis, 2007; Stark et al., 2007b). Firing rates were averaged over trials within each object condition, smoothed by convolution with a 50 ms Gaussian kernel, and concatenated over conditions to give the neural data matrix Dn (“n” indicating “neural” data).
Synergy extraction and fitting.
Decomposition algorithms applying non-negative matrix factorization were used to identify a set of both spatiotemporal (d'Avella et al., 2003) and spatial (Tresch et al., 1999; Lee and Seung, 1999) synergies underlying each monkey's muscle and neural patterns. The muscle data Dm were defined over e = 1, 2, …, E EMG channels and t time points. The matrix Dm was decomposed, first as a combination of i = 1, 2, …, Npmd spatiotemporal synergy matrices Pim (see Fig. 1A). (The “d” and “p” of Npmd denote muscle “data”-derived and number of “primitives,” respectively.) The spatiotemporal synergy duration parameter was fixed at 50 time bins (G1: 0.45 s, G2: 0.55 s). These a priori durations were based on several considerations: First, they equaled half of the trial window defined for each monkey, allowing a reasonably small number of spatiotemporal synergies (as few as two) to explain the muscle and neural data. (The reconstruction plots of Figs. 1C and 2C start at two spatiotemporal synergies for this reason.) Second, visual inspection of the data showed this duration to be sufficient to capture typical muscle bursts. Third, the submovement literature supports a spatiotemporal synergy duration of ∼0.5 s. In monkeys in particular, submovement durations of up to 0.35–0.5 s were evident in forelimb movements (Roitman et al., 2004; Fishbach et al., 2007). The same duration was chosen in several earlier reports across a range of species (d'Avella and Bizzi, 2005; d'Avella et al., 2006, 2011; Overduin et al., 2008). Each spatiotemporal synergy was recruited with an onset delay tim and scaled by a non-negative coefficient cim specific to each synergy and trial (see Fig. 1B) as follows: The activity Dm was also decomposed as a combination of j = 1 … Nsmd spatial synergy vectors vjmd (see Fig. 4A, black-to-red bars), each weighted by an amplitude history ajmd (t) as follows: (The “s” of Nsmd indicates “synchronous” spatial synergies.) From the concatenated set of spatiotemporal synergy matrices Pim, we also extracted spatial synergies vjmp (j = 1 … Nsmp), the motivation for which is described in the Results (see Fig. 4A, purple bars). (The “p” of Nsmp indicates derivation from motor “primitives” rather than data.) This allowed the following reconstruction: For a given Npmd (or Nsmd or Nsmp), the algorithms iteratively updated the structures Pim (or vjmd or vjmp) and coefficients cim and tim (or ajmd (t) or ajmp(t)) until the total reconstruction error R2 grew by <0.001 over 10 iterations. The algorithms were repeated five times for each extraction and the set of spatiotemporal or spatial synergies accounting for the most EMG variation was selected for analysis. Dimensionalities were chosen in a manner consistent with earlier work by applying a threshold of R2 = 80% in the case of Npmd (Overduin et al., 2008; see Fig. 1C) or 95% in the case of Nsmd and Nsmp (Overduin et al., 2012; see Fig. 4C). In the case of the spatiotemporal synergy extraction, the procedure was repeated a further 100 times after EMG electrode identity was first randomized over channels within each object condition. The resulting “scrambled” synergies defined baseline levels of R2 explained by chance (see Fig. 1C, gray lines). The entire set of spatiotemporal and spatial synergy extraction procedures described above was applied to the neural data Dn in the same fashion as for the muscle data Dm (see Fig. 2; the resulting structures and coefficients are superscripted “n” rather than “m.” For comparisons of spatiotemporal muscle synergies and neural synergies fit to the six-unit subensemble data (see Fig. 3), we simply reduced the matrices Pin to those six (of 54 or 17) rows corresponding to the subset of units spanned by the subensemble and then fit these units' data with the reduced Pin.
Spatiotemporal and spatial synergy comparisons.
Spatiotemporal synergy coefficient comparisons, either with object mass (see Figs. 1D, 2D) or across muscle and neural levels (see Fig. 3), involved Pearson correlations corrected for the number of synergies and evaluated at a p < 0.05 significance threshold. In regards to spatial synergy comparisons, “data-derived” spatial synergies (vjmd or vjnd), i.e., those extracted from the task data (either Dm or Dn), were compared and matched to “primitive-derived” spatial synergies (vjmp or vjnp), i.e., those extracted from the spatiotemporal synergies (Pim or Pin). For these comparisons, we used a greedy search procedure (Overduin et al., 2012). For all Nsmd × Nsmp (or Nsnd × Nsnp) possible pairs of data- versus primitive-derived synergies, we first computed dot products (e.g., 8 × 4 = 32 for G2). The pair of data- versus primitive-derived spatial synergies with the highest dot product was defined as the best-matching pair. The pair with the highest dot product among the remaining (NsmD −1) × (NsmP −1) pairs (e.g., 7 × 3 = 21 for G2) defined the second-best match. This process was repeated until all spatial synergies in one set had been paired (e.g., over min(8, 4) = 4 times for G2). We then used Monte Carlo simulation to assess the significance of each match. We repeated the greedy search algorithm 10,000 times for each monkey after first randomly shuffling EMG or neural unit channel identity each time. We then compared the highest (best-matching) dot product between actual data- and primitive-derived synergies with the distribution of highest dot products from the 10,000 comparisons of shuffled synergies. If the former value exceeded the 95th percentile of the distribution of latter values, we took the match as significant at p < 0.05. We then repeated this comparison for the second-best actual synergy pair versus the distribution of second-best shuffled pairs, etc.
Results
The two monkeys, “G1” and “G2,” performed the task over multiple days while EMG data were recorded from 15–19 electrodes chronically implanted in muscles of the shoulder, arm, and hand. Each of the spatiotemporal muscle synergies extracted from the EMG data (Fig. 1A) comprises a specific pattern of muscle activity that spanned a particular phase of the task: reach, grasp, and carry. There were broad similarities between the two animals in terms of the incidence of tonic, phasic, or biphasic muscle recruitment within these synergies (Overduin et al., 2008).
Figure 1B illustrates the reconstruction of average EMG activity in a sample trial condition by a combination of spatiotemporal synergies. Note that the reconstruction is achieved by only six scalar parameters: three scaling coefficients (c) and three onset time coefficients (t), which together specify a period in which the synergy is active. The times at which the synergies were active relative to the time when the object was removed from its origin well in this example are indicated by the dashed vertical lines in Figure 1, A and B. The curve of the reconstruction R2 (Fig. 1C) presents a relatively sharp bend at a particular number of synergies. The curve suggested a dimensionality of 3, at which most (81%) of the variation in the EMG data was explained and beyond which minimal further variation could be explained by additional synergies.
Were these spatiotemporal synergies actually embodied in the CNS, they could provide the animal with a mechanism to generate a rich variety of task-appropriate reaches, grasps, and carrying movements by merely specifying suitable timing and amplitude coefficients for each of the three synergies. That is, rather than having to control individual muscles at each point in time throughout a trial, the CNS could simply output the equivalent of six scalar commands (three timing and three amplitude instructions). Further, appropriate commands could be learned by experience and potentially generalized over task variables. Consistent with such a scheme, we observed various systematic relationships involving synergy coefficients and object properties (Overduin et al., 2008). Most prominently, the amplitude coefficient associated with the grasp-related synergy was strongly modulated by the mass of the object being grasped (Fig. 1D), as captured by linear correlations for both G1 (r = 0.86, p < 0.0001) and G2 (r = 0.90, p < 0.0001).
If motor cortex were truly relying on spatiotemporal synergies to simplify motor control, then the population-level activity should broadly reflect the low-dimensional patterns by which these primitives may be recruited. To investigate this, we pooled the firing rates of multiple motor cortical units (G1: 54, G2: 17) recorded over multiple days of performance with a common set of objects. We subjected the firing rate data to the same synergy extraction algorithm as used for the EMG data to isolate population-wide modes of variation in neural activity (Fig. 2). The data windows were the same as those defining the EMG data except they were shifted back in time by a 50 ms delay to account for transmission latency between cortex and muscles (Morrow and Miller, 2003; Schieber and Rivlis, 2007; Stark et al., 2007b).
As observed for the spatiotemporal muscle synergies, the structure of each neural synergy appeared complex, with units participating in multiple synergies (Fig. 2A). Two features of the spatiotemporal muscle synergies were particularly salient: the sharp bend in the reconstruction curve corresponding to three synergies (Fig. 1C) and the strong modulation of the second synergy's amplitude with object mass (Fig. 1D). Were either of these patterns evident in the neural data? Indeed, the neural synergies resembled the muscle synergies in these two salient features.
First, the dimensionality of the spatiotemporal neural and muscle synergies was similar. In fitting the spiking data of individual trial conditions (Fig. 2B), the data reconstruction curves of the neural synergies, like those of the muscle synergies, bent more sharply at three than at any other number (Fig. 2C) even if the actual R2 values at this point were lower (R2 = 70% for both G1 and G2) than the values for the muscle synergies.
Second, the spatiotemporal neural and muscle synergies were related in their most salient behavioral modulation. Neural onset times preceded those of the muscle synergies by 0.07 ± 0.06 s (range −0.01–0.16 s, averaged over synergies). (Averaged over object conditions, neural synergy onset times for G1 were −0.39 ± 0.01 s, −0.13 ± 0.02 s, and 0.05 ± 0.08 s; those for G2 were −0.51 ± 0.06 s, −0.35 ± 0.06 s, and −0.08 ± 0.03 s, relative to object removal. As reported in Overduin et al. (2008), muscle synergy onset times for G1 were −0.31 ± 0.11 s, −0.10 ± 0.07 s, and 0.04 ± 0.16 s; those for G2 were −0.45 ± 0.16 s, −0.19 ± 0.10 s, and 0.01 + 0.17 s.) For both sets of synergies, the second of the three (as ordered by mean time of onset) captured the neural and muscular activity around the time of object grasp (Figs. 1B, 2B). The activation coefficients for the second neural synergy demonstrated a positive, linear relationship with object mass (r = 0.86, p < 0.001; both G1 and G2; Fig. 2D), as was observed for muscle synergies (Fig. 1D).
Each of the spatiotemporal neural synergies can be pictured as a mosaic of activation that waxed and waned at multiple loci on the cortical surface. Figure 2E depicts the degree to which units at each of monkey G2's recording sites were activated, on average, during the time of peak activation of each neural synergy. Highlighted are the locations of MI units 10, 14, and 13, maximally activated in the first, second, and third neural synergy, respectively.
As might be expected from their covariations with object mass (Figs. 1D, 2D), the amplitude coefficients of the grasp-related spatiotemporal neural and muscle synergies were also significantly linearly correlated with each other (G1: r = 0.90, p < 0.0001; G2: r = 0.78, p < 0.01), as shown in Figure 3A. We investigated whether this was true even at the single-trial level (and not just at the level of synergy coefficients averaged across each object condition's trials). To do so, we identified the largest subensembles of units (from within G1's 54-unit and G2's 17-unit ensemble) that had all been recorded at the same time. For both subjects, these subensembles included six units. For monkey G2, these units are identified by plus symbols (+) in Figure 2B. We then fit the synergies to both the EMG and unit firing data from the 247 (G1) or 56 (G2) trials spanned by these subensembles after first reducing the neural synergies to the six of 54 (G1) or six of 17 (G2) channels spanned by the subensemble's units. Although the correlations are reduced using these much smaller subensembles, the amplitude coefficients of the grasp-related synergies remained significantly correlated between the muscle and neural levels (G1: r = 0.16, p < 0.05; G2: r = 0.32, p < 0.05), as shown in Figure 3B.
The above results concern spatiotemporal (time-varying) synergies. As we have previously reported for the EMG dataset considered here (Overduin et al., 2012), these data could also be decomposed into combinations of spatial (time-invariant) synergies using a related factorization algorithm (Tresch et al., 1999; Lee and Seung, 1999). In the case of spatial synergies, each synergy captures a pattern of muscles firing in synchrony. Figure 4, A and B, illustrates the spatial muscle synergies found for G2 and their reconstruction of the average EMG activity in the same sample trial condition as in Figure 1B. We found that eight (G2) or 10 (G1) synergies were sufficient to reconstruct ≥95% of the variability in the EMG data (Fig. 4C; Overduin et al., 2012). Such spatial synergies could provide the animal with a mechanism to control its muscles continuously (if not as efficiently as in the case of spatiotemporal synergies) by specifying a time course of task-specific amplitude coefficients for each of the synergies (cf. Figs. 4B, 1B).
Inspection of the spatiotemporal muscle synergies (Fig. 1A) suggested that they themselves, like the data from which they were derived, could be expressed as a combination of spatial synergies. Certain subgroups of muscles resembling the spatial synergies tended to coactivate in a manner unique to each spatiotemporal synergy. Consider G2's forearm flexors (FCR, FDS, FDPU, and FCU): their covariation, for example, in the second spatiotemporal synergy (Fig. 1A), is captured in the fourth spatial synergy (Fig. 4A). The spatiotemporal muscle synergies could be decomposed directly into just four (G2) or five (G1) spatial synergies, with R2 ≥ 95% of the variance in the spatiotemporal synergy matrices preserved (Fig. 4D; R2 = 96% for both monkeys). Reconstruction of monkey G2's spatiotemporal muscle synergies by four spatial synergies, for example, is illustrated in Figure 4E.
The qualitative results concerning spatial muscle synergies hold also for spatial neural synergies (data not shown). For this analysis, the same populations of motor cortical units (G1: 54, G2: 17) were decomposed into spatial, rather than spatiotemporal, synergies. We set the dimensionality of the spatial synergies derived from neural data to be equal to that selected for spatial muscle synergies (i.e., eight for G2, 10 for G1). At this dimensionality, spatial neural synergies were able to account for 95% of firing rate variability in G2's units (as for the muscle synergies in Fig. 4C), but only 80% of G1's greater number of units (data not shown). Spatial neural synergies derived from spatiotemporal synergies (and not directly from the data) could explain 94% of the spatiotemporal synergies' firing rate variability for G2 and 85% for G1.
More importantly, when comparing the spatial synergies derived from spatiotemporal synergies with those extracted independently from the data, we found that all 5/5 (G1) and 4/4 (G2) of the former could be uniquely matched to the latter using a greedy search procedure, with significantly above-chance (p < 0.05) dot products. This held both for muscle data (G1 dot products: 0.73, 0.83, 0.84, 0.86, 0.93; G2: 0.83, 0.85, 0.87, 0.92; Fig. 4A) and for neural data (G1: 0.60, 0.70, 0.71, 0.84, 0.96; G2: 0.82, 0.84, 0.85, 0.95; data not shown). These results suggest that, at both the neural and muscle levels, the spatiotemporal synergies may be composed of sequences of a subset of the spatial synergies available to the animal.
Discussion
The idea that motor primitives may be encoded within the CNS remains controversial (Tresch and Jarc, 2009). The encoding of spatiotemporal synergies in particular would be radically different than continuous control of instantaneous motor variables. In much of the motor control literature, cortical activity is assumed and shown to be related to continuously controlled quantities such as the position, velocity, or acceleration of extrinsic effectors or intrinsic joints, the forces underlying these kinematics, or the muscle contractions determining these dynamics (Georgopoulos et al., 1999; Carmena et al., 2003; Morrow and Miller, 2003; Schieber and Rivlis, 2007; Stark et al., 2007b; Hochberg et al., 2012; Collinger et al., 2013). The main novelty of the present results is our finding that, at the population level, motor cortical units in the primate brain also reflect the recruitment of discrete, spatiotemporal synergies.
Earlier studies of motor cortex during natural behavior are consistent with its role in controlling discrete movement primitives. Motor cortical neurons have been observed to have temporally complex and heterogeneous response patterns during primate forelimb movement, with no apparent stable encoding of movement parameters (Churchland et al., 2007). Overall, such neurons do not appear to be as well tuned to instantaneous motor parameters as they are to the final multijoint posture terminating a movement (Aflalo and Graziano, 2006) or to temporally extended trajectories (Hatsopoulos et al., 2007; Saleh et al., 2012). Our results extend these findings to the control of muscles. We show that population-level activity reflects the presence of spatiotemporal synergies—ones whose dimensionality, timing, and amplitude modulation correspond with synergies independently inferred from muscle recordings.
Although we could not show directly that spatiotemporal neural synergies caused muscle synergies, the ∼70 ms difference in onset times was at least consistent with a causal relationship between the central and peripheral expressions of the synergies. Moreover, our analysis of single-trial reconstructions (Fig. 3B) does suggest a relatively direct relationship between a spatiotemporal synergy observed in neural activity and then in muscle activity around the time of object grasp. Although beyond the scope of this report, other methods for comparing bases extracted from spatiotemporal data from different sources, such as bidirectional independent component averaged representation (Brown et al., 2012, 2013), could potentially be adapted for the muscular and neural time series data considered here.
This evidence for a population-level encoding of spatiotemporal synergies hardly rules out other encoding schemes. For example, our results are broadly consistent with evidence that object properties are encoded within forelimb areas of primate motor cortex. Prior studies have found individual units that specifically encode objects (Georgopoulos et al., 1999) or the unique digits or grasping movements that they may activate (Poliakov and Schieber, 1999). In other studies, subsets of cells in PMd (Stark et al., 2007a), PMv (Murata et al., 1997), and MI (Mason et al., 2002) have all been shown to be sensitive to object shape, size, and other parameters that can modulate primate prehension. Here, we did not examine individual units or cortical areas for evidence that they preferentially encode specific objects, object properties, or postures. Nevertheless, units controlling the recruitment of spatiotemporal synergies would also be expected to modulate their discharge according to kinematic variables such as object size.
Another encoding scheme to consider is the representation of time-invariant spatial synergies. In contrast to spatiotemporal synergies, many studies have focused on synergies defined as simultaneous recruitment of groups of muscles with invariant levels of relative activation. Spinal cord studies have provided much of the direct evidence for CNS control of spatial synergies. For example, low-dimensional EMG patterns can be evoked by chemical microstimulation in spinalized frogs (Saltiel et al., 2001). In a similar preparation, intermediate zone spinal neurons are less related to the EMG of individual muscles than to spatial-synergistic premotor drives (Hart and Giszter, 2010). In primates, the spinal cord has been suggested as a substrate for spatial synergies (Cheung et al., 2009) and its premotor interneurons facilitate multiple muscles, including intrinsic hand muscles (Takei and Seki, 2010).
Spatial synergies may also be encoded supraspinally in the brainstem (Roh et al., 2011) or in the cortex. Within motor cortex, firing rates of forelimb-related cells or ensembles can predict spatial synergy combinations in rodents (Kargo and Nitz, 2003) and EMG profiles in primates (Morrow and Miller, 2003; Schieber and Rivlis, 2007), in whom the strongest cell–EMG correlations have also been observed to group into several clusters that tend to resemble spatial synergies (Holdefer et al., 2002). These “muscle fields” may be relatively hard-wired (Kargo and Nitz, 2003), particularly in the case of corticospinal and corticomotoneuronal cells that project through the spinal cord to act on small groups of muscles (Fetz and Cheney, 1980; Bennett and Lemon, 1994; Rathelot and Strick, 2009). Recently, we showed that EMG activity evoked by ICMS in monkeys could be decomposed into spatial muscle synergies similar to those seen in natural behavior (Overduin et al., 2012; Santello et al., 2013). Consistent with a role of these spatial synergies in simplifying motor control, evoked EMG patterns tend to sum linearly when multiple points in motor cortex are stimulated together (Ethier et al., 2006).
Given their discrete versus continuous control schemes, these two conceptions of synergies, spatiotemporal and spatial, are too fundamentally disparate to allow a simple test to determine which provides a better explanation of the data. Instead, we suggest that they are compatible rather than opposed (Fig. 4). In particular, spatial synergies may be hard-wired and conscripted together in the form of higher-level, spatiotemporal synergies (Drew et al., 2008). The encoding of groups of synergistic muscles by subpopulations of motor cortical units (Yakovenko et al., 2011), together with the transient activation of these cells relative to the more sustained firing of spinal neurons (Shalit et al., 2012), suggest that the cortex could program spatiotemporal sequences of activity across these muscle groups. The large cortical territory related to the forelimb in primates may be specialized for solving how to control this high-dimensional musculoskeletal apparatus using learned combinations of lower-level primitives (Berniker et al., 2009). The spatial synergies uninvolved in reconstructing spatiotemporal synergies (i.e., the unmatched plots in Fig. 4A) may capture feedback mechanisms or other control processes not mediated by spatiotemporal synergies.
In addition to demonstrating that motor cortical activity reflects recruitment of spatiotemporal synergies, the present investigation is the first to our knowledge to have applied spatiotemporal (time-varying) synergy decomposition to single-unit data. Other dimensionality reduction techniques have been used previously, including locally linear embedding (Churchland et al., 2007) or modified factor analysis (Santhanam et al., 2009) including Gaussian-process factor analysis (Yu et al., 2009). However, the non-negative matrix factorization approach used in synergy decomposition may be better suited to neural (and EMG) data than the above techniques. This is both because unit firing rates (like muscular contractions) are intrinsically non-negative and because the spatiotemporal version of the algorithm in particular can isolate temporally extended primitives.
There are important implications for the development of neuroprosthetics in our finding that the CNS may build complex behaviors by combining spatiotemporal synergies (which themselves may be built from learned sequences of spatial synergies). Motor neuroprostheses may allow paralyzed patients to bypass damaged spinal circuits and control directly virtual or robotic prostheses or functional electrical stimulation (FES) implants. Most experiments to date have used subjects' motor cortical activity to continuously control computer cursor movements (Serruya et al., 2002; Taylor et al., 2002; Carmena et al., 2003; Ganguly and Carmena, 2009), robots (Carmena et al., 2003; Hochberg et al., 2012; Collinger et al., 2013) or FES systems (Moritz et al., 2008; Pohlmeyer et al., 2009) of up to four independent dimensions. Researchers hope that, as the stability and density of recording technology improve, subjects will be able to handle more such degrees of freedom. However, if the CNS is wired to control only a handful of spatiotemporal synergies, then neuroprosthetic performance could degrade rapidly as further variables are added. At the same time, if the brain is built to launch these spatiotemporal synergies only once per movement rather than continuously, this biological design feature could radically simplify control architectures for neuroprosthetic systems.
Footnotes
↵*J.M.C. and E.B. are co-senior authors.
This work was supported by a fellowship from the Dystonia Medical Research Foundation (S.A.O.), the National Institute of Neurological Disorders and Stroke–National Institutes of Health (Grant NS44393 to E.B.), and the National Science Foundation (Grant EFRI-1137267 to J.M.C.). We thank Margo Cantor and Charlotte Potak for assistance.
The authors declare no competing financial interests.
- Correspondence should be addressed to Simon A. Overduin, PhD, Department of Electrical Engineering and Computer Sciences, University of California, 185, Li Ka Shing Center, Berkeley, CA 94720. overduin{at}mit.edu