Matches in SemOpenAlex for { <https://semopenalex.org/work/W2039918072> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W2039918072 endingPage "809" @default.
- W2039918072 startingPage "806" @default.
- W2039918072 abstract "In a crowded environment, how do we hear a single talker while ignoring everyone else? In this issue of Neuron, Zion Golumbic et al., 2013Zion Golumbic E.M. Ding N. Bickel S. Lakatos P. Schevon C.A. McKhann G.M. Goodman R.R. Emerson R. Mehta A.D. Simon J.Z. et al.Neuron. 2013; 77 (this issue): 980-991Abstract Full Text Full Text PDF PubMed Scopus (488) Google Scholar record from the surface of the human brain to show how speech tracking arises through multiple neural frequency channels, both within and beyond auditory cortex. In a crowded environment, how do we hear a single talker while ignoring everyone else? In this issue of Neuron, Zion Golumbic et al., 2013Zion Golumbic E.M. Ding N. Bickel S. Lakatos P. Schevon C.A. McKhann G.M. Goodman R.R. Emerson R. Mehta A.D. Simon J.Z. et al.Neuron. 2013; 77 (this issue): 980-991Abstract Full Text Full Text PDF PubMed Scopus (488) Google Scholar record from the surface of the human brain to show how speech tracking arises through multiple neural frequency channels, both within and beyond auditory cortex. Spoken language is a foundation of human society: billions of us use it every day, for most of our lives, to communicate nuanced information about our mental states. Unfortunately, background noise and other talkers often corrupt speech acoustics, especially when conversing in social environments such as a workplace, cafe, or sidewalk. But even in these cluttered scenes, we manage to segregate and selectively attend to just one talker. We may be dimly aware of other sounds, yet the attended voice becomes the only perceptually salient one, yielding full understanding. This remarkable ability—the so-called “cocktail party effect”—has been studied behaviorally for over half a century (Cherry, 1953Cherry E.C. J. Acoust. Soc. Am. 1953; 25: 975-979Crossref Scopus (2636) Google Scholar), but only now are we beginning to understand how our brains accomplish it. One central challenge has been to identify the stages of neural processing where selectivity for an attended talker emerges. At early levels of the auditory system, we know that the processing of attended and unattended talkers will be conflated. Unlike visual objects on the retina, sounds from different talkers enter the ear mixed together, so the brain must tease them apart using cues such as spatial location and precise temporal coherence (Shamma et al., 2011Shamma S.A. Elhilali M. Micheyl C. Trends Neurosci. 2011; 34: 114-123Abstract Full Text Full Text PDF PubMed Scopus (270) Google Scholar). But eventually we track only one voice at a time, which then achieves perceptual dominance and favored access to further processing such as memory. The question is, where does this happen? Historically, much neuroscientific research into the “locus” of auditory attentional selection has used artificial, nonspeech stimuli in relatively uncluttered scenes. The approach has many advantages, including easily parameterized stimuli and well-characterized neural responses, and it demonstrates that attention can affect very early sensory activity (Woldorff et al., 1993Woldorff M.G. Gallen C.C. Hampson S.A. Hillyard S.A. Pantev C. Sobel D. Bloom F.E. Proc. Natl. Acad. Sci. USA. 1993; 90: 8722-8726Crossref PubMed Scopus (481) Google Scholar). However, real environments present multiple simultaneous and conflicting cues, i.e., a high perceptual load, which will constrain and influence the locus of attentional selection (Lavie, 2005Lavie N. Trends Cogn. Sci. 2005; 9: 75-82Abstract Full Text Full Text PDF PubMed Scopus (1371) Google Scholar). Therefore, an emphasis on ecological validity, and on analytic methods to cope with the more complex neural responses, will be crucial to address how selectivity for one talker emerges above the background. Closely related to where selectivity arises is how neural activity dynamically parses different talkers and how attention modulates those representations. One prominent theory holds that speech perception relies on cortical activity entraining or phase-locking to quasirhythmic features of the acoustics, at multiple embedded time scales such as the syllable and phrase (Giraud and Poeppel, 2012Giraud A.L. Poeppel D. Nat. Neurosci. 2012; 15: 511-517Crossref PubMed Scopus (1017) Google Scholar). These low-frequency fluctuations (<8 Hz) in neural activity would parse the speech representation, imposing periods of relatively high excitability upon high-frequency spiking activity, which in turn would encode the speech and communicate it to higher levels of processing. Such coordination or nesting of neural activity across different frequency bands is therefore proposed to be a fundamental principle of neural representation, computation, and regional communication. In the context of a multitalker environment, the “selective entrainment” hypothesis holds that attention causes these low-frequency fluctuations to phase-align exclusively to the attended talker, so only the attended speech drives the higher-frequency spiking and downstream processing. Despite its theoretical import, however, the question remained whether low-frequency phase entrainment and high-frequency power modulations simultaneously track the attended talker in a cluttered scene. The paper in this issue of Neuron from Zion Golumbic et al., 2013Zion Golumbic E.M. Ding N. Bickel S. Lakatos P. Schevon C.A. McKhann G.M. Goodman R.R. Emerson R. Mehta A.D. Simon J.Z. et al.Neuron. 2013; 77 (this issue): 980-991Abstract Full Text Full Text PDF PubMed Scopus (488) Google Scholar addresses these questions by recording from the cortical surface while subjects attend to one of two competing talkers. Subdural surface array recordings (electrocorticography, or ECoG) are presently a gold standard for acquiring human brain activity with high temporal and spatial resolution. They have some practical limitations, in that they cannot be used in healthy subjects (implantation typically precedes epilepsy surgery) and cannot sample the cortex fully or uniformly. But unlike noninvasive techniques, they provide good signal-to-noise across a large range of frequencies, including high-gamma (75–150 Hz) which has been shown to correlate with multiunit activity. In this study, subjects had large, coarsely-spaced electrode arrays (total ∼120 contacts, ∼1 cm spacing) implanted over lateral cortex while they performed a “cocktail party” comprehension task. On each trial, subjects were presented a brief (9–12 s) movie of two simultaneous talkers, side-by-side, each uttering an unrelated narrative. A cue indicated which talker the subject should attend. At the end of the trial subjects indicated whether a final word in the narrative was congruent (e.g., “The dog barks when he hears the…” doorbell [congruent], or table [incongruent]). Another block consisted of trials with each single talker alone and provided a reference for the multitalker situation. The task was therefore rather naturalistic, much like conversing with two people and attending to one at a time. Neural signals were analyzed across frequencies in three converging ways, each appropriate for measuring a different aspect of the speech response. First, intertrial coherence assessed the consistency of neural responses, both in phase (reliably precise timing) and power (reliable trial-to-trial amplitude). Consistency alone however does not specify what about the speech is represented. Therefore, responses from all electrodes were integrated as a population to create a reconstruction of the speech temporal envelope, correlated with brain activity. Finally, temporal response functions (TRFs) were derived to find the linear kernel or representative response for each electrode, frequency band, and talker. These TRFs are the most specific of the three measures because they can test whether attention merely decreases the relative amplitude of an ignored talker’s cortical representation or whether attention abolishes it, and with what time course. The data clearly show that both low-frequency phase (delta-theta, 1–7 Hz) and high gamma power (70–150 Hz) yield consistent trial-to-trial responses to speech. Other frequency bands do not, nor does low frequency power—adding weight to the argument that speech tracking is partly due to entrainment of endogenous rhythms (Schroeder and Lakatos, 2009Schroeder C.E. Lakatos P. Trends Neurosci. 2009; 32: 9-18Abstract Full Text Full Text PDF PubMed Scopus (1002) Google Scholar). However, these effects are not equally distributed across cortical areas. The high-gamma tracking tends to be clustered in the superior temporal lobe and the low-frequency phase response is more widespread, including superior and anterior temporal regions and inferior parietal and frontal lobes. Across electrodes though, both the low-frequency phase and high-gamma power showed more consistent responses to the attended versus the ignored speech. Corroborating this observation, speech envelope acoustics could only be reconstructed from neural responses for the attended talker, not the unattended. Finally, the TRFs or best linear responses based on low-frequency phase or high gamma power allowed a direct comparison of attended and ignored speech tracking. Some electrodes, clustered mainly along the Sylvian fissure, displayed a relative gain of attended versus unattended speech (for both frequency ranges). Others, spatially more disperse, showed an essentially exclusive preference for the attended talker, i.e., no detectable tracking of the ignored. These more selective sites also increased in their selectivity for the attended talker over the course of the sentences. In other words, tracking an attended talker depends on low-frequency phase-locking as well as high-gamma modulation. Near auditory cortex this activity still represents the ignored talker, albeit less strongly than the attended. This entrainment becomes more exclusive in higher order cortical regions, perhaps reflecting the perceptual dominance of the target talker. The present paper synthesizes and advances several recent studies on selective attention to speech. Notably, Mesgarani and Chang, 2012Mesgarani N. Chang E.F. Nature. 2012; 485: 233-236Crossref PubMed Scopus (543) Google Scholar, also using ECoG in human patients, showed that high gamma activity in nonprimary auditory cortex tracked the detailed acoustic features of two simultaneous talkers. Attended speech was represented more powerfully than unattended speech, although unattended was still evident, and this selectivity grew over the course of a sentence. In their study, electrode arrays covered mainly the posterior superior temporal lobe, so they could not test how attentional selectivity emerges over large cortical areas (also they observed no anatomical patterns within the covered region). However the arrays had high spatial density (4 mm) which enabled the reconstruction of speech acoustics from neural responses not only in time but also in frequency. This granularity is difficult or impossible to achieve with less dense surface arrays or noninvasive methods. Also, one particularly elegant aspect of their paradigm was the close neural link to behavior, demonstrating how well activity in nonprimary auditory cortex relates to subjective perceptual outcome. Not only was the target talker encoded best when subjects performed successfully, but the data showed that many errors in comprehension seemed to follow from an early misallocation of attention to the wrong talker. While Mesgarani and Chang’s analysis was limited to high-gamma neural activity, other recent work using noninvasive methods provides evidence for speech tracking by lower frequencies. For instance Kerlin and colleagues (Kerlin et al., 2010Kerlin J.R. Shahin A.J. Miller L.M. J. Neurosci. 2010; 30: 620-628Crossref PubMed Scopus (302) Google Scholar), recording at the scalp with electroencephalography (EEG), demonstrated that 4–8 Hz fluctuations in the auditory speech response were modulated by attention. This low-frequency representation of the attended talker—likely following the amplitude modulations in speech acoustics—was boosted while that of an unattended talker was mildly suppressed. Unlike the high gamma studies, Kerlin et al. placed talkers in different perceived locations using virtual acoustic space. This lent additional realism to the task and allowed them to link early occipitoparietal alpha lateralization, presumed to reflect spatial attentional control, to later modulation of sensory processing. The results broadly mirror those of Ding and Simon, 2012Ding N. Simon J.Z. Proc. Natl. Acad. Sci. USA. 2012; 109: 11854-11859Crossref PubMed Scopus (436) Google Scholar, who went on to show with magnetoencephalography (MEG) that the low-frequency auditory responses become biased toward attended speech first around ∼100 ms latency in nonprimary auditory cortex, rather than during the earlier ∼50 ms response presumed to arise in primary fields. Interestingly, Ding and Simon also independently varied the intensity of target or background talkers, demonstrating that in posterior auditory cortex talkers are represented as different objects, susceptible to object-specific attentional gain modulation. However, neither of these studies characterized the emergence of selectivity beyond superior temporal cortex or as a function of high-frequency neural activity. The present results by Zion Golumbic et al., 2013Zion Golumbic E.M. Ding N. Bickel S. Lakatos P. Schevon C.A. McKhann G.M. Goodman R.R. Emerson R. Mehta A.D. Simon J.Z. et al.Neuron. 2013; 77 (this issue): 980-991Abstract Full Text Full Text PDF PubMed Scopus (488) Google Scholar, by addressing both low-frequency and high gamma neural representations with large arrays on the cortical surface, therefore synthesize and extend evidence from numerous approaches within a coherent theoretical framework. Zion Golumbic’s study points the way toward a number of questions and challenges for future studies. Among the most pressing are to specify the functional relationships (if any) among different frequency ranges involved in representing or modulating speech, and among the cortical regions where they arise (Canolty and Knight, 2010Canolty R.T. Knight R.T. Trends Cogn. Sci. 2010; 14: 506-515Abstract Full Text Full Text PDF PubMed Scopus (1213) Google Scholar; Giraud and Poeppel, 2012Giraud A.L. Poeppel D. Nat. Neurosci. 2012; 15: 511-517Crossref PubMed Scopus (1017) Google Scholar). For instance, though the present results generally support a “selective entrainment” hypothesis, they do not speak to one of its key tenets. Specifically, we do not yet know whether cross-frequency phase-power coupling observed for entraining clear speech or simpler rhythmic stimuli plays a perceptually consequential role in a cocktail party environment. Also unclear is the precise relation between these low-frequency phase or high gamma power effects and activity in other bands. In particular, alpha activity (8–12 Hz) represents one of the most prominent brain rhythms, is closely related to perception and attention, and may through coupling to higher frequencies reflect inhibition upon auditory processing (Jensen et al., 2012Jensen O. Bonnefond M. VanRullen R. Trends Cogn. Sci. 2012; 16: 200-206Abstract Full Text Full Text PDF PubMed Scopus (295) Google Scholar). And though the present paper sets the stage, much work remains to distinguish or harmonize selective entrainment with other models of attention including traditional gain modulation. It is important to note that hypotheses about gain and timing may be mutually compatible, especially in the context of cross-frequency phase-power coupling. Furthermore, for such a complex and inherently dynamic signal as speech, we have yet to determine how selective attention is deployed to precise moments in time (Nobre et al., 2007Nobre A. Correa A. Coull J. Curr. Opin. Neurobiol. 2007; 17: 465-470Crossref PubMed Scopus (380) Google Scholar) and how that might differ from attention to space, pitch, or other speech features. Even framing the “cocktail party problem” raises broader questions about how we perceptually organize a noisy world. First, though attentional selection could act on fully established, competing representations of auditory objects or “streams” (Ding and Simon, 2012Ding N. Simon J.Z. Proc. Natl. Acad. Sci. USA. 2012; 109: 11854-11859Crossref PubMed Scopus (436) Google Scholar), it likely participates in their very formation (Shamma et al., 2011Shamma S.A. Elhilali M. Micheyl C. Trends Neurosci. 2011; 34: 114-123Abstract Full Text Full Text PDF PubMed Scopus (270) Google Scholar; Shinn-Cunningham, 2008Shinn-Cunningham B.G. Trends Cogn. Sci. 2008; 12: 182-186Abstract Full Text Full Text PDF PubMed Scopus (458) Google Scholar). For speech, this may be the rule rather than the exception given the countless everyday circumstances rendering sensory cues ambiguous, including both acoustic degradations (Wild et al., 2012Wild C.J. Yusuf A. Wilson D.E. Peelle J.E. Davis M.H. Johnsrude I.S. J. Neurosci. 2012; 32: 14010-14021Crossref PubMed Scopus (232) Google Scholar) and hearing loss. We also need to determine how attention interacts with other top-down or contextual influences such as linguistic knowledge or visual information and to what extent selective entrainment or another mechanism might serve as a final common path to exert these influences. Ultimately, given the inescapable ecological complexity of the task, all these questions will bear on understanding how we track speech in a realistic environment. The present paper brings us a large step closer to understanding this enormously important human ability. Mechanisms Underlying Selective Neuronal Tracking of Attended Speech at a “Cocktail Party”Zion Golumbic et al.NeuronMarch 06, 2013In BriefZion Golumbic et al. use direct brain recordings in surgical epilepsy patients to investigate how people attend one speaker in noisy social environments. Neuronal activity dynamically tracks an attended speaker, with increasing selectivity in higher-order regions, as a sentence unfolds. Full-Text PDF Open Archive" @default.
- W2039918072 created "2016-06-24" @default.
- W2039918072 creator A5047318844 @default.
- W2039918072 date "2013-03-01" @default.
- W2039918072 modified "2023-09-29" @default.
- W2039918072 title "Shaken, Not Stirred: Emergence of Neural Selectivity in a “Cocktail Party”" @default.
- W2039918072 cites W1978389345 @default.
- W2039918072 cites W1991139021 @default.
- W2039918072 cites W2022697906 @default.
- W2039918072 cites W2027250573 @default.
- W2039918072 cites W2044179284 @default.
- W2039918072 cites W2056486423 @default.
- W2039918072 cites W2065169231 @default.
- W2039918072 cites W2074905267 @default.
- W2039918072 cites W2082183045 @default.
- W2039918072 cites W2107365776 @default.
- W2039918072 cites W2118980133 @default.
- W2039918072 cites W2133780491 @default.
- W2039918072 cites W2138164020 @default.
- W2039918072 cites W2158866673 @default.
- W2039918072 cites W2160932740 @default.
- W2039918072 doi "https://doi.org/10.1016/j.neuron.2013.02.015" @default.
- W2039918072 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3643306" @default.
- W2039918072 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/23473312" @default.
- W2039918072 hasPublicationYear "2013" @default.
- W2039918072 type Work @default.
- W2039918072 sameAs 2039918072 @default.
- W2039918072 citedByCount "0" @default.
- W2039918072 crossrefType "journal-article" @default.
- W2039918072 hasAuthorship W2039918072A5047318844 @default.
- W2039918072 hasBestOaLocation W20399180721 @default.
- W2039918072 hasConcept C118792377 @default.
- W2039918072 hasConcept C15744967 @default.
- W2039918072 hasConcept C161790260 @default.
- W2039918072 hasConcept C169760540 @default.
- W2039918072 hasConcept C185592680 @default.
- W2039918072 hasConcept C38652104 @default.
- W2039918072 hasConcept C41008148 @default.
- W2039918072 hasConcept C46312422 @default.
- W2039918072 hasConcept C55493867 @default.
- W2039918072 hasConceptScore W2039918072C118792377 @default.
- W2039918072 hasConceptScore W2039918072C15744967 @default.
- W2039918072 hasConceptScore W2039918072C161790260 @default.
- W2039918072 hasConceptScore W2039918072C169760540 @default.
- W2039918072 hasConceptScore W2039918072C185592680 @default.
- W2039918072 hasConceptScore W2039918072C38652104 @default.
- W2039918072 hasConceptScore W2039918072C41008148 @default.
- W2039918072 hasConceptScore W2039918072C46312422 @default.
- W2039918072 hasConceptScore W2039918072C55493867 @default.
- W2039918072 hasIssue "5" @default.
- W2039918072 hasLocation W20399180721 @default.
- W2039918072 hasLocation W20399180722 @default.
- W2039918072 hasLocation W20399180723 @default.
- W2039918072 hasLocation W20399180724 @default.
- W2039918072 hasOpenAccess W2039918072 @default.
- W2039918072 hasPrimaryLocation W20399180721 @default.
- W2039918072 hasRelatedWork W1976571824 @default.
- W2039918072 hasRelatedWork W2067291703 @default.
- W2039918072 hasRelatedWork W2153750051 @default.
- W2039918072 hasRelatedWork W2160162460 @default.
- W2039918072 hasRelatedWork W2748952813 @default.
- W2039918072 hasRelatedWork W2899084033 @default.
- W2039918072 hasRelatedWork W2953331192 @default.
- W2039918072 hasRelatedWork W3005527239 @default.
- W2039918072 hasRelatedWork W3145582168 @default.
- W2039918072 hasRelatedWork W3200462162 @default.
- W2039918072 hasVolume "77" @default.
- W2039918072 isParatext "false" @default.
- W2039918072 isRetracted "false" @default.
- W2039918072 magId "2039918072" @default.
- W2039918072 workType "article" @default.