CONTEXT LEARNING FOR THREAT1

Context Learning for Threat Detection

Akos Szekely, Suparna Rajaram, & Aprajita Mohanty

Stony Brook University

Author Note

Akos Szekely, Department of Psychology, Stony Brook University;

Suparna Rajaram, Department of Psychology, Stony Brook University;

Aprajita Mohanty, Department of Psychology, Stony Brook University.

This research was supported by funds from Stony Brook University. The authors thank Jie Wang for assistance with data collection, and Christian Luhmann for comments on an earlier draft.

Correspondence should be addressed to Aprajita Mohanty, Department of Psychology, Stony Brook University, Stony Brook, NY, 11794-2500. Contact:

Abstract

It is hypothesized that threatening stimuli are detected better due to their salience or physical properties. However, these stimuli are typically embedded in a rich context, motivating the question whether threat detection is facilitated via learning of contexts in which threat stimuli appear. To address this question, we presented threatening face targets in new or old spatial configurations consisting of schematic faces and found that detection of threatening targets was faster in old configurations. This indicatesthat individuals are able to learn regularities within visual contexts and use this contextual information to guide detection of threatening targets. Next, we presentedthreatening and non-threatening face targets embedded in new or old spatial configurations. Detection of threatening targets was facilitated in old configurations, andthis effect was reversed for non-threatening targets. Present findings show that detection of threatening targets is driven not only by stimulus properties as theorized traditionally but also by learning of contexts in which threatening stimuli appear. Further, resultsshow that context learning for threatening targets obstructs context learning for non-threatening targets. Overall, in addition to typically emphasized bottom-up factors, our findings highlight the importance of top-down factors such as context and learning in detection of salient, threatening stimuli.

Keywords: Context; Threat; Learning; Emotion; Detection

Context Learning for Threat Detection

Threatening stimuli are detected faster and more accurately than neutral stimuli(Williams, Mathews & MacLeod, 1996; Mogg & Millar, 2000; Fox et al, 2000) and this perceptual advantage has traditionally been attributed to bottom-up factors such as distinctive physical characteristics or evolutionary salience of threatening stimuli (Vuilleumier, 2005; Vuilleumier & Driver, 2007; Lundqvist, Esteves, & Ohman, 1999; Lundqvist & Ohman, 2005; Ohman, Lundqvist, & Esteves, 2001). Neurally, it is hypothesized that threatening stimuli are processed via subcortical pathways to the amygdala, allowing an automatic response to threat even before cortical processing is complete (LeDoux, 1996; Öhman, 2002). However, recent research shows that detection of emotional stimuli is impacted by top-down factors such as goals (Hahn & Gronlund, 2007), attention (T. J. Sussman, Weinberg, A., Szekely, A., Proudfit, G. H., & Mohanty, A., 2016; Sussman et al., 2015; Pessoa, 2005; Pessoa & Adolphs, 2010; Mohanty et al., 2007), and expectations (Jin, Sussman, Szekely, Luhmann, & Mohanty, 2015). While this research shows that explicit information regarding threat enhances its subsequent detection, in the real world, threatening stimuli do not necessarily occur in isolation or follow explicit cuespredicting their occurrence.

Rather, threatening stimulioften occur embedded in a rich context of elements that predict their identity and location. It is adaptive for humans to learn regularities in contexts that predict threat and use this learning to guide their behavior even before arrival of the threatening stimulus. For example, after encountering a snake slithering out of a pile of rocks next to several trees near a swamp, a person will search for the snake the next time they encounter such a configuration of rocks near trees and a swamp. This may not be the case when they encounter another configuration comprising of a pile of rocks by concrete path in a manicured lawn. Despite the relevance of such context learning in everyday behavior, and potential consequences of its impairment in clinical conditions such as post-traumatic stress disorder(Liberzon & Sripada, 2007; Maren, Phan, & Liberzon, 2013; Milad et al., 2009; Rougemont-Bücking et al., 2011)this function has not been empirically investigated. Some empirical studies examining the detection of threatening stimuli have presented these targetstimuli within an array of non-threatening stimuli (Gilboa-Schechtman, Foa, & Amir, 1999; Hansen & Hansen, 1988; Ohman, Lundqvist, & Esteves, 2001; Pinkham, Griffin, Baron, Sasson, & Gur, 2010); however, to our knowledge, no studies have manipulated learning of the surrounding context, i.e., the array,and examined its impact on threatening target detection. Manipulation of surrounding context is important in threat detection because it allows examination of both bottom-up attentional capture by salient targets, as well as top-down use of contextual information for detection.

In the present study we examined whether individuals are able to learn and use contextualinformation to guide detection of threatening stimuliin a top-down manner. There is considerable evidence indicating that knowledge regarding spatial context can create expectations aboutthe location or identity of visual objects, helping us to rapidly and accurately detect them (Bar, 2004; Summerfield & de Lange, 2014; Summerfield & Egner, 2009). Contextual information aids detection of targets by constraining the range of possible objects that can be expected to occur within that context. Furthermore, visual contexts are not random. Rather, they tend to be highly regular in terms of location and timing of objects relative to each other. Humans can extract statistical regularities in their visual context and use this knowledge to guide detection of visual targets(Turk-Browne, Jungé, & Scholl, 2005; Turk-Browne, Scholl, Johnson, & Chun, 2010).

Robust memory of regularities within visual contexts has been shown to guide faster detection of embedded targets, as demonstrated by the contextual cueing effect (Chun & Jiang, 1998). In this effect, targets that are repeatedly encountered at an invariant position within the same distractor configuration or ‘context’ are detected faster than when they are presented in non-repeated, random distractor configurations. It is hypothesized that the learned distractor contexts serve as attentional “cues” that guide attention to the target for faster detection than do non-repeated distractor contexts (Chun, 2000). Furthermore, this facilitation was driven by implicit memory of visual context information because participants were not instructed to learn the displays and the learning advantage occurred even when participants, including amnesic individuals, remained unaware of context repetition (Chun & Phelps, 1999; Chun, 2000).

It is well-demonstrated using associative learningparadigms that humans learn contexts associated with threat better than non-threatening contexts (Grillon & Davis, 1997). However, it remains unknown whether threat detection itself involves the use of context. In other words, is detection of threatening stimuli automatic and immune to knowledge regarding the surrounding context? Or, is it possible to extract regularities from the global context (where the context itself is non-threatening but consistently contains a threatening target) and use this contextual information to guide detection of threatening targets? Furthermore, if such contextual learning occurs for threat detection, how does it compare with the learning of non-threatening targets? To examine these questions, we presented threatening (angry features) or non-threatening (neutral features) schematic face targets in arrays that consisted of non-threatening faces and were either repeated or new. If threatening faces are detected automatically or due to bottom-up factors such as their physical characteristics or salience, we would expect them to be detected faster irrespective of whether they are presented in old or new arrays. However, if participants are able to learn the context created by relatively non-threatening distractors and use this contextual knowledgeto then detectthreatening faces, we expect to see faster detection of threatening faces for old versus new arrays. This would indicate that attentional bias to threatening stimuli can be modified based on learning of the surrounding context and is not strictly automatic in nature.

In Experiment 1, we tested whether context learning can facilitate the detection of schematic threatening face stimuli and hypothesized faster detection of threatening faces for old versus new arrays. In Experiment 2 we tested whether contextual learning can facilitate detection of discriminable target stimuli that are not emotionally salient. In Experiment 3 we tested the hypothesis that detection of threatening targets in old vs new arrays would be better than detection of non-threatening stimuli in old vs new arrays. Experiment 4 sought to demonstrate that contextual cueing occurs for non-threatening stimuli as well. Finally, in Experiment 5 we tested the hypothesis that enhanced detection of threatening targets in old vs new arrays would be even more augmented by providing additional time to process the configuration of the distractors in the arrays. This hypothesis is based on earlier studies in which “placeholders” predicting the future locationsof array items were presented before the array itself, providing additional time to encodethe locations of array items and augmenting the process of contextual cueing (Geyer, Shi, & Muller, 2010; Geyer, Zehetleitner, & Muller, 2010; Ogawa & Kumada, 2008; Ogawa, & Watanabe, 2010). The “pre-cuing” in Experiment 5 allowed us to examine whether the same augmentation of contextual cuing also occurs for threatening stimuli and if it differs for non-threatening stimuli. If there is an augmentation for threatening stimuli, based on previous research it would best be attributed to additional learning of the configuration of the array, rather than bottom-up capture of attention by the threatening target.

General Method

In fiveexperiments, we used a modification of the classic contextual cueing paradigm with the ultimate aim of examining whether learning of visuospatial context facilitates detection of threatening targets and how this compares to detection of non-threatening targets. In the classic contextual cueing task participants search for a rotated T target presented in visual arrays of rotated L distractors (Chun & Jiang, 1998; Chun & Phelps, 1999). In the present series of experiments participants searched for a schematic face drawn in a dotted line presented among schematic faces drawn in solid lines. The selection of dependent measures and data trimming (where applicable) in the present experiments followed the rationale reported in past research using similar paradigms (Chun & Jiang, 1998, 2003; Luhmann, 2011). The selection of experimental manipulations was driven by the theoretical hypotheses under investigation.

Apparatus and Stimuli

All stimuli were line drawings of an angry (threatening), a jumbled, or a non-threatening face. We adapted line drawing of faces from previous research that used schematic threatening and non-threatening faces as effective emotional and neutral stimuli (Lundqvist et al., 1999; Lundqvist & Ohman, 2005).

The experiments were conducted on Dell computers using Psychopy software (Peirce, 2007, 2008). Face stimuli consisted of line drawings (Figure 1) where target faces were drawn with dotted lines while distractor stimuli were drawn with solid lines. Unrestrained viewing distance was approximately 50 cm. The visual array appeared within a grid of 8 x 6 positions that subtended approximately 23.91 x 19.81 degrees of visual angle, and was limited to a 13.6 inch x 10.2 inch (17 inches diagonally) section of computer monitors (see Chun & Jiang, 1998). The background was set to a light gray of hexadecimal value of #EDEDED, and line drawing of faces were colored blue, green, red, and purple, with hue, saturation, and brightness objectively matched across colors (Heider, 1972). The size of each face was approximately 2.3 x 3.1 degrees of visual angle. In order to prevent collinearities with other stimuli, items were jittered from their positions in steps of 0.2 degrees of visual angle up to 0.8 degrees of visual angle in both the x and y dimensions. The combination of grid and stimulus size prevented any stimulus from appearing within 1 degree of visual angle from any other stimulus.

Facial outline, ear position and size, nose position and size, eyebrow length, and eye position were held constant. Eyes were ovals for non-threatening faces, and cut in half such that the top was removed for angry faces. Eyebrows were horizontal in non-threatening faces, tilted at approximately a 45-degree angle in threatening faces, the mouth was a straight line in non-threatening faces, and curved by raising the central point closer to the nose and slightly increasing the length of the mouth in angry faces. The interior of the faces was set to background color. Dotted lines were subjectively chosen to appear difficult but differentiable on screen. The exact shape of the face and features were selected for their effectiveness in conveying emotion based on ratingsof valence (on a scale of 1-10, with 1 being most negative and 10 being most positive), arousal (on a scale of 1-10, with 1 being lowest arousal and 10 being highest arousal), and dominance (on a scale of 1-10, with 1 being least dominant and 10 being most dominant) (Aronoff, Woike, & Hyman, 1992; Schlosberg, 1952) for a separate group of 18 participants. Schematic threatening faces (Valence:M= 2.06,SD= 0.80; Arousal:M= 6.06,SD= 1.30; Dominance:M= 6.78,SD= 1.00) were rated significantly more unpleasant,t(34) = -11.87, SE = 0.27, p0.001, arousing,t(34) = 4.79, SE = 0.45, p0.001, and dominant,t(34) = 7.54, SE = 0.38, p0.001than non-threatening faces (Valence:M= 5.28,SD= 0.83; Arousal:M= 3.89,SD= 1.41; Dominance:M= 3.89,SD= 1.28).

General Design & Procedure

Each trial began with a small fixation cross appearing in the middle of the screen for 500 msec, followed by an array of faces (Figure 1). Participants were asked to indicate presence or absence of a face drawn with a dotted line in the array of faces drawn in solid line by pressing the “.” key if the target was present, and the “/” key if it was absent. Hence, we asked participants to discriminate the presence versus absence of a target in dotted lines in the array. We chose this task to minimize the potential for“pop-out” effects for targets of interest (Geyer, Zehetleitner, & Muller, 2010) because our ultimate aim was to measure the emergence of contextual learning rather than stimulus-driven responses to threatening stimuli. Each trial ended either after participant response or 6 sec. The visuospatial context consisted of an array of faces where the spatial configurations of the faces in the search display were manipulated. In the old condition, the array of faces repeatedly appeared in consistent locations across blocks of trials such that the visuospatial context predicted the location of face targets. In the new condition, the locations of the array of faces varied from trial to trial. The old configurations consisted of 12 randomly generated arrays of face stimuli, each corresponding to 12 of the 48 possible locations, and were repeated once per block. The new configurations consisted of a random arrangement of all possible other locations, of which 12 were randomly selected for each block; each new configuration was presented only once throughout the experiment. For old configurations, targets and nontargets were first placed randomly within the array, and then remained in the same set of positions after being jittered across blocks of the experiment. To remove location probability effects, target stimuli appeared equally often in each of the 48 possible locations throughout the experiment; 12 locations were used in old configurations, and the other 36 were used in new configurations. All arrays were generated separately for each participant. Participants received old arrays after either one or two new arrays and new arrays after either one or two old arrays. Target quadrant and color were counterbalanced to ensure an even distribution of all possible combinations within each block and across all blocks.

Before each main experiment began, participants performed a practice session of a separate set of 24 trials to familiarize themselves with the task and procedure. Sinceimplicit memory of visual context is hypothesized to guide contextual learning, we confirmed whether this was the case by conducting an explicit recognition memory test. After performing the contextual cueing task participants were given a recognition test, also on the computer. The procedure for the explicit recognition test was similar to prior contextual cueing studies (Chun & Jiang, 1998; Chun & Phelps, 1999) such that participants were asked to indicate whether they noticed repetition of arrays. Next, all participants were given 12 old displays and randomly selected 12 of 36 new displays presented in a random, interspersed order to indicate whether they recognized each array as having been repeated or not.

General Data Analyses

All RT data were trimmed by removing trial RTs greater than three standard deviations above the condition mean for all subjects (Chun & Jiang, 1998, 2003). To increase the power of the analyses, trials were grouped into four epochs, each epoch consisting of five blocks (Kunar et al., 2007; Chun & Jiang, 1998). Data analyses were performed using SPSS (Version 22.0). For the RT data, consistent with the literature on contextual cueing, we focus on the target-present trials only. We included target-absent trials in our task to enable participants to detect the target (a dotted face) in the array without having to directly respond to the critical attribute, namely target emotion. However, only target-present trials were of theoretical interest to test the hypotheses concerning target detection across old and new contexts (Chun & Jiang, 1998). Consistent with earlier studies (Geyer, Zehetleitner, & Muller, 2010), in all our experiments accuracy for target detection averaged above 78% correct. Accuracy (Table 1) either did not differ across conditions of interest (p > 0.40) or did not change the interpretation derived from the RT data. In all the experiments, the hypotheses focused on two learning related effects: 1) context independent learning manifesting as a facilitation of RTs for target detection across epochs, and 2) context dependent learning demonstrated by faster detection of targets in old compared to new displays. These effects were examined via a 2 (configuration: old vs. new) x 4 (epoch) repeated-measures analysis of variance (rm-ANOVA) in Experiments 1, 2, and 4, and via a 2 (target emotion: threatening vs. non-threatening) x 2 (configuration: old vs. new) x 4 (epoch) rm-ANOVA in Experiments 3 and 5. Furthermore, since RTs during epoch 1 and 4 reflect the start and end result of learning, we also conducted rm-ANOVAs with only epochs1 and 4, as well as epoch 4 only (Chun & Jiang, 1998).

Experiment 1

Experiment 1 was designed to test whether context learning can facilitate the detection of schematic threatening face stimuli. If threatening faces are detected automatically, they would ‘pop-out’ in an array of non-threatening faces and be detected faster regardless of whetherthey are presented in old or new arrays. However, if learning of surrounding context can facilitate detection of threat, old arrays will lead to faster detection of target schematic faces than new arrays, and show that threat detection is not necessarily automatic as it can be learned.