The goal of this study was to establish whether SP is modulated by the two critical media factors perspective and agency and to delineate the neural impact of these factors during video game play. We thus conducted an experiment in which the participants played the video game “Oblivion” in four different conditions (1PP-active, 3PP-active, 1PP-passive and 3PP-passive) while scalp EEG was recorded and subjective experience of SP was assessed. The most important findings may be summarised as follows: (1) Higher SP ratings were obtained in 1PP compared with 3PP and in the active compared with the passive condition. (2) The analysis of the scalp EEG data revealed more alpha band power during 3PP than during 1PP and more alpha band power in the passive than in the active condition. (3) For the theta band, there was more activity in the active than in the passive condition and no significant activity difference between the two perspectives. (4) The intracerebral activations inferred with sLORETA revealed less alpha band activity in the parietal cortex, the occipital cortex, and the limbic cortex for the 1PP compared to the 3PP condition. (5) Finally, there was stronger theta activity bilaterally in the frontal cortex including superior, middle and inferior areas in the active than in the passive condition. In the following, we will discuss these findings in the context of the literature on SP with specific emphasis on neuroscientific findings.
SP feeling in the context of perspective and agency
Operating the video game in the 1PP and active conditions was associated with the strongest subjective SP feelings, thus corroborating the results of previous studies [12, 13]. Nevertheless, our findings go beyond these previous studies because we also examined the interaction of these two factors. Doing this, we found stronger SP ratings during the active conditions (1PP-active and in the 3PP-active) than during both passive conditions (1PP-passive and 3PP-passive), meaning that this general effect was independent of whether the subjects played in 1PP or 3PP. Thus, the factor agency exerts a stronger perception of SP than the factor perspective.
Early conceptualisations of SP have highlighted the specific role of perception on the generation of SP (the perception-oriented view). In this view, SP most strongly depends on the specific perception of spatial cues in VR. It has been argued that the person is concurrently “inattentive” to the spatial cues of the real physical surroundings [2]. Thus, a sensory-rich media stimulation evokes greater attentional processing preferentially via the visual dorsal stream and thus contributes to an enhanced perception of SP [1, 10, 18]. An alternative view of SP has been proposed by Sanchez-Vives and Slater (the action-oriented view) [4]. These authors highlight the role of controlled actions in the real or VE as a constituent feature for experiencing SP. The sense of “being there” in a VE is thought to be based on the ability to “do there”. Thus, conscious action control and movement within the VE is an important constituent of SP. This does not necessarily mean that real actions must be executed in a VE. A mental representation of an action can be automatically triggered by the incoming VE stimuli irrespective of subsequent execution of the action or not [10]. Both views, the perception-oriented and the action-oriented, receive support from our own study since we demonstrate that the experience of SP can be evoked by the perceptual perspective and by the experience of agency in the VE.
However, our experiment indicates that agency is more important for SP, given the larger influence of agency than of perspective on SP ratings. This finding is consistent with the action-oriented view in a gaming situation, although this may be limited to the specific characteristics of computer and video gaming and may not be generalized to all VR situations. In a common video gaming situation like the one used in our experiment the sensory modalities are not exposed to the kind of near-real-life stimuli that more sophisticated VR technologies can deliver. Sophisticated VR technologies are capable of addressing different sensory modalities and of sustaining a match between these different modalities (e.g. by using eye goggles and real-time tracking devices) [42]. However, in video or computer gaming, the means to stimulate sensory modalities are technically limited to a TV or computer screen. But we suggest that in video gaming this limitation can be compensated for by an increase in the interactivity enabled or required by the mediated environment. Referring to the model of Wirth et al. [5], we postulate that the possibility to act and manipulate first “attracts” the attention of the player to the VE and, in a second step, increases the acceptance of the mediated ERF as the primary ERF. Consequently, it is possible that (especially in video gaming) the perception of agency is of higher importance than the perception of self-location.
Neurophysiology in the context of perspective and agency
With respect to the cortical activation pattern, we identified decreased alpha band activity in the active and 1PP conditions. Since the active and 1PP conditions evoke the strongest SP perception, it is clear that strong SP is associated with less alpha band activity. Based on studies that revealed strong negative relationships between hemodynamic responses and alpha band power [24–26], we take the alpha band power as an indicator of cortical deactivation in these areas. Thus, higher cortical activation (strong hemodynamic responses) is associated with less alpha band activity. In the context of our study, the weaker alpha band responses over frontal and over parietal areas during 1PP and the active conditions indicates increased brain activation during these conditions. Since these conditions are also those with the strongest sense of SP, we assume that strong SP is associated with increased activation in the fronto-parietal network. For the theta band, we identified more theta band power (especially over the frontal midline positions) during the active conditions. In contrast to alpha band activity, theta band activity is not a direct indicator of neural activation or deactivation. However, theta band activity, especially over the frontal midline electrodes (FM-theta), is known to be involved in the control of complex cognitive operations such as working memory, memory processes, and the control of actions [27, 28]. Furthermore, our finding of increased FM-theta in active conditions is consistent with previous EEG based video gaming studies in showing that operating actions in VEs is associated with increased FM-theta [29–31].
Intracerebral activations in the context of SP feeling
In the comparison between high SP conditions and low SP conditions, we found less alpha2 frequency band power in parietal brain regions of the left hemisphere (left superior and inferior parietal gyrus, left precuneus) in the high SP conditions. Additionally, we revealed a significant positive correlation between intracerebral alpha activity in the bilateral DLPFC and SP ratings of the self-location subscale. These findings are in accordance with previous findings of our group in which we demonstrated 1) that SP experience is associated with the involvement of a distributed fronto-parietal network including the dorsal visual stream as a prominent part, and 2) that the DLPFC is one of the major regulators within this network [6, 7]. Nevertheless, our results not only replicate previous findings but expand current knowledge by showing that these neurophysiological mechanisms are not confined to the virtual roller coaster scenarios used by Baumgartner et al. [6, 7]. Since the roller coaster scenarios were non-interactive and arousing, while our experimental scenarios were interactive and non-arousing, we demonstrate that the core regions associated with SP are relevant in different virtual environments.
Intracerebral activations in the context of perspective and agency
Comparison of the intracerebral sources of alpha activity between 1PP and 3PP (separately for the active and the passive conditions) revealed a network of neural sources including the parietal cortex (right inferior parietal lobule), the occipital cortex (left lingual gyrus, left cuneus), and the limbic cortex (bilateral posterior cingulate, left cingulate cortex, left parahippocampalgyrus) with decreased alpha activity. Comparison of the active and passive conditions showed stronger theta activity in frontal brain areas during the active conditions (superior, middle, medial and inferior frontal gyrus in the left hemisphere, and superior and middle frontal gyrus in the right hemisphere). The fronto-parietal network found in the present study (and in previous studies of our group) is not exclusively associated with generating and controlling the experience of SP [see also [10]. These regions of the parietal cortex are known to be involved in spatial processing, mental rotation, and most importantly, a part of this network constitutes the dorsal visual stream, which is known to be involved in the egocentric processing [43, 44]. These parts of the limbic cortex are known to play an important role in episodic memory, emotional stimuli processing, and action control [23, 35]. The identified frontal brain areas are known to be involved in various executive functions that exert regulatory control over emotions and actions with the DLPFC being an important part [27, 45–51].
Among the various brain regions within this fronto-parietal network, the role of a parietal-premotor connection associated with sensory-motor integration is particularly interesting for the interpretation of our findings for perspective and agency. The posterior parietal cortex (PPC) is known to integrate information from different sensory modalities to form a coherent multimodal representation of space coded in a body-centered reference frame. This integration of multisensory cues includes mapping the position of objects in reference to one’s own body [52]. The premotor cortex, (PMC) on the other hand, appears to define a corresponding motor space consisting of all potential motor actions within the surrounding space [53]. It has been suggested that potential target objects within the visual field elicit motor schemas for potential actions that map the position of these objects in the surrounding environment [54]. Put simply, the PPC directly maps the position of objects in reference to the body, whereas the PMC creates a motor space as an intermediate step first.
According to the model of Wirth et al. [5], an alternative egocentric reference frame (ERF) to the real-world ERF develops from spatial cues and action cues of the VE. This ERF contains information about the spatial properties of immediate surroundings from a first-person perspective. The parietal-premotor connection might play a key role in this process. As mentioned in a previous review [10], we believe that the new egocentric view derived from VR is generated in the context of several transformation processes in the dorsal visual stream of the parietal cortex [see [43, 44]]. Regarding our findings for perspective and agency, we propose that mediated by the PPC, spatial cues from the VE directly define the spatial properties within an egocentric view. On the other hand, mediated by the PMC, actions cues in the VE are used to define a motor space which indirectly facilitates an egocentric view. Thus, we propose that both parietal and frontal regions are involved in the generation of the alternative ERF derived from the mediated environment.
The sense of SP is thought to emerge when the ERF from the mediated environment is selected as the primary ERF over the competing ERF of the actual environment (Wirth et al. [5]). We believe that whether or not a person chooses this new ERF as the primary ERF depends (among other things) upon the strength of the influence that media factors such as perspective and agency exert on this fronto-parietal network. More specifically, we propose that playing in 1PP, as well as actively controlling the actions of the video game avatar, increases the acceptance of the new ERF as the primary ERF mediated by (and observable as) increased activation in the fronto-parietal network.
Limitations
The present study has some limitations which should be addressed in future studies. The SP model of Wirth et al. [5], which lays the theoretical foundation for the interpretation of our neuroscientific findings, has not yet been thoroughly investigated empirically. It remains to be seen if this model will receive further support in future investigations.
The post-hoc questionnaires used in this study have disadvantages due to their administration after exposure; these disadvantages include recency effects, anchoring effects and inaccurate recall [55]. However, the alternative of a continuous on-line measurement of SP also has important disadvantages such as disruption of the SP experience itself by drawing attention away from the VE [2]. Future studies may be able to use new developed measurement procedures which seek to overcome these issues.
A limitation of our study design was the fact that we were not able to investigate the factors perspective and agency independently. In future studies, it may prove beneficial to manipulate the amount of sensory information and the possibilities to act in the VE independently in order to further elaborate how each influence SP and which influences SP more significantly.