DOES THE DUBBING EFFECT APPLY TO VOICE-OVER? A CONCEPTUAL REPLICATION STUDY ON VISUAL ATTENTION AND IMMERSION – Gabriela Flis, Adam Sikorski and Agnieszka Szarkowska

The original study by Romero-Fresco (2016) was replicated by Di Giovanni and Romero-Fresco (2019) on a group of Italian and English viewers watching a fragment of Grand Budapest Hotel (Wes Anderson 2014). While the original study focused on close-ups, Di Giovanni and Romero-Fresco (2019) examined a different language combination (English to Italian dubbing) and different types of shots in the film.”

As opposed to Spain, where the predominant audiovisual translation (AVT) mode is dubbing, and the UK, where the vast majority of audiovisual content is available in the original English version, Poland is generally considered a stronghold of voice-over (VO) (Gottlieb 1998). Casablanca has never been dubbed into Polish and only the voiced-over and subtitled versions exist.” [!!]

Voice-over translation is an audiovisual translation technique in which, unlike in dubbing, actor voices are recorded over the original audio track which can be heard in the background.”

In contrast to dubbing, where every attempt is made to synchronise the translation with the lip movements of the original actors (Chaume 2014), in voice-over there is no requirement for lip synchrony (Sepielak and Matamala 1999). Neither does the translation need to be of the same duration as the original – a requirement known as isochrony (Chaume 2014). In VO, the original soundtrack remains audible but its volume is lowered, and the translation tends to be shorter than the original, typically allowing viewers to hear the beginning and end of the original utterances. The translation is read by one voice talent, usually male.”

Assuming that a lack of synchrony between the characters’ lip movements and the translation may lead to viewers avoiding looking at the mouth, we wondered whether a similar effect may take place when watching Polish VO, where the lack of synchrony between the original utterance and its translation is part and parcel of this AVT mode. Have Polish viewers also developed similar strategies in their process of habituation to VO? Given the fact that all the translated utterances, whether pronounced by female or male actors, are read out by a single male voice talent, we thought that the viewers’ potential avoidance of looking at characters’ mouths may be particularly discernible in scenes with female characters speaking.”

Although film viewing may seem like a passive activity, when watching films viewers are, in fact, busy processing the sequences of images and sounds, understanding the action, and construing the narrative. From previous research we know that viewer gaze behaviour shows certain commonalities (Smith 2013).” “Attentional synchrony, or “the tendency for observers to be looking in the same place at the same time” (Foulsham and Sanderson 2013: 926), is greater when sound is present than during moments of silence (ibid.: 939).”

One might think that perceptual quality would be extremely low in cases of, for example, science-fiction movies or animated short stories; however, this is not the case. In Hall’s experiment, participants who watched Jurassic Park (Steven Spielberg 1993) still perceived dinosaurs as real, even though they had become extinct millions of years ago, because they felt real in the context of the film.”

Even though it might sound similar to transportation, character identification is limited to particular characters depicted in a movie, whereas transportation ‘is a more general experience created by the narrative as a whole’ (Tal-Or and Cohen 2010: 404).”

Experiment 1 reports on the results of the eye-tracking study conducted on VO with Polish viewers, using the same 6-minute excerpt from Casablanca as Romero-Fresco (2016, 2020). We used a mixed study design with the area of the face (eyes/mouth) as an independent within-subject variable, and participants’ immersive tendency and English proficiency as factors. The dependent variables were the percentage of gaze distribution, immersion levels, comprehension and enjoyment. In Experiment 2, we compared our results with those obtained by Romero-Fresco (2016, 2020).”

To the best of our knowledge, no work on the dubbing effect in voiced-over films, and especially Polish VO, has been done before.”

In general, our sample consisted of young adults whose proficiency in English was relatively high, which may be important as they could understand the original English audio in the background of the Polish voiced-over version.”

Our percentages on eyes and mouth did not add up to 100%, as was the case in the original study, because we also took into account other areas on the screen where people looked, including the nose, hat, hair, background, etc.”

when the actors were speaking, participants looked at the eyes twice as much as at the mouth but, when the characters were not speaking, participants looked at the eyes four times more than at the mouth.”

We therefore compared the percentage of gaze distribution on Ilsa’s mouth with that on Rick’s mouth. Indeed, a statistically significant main effect of actors’ gender was found on gaze distribution on the mouth in dialogue scenes, F(1, 17) = 4.516, p = .049, partial eta2 = .21. Contrary to our predictions, [sabia!] however, viewers looked more at Ilsa’s mouth (M = 24.94, SD = 12.36) than Rick’s (M = 17.94, SD = 13.34).

We were also interested in finding out whether gaze distribution was in any way related to the participant’s immersive tendency and English proficiency, but neither of these factors was found to be significant.”

The largest discrepancy between the declarative and the actual time spent was found in the case of Spanish participants looking at the mouth, which may show that the dubbing effect is largely unconscious.

It needs to be noted that asking people to report on a 1-5 scale the time they think they spent on eyes and mouth is problematic for a number of reasons, including the fact that while watching they were unaware of the nature of the experiment and were not focussed on their gaze behaviour and its distribution.”

In answer to our main research question, we found that when watching the voiced-over fragment of Casablanca, Polish viewers did not avoid looking at the characters’ mouths. Our participants spent – proportionally – about 60% of the time looking at the eyes and about 40% at the mouth in scenes with dialogue, while for the English this proportion was about 75% and 25% and for the Spanish 95% and 5%. This means that we did not find what could be potentially called ‘the voice-over effect’.”

Interestingly, the percentage gaze distribution of Polish viewers was closer to that of the English viewers watching the original clip than to the Spanish group watching the dubbed version. Statistically, there were no differences in gaze distribution between Polish and English people in the sense that more time was spent looking at eyes in scenes with no dialogue than in dialogue scenes and, analogically, at mouth in dialogue scenes in comparison with those where the character remained silent. For Spanish, the trend was reversed. Such results make us wonder whether voice-over may in fact provide an experience more similar to the one we may have while watching a film originally recorded in our native language, an aspect that could be investigated in further studies.” A questão não é essa, mas que o áudio original estava presente!

In the presence of noise, where speech is less intelligible, the significance of visual speech information increases. If we consider VO as a sort of ‘noise’, making the perception of the original more difficult by the co-presence of the VO translation, then it may explain why Polish viewers focused so much on the mouth compared to the other two groups.”

Indeed, when directing films starring Ingrid Bergman, Alfred Hitchcock increased the use of close-ups ‘to concentrate expression in the micromovements’ of Bergman’s face. In the scene used in the study, Bergman is also framed in a close-up, placing her face and full mouth in a particularly prominent position, which may explain the larger focus on Ilsa’s face and mouth than on Rick’s.

It has been suggested that the reasons for the scarcity of replication in modern science include the negative perception of replication as research that is unoriginal and lacking in novelty; the unfavourable attitude of some editors and the consequent difficulty in publishing such studies; the potential hostility towards the original researchers and the fact that replications may be associated with controversy (Koole and Lakens 2012; Nosek et al. 2012; Coyne et al. 2016).”

In our study, direct replication was not possible since dubbing is rare in Poland and the clip used in the original study has never been dubbed into Polish. Furthermore, as we were operating with the institutional confines of our university lab, we had to work with a different eye tracker (SMI) than that used in the original study (Tobii).”

Given the departures from the original study, conceptual replications ‘do not constitute an unequivocal test of the validity of prior findings’ (Coyne et al. 2016: 245) and can be used ‘only to confirm […] the original result, not to disconfirm it’ (Nosek et al. 2012: 619). Therefore, the fact that the dubbing effect has not been found in the Polish context does not necessarily disconfirm its existence in a typical dubbing country such as Spain. Last but not least, as stated by Earp and Trafimow (2015: 9), ‘even carefully-designed replications, carried out in good faith by expert investigators, will never be conclusive on their own.’ What is needed is a series of replications, conducted independently of one another by different research teams and labs.

Replicating a study may be in some ways more challenging than conducting an original study from scratch. The replication team needs to make sure that they follow exactly the same protocol as the original team did. Yet, current reporting practices are sometimes insufficient for the replicating team to be able to follow the experimental protocol to the letter. This relates to, for instance, using identical areas of interest, identical pre-processing of eye-tracking data in terms of minimum and maximum fixation duration as cut-off points, or using exactly the same eye-tracking measures, such as fixation time or dwell time.”

Our study has shown that the visual attention distribution of Polish participants was similar to that of English people watching the film in the original, which suggests that for viewers accustomed to VO, watching a voiced-over film may be an experience comparable with watching the original, at least in terms of visual attention distribution. This may come as a surprise, since VO is often considered ‘the worst possible method (which can) in no sense maintain or do justice to the quality of the original version’ (Dries 1995: 6).”