The data are collect from a study with human participants. Participants had to rapidly decide whether a character, appearing suddenly, and for a very short time, on the left or right side of a poorly-lit realistic indoor scene, was wearing a cap or a helmet. In some of the experiment blocks, the stimulus display was preceded by a cue indicating whether the character would appear on the left or the right, either as a spoken word (``left'' or ``right'') only, or in the form of a synthetic face uttering the cue word (with matching lip movements). Performance was then measured in terms of both accuracy and reaction times.