Experience Mapping Problem – Case Study: SUPERHOT VR

For an introduction to the Experience Mapping Problem and the goal of hyperrealism in VR, see this post.

SUPERHOT VR succeeds in creating an intoxicating player power fantasy. Time moves at your command, allowing you to execute action moves worthy of a scene in the Matrix. It delivers this powerful fantasy while maintaining a highly abstract graphical style. This game is as far from “photoreal” as you can get. It is also very immersive — people get wholly absorbed by the game:

(video source)

SUPERHOT quickly builds a simple and understandable mapping from life to game, using verbs like “grab” and “shoot”. To analyze how it achieves this rapid, simple readability, we need to characterize the way we unconsciously categorize the world around us.

Suppose I am looking at a shelf full of objects. If I am moving about the room normally, the shelf exists as a single thing to me, a symbol tagged “shelf”. Under normal circumstances, all my knowledge and perception about this collection of molecules is filed under “shelf”.

However, if I become interested in a particular object on the shelf, the symbol of “shelf” decomposes into a slew of new symbols. The object of interest is a symbol (say, “box”). I may also have symbols like “figurine” and “row of books” for the other things on the shelf. The “shelf” symbol still exists, but there is a Symbol Hierarchy now. “Box” and “figurine” have a relation to “shelf” (they are on it), but they also have their own properties.

If I am curious about the contents of the box, the “box” symbol decomposes into “box” and “contents”. (Note that I use the same word (“box”) for both the box-and-contents as well as the box-itself. This word→symbol overlap is a common source of confusion during any discourse, and we must stay wary of it.)

If I want to open the box to see what the contents are, the “box” symbol decomposes into “box” and “box lid”.

Note that Symbol Hierarchies tend to decompose based on the verbs in the context. If the context includes verbs like “open”, then the box symbol hierarchy decomposes into box and box lid. If the context has the verb “grab”, suddenly the shelf decomposes into grabbable objects (“figurine”, “book”, “box”). And if I discover that the figurine is glued to the shelf and therefore cannot be picked up, its symbol is somewhat absorbed by its parent: the figurine object becomes part of the “shelf” symbol again, in the context of “grabbing” or “picking up”; any relationship you have with the shelf (e.g. too heavy to pick up) is shared with the figurine too, and if you knock the shelf down, the figurine will also fall to the ground.

Within this framework, it is obvious why SUPERHOT can deliver an effective, immersive fantasy. The first thing the game does is introduce a verb — “grab” — and teach that all black objects can be grabbed. Second, when you are holding an object that looks like a gun, you can “shoot” the gun. Third, there are red enemies. Red enemies die if a black thing touches them. Then there are other rules, like time moves when you move, you die if a bullet touches you, and white objects are inert and will stop black objects.

The set of symbols and verbs is now defined for the game, and there is a 1-to-1 correspondence between sensory perception and symbol. A gun is always a “gun”. It never decomposes into “slide”, “magazine”, or “grip”. White objects are always “the background”. Even if it looks like a console with buttons and a telephone handset, the player will never even try to decompose the “background” symbol into “buttons” or “telephone”. There are no properties that belong to some white objects but not others; they are always immovable, and impervious to bullets.

This very simple model means that the player rarely encounters a mismatch between expectation and outcome. If I throw a gun, it kills enemies, just like any other thrown black object. I can take cover behind a small flimsy chair, since it is white. I can kill an enemy with a thrown ashtray, since it is black. The player never encounters a mismatch because they never attempt to improperly decompose a Symbol Hierarchy. The game clearly indicates whether something is atomically a “gun”, a “black object”, an “enemy”, or “the background” and never requires you to further divide those atomic symbols.

The player is free to map additional narrative and emotional properties onto these symbols. “Enemies” become an attractor for all the qualities of antagonists from movies and TV, and the suggestive background environments help encourage this emotive transfer: airport, shopping mall, rooftop. These are contexts we have seen action scenes occur. Similarly, the guns and ninja stars are symbolic attractors and gain all the forbidden, sexy power that weapons are granted in other media contexts.

Finally, the game allows the player to perform kinesthetically pleasing sequences with these verbs and symbols. Sure, the enemies just burst into triangles when “killed”, and the guns are simplistic black blobs. Yes, “shooting an enemy” means tiny black blobs came out of your little black blob and hit the big red blob. But you just “punched” an “enemy”, “grabbed” his “gun”, and “shot” another “enemy”! The interaction of symbols carries all the weight the player has put behind the symbols, and the fact that “punching” is kinesthetically similar enough to a real punch (same with “ducking” and “aiming”) allows the player to map back from symbol to reality.

The important take-away is this: because the player quickly builds an accurate model of this virtual world and NEVER encounters a mismatch between expectation and outcome, they stay immersed. And because the objects behind the symbols enable generic symbol assignment (the enemies are not “storm troopers”, they are just archetypal “enemies”), they attract lots of affective qualities. An immersive virtual world filled with high-affect symbols is, well, hyper-real.