Real-time Strategy in VR

Real-time strategy is the genre that’s possibly been knocked hardest by the evolution of modern gaming. Interest in games like Starcraft have waned in favor of action games and MOBAs. There are two competing explanations for this: either the modern market isn’t big enough to support an RTS ecosystem, or the developers of real-time strategy games have failed to innovate and keep up with the times. In either case, it is undeniable that the RTS is a dead genre.

Or is it? Of course not! The spirit of the RTS is alive and well. The same things that hooked players back in the 90s still hook players today. Human psychology hasn’t changed. But to survive in the vast and confounding battlefield of modern gaming, the genre has had to twist, split, and adapt to fill sustainable niches. To the legions of fans forged in the heyday of the RTS, it seems as though the genre is dust and bones because the RTS is held, in the public mind, as a single monolithic conception.

The RTS suffers the same fate as Star Wars. A novel presentation in a fallow market captured the hearts and minds of many, for myriad reasons. Unfortunately, its fame necessitates its failure; the intersection of so many interests leaves nowhere to progress, creatively. Moving in any particular direction will cause some fans to lose interest. So in a sense, the RTS genre is dead, if you define the RTS genre by the specific mechanics found in Starcraft, Warcraft, and Dune. Such a precisely defined genre is a dead-man-walking.

To find the modern RTS, one must look at the aspects of play that create such a devoted fanbase. As I mentioned, these are diverse — from a quick survey of internet threads where people discuss why they love real-time strategy:

  • Building power over time
  • Introduction of mechanics over time causing increasing complexity
  • Discovery / Exploration
  • Having to choose where to invest time and resources
  • Managing multiple tasks / dividing attention
  • Single-player stories
  • Challenging yourself
  • Directing troops
  • Base building
  • Devising strategies offline and then implementing them ingame
  • Improvising when a plan falls apart
  • Defeating an equally-matched human opponent
  • Analysing your own mistakes and better learning the specifics of the game to improve

There are some high-level generalizations about player experience to draw from these. I think the attractions of a traditional RTS can be reduced to the following:

  • Fantasy of Command / Warfare
  • Struggling to acquire and subsequently manage streams of information (in real time)
  • Efficiently growing a base, set of units, and pool of resources (in real time)
  • Overcoming opponents by building and manipulating a superior mental model of the game dynamics (in real time) [“dynamics” in the sense of the MDA framework]

Note that I have appended “(in real time)” because these can be fulfilled by a number of turn-based strategy and tactics games, just not with a time-constrained component. Indeed, what separates the RTS from other “games of command” seems to come down to giving commands at multiple locations at the right time. Many modern games of command involve a single unit, or do not have a relentless time component (i.e. not turn-based and no pause feature). For this reason, the “Fantasy of Command” can be safely put aside for the purposes of this essay. It is fulfilled by other games and is incidental, not essential, to the genre.

We expect that modern inheritors of the RTS mantle will continue to fulfill some subset of these general player experiences. Indeed, we see Offworld Trading Company fulfills most of these, but is weak when it comes to “Fantasy of Warfare” and “Struggle to acquire information”, since the game lacks any units, and there are few mechanics that allow players to learn more about their opponent than what their opponent knows about them. Clash Royale delivers a strong experience when it comes to “Overcoming opponents through superior mental models”, but lacks virtually any information-gathering or economic growth components.

Those are the elements of the high-level player experience, the aesthetics of play. When it comes to crucial mechanics, I found this analysis compelling. The analysis essentially pinpoints 11 features that are both necessary and sufficient for calling a game an RTS:

  • Players themselves are not in control of turn progression, and there are no direct interruptions of the progression of actions within the game.
  • Not being in direct control of the pacing of game events put pressure on the player to make fast, accurate decisions based on limited information.
  • Poor decisions must be eliminated or mitigated in future attempts (that is, later in the course of a particular match or in future matches against the same or different opponents).
  • RTS train players to quickly evaluate situations to determine the best future path forward.
  • Acquisition and expenditure of stores of value
  • Array of options with which to progress
  • The player must be asked to invest limited (though not necessarily scarce) resources into progressing and expanding, capitalizing on their past actions towards future goals
  • Players be able to actually lose their investments
  • they must defend their own investments and use them as wisely as they are able.
  • Require the player to simultaneously manage multiple game pieces or elements.
  • Uncertainty of other player’s actions to be incredibly important

As a summary:

Multiple participants engage in competitive economics, managing limited resources to expand multiple game elements in order to gain an advantage and ultimately wrest control of one or more critical systems to attain a concrete victory.

While this is useful for determining what is and isn’t an RTS, or figuring out which elements of a specific RTS are core to its being and which are secondary embellishment, it doesn’t necessarily lend itself to the sort of abstract, blue-sky thinking that you need when pushing an existing, well-loved concept into uncharted waters. You could follow its prescriptions to the letter, and find yourself with a product that fails to fit its niche or entertain its users.

Real Time Strategy in VR

Plenty of developers have tried to make archetypal ports of existing genres into VR. These evidence themselves as being bids for the statement “X is the Y of VR”. “Pavlov is the Counter-Strike of VR”, “Space Junkies is the Quake of VR”, “Sprint Vector is the racing game of VR”, “Beat Saber is the rhythm game of VR”.

A good rule of thumb here is to ask, “if I had the choice of playing this in VR or not in VR, would I rather play it outside of VR?” Then ask, “if I played this as a non-VR game, would I rather be playing something else?” The answer to the second question is almost always “yes”, so the answer to the first question better be “no”.

Everybody wants a VR strategy game, and plenty have been made: Tactera, Base Blitz, Airmech Command, Brass Tactics, Skyworld, Landfall, Cosmic Trip, Final Assault. (I’ve played all of them, by the way). Plenty of these come close to being an archetypal port, a true “RTS of VR.” Personally, I’m currently enjoying Final Assault.

But would I rather play these outside of VR? Absolutely. VR is uncomfortable, and these games don’t (for the most part) bring anything to the table that precludes playing them on a flat screen. No game has yet provided an RTS experience that is integral to VR.

Can we define an envelope for the “ideal” VR RTS? I will suggest two heuristics that help us towards a vision of this hypothetical game.

Heuristic 1

The perhaps less controversial heuristic is that the game shouldn’t be worse off for being in VR. By this, I mean that the game is not less player-friendly or less fun than the “pancake” version of the game you would get if you tried to port it back to traditional platforms. This applies to qualitative things like “fun”, but also concrete things like input level and game feedback. If the input scheme of the game is frictional, the player will immediately reject the game. The controls should grant the player new opportunities, not restrict their ability to act.

There is a corollary which follows logically from that first heuristic: the inputs to the game should not be able to easily be mapped to mouse and keyboard. For example, all of the aforementioned VR games — with the exception of Cosmic Trip — involve commanding troops to move around on a 2-axis battlefield. There may be height variation, but there aren’t even multiple levels. In addition, troops are commanded by selecting groups of units, then directing them to a point on the battlefield. Naturally, this is exactly how pancake RTS games work, and it becomes quickly obvious that a mouse and keyboard is much better than a pair of VR controllers for this kind of work.

But this also applies at a higher level of abstraction. It may not seem problematic if my VR RTS involves giving commands to troops at ground level by making gestures. How could that be translated easily to a mouse and keyboard setup? Well, players tend to build the most abstract mental model possible when learning a game. Unless there is a significant amount of additional unique control afforded by this gesture-based command system, there will be no difference between it and a simple top-down point-and-click command system in the player’s mental model of the game. It would be difficult to build a gesture system that affords control so unique that it couldn’t be easily replaced by a pancake GUI using a few buttons, hotkeys, sliders, or mouse gestures. But trying to imagine such a nuanced gesture-command system might lead to some interesting RTS ideas!

Heuristic 2

The first-person, gesture-based RTS as a thought experiment points to an important distinction to be made between the hypothetical “ideal RTS” and existing archetypal ports of other genres. All successful archetypal ports are currently action games. At a most basic level, the game’s fun is grounded in a certain viscerality. Arizona Sunshine isn’t about learning the game’s sandbox; it’s about being in a horror situation, about reacting to surprises and danger under loads of stress and anxiety. Sprint Vector is a game of going fast and perfecting your execution of maneuvers. You beat out opponents by performing actions more precisely, not because you’ve built a more sophisticated mental model.

Conversely, three of our four identified core player experiences of the RTS genre are dependent on the fact that the player will be trying to understand the game as thoroughly as possible (i.e. build a mental model):

  • Struggling to acquire and subsequently manage streams of information (in real time)
  • Efficiently growing a base, set of units, and pool of resources (in real time)
  • Overcoming opponents by building and manipulating a superior mental model of the game dynamics (in real time)

A good mental model will help you filter, prioritize, and categorize the information you collect. It will help you manage your resources most effectively, and it will allow you to overcome your opponents. The core fun of an RTS is the feeling of having achieved victory by being a total genius. The core fun doesn’t come from ordering troops around on a battlefield.

This leads to a second heuristic: victory in the game should be determined by whoever builds a better mental model. It should not be determined by whoever masters the controls better, or acts faster. This heuristic also ensures that one of the most egregious VR issues is avoided: frustration at the controls. It can be stated with certainty that if the player is annoyed at the controls of a game, the designer has failed. Having the player fail because they are struggling with the controls is the worst possible outcome.

Constraints For An “Ideal” VR RTS

We can take it as a given that this hypothetical game is about achieving victory. Thus, it must contain mechanics that combine to produce interesting, nuanced dynamics, since the core gameplay is the way the player uses their mental model to understand the dynamics, and subsequently determine the best way to harness the mechanics to their end. Additionally:

  1. The controls of the game cannot be something which map easily to flat gaming (mouse and keyboard, touchscreen, etc), following from Heuristic 1.
  2. It is difficult to imagine a mechanic which would work well in a pancake game, yet whose corresponding controls do not map well to mouse and keyboard (or other flat input device).
  3. Therefore, any mechanic essential to the game must be something which could not easily exist in a pancake game.

This is a tall order. It calls for something few VR games have achieved, which is a core loop and core mechanics that cannot exist outside of VR. We need, essentially, the Beat Saber of the RTS genre.

When we envision “VR native” mechanics, we must start at first principles. What does VR have at its disposal? The ability to look in all directions; locate audio sources; ability to feel presence in a space; a sense of scale; room-scale movement (the ability to crouch, jump, and walk around a space); and tracked hand controllers with haptics.

Acquiring Information

We could use the perceptive aspects of VR (looking around, binaural sound spatialization, stereo vision, haptics) to feed into the “struggle to acquire and manage information” player experience.
But we need to be careful not to violate Heuristic 2. For example, in Brass Tactics, the fact that you can’t be everywhere at once is an important gameplay element. You might think this would force players to use their mental model to choose the best place to focus their attention — a positive gameplay element. However, in reality, it means that the person who is better at moving across the table and orienting themselves using the game’s controls gains the advantage. The player who is better at using the controls is more effective.
In our ideal RTS, players must be able to quickly take in all the available information using an incredibly intuitive interface.

We should keep in mind the current limits of headsets. Resolution is very low (compared to angular size of pixels on flat displays), and focal length is fixed to a single distance. Empirically, visual acuity is reduced to between 20/32 and 20/42, depending on the anti-aliasing and supersampling settings the user employs [1]. Oculus says in its best practices document that it is most comfortable to view objects between 0.75 and 3.5 meters away, since the focal distance of the headset is about 2 meters.

Here is an image demonstrating a vergence mismatch between eyes.

Accordingly, we should place all important elements of the interface 2 meters away from the player (this minimizes the vergence-accommodation mismatch problem, illustrated in the image above). Any text (which there should be very little of, since people hate reading) needs to be big enough that someone with 20/42 vision could read it comfortably at a distance of 2 meters.

Managing Resources

As the player deals with resource management (let us generalize structures, units, and resources pools under the term “resources”), they will contend with VR’s input methods. At a low level, this is the position and rotation of their head and hands, and the buttons on their controller. The pose of the head should not be used as to not interfere with the player’s information gathering, so we are left with the controllers. Here, there are too many possibilities to enumerate.

Most of the existing VR RTS games utilize variations of the “laser pointer” input paradigm, which ultimately boils 6 degrees of freedom (DoF) into 2 dimensions (specifying a point on a planar topology). Our ideal RTS would require input along at least 3 dimensions, increasing the difficulty of mapping it to a accessible pancake and thereby increasing the likelihood of satisfying Heuristic 1.

Our input interface must also contend with Fitt’s Law. Specifically, in VR players can either do a lot of sweeping actions quickly, or they can do few precise actions slowly. If the interface requires precise actions, and increasing action frequency increases player power, then a player will become frustrated as they try to actions faster than is possible for them. However, even if the inputs involve broad movements, we would still like to avoid coupling input frequency and player power.

(As an aside, games like Starcraft contain just such a coupling. The faster and more precise your actions, the better you will do against a slower opponent of equal strategic mastery. However, this is archaic. Other modern genres have captured this “twitch” aspect. The modern RTS may safely abandon it.)

If inputs fall on the slow-yet-precise side of Fitt’s Law, they should not require fine wrist control. Many VR games use a laser pointer-style interaction, and anybody who has played these games knows that hovering over a small UI element and pressing a controller button is an exercise in futility. Besides frustrating the player, encouraging myopic focus on small interface elements means the player won’t be focusing on the larger environment around them, an experience that is one of the key selling points of VR.

Putting It Together

To recap:

  • We should avoid coupling input frequency and player power.
  • Inputs should not require fine wrist control or angular precision.
  • Inputs to the game should have 3 or more dimensions.
  • Interface elements should sit around 2 meters from the player, and not require much visual acuity to operate.
  • Players must be able to quickly absorb available information using an intuitive interface.
  • The mechanics must combine to produce interesting, nuanced dynamics.

Additional easy wins would include a focus on co-presence with other players, a component of play that involves scale (seeing big things or feeling big in VR is cool), and the option of playing the game sitting.

Side Note: Ergonomics

Many VR RTS games have poor ergonomics. Since their action takes place on a flat horizontal plane, you spend the majority of your time looking down. With the weight of the headset on the front of your head, this induces considerable neck strain over time.

Additionally, a lot of the games use some form of world-dragging to translate the player around the virtual game space. Smooth artificial locomotion causes people to stand still, which is biomechanically more tiring than moving around on your feet.

Our ideal game would have you looking straight or slightly up during the majority of the game, and would take place within a fixed volume to encourage the player to move around physically.


Specific Ideation

This section is going to be a bit of a freeform brainstorm on how to drive the design of our hypothetical game.

The geometry should be topologically 3D and exist within the player’s real space volume. Perhaps the battlefield exists as a series of nodes with links between them. Nodes would provide both resources and 3D arenas for engagements, while links could be created and destroyed over time, and have different properties (slow/fast, dangerous, etc).

In order to decouple input frequency from player power, we can implement an action budget. This might be a power meter that slowly fills up, and allows manipulation of nodes and other resources on the board. Managing and budgeting your energy would be an important meta-game. You could burn energy to push an offensive, or burn energy to defend more effectively. Having low energy would make you vulnerable. Final Assault is an example of a game that does this well.

During direct engagements, when two players are both competing for the same location on the battlefield, we must make sure to provide significant choices to players and also prevent the faster player from automatically gaining the advantage.

The majority of an attacker’s effort would be in preparing an attack. Adding a delay between finishing preparations and the actual attack would allow defenders to be alerted and turn their attention to the engagement.

Micromanagement would be penalized – it could cost energy from the action budget, and take time to be delivered to the battlefield. Sending too many commands in succession could paralyze your units.

Leveraging high-level inputs like unit formations and group composition could be effective. Having unit behavior change significantly based on group composition could allow deep strategic play and forward-thinking. For example, assigning a fighter escort to your bomber squadron could cause the bombers to focus on making headway towards their target rather then spend effort defending themselves.

This concept of purchase-then-attack could be expanded by allowing players to purchase reserve units at a discount; if you can properly judge how an engagement will play out, you can get proportionally more power onto the battlefield in the right place at the right time.

Resource income should be expandable, but not exponentially. We want to prevent “steamrolling”, and instead foster a tug-of-war dynamic. Spending time arranging your resource production in 3D space could yield some efficiencies — but ideally this requires a trained eye rather than a fast hand.

In addition, adding unit exhaustion could prevent a well-designed push from completely overrunning the opponent without giving them opportunity to counter. Making defense easier than offense means you need to continually exercise superior play in order to win. Over time, this balance should become more unstable – this prevents a complete stalemate. After many minutes of play, a single advantageous play could cascade to victory.

Perceivable situational details should influence the outcome of encounters. Things like group composition, terrain, weather, flanking, formations, and unit morale, if properly exploited, can lead to one-sided engagements.

Game conditions should change to prevent a player from establishing a totally unassailable position. For example, a changing battlefield topology would open new flanking routes or render previously vital positions redundant. This would also require players to continually adjust their personal strategy, leading to more interesting matches.

We want to encourage strategic posturing. Placement of units should be a mind-game, to some extent. Direct engagements should resolve fairly simply and quickly, discouraging micro-management and feelings of helplessness as your forces lose. There could be several discrete points during an engagement where player commands are communicated to the units – creating a rock-paper-scissors guessing game. “Is the opponent going to change his formation command, or keep it the same?”

All together, this should lead to a “dance”, where the player shifts around the space, setting commands up that will execute at some time in the near future.


Anyways, I hope this has excited some ideas in your brain. It has certainly done so for me.

Say something! Do it!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: