Source Filmmaker: First Impressions

Meet the Pyro

Meet the Pyro

As you may have heard, the Source Filmmaker was released two weeks ago at the conclusion of the Pyromania Update for Team Fortress 2. To get it at first, everybody was required to submit a survey form that included basic hardware and software specs about your computer, including whether or not a microphone was attached. The idea was that a limited, graded release would help give a taste of what the tool is like without flooding the Internet with videos. However, after three weeks of semi-open beta, the SFM team has gone public. You can download it here. Here are my first impressions of the tool (there is a TL;DR at the bottom).

The Source Filmmaker is a tool that allows “players” to set up scenes within any Source game, and then edit the resulting clips as if they were in an video editing program. This hybrid system passes over a lot of the conventional paradigms in film making. You can simultaneously modify how you want a shot to look AND change how the sequence is cut together. Scenes still have props, actors, lights, and cameras. However, if you decide while editing that you want a shot of the same scene from a different angle, you can create a new shot from a new angle in seconds.

This is definitely the direction that movies are headed as a medium. Computer graphics have reached a level of visual fidelity that allows filmmakers to create entire new elements and mix that with live footage. For instance, Sky Captain (an awesome movie, by the way) was shot entirely on blue-screen in some guys basement. All the environments and non-human actors were computer generated. This allowed the maker to move the actors around as he pleased. If he didn’t like the direction they were facing or their position on-screen, he could simply move them around like another 3D asset.

Sky Captain and the World of Tomorrow

Sky Captain and the World of Tomorrow

So far I’ve used the Source Filmmaker for a little over one week, on and off (I made this). From what I hear, experts at the program can deftly make complex scenes in minutes. However, I have yet to figure out all the hotkeys and efficient methods, so it takes me a long time to even sketch out a rudimentary scene. My speed is hampered, in some part, by the strange choice of hotkeys; The lower left part of the keyboard seems to have shortcuts distributed at random. Yes, every program has such a learning period in which shortcuts are committed to muscle memory. The SFM, though, for all its similarities to 3D programs, seems to have flipped the traditional hotkey set.

I digress, however. The primary aspect of SFM that impedes my work in the program is the tool’s concept of time and animation. To illustrate, let me explain the structure of the program: Each file is called a “session”; a self-contained clip. A single map is associated with each session. A session contains a strip of “film” which is composed of different shots.

Shots are independent scenes within the same map. Each shot has a scene camera and various elements that expand upon the base map set. Each shot also has an independent concept of time. You can move a shot “fowards” or “backwards” in time, which doesn’t move the clip in relation to other clips, but changes which segment of time the shot is showing within its universe. You can also change the time scale, which slows down or speeds up the clip.

If you move a shot to be before another shot, it will not change the shot, only the sequence in which the shots are displayed. This can be confusing and/or annoying. For instance, if you have a shot of someone talking, and you want to have a close-up shot or a different angle inside of that clip, there are two ways to do so. You could go into the motion editor and move the camera within the specific segment of time within the shot. The easier way, however, is to split the shot into three clips. The end clips remain the same, and inherit the elements from the single parent shot (which doesn’t exist anymore). In the middle clip, however, you change the camera to show a close-up angle. Both of these methods look the same; until you change your mind.

After you split a clip up into different shots, you can’t (to the best of my knowledge) add in a common element that spans all three shots, even though the elements that were there beforehand were inherited by all three. If you move a prop in one shot, it doesn’t translate over. This problem lends itself to a strange workflow, in which you set up the entire scene from one camera view, and only when you are satisfied do you split it up into different clips.

But how about the other method I mentioned? The motion editor allows you to select “portions of time” within a shot’s universe. You can make changes to objects and their properties, but the changes will only be visible within that time segment. For smooth transitions, it allows you to “partially” select time, and blend between two different settings. This feature can be extremely useful and powerful, but it is also a pain in the ass. While trying to hand-animate actors, I often find myself getting annoyed because I want to go back to the same time selection and add in something, or smooth over multiple curves. Since each entity stores its animation separately (each bone in a actor’s skeleton, for instance), I often find myself annoyed because I change an animation, but forgot about a bone. The animation ends up completely screwed, and its easier to start over than fix it.

Yes, a lot of this pain is due to my inexperience with the workflow. I’m sure I’ll get the hang of working with the strange animation system. But for any filmmaker or animation starting out, it will be quite a jump from the traditional keyframe methodology. In the Valve-made tutorials the guy talks about the graph editor, which seems to liken itself to a keyframed timelines. However, I have yet to glean success from the obtuse interface, and in any case the “bookmarking” system seems unnecessarily complex.

I want to cover one more thing before wrapping up. What can you put in a scene? Any model from any source game can be added in and animated. There are also new high-res versions of the TF2 characters. Lights, particle systems, and cameras are also available. For each of these elements, you need to create and Animation Set, which defines how the properties of the elements change over time. IK rigs can be added to some skeletons, and any property of any object in the session can be edited in real time via the Element Viewer. Another huge aspect of the program is the ability to record gameplay. At any time, you can jump into the game and run around like you are playing. All the elements of the current shot are visible as seen by a scene camera. You can even run around while the sequence is playing. You can also capture your character’s motion in “takes”. This is great for generic running around that doesn’t need gestures or facial animations. If you need to change something, you can convert the take into an animation set, which can be edited.

On the note of character animation, lip syncing is extremely easy. Gone are the pains of the phoneme editor in Face Poser. You can pop in a sound clip, run auto-detect for phonemes, apply to a character, and then go in with the motion editor and manually change facial animation and mouth movements.

TL;DR: To summarize my feelings, any person who admires the Meet the Team clips or the Left 4 Dead 2 intro trailer should definitely check out the Source Filmmaker. It’s free, and the current tutorials let you jump into making cool short clips; every clip looks really nice after rendering. The program does require a lot of memory and processing power though, so you will be unable to work efficiently if your computer doesn’t get decent framerates in TF2.



Why is it that appearances play such a large part into a person’s perception of things?

When I was building a ray tracer (see here) a while back, there were a couple of stages. First, I had to get the actual ray tracing working, which was the math-intensive part of the project. However, that only yielded solid-color circles on the screen. Once I called the function a second time to trace from the bounce point to a light (giving the spheres shading), it became impressive to look at. The proportion of work to seeming difficulty seemed to be arbitrary.

A 3D model becomes more admired when color is added. Even if the textures are solid colors, it still “looks better” (to the untrained eye) than a white/gray untextured model, despite the actual model quality. Yet again, so much work can go into detailing a model or perfecting the topography, and it barely takes anytime to cover areas of the model in different hues. Similarly, a model will get as much praise from casual observers if it is solid white as if it is textured with checkerboard. However, checkerboard shows that the model has been UV unwrapped (a huge chunk of work), while pure white does not.

A game mod can be absolutely brilliant in conception and be fueled by a constant stream of original ideas, but if it has a couple of pieces of concept art or a 3D model or two to begin with, people will be much more interested. Perhaps in this case it’s also a sort of promise that work will be done on it, so people have some justification in expecting pictures. Honestly though, I would rather have a mod or scratch-built game with a brilliant design document/manifesto that insures that the developers aren’t the brainless riffraff you see nowadays in the main stream, than a few pieces of art.

An even more common example is websites. A website might be fully functional and the idea might be brilliant, but until it meets some of the unspoken standards set by most commercial websites, people will have limited interest. Conversely, a website might have no features, but if it is sleek with custom buttons and subtle gradients, people become enraptured by it. It’s almost as though superficial, insignificant design flourishes can replace content to a certain degree.

I guess you see that quite a bit in mainstream games these days. There is this industry standard that seems to have grown out of control. Distributors, and therefore designers in practicality, seem to think that for a game to be enjoyable, or rather to sell, it must have top of the line graphics. The only problem is that this standard was set by some megalithic productions that had ridiculous budgets (Crysis, Call of Duty, Halo 3). Now smaller studios are held to it too, though, studios that don’t have the budget to integrate state of the art graphics and environment detailing AND to work on developing an original, fun game. Thus we get the constant, haggard procession of small, rushed games with mediocre graphics, mediocre gameplay, and which are sometimes not even playable at the release date due to show-stopping bugs.

This phenomenon of design enhancing content is not entirely bogus, but in most cases it shouldn’t affect a viewer’s opinion. For instance, if I make a blog post with a picture at the top, people are more likely to read the entire thing. Why is this? I have two theories. I outlined one above a little bit; a user expects a certain layout and a certain level of visual quality. If this superficial standard is fulfilled, the user is subconsciously more acceptant to the content, even if it is of a lower quality than another less refined project. If the design is lacking, the user tends to disregard it as lower quality, and has a bias against the content before they even start consuming it. The other theory is that people are brainless. Most people don’t actually judge by substance quality; their mind tends to wander, and when it doesn’t want to focus on the stuff that makes them think, they look at the design. If something doesn’t have a good design to look at, they move away from the content.

But the question is, why do people value the visuals to begin with? This comes into play in more than just digital media. Everybody remembers sights easier than any of the other four senses. Sure, you might smell a scent and recall a name or memory of an event. But you can’t just recall the smell. It is like hash map. You can get from the smell to the linked memories, but just given the data structure you can’t reconstruct the keys that the hashes came from. Taste and touch become slightly easier. Sounds are almost as easy to remember as sights. So why are the senses given different priorities? I think the answer lies in two odd facts. First, for each level of aforementioned priority, the size of the vocabulary for describing sensations varies directly with the ease of recollection. Second, it is easier to reconstruct visual memories than it is tastes or smells. Humans can easily communicate what something looked like by drawing in the dirt, using hand gestures, or drawing on paper. Tactile sensations can be simulated using the hand, and sounds can be similarly recreated using the mouth. But smells and taste are completely out of our control.

I’m sure there is a whole field based around categorizing scents and tastes and developing a detailed vocabulary. If anybody has experience with this, feel free to comment.

%d bloggers like this: