Sunday, May 15, 2011

38 Studios

So once I decided not to start my own company, I started looking around for a company in the games industry that really had employees, and more importantly, employees' families at the forefront of their priorities. I wanted a place I could raise kids, have a life, and make great games with my friends.

38 Studios stands head and shoulders above every other company in the industry in this respect. 38 Studios was founded by Curt Schilling, former Red Sox pitcher, and his passion is building teams. He knows that to create a great game you have to have great people, and treat them and their families well.

Curt will move heaven and earth for his family, and if you're on his team, you're family. Rarely have I met anyone as genuinely passionate and good hearted as he is. He has strong opinions, but he is also wise enough to trust his team and value their opinions and expertise.

When I came to visit I was very impressed by everyone I met. They are smart, nice, and honest with themselves and each other. They realize they have a tremendous task in front of them, creating a company, creating a great game, creating a world. But each time I've seen them approach a problem, I've seen them solve it in innovative and great ways. Each time I've seen them choose between the expedient path and what's right for the team, they've done what's right for the team. Not only does that hold true at the top levels, but that attitude, flexibility and self-empowerment is there in everyone I talk to.

38 Studios faces challenges, for sure. They are trying to do something that hasn't been done before. They are a privately owned company making an MMO, and trying to do it in a healthy sustainable way. They are growing quickly and trying to prepare themselves for the transition to a service organization while at the same time creating a quality content game. They are trying to find the right formula and the right people and they're trying to do it in a realistic time frame.

Can they do it? I don't know. But I am incredibly motivated to help them try, because if they can pull this off, this is my dream come true. A great company, great people, and a quality life for my family and me.

My first day starts tomorrow. Wish me luck! :)

Wednesday, April 6, 2011

Exploring the Galaxy

In 2008 I created Galaxy Gameworks as a small LLC to handle commercial licensing of Simple DirectMedia Layer on the iPhone. Since then I have been maintaining it at a minimal level in my spare time to make it possible for people to use SDL on iOS and other embedded platforms.

As soon as I made the concrete decision to leave Blizzard, my brain started exploding with ideas on how to expand Galaxy Gameworks into a full blown games middleware company. My vision was to build the company into three pillars each designed to help people make games:
  1. Software: Starting with SDL and expanding to meet important needs for small developers, based on my experience leading a development team at Blizzard Entertainment.
  2. Education: I have always been a big fan of student and indie development, and I know an awesome woman, Brandii Grace, who used to teach at Digipen and is on the board of the LA Independent Game Developer Association who could help start this.
  3. Consulting: This is a less developed idea, but there are lots of companies that need help at strategic points to help get their games shipped, and this would be a great service to complement software and education.
As soon as I left Blizzard, I blasted into overtime starting development on SDL and learning what I needed to build Galaxy Gameworks into a real business. I had lots of connections and people interested in what I was doing, and was really excited about the possibilities. I gave myself until the end of March to explore what is involved with creating a company and making my dream a reality, without committing myself.

It was a really amazing experience. I brought all the pieces together to start the company:

Company Vision
I have seen quite a few companies in my day, and I wanted to bring a company vision that incorporated the best of what I have seen, with an eye towards practicality and treating employees really well. I also wanted to have clear goals so that employees who are brought on and potentially given ownership have a good idea of the kind of company we are trying to make.

Virtual Office
I decided early on that I wanted to create a virtual office. This gives flexibility in hiring and reduces overhead, but means you have to work harder to find people who can be self-disciplined, stay motivated, and communicate well remotely. I put together a virtual office around Google services, and set up business phone and FAX. I also worked out the advantages and risks associated with distributed development, and talked to some companies that actually operate that way to find out how it works in practice.

Good branding and easy to use website are really important. Some of the important elements are a showcase for your products, good documentation, clean presentation, and easy navigation.  I hired someone to create a logo (the process is described in another blog post) and create a new website for a commercial presentation.  The result is at and while it's not complete, it's a good start.

Customer Service
I hired a friend of mine, Jen, to handle customer communication and start setting up hiring processes. She is is amazing at organization and has customer service and HR experience. She is handling the mailing list and forum, organizing bug reports, and handling customer inquiries and licensing. She also did research on the tax implications of hiring contractors and full time employees in other states and potentially other countries.

I have been working very closely with my friend Sheena, who has great technical writing skills and is learning programming, to put together high quality API documentation (work in progress at We also put together plans for tutorials and use cases with examples.

Business Plan
I worked with my wife Lauren and an old friend Jeremy on the draft of a business plan. Lauren is awesome because she is a technical writer by trade and can take large amounts of technical information and organize it in an easy to understand manner. Jeremy is awesome because he's spent the last 15 years or so in software sales and business development, and is extremely innovative and works well with customers.

Between the three of us, we came up with a great business plan centered around three products. The business plan really helped us focus on our goals and business strategy. We thought about our strengths and weaknesses (SWOT analysis), our competitors and potential partners, the trends in the games industry, and our costs and potential sources of income.

We got great feedback on our plan, and a few themes came up repeatedly:
  • What problems are you solving?
  • Keep it short - what's your elevator pitch?
  • Can you show a fast return on investment?

In response to the feedback, we cut one of the products to stay focused and reduce costs and ended up with a revenue and expense chart that looks something like this:

Investors like to see an exponential chart, and a return in the 6-18 month timeframe. :)

I'm pretty confident I could find some investors at the $300,000 price point and create a small sustainable company creating middleware in the games industry.

At the end of the day though, I realized that even though I have all the pieces I need and lots of support from friends and customers, it was taking all my waking hours and would continue to do so. In a few years when the girls are older and becoming independent that might be okay, but my time with them is precious and I want to have as much of it as possible now.

So, I'm closing Galaxy Gameworks for now, and the quest continues...

Monday, April 4, 2011

How I left Blizzard

Blizzard doesn't tend to be very open about how and why people move around at the company, and I still have people asking me what happened, so for those interested I thought I'd share my story.

As far as I know, nothing I'm writing here is confidential, so please tell me if that's not the case.

To understand something of the story, I really should start at the beginning...

After the Wrath of the Lich King launch, I was looking to move on to something new. I loved Blizzard and I didn't want to leave the company, but I wanted to work on a smaller team, a smaller project, with more responsibility. Unexpectedly, a job opened up for technical lead on a small unannounced project that was getting underway. It was a small project, with a short timeframe, at least for Blizzard, and it was a project that I personally was very interested in. I applied, and at the beginning of 2009 I was the engineering lead on a new unannounced Blizzard game!

I was very excited, and spent the first few months researching engine technology, getting some of the  infrastructure set up, and worked with the producer on a detailed technical plan for the project.  After that, since the designers had a solid idea for the game and were making great progress in the prototype, I started work on proving out the gameplay systems and hiring the most urgent engineering positions.

Come July 2009, the company sounded a call to arms to help get ready for the Starcraft II launch. There was much to do and each team loaned a person or two to help out. Since our team was so small, most of us went to help out, including our producer and UI artist. We kept a small crew of designers, an artist and our newly hired graphics engineer to keep the project alive during our tour of duty on

We kicked butt, took names, and Starcraft II launched smoothly a year later with 2.0.

During the course of that year, our producer was promoted to lead producer, and the designers made great strides in evolving the game design and the gameplay model. The graphics engineer had completed the core of the graphics engine and was roughing out the tools pipeline and content creation system.

When we returned from, I jumped right on getting our AI/gameplay engineer up to speed on the gameplay systems and started the herculean task of migrating code I had already finished over to the new gameplay model and tools systems that were developed while we were gone.

Three months later that work was being wrapped up, and I was called in to a meeting and told that there were issues on the team.  I had just worked out some issues with my gameplay engineer, but there were issues beyond that, which they needed to follow up on to give me concrete feedback.

Over the next week I implemented a great suggestion from them which improved my communication, our part-time producer led technology goal and milestone meetings which I had requested and were very helpful, I worked on a clear division of labor and responsibility among the team, and I had one-on-one meetings with all of my engineers to see how things were going.  As far as I could tell morale was improved, things were going great with my gameplay engineer, and we had a clear plan for the future.

I was feeling great, and was told that even though things had improved, there were still unspecified issues that were being looked into.  The meeting was very uncomfortable as though something were being danced around, so I mentioned that if the leadership of the team thought it was best, I was willing to step down.  The meeting suddenly relaxed and I got nervous.

A few days later, I was asked to interview someone for technical leadership of the team!  I asked for the chance to talk to whoever was having problems, and was told that would be awkward.  A few days after that it was confirmed that they had already made the decision to have me step down, but that the team leadership was there to help me succeed however they could.

So after a little primal scream therapy, I sat down and really thought about what went right, what went wrong, and what really was important to me.

One of the factors that I think contributed was that we hired really experienced people in the industry, and I hadn't had experience building a small game from the ground up.  Instead of mandating how things should be done, or bringing a set of best practices, I tried to work collaboratively to help us develop the best tools and approaches for our unique needs.  I think this was a creative and good way for us to go given our talents, but I think it increased stress and was possibly interpreted as weakness on a team already feeling time pressure.

So, assuming that was a factor, and since I honestly did not know how other game companies work, I set out on a quest to tour the industry and learn as much as I could about how companies other than Blizzard operate.  I was interested in everything including technology, development processes, team composition, project planning, and business relationships.

I want to stop here and thank everyone who helped me in this quest.  Lots of companies opened their doors and under NDA showed me everything that I wanted to learn about.  Blizzard management too, went way above and beyond the call of duty and introduced me to people who could help me learn about the industry.  I learned that there are a ton of great companies out there, and lots of good ways to approach things.  Thank you all! <3 :)

Armed with a good perspective on the industry and how games are developed, I looked back over how we set up our project and came to the conclusion that it was entirely reasonable, from a tech perspective, and felt comforted.

With that awesome tour completed, I turned my attention back to my future.  My options were to stay on the team as an engineer, transition to another team at Blizzard, leave and start my own venture, or join another company in the industry.

I realized I didn't really want to stay on my team, because even though I love the team and I still think it's an awesome project, I didn't feel like I got any support in that role from management, and I never was told why I was asked to step down.  I don't feel like it was personal, that the management really was trying to do the right thing for the team, but I still didn't want to stay in that environment.

I thought a lot about joining another team at Blizzard, or joining another company, but I realized that after launching Wrath of the Lich King, booting engineering on a new team, and shipping new on StarCraft II, all in less than two years, I was starting to feel burned out.  I also looked at myself and realized that sometime in the last ten years I had become a family man, and what I wanted to do more than anything else in the world was spend time with my family.

I took a deep breath and relaxed, feeling my shoulders unclench and my face break out in a smile, and knew I had made the right decision.

Friday, March 11, 2011

Life and Love

Every once in a while you see something that takes you completely out of your day to day routine and reminds you that this world is a wide and wonderful place.

Spider Robinson is one of my all time favorite authors, and he has been entertaining me and teaching me about love and laughter most of my life.  I finished the Callahan series (again) tonight, and was touched by the plight of Doc Webster, who passed on with a brain tumor in the final story.  I finally noticed in the back of the book contact information for the American Brain Tumor Association:

I got on the 'Net and found his website and found that his wife had passed on, in peace, almost a year ago and was able to piece together a little of what his world has been like since then.  He is a brave man with a wonderful family. I'm lucky that I haven't had to deal with the death of someone close to me yet, but this world is richly textured, and people deal with pain and loss and anger and joy and triumph every single day around the world.

If you haven't seen his work, you can check it out in traditional and audio form here:

To all of you who are dealing with and have dealt with loss, I wish you the best, and while I know nothing can ever replace those you've loved, there are still more out there who have loved you, love you now, and will bring you joy in the future.


SDL Job Opportunity!

Software Engineer - Focus: SDL

We're looking for a talented developer who would be interested in developing and maintaining SDL. 

  • Passionate about helping people make games
  • Able and motivated to learn quickly
  • Organized
  • Good at communicating, both with developers and with customers
  • Comfortable with game development concepts
  • Comfortable with software design and implementation concepts
  • Comfortable using and developing for multiple platforms
  • Fluent in C, with a familiarity with some other languages
We're bringing the next generation of the widely acclaimed Simple DirectMedia Layer to game developers everywhere, and we need talented and motivated engineers to hone the library to a fine polish, improving stability, performance, and usability. The position is initially for contracting work, with the possibility of moving to a full time position. 

Interested candidates should send a resume and cover letter to

Monday, March 7, 2011

GDC 2011

GDC was really great this year.  It was interesting watching the interplay between different companies and people, and the ebb and flow throughout the show.  I got some great feedback on the business plan, got to pal around the show with an old friend of mine from high school, and learned a lot about presentation.  (note to self: Comic Sans is a terrible font for professional presentation) :)

Aside from the great contacts and fun meeting people, the highlights of the show for me were the exciting things happening on the Indie scene.  Unity had a great booth overflowing with people the whole time, GameSpy announced their Indie technology program, and it seems like every major technology developer has free or extremely inexpensive licensing for independent developers.

Of course each year I really enjoy the Independent Games Festival finalists.  The booth was packed the whole show, so I didn't get to try everything, but there were a few that really stood out in my mind:

Solace -
Solace is a blend between top down arcade shooter action and the creative experience of emotion and sound.  I've never seen anything quite like it before, it's definitely worth checking out.

Helsing's Fire -
Helsing's Fire has a unique art style and interesting gameplay.  I haven't played all the way through it, but it uses the tried and true method of introducing new elements as the game progresses, so the gameplay is constantly evolving.

Oh, before I forget... I met the developers of Angry Birds!  Wooo! :)

I have lots of followup e-mails to send, so I'm heading out, but until next time... Cheers! :)

Friday, February 25, 2011

Business Phone and FAX

For business purposes I need a contact phone and FAX number, but I don't want to have the cost and equipment requirements of a separate phone line.  Given that I'm creating a virtual office, I figured there must be inexpensive and flexible ways of setting up a business phone system.

Google to the rescue, again!

Actually there are lots and lots of options available with different pricing schemes and features, but given that I'm using Google for business e-mail and document sharing, and Google Voice is free, it's a natural fit:

It's a little awkward to call from - it requires you to enter the number you want to call and which phone you want to call from and then it calls both and connects the two, but it's freakin' awesome as an incoming voice message system.  You can route your virtual number to any phone you want, and if it goes to voice-mail, Google will take a message, transcribe it to text, and send the message to you via SMS and e-mail.

So, now that voice is taken care of, I need a solution for FAX. Again, there are lots of options available, but given that I plan to use e-mail for most outbound communication and FAX only for receiving signed license documents, OneSuite turns out to be a good option:

You have to create an account and "charge" it with $10, but once you do that, you can activate a FAX number for unlimited incoming transmissions for $1/month.  If you want outgoing FAX capability, or want the incoming FAX messages to go to multiple e-mail addresses, you can upgrade to $2.95/month + 2.5c/page for outgoing transmissions.

Business phone and FAX solution implemented.  Total cost?  $1/month.  Pure win?  Priceless.

Problem Solving in PowerPC Assembly

When you're working with other people's code, you don't always have the source, and sometimes this means you have to dive into the assembly code to figure out what's going on.

If you're lucky, there will be a small function where there are well defined inputs and outputs and you can see what's going on...

In this case someone reported a bug where if they created two different streaming textures, the program would crash deep inside glTexSubImage2D() in a routine called gleCopy().  They helpfully sent a small test program, and sure enough, it crashed on my iMac.

So, not having any leads, I start stepping through the assembly code:

0x0424a3fc <gleCopy+88>:        mtctr   r4
0x0424a400 <gleCopy+92>:        rlwinm  r0,r9,2,0,29
0x0424a404 <gleCopy+96>:        addi    r9,r9,1
0x0424a408 <gleCopy+100>:       lwzx    r2,r11,r0
0x0424a40c <gleCopy+104>:       stwx    r2,r3,r0
0x0424a410 <gleCopy+108>:       bdnz+   0x424a400 <gleCopy+92>
0x0424a464 <gleCopy+192>:       add     r11,r11,r7
0x0424a46c <gleCopy+200>:       add     r3,r3,r8

This is a tight copy loop, where the loop count is loaded from register r4, and the data is copied by offset from pointer r11 and stored at that same offset in poiner r3. At the end of the loop, r11 is incremented by r7 and r3 is incremented by r8.

In C this might look like:
while (rows_to_copy) {
    for (i = 0; i < count; ++i) {
        dst[i] = src[i];
    src += src_pitch;
    dst += dst_pitch;

When I printed out the registers, I noticed an interesting thing. The source and destination pointers started out the same, but the source and destination pitches were completely different!

To understand why the source and destination pointers were the same, I looked at the code that creates streaming textures and saw I'm using an Apple extension to have OpenGL use application memory instead of internally allocated memory.  Since I use that same pointer when updating the streaming textures, it makes sense that if the system thought copying needed to be done, it would copy from the pointer I passed in, to the pointer I told it to use for data storage.

But why would it think it needed to do copying?  I looked on the original web page describing how to optimize Mac OS X texture upload and noticed that it had a note saying that only textures with a width of a multiple of 32 bytes would bypass the copying step.  I figured that it shouldn't crash if they didn't have an aligned width, but what the heck.  I resized the textures and as expected the program still crashed.

So what would cause the pitches to be different?  Well, in a normal texture upload, the pitch is controlled by the  GL_UNPACK_ROW_LENGTH attribute.  In a flash, I realized that's probably what the extension uses to determine the original pitch of the texture.  Sure enough, I was missing a call to set that attribute when creating the texture. Looking more closely at the values in assembly, the pitch that it was using for the destination was indeed the pitch value that was set for the first texture upload, and of course if pitch values don't match in a pixel copy operation, you have to copy them row by row, just like the assembly code above was doing.

Adding a call to set GL_UNPACK_ROW_LENGTH in texture creation fixed the problems!

So, even though the bug was in my code, and a pretty obvious one in retrospect, it was really helpful to be able to look at the assembly and understand WHY my code was wrong.


Thursday, February 24, 2011

Evolution of a Logo

This past month I've been working with a friend of mine, Shawn Bellah, on rebranding Galaxy Gameworks in preparation for GDC.  This is a kind of big deal, because this is the birth of the company in the public eyes, and I want to make sure it's as fun and professional as possible.

One of the important pieces of this process is the logo.  The logo should be distinctive, simple, and say something awesome about your company.

We went through a very iterative process during the logo development.  We started with a bunch of brainstorming sketches that Shawn had done, where we looked at different ideas to use with the logo.  We liked the double-G play in the company name, and geeked out on the space theme.  We looked at lots of pictures of galaxies and nebulae, and tried a few different ideas like interlocking G's, where the top of each G was an arm of a spiral galaxy, another where the G was the center of a galaxy seen edge on, another where the G was made of welded steel plates surrounding a galactic center, etc.

But I kept coming back to the first sketch that he had done to warm up, which was a solar system with the planets aligned to form the cross-bar of the G.  I felt a connection there, and I wanted to go explore the planets, and above all it just looked interesting.

So he played around with that idea, creating a galaxy suggestive of the letter G with some larger stellar objects to create interest:

 We settled on one of the variants with a nice angle, the planets in alignment and a glow in the middle:

That looked nice, but wasn't suggestive enough of the G, and was a little too complex for our purposes.  So Shawn removed the planets and the tail, and thickened up the lines considerably:

This was simple and showed off the two G's nicely, but I didn't really like the bulge on the outer G, so I used my crude art skills to trim it to show what I was thinking, and Shawn came back with half a dozen variants on what I was showing him.

At that point, both Lauren and I had been staring at it for too long to have any reasonable fresh opinions, so I showed the options to some friends who were artists, and they unanimously picked one of them.

So, a little more polish and we have our final logo!

It's clean, simple, and has the double G with a galaxy motif.  Perfect!

Thursday, February 17, 2011

Ninja hacking on the iPhone

I'm tracking down a crash in SDL on the iPhone, and the path is not yet clear to me, but I thought some people would enjoy the view along the way.

The crash itself happens only on the real phone, not on the simulator, and it's a crash initializing an SDL_uikitopenglview, which is a view deriving from SDL_uikitview, which in turn derives from UIView.

The callstack for the crash looks like this:
_class_getMeta ()
_class_isInitialized ()
_class_initialize ()
objc_msgSend_uncached ()
UIKit_GL_CreateContext () at SDL_uikitopengles.m:146

Of course everything past my code is ARM assembly, which makes it a little tricky to debug.  Luckily Apple has published the source to their Objective C runtime, so I can disassemble the functions using gdb and follow along:

First, there's a couple useful things to know if you're poking around at this level:

The ARM calling conventions are that registers r0 through r3 are for parameters passed into functions, and they correspond to parameters from the left to the right. The return value of the function is also passed back through r0.

The Xcode debugging window has a nice interface with the code right there along with the local variables and registers. On the far right is a button to bring up the gdb console where you can do some pretty advanced things.

gdb quick reference:
b <name> - set a breakpoint at the beginning of the named function
s - go to the next line of code, stepping into function calls
n - go to the next line of code, skipping over function calls
si - go to the next assembly instruction
fin - run until the function returns
c - continue running until the next breakpoint
p <var> - print the value of a variable or register (e.g. $r0, $r1, etc.)
x <address> - lookup the symbol associated with an address
display <var> - print the value of a variable or register after each command 
list - list the code around the current execution

Most of these we don't need since the Xcode UI is pretty nice, but a really handy one is 'si', since that will let us step into the assembly and then use the UI to continue tracing the execution.

So first, I set a breakpoint at the line that crashes:
view = [[SDL_uikitopenglview alloc]

Then, I bring up the gdb console and use the 'si' command a few times until I get into assembly, just to see what things look like:

I'm curious what the first parameter to objc_msgSend() is, so I use 'x $r0' and it shows that it's "OBJC_CLASS_$_SDL_uikitopenglview", which is the Objective C class definition for my custom view.

Then I use the 'b' command to set a breakpoint in the _class_initialize() function, and bring up the code so I can follow along with the assembly.  When the breakpoint hits, I step into the first instruction in the function, a call to _class_getNonMetaClass(). I double check r0, and it's still my view class definition, but on return from the function, it's been set to 0!

The code that was executed is this:
static class_t *getNonMetaClass(class_t *cls)
    if (isMetaClass(cls)) {
        cls = NXMapGet(uninitializedClasses(), cls);
    return cls;
which means that somehow the class for my view didn't get into the map of classes that my program has loaded.

I did a little googling and found that Apple has a set of APIs for managing and interacting with the Objective C classes, and so I wrote a function to print them out and look for anything with SDL in it:
void print_classes()
    int i, numClasses;
    Class * classes;

    numClasses = objc_getClassList(NULL, 0);
    classes = malloc(sizeof(Class) * numClasses);
    numClasses = objc_getClassList(classes, numClasses);
    for (i = 0; i < numClasses; ++i) {
        char *name = class_getName(classes[i]);
        if (SDL_strstr(name, "SDL_")) {
            name; // Yay, found it!
Sure enough, when I run it on the simulator I find the SDL view classes, and when I run it on the device they don't show up. If I use nm on the application binary in the app folder, I see the classes are there, in both the simulator and device binaries:
nm -m Happy | fgrep SDL_uikitopenglview
0008a287 (__TEXT,__text) non-external -[SDL_uikitopenglview context]
000cd06c (__DATA,__objc_data) external _OBJC_CLASS_$_SDL_uikitopenglview
000cd058 (__DATA,__objc_data) external _OBJC_METACLASS_$_SDL_uikitopenglview

So, at this point I know why the crash is happening, but I don't know why the classes aren't being loaded on the device, or how to fix it yet.

Update:  Eric Wing figured this out.  The problem is that the Objective C class definitions were in a static library and the linker wasn't bringing in all the code necessary to construct the classes.  The solution is to add -ObjC to the "Other Linker Flags" setting for your application.

Thanks Eric! :)

Saturday, February 12, 2011

MPEG acceleration with GLSL

Video decoding is something that people are always trying to find ways to accelerate.  Whether it's making HD video more HD or dynamically streaming video to textures in your game, we want it as fast and high quality as possible.

MPEG based codecs have basically two steps which are time consuming, the first is decoding each frame into a YUV colorspace image, and the second is converting the image from YUV to RGB.  There is lots of information available on the MPEG stream decoding and YUV colorspaces, but here I'm going to focus on the YUV to RGB conversion.

To understand how to accelerate this process, we need to understand a little about the YUV format and how the conversion is done.

YV12 images consist of 3 planes, one Y image sized WxH, and a U and V image, sized W/2 x H/2.  Put simply, the Y plane contains the luminance, which can be used alone for grayscale, and the U and V planes contain the red and blue color components, one value for each 2x2 block of output pixels.

The formula for converting from YUV to RGB is:
R = 1.164(Y - 16) + 1.596(V - 128)
G = 1.164(Y - 16) - 0.813(V - 128) - 0.391(U - 128)
B = 1.164(Y - 16)                  + 2.018(U - 128)

The basic idea for the shader is to create three different textures, one for each plane, and pull the Y, U, and V components from each texture and combine them using the above formula into the output RGB values.

The key to optimizing the shader is to recognize that the GPU shader hardware is optimized for massively parallel operations and that many of the common operations used in 3D math are optimized down to a single cycle, and then reducing number of operations as much as possible.

Looking at the formula, it can be broken down into an offset for each of the YUV components, and then a multiply and add operation on each of them, which conveniently is how a dot product is defined.  So, I simply create constants for each of the operations and put it all together!

varying vec2 tcoord;
uniform sampler2D tex0; // Y 
uniform sampler2D tex1; // U 
uniform sampler2D tex2; // V 

// YUV offset 
const vec3 offset = vec3(-0.0625, -0.5, -0.5);

// RGB coefficients 
const vec3 Rcoeff = vec3(1.164,  0.000,  1.596);
const vec3 Gcoeff = vec3(1.164, -0.391, -0.813);
const vec3 Bcoeff = vec3(1.164,  2.018,  0.000);

void main()
    vec3 yuv, rgb;

    // Get the Y value
    yuv.x = texture2D(tex0, tcoord).r;

    // Get the U and V values
    tcoord *= 0.5;
    yuv.y = texture2D(tex1, tcoord).r;
    yuv.z = texture2D(tex2, tcoord).r;

    // Do the color transform
    yuv += offset;
    rgb.r = dot(yuv, Rcoeff);
    rgb.g = dot(yuv, Gcoeff);
    rgb.b = dot(yuv, Bcoeff);

    // That was easy. :)
    gl_FragColor = vec4(rgb, 1.0);

Now the test!

I used a 1024x1050 image, converted it to YV12 and then repeatedly updated a streaming texture and displayed it on the screen.  I ran this test on a Mac Pro running Mac OS X using both MMX optimized software color conversion and OpenGL GLSL color conversion.

The code is available here:

SDL_RENDER_DRIVER=software ./yuvspeedtest ~/bluemarble2k_big.bmp
Using software rendering
26.83 frames per second

SDL_RENDER_DRIVER=opengl ./yuvspeedtest ~/bluemarble2k_big.bmp
Using opengl rendering
1040.53 frames per second

Using hardware shader acceleration got almost a 50x speedup!

Fifty times?! That's right...FIFTY! OMG!!!


Friday, February 11, 2011

Streaming textures with SDL 1.3

I was recently asked how to use streaming textures with SDL 1.3, and while it's very simple, I didn't actually find any documentation on how to do it, so here it is!

First, why would you use a streaming texture?

Static textures are designed for sprites and backgrounds and other images that don't change much.  You update them with pixels using SDL_UpdateTexture().

Streaming textures are designed for things that update frequently, every few seconds, or every frame.  You can also update them with SDL_UpdateTexture(), but for optimal performance you lock them, write the pixels, and then unlock them.

Conceptually they're very simple to use:
  1. Call SDL_CreateTexture() with the SDL_TEXTUREACCESS_STREAMING access type
  2. Call SDL_LockTexture() to get raw access to the pixels
  3. Do any pixel manipulation you want
  4. Call SDL_UnlockTexture()
  5. Use the texture in rendering normally.
You can specify any RGB/RGBA or YUV format you want and SDL or the hardware drivers will do the conversion for you on the back end if it's not supported. To get the best speed you'll probably want to create the texture in the first format listed in the renderer info, although at the time of this writing SDL_PIXELFORMAT_ARGB8888 is the optimal format for all renderers. 

You might also want to create a surface from the texture pixels if you're doing old style blitting using other SDL surfaces.  You can do this by creating a surface with no pixel data and then filling the pixel and pitch info in later:

texture = SDL_CreateTexture(renderer,
                            width, height);

surface = SDL_CreateRGBSurfaceFrom(NULL,
                                   width, height,
                                   32, 0,

SDL_LockTexture(texture, NULL,
... draw to surface

I put together a very simple example based on the running moose by Mike Gorchak:


Tuesday, February 8, 2011

Fun with shaders!

I just added GLSL shaders to the SDL OpenGL rendering implementation.

On my hardware this ends up being about a 200-400 FPS increase in testsprite2:

4259.55 frames per second

4552.88 frames per second

I also got a modest increase with testsprite, using the old SDL 1.2 API:

1329.16 frames per second

1354.20 frames per second

Woot! :)

I also noticed there's not a single example of using shaders with SDL that has full source code, so I added one:

Enjoy! :)

Saturday, February 5, 2011

Texture Streaming Performance

In my recent SDL 1.3 update I made it possible for the old SDL 1.2 API to be accelerated using texture streaming.

On my system on Mac OS X and Linux, this doubled performance!

Mac OS X
  • SDL 1.2 testsprite: 514.22 FPS
  • SDL 1.3 testsprite texture streaming with OpenGL: 1259.62 FPS
  • SDL 1.3 testsprite2 (hardware accelerated):  3865.16 FPS
  • SDL 1.2 testsprite:  495.48 FPS
  • SDL 1.3 testsprite texture streaming with OpenGL:  1244.55 FPS
  • SDL 1.3 testsprite2 (hardware accelerated):  2556.85 FPS

On my system the Windows performance got worse!

  • SDL 1.2 testsprite using GDI: 1030.71 FPS
  • SDL 1.3 testsprite using GDI: 1077.81 FPS
  • SDL 1.3 testsprite texture streaming with OpenGL: 623.08 FPS
  • SDL 1.3 testsprite texture streaming with Direct3D: 233.97 FPS
  • SDL 1.3 testsprite2 (hardware accelerated with OpenGL): 3027.26 FPS
  • SDL 1.3 testsprite2 (hardware accelerated with Direct3D): 4259.48 FPS

Clearly the Windows GDI drivers are heavily optimized for 2D performance, but why is the Direct3D streaming performance so poor?

Here's what I'm doing for OpenGL:
  Texture format:  GL_RGBA8, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV
  Texture update:

I'm not doing anything fancy with pixel buffer objects, I'm just making sure that my data is in the optimal format for processing by the OpenGL drivers.

Here's what I'm doing for Direct3D:
    Device Setup:
    pparams.BackBufferCount = 1;
    pparams.SwapEffect = D3DSWAPEFFECT_DISCARD;
    pparams.PresentationInterval = D3DPRESENT_INTERVAL_IMMEDIATE;

    Texture create:
    device->CreateTexture(width, height, 1, D3DUSAGE_DYNAMIC, D3DFMT_A8R8G8B8, D3DPOOL_DEFAULT, &texture, NULL);

    Texture update:
    texture->LockRect(0, &locked, NULL, D3DLOCK_DISCARD);
    device->DrawPrimitiveUP(D3DPT_TRIANGLEFAN, 2,
 vertices, sizeof(*vertices));


For those who are curious, the full code can be found here:

So... does anyone know how to improve Direct3D texture streaming performance?

Thursday, February 3, 2011

New SDL API changes

I've been quiet the last week working on a massive restructuring of the SDL rendering API.  The result is a simpler, easier to use, and easier to port system.

Whew!  Time to sleep! :)

Thursday, January 27, 2011

Pointers and qualifiers

I had forgotten what the rules for pointer qualifiers were, so for the ignorant and forgetful, here they are:

The stuff to the left of the * is what the pointer points at.
The stuff to the right of the * is the qualifiers on the pointer itself.

So... volatile char* volatile * const ptr
is... a pointer that can't be changed, that points to a volatile pointer that points to volatile chars

Benchmark, benchmark, benchmark

I've been meaning to add a yield() function to the SDL API for a while.  I've also been mulling around with different ideas to improve the spinlocks.

Here's the current spinlock (pseudo-code):
while (locked) delay(0)

My theory was that if the lock was contended, it's better to retry a few times before sleeping.  I also figured using the OS's thread yield function was a better way of giving up the current timeslice than sleeping for 0 time.

Here was my new spinlock:
if (!locked) return
while (true)
    for (i = 0; i < 4; ++i)
       if (!locked) return

Astonishingly, my FIFO benchmark went from 3 seconds to 10 seconds!

Thinking about this a little, it makes sense... the CPU probably doesn't see the value change in as few cycles as the loop is, so all we're really doing is wasting time in the loop.  When there's high contention, as in my FIFO test, this just means the whole system is working harder.

So, I went back to my basic spinlock, but I used a yield instead of delay(0):
while (locked) yield()

Oddly enough, my FIFO benchmark wasn't any faster, in fact it was 4 seconds instead of the original 3 seconds.  Apparently on Mac OS X sched_yield() is a more expensive function than nanosleep().

My conclusion?
Always benchmark your changes, even if they're obviously an improvement, and don't be afraid to leave well enough alone. :)

Wednesday, January 26, 2011

Digitally Signing Documents

In the course of my business day, I frequently have to sign licensing or tax documents and e-mail them to people.

I've been looking for a paperless way to do this, and I've found a fairly easy way to do it using the GIMP ( and OpenOffice (

  1. Scan your actual signature and edit it in GIMP, converting the background to transparency using Colors -> Color to Alpha, and then save it as a PNG file.  You may need to rescale your signature to have it fit on documents, depending on the resolution of the original scan.
  2. Download and install the OpenOffice PDF import extension:
  1. Open your PDF document in OpenOffice, which should open in the Draw program.
  2. Select the Insert -> Picture -> From File... menu option and select your signature image.
  3. Move it into place on the document.
  4. Print the document and in the Print menu, select "Save as PDF"
  5. Send it via e-mail!
There's also a really good digital signing service at DocuSign (, and we'll probably start using that once there's enough volume to make it worthwhile.

Is Lock-Free worth it?

I've spent a bunch of time the past few weeks learning about lock-free and wait-free algorithms for super fast multi-threaded data structure access.

I've had two motivations for this.  First, I wanted to vet the new SDL atomic API, and second I wanted to potentially increase the performance of the SDL event system.

In the course of learning about lockless programming, I've gained a healthy respect for people who are doing it correctly.  It's incredibly tricky to get right, and even reviewed and published papers can have subtle bugs in them.  There are a lot of subtle issues and ways multi-threaded lockless code can fail, especially on the XBox 360 where data ordering between CPUs is not guaranteed at all without expensive sync instructions.

All of this has made me wonder if it's a good idea to make a cross-platform atomic API available in the first place.

But that said, let's first see what the benefits might be...

I wrote a lock-free FIFO that modeled the behavior of the SDL event queue, and created a test that could run both lock-free and using a mutex, to simulate the way the current SDL event queue is handled.  I then added a watcher thread and single spinlock to the lock-free version to simulate a thread periodically coming in and having to do heavy duty manipulation of the queue.

The test took 4 writer threads and had each of them put 1 MILLION events on the queue, and then created 4 reader threads to pull them off and process them.  The queue was limited to a maximum of 256 events.  I defined wait states for writers if they tried to add an event and the queue was full, and defined wait states for readers if they tried to get an event and the queue was empty.

I ran the test on a 4-core Mac Pro (with 8 hardware threads), on a mostly idle system.

The results were astonishing!

FIFO test---------------------------------------

Mode: Mutex
Finished in 37.097000 sec

Writer 0 wrote 1000000 events, had 0 waits
Writer 1 wrote 1000000 events, had 0 waits
Writer 2 wrote 1000000 events, had 0 waits
Writer 3 wrote 1000000 events, had 0 waits

Reader 0 read 999998 events, had 50 waits
Reader 1 read 1000003 events, had 23 waits
Reader 2 read 999998 events, had 13 waits
Reader 3 read 1000001 events, had 7 waits

FIFO test---------------------------------------

Mode: LockFree
Finished in 0.688000 sec

Writer 0 wrote 1000000 events, had 5308 waits
Writer 1 wrote 1000000 events, had 5263 waits
Writer 2 wrote 1000000 events, had 5348 waits
Writer 3 wrote 1000000 events, had 5314 waits

Reader 0 read 1023604 events, had 7838 waits
Reader 1 read 991976 events, had 7688 waits
Reader 2 read 981770 events, had 7700 waits
Reader 3 read 1002650 events, had 7893 waits

As you can see, the lock-free version was over 50 times faster than the mutex version!

On the other hand, the mutex protected queue was able to process an event in 9 microseconds, which is well within the design specs for the SDL event queue. :)

In conclusion, lock-free algorithms are a huge benefit for code that needs to be extremely high performance and scale well across many processors.  Applications of this might be fast transaction network servers, databases, or massively parallel data processing.


Oh, and if you're not afraid yet, think about might what happen if you're managing dynamically allocated objects with an atomic reference count...

Saturday, January 22, 2011

Weekends are important!

This past week was the first week in a three week sprint I'm on to get SDL in a releasable state.  Realistically I know there will be some bigger tasks that will still need to be done, but I feel like I'll have most of the important smaller tasks and bugs and polish taken care of.  However there's a huge amount of work to be done, and to hit this goal I've been working almost non-stop for the past week.

Ahhh, weekend!

The weekend is hugely important, and I'm planning to take my weekends as family and planning time.  They give me a chance to slow down, relax, smile, and get perspective.  If I'm going so fast that I don't have time to relax on the weekend, I'm doing something fundamentally wrong and need to rethink things.

Ahhhhh. :)

/goes to hug his wife...

Friday, January 21, 2011

Multiple Gmail accounts

I finished setting up Galaxy Gameworks with Google mail and apps services today, and was immediately frustrated by the inability to have my business e-mail and my personal e-mail open at the same time.

I did a little digging and found that Google has indeed made it possible to have Gmail and the core web apps open on two different accounts.  However, setting it up is a little tricky.

You can turn on this feature by going to Settings -> Accounts -> Google Account settings, and then editing the “Multiple sign-in” setting.  Once you have done this and refreshed your browser, you can click the downward arrow next to your account name in the upper right and log in with another gmail account.

Voila! :)

Keeping the "Simple" in Simple DirectMedia Layer

SDL serves three types of video API users:
  1. People who just want a framebuffer and use SDL 1.2 blit and direct pixel access to do the job.
  2. People who just want an OpenGL context, and use OpenGL or OpenGL ES to do the job.
  3. People who want a hardware accelerated 2D API

For 1 and 2, the functionality was available in SDL 1.2 and the feature set is well understood.

For 3, this is a new area that SDL 1.3 is supporting, and it's really easy to lose the "Simple" in "Simple DirectMedia Layer"

So I think here is a good time to remember that the goal of the SDL rendering API is simply:

Hardware accelerate operations that were typically done with the SDL 1.2 API.

This is tricky, since people did lots of interesting things with direct framebuffer access, but let me break it down into a feature set:
  • copying images
  • filling rectangles
  • drawing single pixel lines
  • drawing single pixel points

SDL 1.2 provided colorkey, alpha channels, and per-surface alpha, as well as nearest pixel scaling.

Again, to break that down into a feature set:
  • blending for all operations
  • single vertex alpha for all operations
  • single vertex color for all operations
  • scaling for image copy operations

It's tempting to add functionality here, but that road lies madness...

Thursday, January 20, 2011

Organization is key!

As I'm starting to grow Galaxy Gameworks as a business, I'm finding that there are so many things to think about and keep track of that it's easy to feel overwhelmed.  I understand now why executives have personal assistants - they are essentially extra brains to handle all the myriad details involved in the daily business.

I don't have a personal assistant (my wife offered to help, but she's starting a business of her own and needs her own personal assistant!) but I do have a computer.  A computer is only as useful as you make it though.  You know the old saying, "Garbage In, Garbage Out".  I used to work for the government in Sacramento doing system admin and desktop support long ago, and I met many high level executives who had a computer and no idea how to use it.  Fortunately, I am computer savvy and can make my computer my b.. er, personal assistant. :)

Most of my communication is done through e-mail - licensing, bug reports, employee coordination, TODO notes to myself, social events, etc.  Up until now I haven't had time to organize my e-mail, so I just have a giant pile of messages where things easily get lost or left to sit forever.

This is bad.  Not only does it mean that important things are left untended to, but that I always have a low level of stress about not getting everything that I need to do done.  Worse, it means I don't even know what those things are, so I can't compartmentalize them and know that I'll get to them later.

So here are my goals:
  • Respond in a timely fashion to people
  • Know what needs to be done
  • Know when things need to be done
  • Make sure that everything has been addressed, either by trashing it or replying to it or prioritizing it on a TODO list.
  • Feel good because I'm organized and responsible. 

People are important.  They tend to view their importance to you directly in proportion to your responsiveness.  If you respond quickly, you look professional and caring.  You can develop good relationships and get help with what you're doing.  If you don't respond, you often look like a jerk and people will look elsewhere for friendship or business.

Organization is important.  There's always more to do than time to do it in, and the only way you can find out what the important things are and prioritize them is if you have a good understanding of what all needs to be done.  In my case, I can only guarantee that I've handled and organized everything if I keep my inbox clear.

So I know what I want to do, but how do I get there?

Gmail to the rescue!

Gmail has a really powerful message labeling feature.  I spent a good amount of time this morning creating a set of labels that are intuitive and really helpful.  With too many labels I can't remember what label things go in or how I'm going to handle it.  With too few labels then things get clumped together and I don't get any organizational benefit.  Labels can have "/" in them and I use that to do logical groupings, which conveniently turn into folders on my iPhone mail program.  I also set up some message filtering rules to automatically add labels to certain types of messages, optionally bypassing my inbox entirely!

Gmail also has the ability to show and hide different labels in the sidebar, and I used that to keep categories of mail that bypass my inbox or that I need to deal with frequently available at a glance.

Gmail also has a "star" flag, which I use to mark messages for followup.  I can then click a link to bring up all my starred messages, and they show up in a list, along with their labels so I can quickly go through them and address them.

Finally, Gmail comes with a little task list.  I haven't used that much yet, but I'll give it a little spin and see if it's useful.

All of this organization takes some time up front, but the payoff is that I know what needs to be done, can do it more efficiently, and have peace of mind at the end of the day.

Suggestions welcome! :)

Wednesday, January 19, 2011

Learning is fun!

One of the things that's really fun about developing SDL is the incredible breadth of things that you learn in the process.
  • A couple weeks ago I went from knowing nothing about Android development to being very comfortable with activity life cycles and JNI.
  • Last week I dove into the dragon infested waters of atomic operations and lock free programming, and in the process was humbled and gained a healthy respect for people who do that for a living.  
  • This morning I knew that CSS was used to affect the look and feel of web pages.  By the end of the day I had practical lessons on how it works and affects the style and behavior of a site.
I wonder what I'm going to learn tomorrow .... ? :)

Making website feedback easy

The last few days I've been working with my wonderful documentation guru, Sheena, to streamline the process of creating and contributing to SDL documentation.  We needed a way for people who want to contribute to the documentation effort to share information and coordinate their efforts.  We also wanted to make it extremely easy for people using the documentation to suggest improvements or request clarification.  We're using a wiki, but have locked down permissions to avoid the plague of spammers that abound on the Internet today.

Creating a way for active contributors to coordinate was easy.  We're hosted on Dreamhost, and they make it painless to set up a new mailing list, so we redirected to a new mailing list for contributors.

Creating a way for casual readers to contribute and give feedback was more difficult, especially since I wanted to make giving feedback inviting but not require Sheena to add markup to each page.  I wanted to make it possible to give feedback without following a link to another page, so the person would have the context of the page they were on for giving great feedback.  I also wanted to make sure that it was obvious how to give feedback, but also didn't interfere with the page content.

With these requirements in mind, I set off in search of the perfect feedback system!  After a short search, I found something that looked pretty promising:

It was pretty much self contained, which was a plus in terms of integrating it into the wiki site, and did a good job at being accessible without keeping an entire form on screen.  However once we added it to one of the wiki pages, a few problems became immediately apparent.  First, the color and general style didn't match that of the documentation site.  That's not too bad, a little photoshop time and adjusting some of the CSS sizes and offsets, and we're done.  A bigger problem was that it was positioned at the lower right, continually obscured important information on the page.  We looked at a couple of ideas, but the one that really made sense was to put it in the upper right.  A little looking at the CSS and Javascript code, adjusting some offsets and flipping the arrow toggle state, and we're done!

... or so I thought.

It worked perfectly in my test file on Firefox, but when I opened it in Internet Explorer it was complete garbage.  Not only was it not anchored in the upper right, but elements were badly overlapping and the button didn't animate anything.  To make matters worse, when I moved the feedback form from my test document to the real wiki page and viewed it in Firefox, the sliding panel was neatly anchored already out, instead of tucked up out of sight.

Okay, one problem at a time... first, what's the difference between the test file and the wiki page?  After putting some tests together and scrutinizing the page source, I realized that the wiki page had a <!DOCTYPE> element while my test page was hand coded HTML with no doctype.  It turns out that Firefox is smart and if a page specifies a certain flavor of markup in it's doctype, then it will be compliant to that standard.  If a page doesn't specify a doctype, then Firefox will enable quirk mode, which is essentially the combined experience of lots of web developers making pages work.  Now that I know that, it's just a matter of looking up the relevant section in the standard and fixing it right?

... of course not. :)

It turns out that the technique I was using, negative margin values, was perfectly acceptable and compliant.  After playing with it a bunch, I could never get the panel to move regardless of what margin values I used.  I shelved the problem and went back to Internet Explorer to see what I could do there.

<montage scene of me completely deconstructing the HTML and CSS and rebuilding it a piece at a time>

So after playing with it a lot, I realized that using headers and spans mixed with div markers was completely hosing Internet Explorer.  I carefully rebuilt the form using only <div> and images and after a few trials, voila!  It was anchored correctly, the sliding panel was initially out of sight, and then popped down into the correct location when I clicked the arrow.  I quickly switched back to Firefox and Safari and they were happy as clams.

Now that I had a working feedback form, I had to figure out how to integrate it seamlessly into the MoinMoin wiki.  I poked through lots of python code for generating the sidebar and implementing themes and found that someone had thoughtfully created a way to add custom markup to the footer of each page.  I just set the variable "page_footer1" in to the feedback form markup, and the feedback form was now on each page of the wiki.

For the final touch, I wanted the feedback to go to the documentation mailing list and include which page the feedback was coming from.  A few quick edits of submit.php and we are off and running!

You can see the final result in action at, and the code is available for your perusal at  Feel free to use it however you like, and let me know if you find a way to improve it!



I'm working full time on Simple DirectMedia Layer now, and in the process I'm learning so much about lots of new topics.  I've been really impressed at how much good information I've found on people's blogs, so I thought I'd start one and both show my ignorance and help others who are following in my footsteps.

Cheers! :)