Why We Have File Size Problems

When working on any type of project there are always trade offs to be considered. One that we ran into early on was art quality and quantity versus file size. When developing for the iPhone it is best to stay under 20 MB so that users don't have to be connected to WIFI to purchase your game. When developing a sprite based game it is easy to run up against that wall very quickly.

Steve hadn't been forced to deal with the painful realities of the nitty gritty details of how computers work before, so when we bumped up against the size limit it led to some initial frustration. Steve yelled, "Why are there crazy 3d games on the iPhone, but we are running into trouble with simple 2d sprites?" as he threw excrement at me from his banana tree.

It was a fair question. Unfortunately to appreciate the answer a bit of technical knowledge was needed. What follows is the email I sent to Steve explaining images (with some parts edited/redacted). Those familiar with the technical details will spot a few exaggerations, oversimplifications, inaccuracies, and bends in the space time continuum. This isn't meant for you. If however you are working with someone on a project and they want to know why they can't have hundreds of 2048x2048 pixel 32bit images, then send them this.


From: Chief Tacology Officer (I make them)

To: Co-Consumer of Tacos (He eats them)

Date: Back When We Started This Thing

Subject: Why We Have File Size Problems

I figured an explanation of the problem we have might help.


Images are basically a big map of what color is at what pixel. So if I had a 4x4 pixel image. The data for the image would look like this:

red red red red

red red red red

red red red red

red red red red

Since computers are dumb, and we have an insane number of colors (16777216), that actually looks more like:

16711680 16711680 16711680 16711680

16711680 16711680 16711680 16711680

16711680 16711680 16711680 16711680

16711680 16711680 16711680 16711680

16711680 is the numeric representation of the brightest red. The problem gets even worse since we have to take into account the fact that all of these colors come in various levels of transparency (bringing us to a whopping 4,294,967,296 colors):

4294901760 4294901760 4294901760 4294901760

4294901760 4294901760 4294901760 4294901760

4294901760 4294901760 4294901760 4294901760

4294901760 4294901760 4294901760 4294901760

This is for a 4x4 pixel image. Many of our images are several thousand pixels in each direction. I am pretty sure my email would explode if I tried to demonstrate.

Now so far we are talking about numbers and have avoided how that relates to image size. Well each one of those colored pixels is 4 bytes. If we saved images in the simplest of ways, then we could get image size by multiplying height and width and then multiplying the product by 4.

4x4x4 = 64 bytes!!!!

2048x2048x4 = 16,777,216 bytes = 16,384 KB = 16 MB !! Holy Crap!

There is good new, there is bad news, and there is some more semi-bad news.


Lots of smart people have spent years figuring out really intelligent ways to store the same amount of information in less space. If the same color appears in many many places in the same image there are ways to use less room than the naive way I first showed you. It is even better if there is a lot of the same color next to each other. A very simplified explanation follows:

4 x 4294901760

4 x 4294901760

4 x 4294901760

4 x 4294901760

Basically the above structure says, repeat this color for the next 4 pixels. It would use approximately 20 bytes instead of 64 bytes. If we express the bounds of the image first we can do even better.

4 x 4

16 x 4294901760

This would use around 7 bytes. FYI, this isn't even close to the actual method they use, but it demonstrates that there are ways to express the same image with less data. If you want to jump of the deep end, read about DEFLATE (ah that's the good stuff).


These methods work best with big blocks of solid color, but they start to fall apart on gradients or more complex textures. They still make the image much much smaller than the worst case scenario, but they aren't even close to the best case scenario. This is why the textured version is 10 times the size as the clean version.


No matter what tricks are used to store the files on disk, when the phone actually displays the images it cannot use the compressed version. This means that it reads the compressed version off of the iPhone's hard drive and as it loads it into memory it decompresses the image. The nice small imaginary 7 byte image that we managed to create returns to it's 64 byte size in memory. That 2048x2048 image becomes 16MB in memory (RAM). Ouch!


This giant amount of memory usage is why we are forced to use fewer colors during gameplay. In some instances we have multiple 2048x2048 images moving around. That is a huge problem on devices with limited amounts of memory aka iPhones.

The 4 bytes that each pixel takes up is because we want to use the full range of color. If we use a reduced color set than we can use 2 bytes per pixel. This effectively halves the amount of memory we use and prevents the iPhone from having a mental breakdown. Black and white graphics anyone?


3D games don't store images of everything in the game. If they did, they most certainly would take up millions of terabytes. They would not only need an image of a tree, but an image of a tree from every angle that it could be viewed from, every lighting condition it could be under, every frame of animation of its branches blowing in the wind. The tree would take up more space than our entire game.

Instead, 3D games store a few small images and instructions on how to draw bigger images based on them:

drawTriangle <insert triangle position> 

paintWith <some shade of green>

textureWith <some file with a tree texture, some algorithm that does magic>

This looks more like code because it is (well sort of). 3D modeling programs can output a few different things:

Images - a single viewpoint of an object, follows the same rules of size as any other images

Video - a series of images

3d Data Files - a bunch of rules (code) that another program can read to draw the output of the modeling program

3d games read the 3d data file to draw the images themselves. This greatly increases the complexity of the code and the amount of CPU used to do the calculations, but it makes it possible to draw amazing things without using absurd amounts of disk space.


I don't think the above explanation does an adequate job at demonstrating just how much space you save by giving instructions to the program to draw, rather than saving the images. So I think it is better to compare the size of the instructions (code) and images in REDACTED GAME REFERENCE, to a full resolution video.

Code & Images < 12MB

34 Second Video = 72MB

Given that the video is only one particular path the game could have taken, and there are near enough to an infinite number of possible images that that one play could generate, I think it is fair to say that by programming rather than generating images for every possibility we saved several terabytes of space.


Nope it sure isn't. We have sacrificed size and freedom for simplicity. It is easy for us to build a sprite based game using tools that you are familiar with. This also allowed us to build an awesome game in less than the N+ years it takes to build a serious 3d title (and they have much larger teams). We have some limitations, but we will get to make something awesome in a short amount of time.


We are currently saving everything as images, which as I explained above takes up a lot of space. It would be awesome if we could save instructions for the iPhone to draw things smartly instead. Luckily for us, the clean look facilitates that (Editor's Note: What clean look? You have to wait and see). Some nice guys invented something called SVG (Scalable Vector Graphics). I believe Flash can export things to SVG (Editor's Note: Apparently I was wrong). SVG is not a list of pixels, it is a list of instructions.

There is no built in way to make the iPhone follow the list of instructions inside of an SVG, but I could write code that will read an SVG and tell the iPhone what to do. It will take time, possibly a few weeks, maybe a month. If I do it, we would save an absolutely absurd amount of space. We could have REDACTED and add in REDACTED and eat millions of tacos!


Images take up lots of room. 3D uses magic. SVG might be our savior. (Editors Note: Turns out reading SWF directly was.)

**** Almost everything in here is oversimplified to the point of not being 100% accurate (because there are a million if/thens and contingencies, etc, etc), but I would rather work on our game than write a technical document. ****


Editor's Notes:

It's been awhile since I wrote the above and in the interim I have found code that will help reading SWF files directly (Steve has worked mostly inside of Flash for this project). It doesn't support everything we need, but it is open source so I am updating it so that it does. There is a decent chunk of work ahead to bring the level of support up to where we want it, but it looks like we will be getting around this road block soon.