So you’re curious about the math involved in graphics programming. Well, someone is. People ask me about this stuff now and again. But a subject like “the math of graphics programming” is large enough to fill one of those great, big reference books that nobody likes to use because they’re so heavy and they take up so much space.
Years ago I had a devil of a time learning this for myself. Most tutorials are dense with Greek symbols and mercilessly heavy on jargon. It’s one of those situations where you have to already understand the subject before you can comprehend the lessons to teach you the subject. So I offer my own explanation here for the curious. This won’t teach you everything you need to know, but it might help give you a frame of reference so the other lessons make sense. Or it might just be a fun way to blow five minutes. Whatever.
Naturally, making computer graphics starts with things that have x, y, and z coordinates. Even in 2D games. (Remember that 2D games are called this because the gameplay is 2D. You still have backgrounds and other elements that need to be drawn in the distance. You’ve got scenery behind the main character, the game pieces, the Pokemans, or whatever else you’ve got going on there in your little game.
It’s a 2D game, but there’s still depth as far as the programmer is concerned. 
Observe programmers sometime and you’ll notice that there is no right way to do something, only a dozen wrong ones with varying drawbacks and annoyances. In 3D, one of those annoyances is in deciding what x, y, and z mean. X is the one variable we mostly agree on. X is for east / west positioning, although nobody can say for sure if positive x should go east and negative x should go west, or the other way around. It gets even trickier with y and z, since we can’t agree on which one is up and which one controls north / south movement, and which way they point.
These distinctions are not meaningless. Even in a game where the player doesn’t have a sense of which way is objectively “up”, the level designers had one in mind when they built the thing. Even if the player doesn’t think of the world in terms of up / down, left / right, the artists and designers did, if for no other reason than they needed a common language with which to discuss their work.
In Child of Eden, you don’t really have a sense of direction. But the designers did. 
If you’re writing a game, it’s very, very common to have some sort of variable to hold these three values in one handy package. Confusingly, this package is usually called a vector. (It’s been called a vector in every project I’ve worked on that had a name for it.) This is confusing because x, y, z might be used as a vector or as a position. But I suppose “vector” is a better name than “three numbers thing” or “ecks why zee” or even “Donald”.
So drawing a gameworld involves setting up some kind of space with some agreedon orientation. Then you need to make 3D stuff move around in that space. For that we need a Transformation Matrix. It might sound like some technobabble made up by the Wachowski Brothers, but it’s a real thing. “Transformation” for changing a set of numbers into another set, and “matrix” because we use a grid of values to do it.
So we’ve got some geometry. We’ve created this cool 3D model of the player’s avatar. We’ve got all these 3d vertex positions that define the shape of this character. His head is a box, which has 8 corners. We draw a pair of triangles (in modern graphics, everything is always triangles – even rectangles are made from pairs of triangles) for each face of that cube. So, 12 triangles in all. The body is another cube, each of the arms is another, and the legs are two more. The Minecraft Guy is made from six cubes, which means means 72 triangles, which means 48 total vertex positions. We can just send those positions off to be rendered and call it a day.
Except, players are notoriously impatient. As nice as your model is, sooner or later these brats are going to want to move that guy around. Great. Thanks players. So the player tries to walk forward, or maybe turn to one side, and suddenly we’ve got to do something to this 3D character to reflect that. For that, we use the Transformation Matrix, which looks like this:
1  0  0 
0  1  0 
0  0  1 
This grid of numbers allows us to transform the 48 positions from localspace to worldspace, which is programming talk for “move that guy around”.
It works like this:
The first row is for calculating our x position. Multiply our current x times the value in the first box. Multiply y times the value in the second. Multiply z times the value in the third. Now add those three numbers together. That will be the new, translated x position.
Similarly, the second row is for calculating y. Multiply our current x times the value in the first box. Multiply y times the value in the second. Multiply z times the value in the third. Now add those three numbers together. That will be the new y position.
Finally, the last row is for calculating z. Multiply our current x times the value in the first box. Multiply y times the value in the second. Multiply z times the value in the third. Now add those three numbers together. That will be the new z position.
If you are very clever you will notice that I just did:
x = x * 1 + y * 0 + z * 0
And if you’re superclever and have a bit of algebra, you might notice that the above operation does precisely nothing. The value of x will not change at all. Like the famous cheese shop sketch, I’ve just deliberately wasted your time. Likewise the second row will do a bunch of math and set the value of y to what it was when you started, and the third row will work out that z should be… z. Hm.
The matrix I gave above is called the “identity matrix”. When the grid is all zero except for a diagonal of one, then the output is going to be the same as the input. Don’t worry, the computer doesn’t mind the senseless work.
But if we change the matrix to this:
0  0  1 
0  1  0 
1  0  0 
Now if we crank our values through this, x will end up with the value of z, and z will end up with the value of negative x. Assuming y is used for up / down, this will have the result of making your avatar turn 90 degrees to one side.
Crank all 48 vertex positions through this, and the Minecraft guy is now facing (say) east. (This is just an example, I don’t actually know how Notch sets up his coordinate system.)
Usually the matrix is not filled with nice, neat numbers like ones and zeros. See, you use sine and cosine to move these values around, so…
0.984808  0  0.173648 
0  1  0 
0.173648  0  0.984808 
Just happens to be the matrix that would rotate our guy ten degrees to the right.
Done? Not quite. See, he’s still standing at the exact center of the world, and this will eventually bore players. He needs to move as well as rotate. I know the matrix I’ve been working with is a 3×3 grid of numbers, but a proper matrix is 4×4. I just lied to you earlier in hopes that you would keep reading instead of getting scared and running off. Here is a proper identity matrix:
1  0  0  1 
0  1  0  1 
0  0  1  1 
0  0  0  1 
The yellow boxes are used for scaling. So, after you go across that first row and figure out the value of x, you multiply the result by that last yellow box. Most of the time this box is simply set to one, and the result does not change. However, you could change this number to (say) ten, and suddenly the object being transformed will become ten times larger.
The green boxes are used for translation, which is fancy programmer talk for “moving”. (Translation is moving something from one place to another. Transformation is the process of moving, rotating and scaling, combined.) This is a bit odd, but once you’re done with x on the first row you add the first green box to the result.
The red box is not used. By convention, people put a one in that box, just because it makes it seem like the thing has purpose, but setting this value at all is the equivalent of vacuuming behind the couch where nobody will ever, ever see. You can put in zero, or last night’s lottery number, or whatever else strikes your fancy, because no matrix math uses it. I’m personally a fan of not setting it to any value in particular, which means it will be filled with random trash values.
And no, I don’t move the couch when I vacuum.
So to calculate x, you do:
x = x * 1 + y * 0 + z * 0
Then multiply the result by the yellow box. Then add the result to the first green box.
For y, you run the numbers on the second row, and add the second green box. z is the third row, added to the third green box.
For the curious, in C++ the code might look like this:
1 2 3 4 5 6 7 8 9 10 11  GLvector glMatrixTransformPoint (GLmatrix m, GLvector in) { GLvector out; out.x = M(m,0,3) * (M(m,0,0) * in.x + M(m,1,0) * in.y + M(m,2,0) * in.z) + M(m,3,0); out.y = M(m,1,3) * (M(m,0,1) * in.x + M(m,1,1) * in.y + M(m,2,1) * in.z) + M(m,3,1); out.z = M(m,2,3) * (M(m,0,2) * in.x + M(m,1,2) * in.y + M(m,2,2) * in.z) + M(m,3,2); return out; } 
Crank all of our 48 vertex positions through that sucker, and the Minecraft guy will be able to move around the world and rotate, just like your favorite videogame characters. In this way, our matrix can be thought of as a magic box. Put stuff in, and it comes out in a different location and facing a different direction. It’s a bit like the prism crates in the upcoming Portal 2:
Link (YouTube) 
We can then stop worrying about the math under the hood. Just rotate this matrix (by messing with the 3×3 grid of numbers) or translate it (by changing the values of the green boxes) and crank our vertex positions through it.
But Shamus, the arms and legs move on the Minecraft guy. How does your matrix thing do THAT?
…
You savage bastard. We’re 1,700 words into this explanation and now you’re going to ask a question like that? What kind of monster are you? Sigh. Okay. Basically, there’s another matrix for each of the moving parts. So you crank the arms through the fullbody “avatar” matrix, then you run them through the “arm” matrix. One matrix can be nested in another.
Wrapping your head around matrix math is pretty much the cover charge to get into graphics programming. It’s a roundhouse kick to the left hemisphere of your brain, but if you can hang on long enough things will start to get easier.
What Does a Robot Want?
No, selfaware robots aren't going to turn on us, Skynetstyle. Not unless we designed them to.
Quakecon 2011 Keynote Annotated
An interesting but technically dense talk about gaming technology. I translate it for the noncoders.
The Gameplay is the Story
Some advice to game developers on how to stop ruining good stories with bad cutscenes.
Starcraft 2: Rush Analysis
I write a program to simulate different strategies in Starcraft 2, to see how they compare.
The Best of 2013
My picks for what was important, awesome, or worth talking about in 2013.
Hooray for linear algebra!
I had extensive classes on that subject, and they neglected to tell me that you need it for 3Dprogramming. Very clever indeed.
Transformation Matrix: Michael Bay and the Wachowski brothers, together at last!
That is the worst thing that could occur in the history of ever.
Ah, you must be speaking of the VaultTec Assisted Targeting System.
I would’ve payed SO much more attention to matrices in Hon. Algebra if they had said something about graphics programming!
Ouch, maths I haven’t used in over a decade. My brain hurts.
Matrices are fun! When are you going to calculate the inverse of the matrix?
You had me right up till the point where math started getting involved, Shamus. ;) Then I just skimmed to the end of the article. Math has always had the effect of shutting my brain down. I suspect that were it not for my love of D&D and the necessity that math plays in it, I would not possess any more mathematical knowledge past basic arithmetic.
But…D&D only requires basic arithmetic unless you’re min/maxing and trying to determine expected values for different options.
So, essentially, it only requires basic math if you’re doing it wrong. :P
Alternatively, it only requires basic arithmetic UNLESS you’re doing it wrong. ^.~ Two can play at this game!
Congratulations. You’re no longer playing an RPG, you’re doing a spreadsheet.
And yet the math he described was just basic arithmetic: x * 1 + y * 0 + z * 0
multiplication and addition. trivial stuff. of course, overall, he described something akin to linear algebra (except scaling/transform parts of the matrix and the unused box are graphicsspecific extensions), but he described in plain English how it’s just a matter of multiplying and adding the right choice of numbers together.
the closest he came to “hard” math was that he said the word “sine” in reference to rotations which weren’t 90 degree turns, but nowhere else did he rely on anyone knowing or caring about that sentence.
“Another matrix for each of the moving parts”. Well, okay, but what about fluidmotion obects (say, a wacky waving arm inflatable tube man, or a flag flapping in wind) that don’t really have parts, but every vert has unique positioning (probably related by a function to two neighbours, though, with a few control vertices marked on the animated skeleton)? Do you really have a transformation matrix for every vert?
There are two ways of representing an animation:
1. Store the position of each vertex at each point of the animation
2. Store how each vertex moves between each point of the animation
Generally, storing deltas (the second method) is more spaceefficient than storing the entire position again each frame – and a transformation matrix is a fairly efficient way of representing the difference between two points.
The alternative (which is often used for benchmarking, for instance) is to store a much simpler representation of how the object as a whole moves, and calculate the exact vertex positions onthefly. This is much more spaceefficient, but means much more runtime processing.
I think in this case you would run it through a translation matrix to move it to your desired location and then run a different and completely unrelated mathematical function to determine the new positions of your vertices. Perhaps one that does a simple physics calculation taking into account the weight per vertex, pressure of the internal air at the given time and outside wind vector.
Edit: Or you can use efficient cheats like the first two methods described above — which basically unless changing the way the armwaving thing waves is crucial to the gameplay or you’re trying to sell your game based on physics, is what you should do.
If you have a skeleton, you might have a ‘joint’ defined on the skeleton, which can be evaluated and used to synthesize a transformation matrix every frame.
If you’ve got a fluid, you might have ‘particles’ which have a position & orientation and then have forces applied to them to update a new position & orientation every frame.
Animation is expensive.
For an awkward moment, I thought my father was commenting on this post despite the fact that I know he doesn’t read it.
I have tried though. He codes for a living, and for fun.
Fluid bodies have a lot of different approaches. Let’s set them aside for now because I don’t want to write another 2k words.
For the soft deformation – Regular avatars are a good example of this. If you did the minecraft style animation to (say) Commander Shepard it would look HORRIBLE. (I’ve done it.) The arm moves very mechanically, and the polygons around the joint stretch in strange ways.
I don’t know how they do it on these 10,000 polygon monsters, but on the 1k poly avatars of ten years ago, you would have these “weighting values” for each vertex. Like, this point is 90%upper arm and 10% torso. This one is 50% upper arm, 50% lower arm.
Then you calculate both possible positions for the verts. “Here is where this very would be if it was 100% torso” and “here is where it would be if it was 100% arm”. Then you blend those two positions together according to the weighting.
That’s still how they do it, but now they let the GPU do the heavy lifting of calculating and interpolating those 10 000 points over the thirty or so skeleton vertices.
And you paint them onto the model’s weight map with a paintbrush, which can actually get incredibly annoying if you’re trying to do mechanical things and you realise that your 3D program does not appear to actually support simply grabbing a bunch of verticies and saying “Attach these to that bone.” (Looking at you Maya.)
You can do that in the component editor. It’s very clunky, but it works.
Yay Blender!
The nice thing about transformation matrices is that mathematically you can treat them like a regular number. So the transformation can be written like
P(new) = P(old)*M + T
With P being the point vector, M the Matrix and T the movement vector (T for translation)
Now you have one Matrix (M1) and movement vector (T1) for the upper arm, another one for the lower (M2, T2).
You can either compute both results and interpolate, or you can just interpolate the matrix and the movement vector themselves and apply those, which makes sense if you have more than one vertex with the same weight.
I’m not in computer graphics, but being an engineer who’s dealing with completely different stuff and knows transformation matrices from a slightly different angle:
In my field, we always put the scaling part inside the matrix, that saves you one operation. Just multiply the matrix lines or columns with the appropriate scaling value (If you do it to the lines, you’ll scale the alreadymovedand rotated object, if you do it on the columns you’ll scale first, then rotate and move — should be easier keep track of and keeps the center of scaling in a constant location relative to the object).
So instead of Shamus’ (please excuse the crazy formatting, I can’t seem to make it work better)
 0.984808 0 0.17364  .  1 
 0.000000 1 0.000000  *  10 
 0.173648 0 0.984808  .  1 
I would just use:
 0.984808 0. 0.17364 
 0.000000 10 0.000000 
 0.173648 0. 0.984808 
(that’s a 10 degree rotation and making the object 10 times as high)
This way you save three multiplications per point but you need nine multiplications to join the scaling vector into the matrix. => only worth it if you have more than three points with the same transformation. => Most of the time.
Luckily, since graphics chips support hardware Transformation (the “T” in “T&L”, which the first Geforce chips introduced), you can just apply a transformation matrix with one command that not only looks like a simple multiplication but also executes as quickly, because the single components are handled in parallel. Modern graphic chips can handle not just one of these multiplications at a time but … lots.
This is very akin to what good old vector processors used to do, the ones used in supercomputers like the old Cray machines. Which is why modern graphics cards are so popular with supercomputing people since especially in scientific computing you use vectors and matrices a lot, and since vector processors have almost died out, and “scalar” processors (read: probably the only ones you know about) are really slow to do this (which is why the first geforce chips were the biggest boost 3D graphics ever got), people are happy to have something again which can do at least threedimensional vector operations in hardware accelerated mode.
Although doing something with 6 to 17 dimensions would make us much happier still …
Oh, and “vector” is the official mathematical term for a couple of numbers used to describe a point in space, or an arrow (doesn’t matter, it’s either a point, or the arrow starting at (0/0/0) pointing at said point. Works the same in 2D or in 17 dimensions, mathematically. Because maths does not care whether your brain melts or not.
Got it? :)
“Oh, and “vector” is the official mathematical term for a couple of numbers used to describe a point in space, or an arrow […]”
Not really. There is a difference in math between a point and a vector, the same way there is a difference in physics between a point and a vector (say the position of an object and the force being applied to it).
The problem is that the difference disappears when you use coordinates. If you talk to me about (1,1,2), I can’t tell if it’s a position or a vector.
A lot of progress was made in 20th century math by finding ways to abstract ourselves of the need for coordinates.
But a position is also a vector — it represents the movement you’d have to make from the origin to reach the desired location. Which is why you can get away with calling everything “vector”s. :)
…My brain hurts after reading this…
Well i’m sure you know it, and probably thought about it while typing this up shamus. But Minecraft does use Donald oddly. x is east and z is north, while y is up.
I like these explanations, fills up alittle time, and i feel accomplished for learning
To get more to the math side of things, all this is encompassed in the field of Linear Algebra. Here’s a list of topics on wiki. The basics are actually really easy, and can be picked up by anyone who did a year of highschool math.
Shamus, you did a nice job of explaining it. I aced Linear Algebra, but never made the connection to graphics (although transformations were a topic).
For the record, the 4×4 matrices are really projective transformation (the very last category on the wiki page linked).
Hence the useless 1 at the last position, which does not come into play in calculations but is a source of comfort for mathematically inclined people who would hate to leave a hole in their matrices.
Maybe next time you can do collision parameters, gravity, and (if we are lucky) movement modifying items, such as a trampoline.
I like donald.Donald is much better than vector.We should all convert to donalds now.
Does this mean my middle name is now Vector? I’m not sure how I feel about that.
Now I wish we had another child so we could name him Vector. Taht would be an awesome name!
Obligatory “Despicable Me” reference, “Ohyeah!”
(Yes, that’s run together on purpose – see the movie.)
What’s our vector, Victor?
Have we got clearance, Clarence?
Roger.
Yes?
Roger, Roger!
Who? what?
And what about Turkish prisons? Have you ever been in a Turkish prison?
Looks like I picked the wrong week to quit sniffing glue.
Sorry, I had to do it.
Also, you may want to think this over:
http://en.wikipedia.org/wiki/Vector_(epidemiology)
I just want to tell you good luck. We’re all counting on you.
Or just call it a vector. I don’t really see what is confusing about calling a vector a vector.
A vector is supposed to be a magnitude and a direction though. Why is a collection of coordinates called a vector?
Unless you’d use the Pythagorean Theorem to calculate magnitude and direction from the coordinates or some such thing, which I guess would make sense. It still seems there should be a different name for it.
Those three numbers that define a point (let’s call it A), define a vector that has a beggining at coordinate beggining (0, 0 ,0) and ends at the A. All the math you did grapicaly on paper with vectors (like addition), you can do with their numerical representations (three numbers specifying where their end is).
The magnitude of such vector is sqrt(x*x+y*y+z*z), and the direction (angle between axis and it) can be calculated using trigonometric math.
A position is a magnitude and a direction from the origin. In more advanced math and physics, “everything” is a vector.
In fact, one of the things the “Local Space > World Space” transformation does is convert points from being relative to themselves (ie. 0,0 is at the center of the model, or some other handy place) to being relative to the world (0,0 is the world’s center).
So yeah, all positions are vectors, and have to be so ’cause they’re all relative to a completely arbitrary point which we decided to label as 0,0.
This is something that has always bugged me. Maybe I’m wrong, but as I understand it, in math a vector is just a set of numbers that represents something in a multidimensional space. [x, y, z] is a vector. [day, year] is a vector. [age, height, weight, armspan, bust size, waiste size] is a vector. Heck, even [eye color, hair color, gender, political party, county of residence] is a vector with the right transformations. I don’t understand why physicists have to confuse everything with this “magnitude and direction” stuff, and conflate it with vectors.
Yes, velocity has a magnitude and direction. That’s because the dimensions in a velocity vector represent magnitude and direction. If you want to make things complicated to understand, you can say [x, y, z] is a magnitude and direction for position in with relation to the origin. In fact, with enough mental acrobatics, you can justify saying every vector is a magnitude and direction, but at that point you’re taking so much effort to connect the concept of vectors to what they are, it becomes more confusing. It’s so much simpler to just say that a vector is a set of numbers that represent dimensions, instead of this magnitude and direction junk. And it confused the crap out of me until I figured that out.
EDIT: Note that this isn’t a rib against you, it just hits on one of my personal pet peeves.
I imagine a vector would have to be an element of a vector space to make sense, and the definition includes a scalar field, so you’d have to define quite a lot of stuff to multiply those discrete political parties with your scalars.
Edit for more clarity: With the position vectors this graphics article is talking about, the scalars are the real numbers. A vector multiplied with a scalar has to be a vector. But the field thing requires division between the scalars, so I think with discrete coordinate values it gets kind of complicated.
Except that in some parts of mathematics we have a habit of using the word “vector” when we really mean “tuple”, when there’s nothing in sight that even resembles a vector space. (Yes, it’d make much more sense to just say “tuple” instead, but people don’t always.)
In one class, a matrix was even defined as “a vector of vectors”…and I don’t think it’s actually possible to make any sense out of matrix operations if you take that literally in the vectorspace sense….
Curses! Once again, I’m foiled by questionable terminology by other people.
But actually, mxn matrices, with certain kind of entries (field?), are a vector space. The multiplication, transpose and so on are just additional operations, and also between different vector spaces. Though the “of vectors” part does get murky. I guess matrices are tuples of column vectors as well as vectors of their own vector spaces, but “vector of vectors”?
Unless, of course, one happens to call tuples vectors anyway…
even vectors of vectors are no problem, actually. Or vectors of vectors of complex numbers (which itself sometimes are displayed as 2D vectors…)
For me, matrices are operators, but I’m not sure about the proper definitions, so they may or may not have some geometric interpretation
Tensors!
Eight, sir; seven sir;
Six, sir; five, sir;
Four, sir; three, sir;
Two, sir; one!
Tenser, said the Tensor.
Tenser, said the Tensor.
Tension, apprehension,
And dissension have begun.
It doesn’t actually matter.
Let’s say I have a small volume of air, moving. That has a density, three momentum components (which actually form a vector, and a kinetic energy.
These five together are commonly used as a vector in aerodynamics. They have no geometrical counterpart, but if you choose to define a fivedimensional space where these five quantities are the components, then in that space, this is a vector. Luckily noone ever has to draw this up, and noone has to imagine it, because it’s just a way of getting the maths done quicker so we can get back to what we like more. (most other things, that is)
Saying “tuple” would be correct from the point of view of (say) a Python programmer, but mathematically it is a vector because we’ll continue to use vector operations on it. And in the end the whole governing equations system can be written as
Y = ax + b
… where each letter represents at least a fivedimensional vector or a 5×5 matrix, with complicated elements each single one of which takes forever to figure out and so on, but you can not only write it on less paper, you can also compute it with much less code (and much faster), if you just choose to define some vector space that contains as components the values you are looking at.
I acknowledge that this is a long way from highschool mathematics, and that it took me years to get some sort of understanding for it. So it’s fine if you just shake your head in bewilderment or have stopped reading already.
Let me just say that a vector can be anything that consists of independent components, and there does not need to be any (let alone easy to understand) geometric analogy.
The Magnitude/Direction is just the result of transforming a vector in euklidean space (x/y/z) to polar coordinates (like latitide/longitude/elevation). Mathematically, this must work with any vector, but it only really makes sense in actual 2D and 3D space, and maybe in 4D if you’re doing relativistic physics.
Density, momentum and kinetic energy work fine, as they’re continuous. My original point here was that discrete variables like one’s political party are trickier as components of a vector (space element), because you have to be able to multiply them with the scalars and their inverses while staying in the vector space. They can easily be elements of a tuple though, so tuple and vector are different, and “vector of vectors” isn’t obvious the way “tuple of vectors” is. Edit: Though “column vector of row vectors” would work.
Aside: the Fun in statistical mechanics is you get to deal with a phase space of not just 4 or 5, but 2dN dimensions, where d is the amount of dimensions in the space your N particles are moving in (d coordinates, d velocities per particle, N typically huge). For me at the moment, that Fun there is of the DF variety.
Well, to be pedantic, a vector space is usually defined over a field, something which can be multiplied, divided and added and subtracted. Vectors only need to be able to be added to each other.
If, for example, you can only multiply and add and subtract, not divide, you get something mathematicians call a *ring*. You can define a vector space over a ring, but this is usually called a *module*. As an example, consider as a ring the integers (either positive, zero or negative). You can add, subtract, and multiply them, but in general not divide them (you go out of the ring then). A module over the ring of the integers is for example the polynomials in one variable with integer coefficients (like x^2 + 1, or 5 x^3 + 3 x^2 + 7 x + 3). These form a module over the integers (and also form a ring).
You also have finite fields, for example, if you take the set {0,1} and define addition 0+1 = 1+0 = 1, 0+0=1+1=0, and multiplication 1.1 = 1, 0.0=1.0=0.1=0, you see that you can divide by every element except 0, and this gives you a field. This means you can define vector spaces over this field, and there certainly is no continuity (in the usual sense of the word) at work here.
Yes, a finite field is what I meant by “trickier”. ;)
Meaning you can’t use the usual scalar fields like reals with “discontinuous” vectors.
Abnaxis: This coordinate business is what is confusing the issue, and this is why physicists care so much about magnitude and direction, because there is a real difference between position and vectors that is not captured by thinking in terms of a single coordinate system.
But if you think about two coordinate systems, then the difference appears: a position might be say [0,1,0] in one system and [0,0,0] in another, so the magnitude of the triple is not physically meaningful since it is not conserved under coordinate changes. It does measure the distance from your point to the origin of the coordinate system, but since that origin is chosen purely arbitrarily, it is not really useful information.
On the other hand, a vector representing a force should have the same magnitude whatever coordinate system you consider.
I was wondering about that. I guess he means that a position isn’t a vector with a length and direction – but as positions do have a onetoone correspondence with the unique vector from the origin to that position (which has the same parameter values), the concepts are completely interchangeable, and you get to multiply them with matrices too.
Though I’d imagine Shamus would have run into position vectors in linear algebra, so maybe he has some other difference in mind.
“I don't really see what is confusing about calling a vector a vector.”
Nothing confusing about that. But in math, a vector is a point and a direction. {x,y,z} should be called a point, not a vector.
Ah, but in physics, we use a displacement vector with the tail at the origin to denote position (the tip of which is your point in three dimensional space).
Gah! Shamus, I can’t edit posts from my mobile. Like when I half finish something and post it anyway because I’m a idiot.
Anyway, back to how we do things in physics: We then break that vector into component vectors along the mutually perpendicular axes so that the length of each component is the position of the tip on that axis. Basically, a position might as well be a vector from the origin but without having to call pesky trigonometric functions that will only get you approximations anyway.
It is assumed that such vector has an begining in the origin (0,0,0)
A vector here actually refers to the column vector used in the matrix computation, which is different than, say, the Euclidean vector. The column vector actually represents an ndimensional vector with a starting point at the origin, and the end point described by the values in the column vector.
Different how?
http://www.youtube.com/watch?v=fVq4_HhBK8Y
I was hoping someone would finally link that so that I dont have to.
Thanks, I’ve been looking for that one for a wile now…
Or equally
http://www.youtube.com/watch?v=bsTuhzeSBE
Oh dangit. I shouldn’t have slept through precalc.
This oddly enough, coincides with my attempt to make it through the latest version of the OpenGL Superbible — and then if I pull that off, various books on AI and physics simulation that I’ve acquired over the years.
You know, because going to college after 8 years out of school isn’t hard enough.
This is why I stick to frontend development. That way I can just call the “moveGuy(x,y,z)” function that someone who actually knows what they’re doing created instead of programming it myself.
There is great wisdom in your approach.
And we’re back around to the CryEngine3 post again, aren’t we? *grin*
I take this a step further and simply play games. That leaves all this confusing front AND backend development to experts. They can use magic for all I care.
That said, I do like reading about these concepts. I just don’t have any desire to implement them.
You are inversing the flow of money with that simplification, however (from a positive to negative value per time).
Just don’t cross the streams.
What’s that in this context? Paying someone to give you money?
(I got the joke I just also got confused)
The funny thing is: By stating that you have just surpassed the majority of programmers. :)
Y’know, I’d argue with you here, but I’ve read some nasty stories about people trying to hire programmers and such and it’s just…DAMN man. I can’t believe some of the people out there that call themselves programmers.
It makes sense now how I wound up helping people in my Oracle class in College, despite the fact that I got a C in it and just HATED the whole thing. I used to joke with my roommate that I was the worst IT major ever, but damn, man.
There are some people who apparently believe that ‘programmer’ means ‘once saw a computer through a window’ :P
It gets better when your local university decides to teach people to program completely backwards — and with Java, using a custom set of easier classes to keep the students from a) learning to actually use the API or b) getting confused.
The textbook (which costs two to three times as much as a Sams and is less than half the size) starts with graphics, then goes back and introduces primitives, arithmetic, conditionals and loops. After midsemester my instructor says something about introducing control loops and I just looked at the guy next to me and asked how they were writing their programs if they didn’t know about for and while loops.
You know what part of the problem is? I think someone mentioned that there’s high demand for programming jobs and you’ll make a lot of money doing it, and then a whole ton of kids just went for C.S. instead of Business or Communications or something. Now we’re cluttered with people that don’t know what they’re doing.
It isn’t programming related, but it still amazed me one summer when I was working at a summer tech camp. We were setting up the computer stations and I was surrounded by C.S. majors. On one of the machines the disc drive wasn’t showing up, and no one else could figure out why. I stepped in, checked the computer’s properties to see the hardware, and when it was absent I figured to shut the computer off and open it up. Lo and behold, the disc drive was never actually connected. This had everyone else baffled for like twenty minutes, and I solved it in less than five.
I stood up and one of the C.S. majors said “Well it’s good we had an IT major here.” Note that the one class I had involving hardware was actually when I majored in Software Engineering, before I switched to IT, and my IT classes all focused on HTML and web design. It’s like, dude, I only have a +2 or something to my Knowledge (Computer Hardware) check. Why is it I could figure that out and actual C.S. majors can’t?
It’s a whole new world. Whenever I see people like Shamus get nostalgic of his younger days when he was just learning to program, it makes me sigh. Those days really are gone. My 12 year old cousin likes to think himself a hacker, but he uses so many programs that do a bunch of this stuff for you. I finally went over his house one day to help teach him the basics of web development, something he’s been trying to do for a while, and he didn’t even know what a div is (note: I still don’t know what div actually stands for, and my best guess is divider).
Trying to learn programming in the modern age is a totally different, and sometimes overwhelming, experience. Nonetheless, I wouldn’t be surprised if, in the future, there’s enough API and frameworks to make it easy that they’ll start teaching it in grade school.
Oh, I know, I’ve run into some people with a B.S. in Computer Science. And I was more competent with both hardware and software when I was in middle school than they are now.
My father was the CIO of a company and hated hiring people because HR couldn’t find a real programmer in the sea of useless CS majors. That’s part of the reason why I’m not a CS/Math major anymore.
ccesarano: I guess you’d be amazed to find out that the number of CS majors have been dropping down like crazy in the past few years. Apparently, the major is still too hard…
Lol education almost isn’t worth it anymore. I’ve never been formally trained to program. I kind of have my own style to it. Randomly throw some code together, then update it a million times and hopefully I’ve made something intelligent. Btw if I was a genius and had time to learn every detail of every subject I would hate API’s but it is nice to be able to just do what u want with out the overhead of learning a bunch of new stuff and writing some code to make it happen. Just saying if I could say computer make me chocolate milk and cookies I wouldn’t get up set if it did. I think is considered progress when u can do something that used to be hard and make quicker and easier to use.
Nice article!
Shamus, don’t you think that the transformation matrix is violating encapsulation in a horrible manner. I mean, I’m sure it’s an efficient way of storing those values, but the thing is no longer a matrix in the sense that you can do normal matrix/vector operations on it, and it is essentially three things in one. Wouldn’t it be better (from a programming sanity standpoint) to just have three variables for the rotation matrix, translation vector and scale parameters?
Indeed, using the normal matrix multiplication with that last one doesn’t work. You’d need a fourth element in the position vector, for one.
You could make normal multiplication work with one matrix though. Position vectors could have 1, or unit distance, as element (4,1). The transformation matrix would have the rotation numbers multiplied rowwise by the scaling parameters, the translation distances as the first three elements in the fourth column and 0, 0, 0, 1 in the last row. Though this probably would not make a difference compared to three variables.
Shamus simplified, but actually all computer graphics I know of handle vertices and transformations using 4×4 matrices and 4D vectors (the 4th coord is usually hidden though, since it’s useless).
Yes… but it needs to be in a different format than the one above for performing the transformation by (normal) matrix multiplication. The “identity matrix” above adds the 4th element of the vector to the first three instead of preserving everything, for example.
You’re right. A proper noop transformation matrix to use with matrix multiplication is the regular old identity matrix. The one Shamus has is actually translates points by (1,1,1) if you multiply it by the position vector. (Assuming the wcoord is 1)
I think they might be using quaternions.
edit: of course, they are using projective geometry.
It’s funny, the math is rather simple for me(but I’m a professional mathematician), but the application to programming is a total mystery to me. Shamus, please continue these nice explanations!
Actually, you normally *do* do matrix maths with those transformation matrices. Or at least in OpenGL you do: http://www.songho.ca/opengl/gl_transform.html
EDIT: Or, more elaborate: http://glprogramming.com/red/chapter03.html
Crazy … I’d just multiply the scaling vector with the rotation matrix, then you have a nice linear system: ax+b (a being a 3×3 matrix, b the translation vector).
On the sites you link to, there’s a fourth “coordinate” called w in every coordinate vector. What is that supposed to mean? Behind the second of your links somewhere it says: “vertices always have four coordinates (x, y, z, w), though in most cases w is 1”, and that’s all I found (at least after a few minutes searching).
Hmm… if the last column of the matrix is actually the translation vector, then the w component of the vertex will scale the effect of translation … i.e. with w=0.5 that vertex will always only get 50% of the movement specified. What’s that good for?
We see that obviously, 3D graphics programming is somehow at least using different definitions for some things.
Certainly ax+b is equivalent to cx where c is a 4×4 matrix when doing affine 3D transforms (such as scale, skew, rotate, translate), but homogenous coordinates come in useful when doing perspective, which is *not* affine. The trick is to do an affine transform in *4D* space which skews vectors with greater z (further from the screen) into higher w. Then the perspective *divide* normalises the points back into homgenous coordinates (w as 1): (x/w, y/w, z/w, 1). As you can see, objects that are further away get smaller, also known as perspective. Since the hardware has to multiply vectors by 4×4 matrices for this anyway, and 4×4 is equivalent to 3×3 + 3, the other transforms are also represented this way. This also means the other matrix tricks like inversion and concatanation also work without changes.
That actually makes a lot of sense. And if your hardware is for 4×4 matrices anyway, it also makes sense to use it instead of 3×3+3, even if a scalar processor would take longer for the operation, because on specialized hardware it still runs faster.
It also explains why graphics programmers find it weird, because using 4 dimensions for 3D content is not selfexplaining.
I don’t quite understand, though, why I would need a w variable for perspective division, because I already have z which is growing with distance from the viewpoint, so I could just draw (x/z, y/z), or not?
*insert a few minutes looking for, finding and reading http://www.songho.ca/math/homogeneous/homogeneous.html*
aha! I think I grasp it! w allows you to go to infinity without using infinite values for coordinates! This is cool. So you’ll keep it at 1 for any part of the geometry that is in reasonable distance, but you can set it to 0 in order to indicate that something is infinitely far away (and will ignore perspective), without needing a different approach for rendering backgrounds. And I guess it could also be used to circumvent problems of accuracy or size limitation of the regular coordinate variables, so you can have a size limit of say 64k for the variables but still have something at a distance of 640k if you just set w to 0.1. Should come in handy! Although geometry like that will not be translated correctly anymore, as will the infinity sphere, because if it’s at infinity, shifting by five meters changes nothing.
Shamus, you need to talk more about these things. Which would turn this blog into a complete dork affair which no one will want to read. Except me, I guess. I like it. Although to be fair I would maybe not follow a blog about apllications of vector mathematics … ok, just keep doing it like you are.
I love the fact that, somehow, in higher math, it can be correct to say 3×3+3 is equivalent to 4×4, where a gradeschooler wouldn’t be allowed to say that :P
Yes, I know you don’t mean equivalency as equality, but it’s still funny to me. Eh.
In this case “3×3” doesn’t mean multiplication but a 3 by 3 matrix
And the statement was that instead of multiplying a vector with a 3 by 3 matrix and adding another 3d vector you can multiply it with a 4 by 4 matrix and be done.
But I agree, this would not have occurred to me during school, either. It should have occurred to me during university but didn’t, either. But I did get it today … better late than never.
It is worth noticing that those two articles, while still useful in concept, are (at least partially) outdated.
Modern GL does not work like this anymore. As far as I recall, using GL matrix stacks is deprecated.
Shaders makes it utterly more complicated. Who says there will be a MVP matrix to start with? Who says transform will carry on that way? Nobody does mandate those features and sometimes you cannot even know.
Who says there will be an explicit point position? Nobody guarantees that until entering rasterization!
Now, this is crazy.
So, what do you do then? The matrix method does seem efficient, and old software probably still does run, so has the transformation matrix become just a special case of vertex shader which can be replaced by anything else if you feel like it? I mean, in the end you need to have points and triangles on the screen.
The GL matrix stack *API* may be deprecated, along with all the other “fixed function” APIs, but that doesn’t mean either matrices or matrix stacks are, you just have to do the maths yourself (GLM is a great library for C++ for this).
Direct3D explicitly states it’s simulating the fixed function pipeline with a shader, when you run the debug version. I’d be surprised if recent OpenGL drivers didn’t do the same.
Knowing about matrix math is perhaps even more applicable now, since you use them a lot, in wierd and wonderful ways, when you write shaders.
Actually, you CAN do normal matrix operations on it. For instance, to calculate normals for lighting, you need the transposed inverse matrix.
In fact, the transformation matrix holds the complete information about how a certain point is positioned in world/view/projection space (View = Where the point is relative to the camera, Projection = Where the point is relative to the screen; Translation = location, Rotation = facing, Scale = size), and most of the math used for fancy stuff will need all of it, so there’s no point in keeping them separate. (Also faster)
Of course, for sanity, you usually handle matrices as little black boxes and use functions like matrix = CreateRotationY(angle) * CreateTranslation(x, y, z) * CreateRotationX(angle).
(Rotating is different before and after translation, didja know?)
I’ve got a failed Linear Algebra course I need to catch up with so don’t remind me about this stuff. Whatever, time to get back to playing video games!
Pffft…. real men use their Transformation Matrixes in R^42 space to evaluate a local variety. Okay, to be fair that’s the Hermitian Matrix, not the Transformation one, but my point still stands.
Math is fun.
Is my point that is.
I’m going to be honest, even though I do dabble in programming for several reasons from time to time and I’m an avid reader of your programming dissertations Shamus, that ‘extended’ matrix for spatial translations made me do a doubletake and shout ‘Heresy!’ to the air. Silly programmers and their impurity that seeps into so beautiful math.
Totally joking of course. :P
Shamus, nitpicking, but scaling is the diagonal: the scaling by (x,y,z) matrix is:
x 0 0 0
0 y 0 0
0 0 z 0
0 0 0 1
The w column is for translation (movement), move by (x,y,z) is:
0 0 0 x
0 0 0 y
0 0 0 z
0 0 0 1
@Drexer: If you want to feel better, it’s 4×4 because 3D perspective is implemented as a skew in the fourth dimension (called ‘w’) of 4D space. That’s right – 3D graphics uses 4D maths :).
BORING TECHNICAL DETAILS ABOUT PROJECTION WITH HIDDEN “OH THATS WHY THAT HAPPENS” DETAILS BELOW:
The last step to get from points in 4D “homogenous space” to the 2D screen coordinates is to divide the x and y coordinates by w (the “perspective divide”). This happens after applying another matrix called the projection matrix, in 3D it uses a perspective matrix that puts the z coordinate in w.
For example, D3DXMatrixPerspectiveRH() uses:
2*zn/w 0 0 0
0 2*zn/h 0 0
0 0 zf/(znzf) 1
0 0 zn*zf/(znzf) 0
where ‘w’ is the screen width, h is the screen height, zn is the near clipping plane: the closest renderable depth, and zf is the far clipping plane, the furthest renderable depth.
These clipping planes are why stuff disappears if it gets too close or far from the camera – it’s considered to be behind the screen or past infinity. Putting these clipping planes too far apart means that the possible values z can take are further apart, meaning surfaces that are too close together might round to the same Z value, and the graphics card doesn’t know which to render – giving you “Zfighting”, where the triangles flicker back and forth as you move around.
Thanks for the explanation, should have read this earlier, and will read it again, it’s too late right now.
Just want to point out that at least in regular 3D coordinate transformation (with 3×3 matrices), scaling is only on the diagonal if you’re doing nothing else. In the general case, you need to multiply all elements in the first column with the x scaling factor and so on (that is if you scale before rotating)
Combining transformations is normally done by doing matrix multiplication, for example: “TranslationMatrix(position) * RotationMatrix(roll,pitch,yaw) * ScaleMatrix(scale)” is a scale then a rotation then a move (matrix multiplication order is reverse of logical application in the normal sense). In fact, the ability to combine transforms like this is the reason matricies are used to implement transforms – combining movement, scale and rotation manually is possible but complex.
Actually, your translation matrix over there appears to move the position to (x, y, z, 1). For moving by (x,y,z), you need to have ones on the diagonal elements, to add the original values.
Doh! Yeah, you’re right :)
Ah, ok. but rotation times scaling should lead to exactly that, a rotation matrix where each column is multiplied with the respective scaling factor.
Also, I see now how translation works in a 4×4 matrix. If the fourth component of each input vector is just one, that will just give
x(new) = a1x +b1y +c1z +d1
and so on.
Cool, something learned today!
Matrix multiplication is… complicated. Wikipedia explains it somewhat reasonably (under the title “Matrix product”):
http://en.wikipedia.org/wiki/Matrix_multiplication
Roughly, the value at i, j is the combines i’th row vector from the left side with the j’th column vector from the right side.
It’s only called multiplication because mathematicans like confusing people by having weird definitions: http://en.wikipedia.org/wiki/Ring_(mathematics)
yep. I didn’t have the rules in my head, but I just looked up your link, and multiplying the 3×3 rotation matrix to the transposed scaling vector (or 1×3 matrix), gives you exactly what I described above.
How? I mean, if you multiply v’ and M, where v’ is 1à—3 and M is 3à—3, you’ll get a 1à—3 result. And the normal matrix product isn’t defined the other way round.
But the 4à—4 scaling matrix Simon showed above works. You’ll notice that multiplying a 4à—4 matrix with it (from the left) just scales the first three rows with their respective factors.
gaaah, I got it slightly wrong. If you multiply a 3×3 matrix with a (transposed!) vector (or 3×1 matrix), and the vector is on the right of the matrix, the result is a new 3×3 matrix wich combines scaling and rotating, scaling being the first one to be applied. This is the same as multiplying with a diagonal scaling matrix.
I don’t disagree with Simon, his post just looked as if he thought I got it wrong, but my solution was just a roundabout way of putting it without mentioning “Matrix multiplication” because I didn’t remember how that went. I just figured you had to multiply each point vector coordinate with the appropriate scaling factor before doing the rest of the transform, and that’s achieved by multiplying each column in the matrix by that factor, which is the same as multiplying in the appropriate diagonal matrix. So all’s good now :)
A 3à—3 matrix multiplied by a 3à—1 matrix gives you a new 3à—1 matrix. You need the 3à—3 (diagonal) scaling matrix to preserve the dimensions.
Am I stupid? Ok wait, noone needs to answer that. To me, however,
it seems like we are multiplying by translation (green) numbers and adding scaling (yellow) numbers in that C code. (Yes, yes I realize I’m being picky and noone used that code for real.)
So the stated line of “out.x = M(m,0,3) * (M(m,0,0) * in.x + M(m,1,0) * in.y + M(m,2,0) * in.z) + M(m,3,0);” should be “out.x = M(m,3,0) * (M(m,0,0) * in.x + M(m,1,0) * in.y + M(m,2,0) * in.z) + M(m,0,3);”
These insidious little bugs are what drive programmers crazy.
I’ve always found linear algebra to be the most “actually useful” flavour of maths.
I think I’ll have to give that award to calculus.
Elementary arithmetic would be a strong contender, methinks.
You’d be surprised how little of that I actually do.
I guess your only interested in purer mathematics then? Usually only using 0, 1, pi, e and so on? It’s more elegant, but I think basic arithmetic still wins in the “most actually useful” category. Even with calculators, people still need to know how a problem is translated into a calculation. And teaching the more advanced stuff would be troublesome without it… I hear set theory was tried as the first math taught here in Finland in the 1970s, but it didn’t end well.
I’ve heard about it as well and I can’t see any reason to it. Who thought that was a good idea?
No, I’m not interested in purer mathematics, not really interested in mathematics at all (I’m not not interested, I’m just not interested) and while of course arithmetics is important it is for me something done in the end once the actual thinking has been finished.
The movement is known as “New Math” in the USA. After the launch of Sputnik, the West got scared and decided to boost science education, including math. Since set theory is the theoretical basis for math (to a certain extent), the hope was to teach it to children, so they could learn more advanced concepts later on. However, set theory is actually pretty advanced and children doesn’t have the mathematical maturity to handle it. No wonder it ended in failure.
The problem is Set Theory is not necessary and gets in the way of basic arithmetic. Spending a year learning how to “show your work” in solving 2+3 rather than a couple weeks learning (or memorizing) addition tables is a huge waste of time and stifles a child’s desire to learn. Math becomes boring and pointless rather than an exciting tool to learn new things.
This is why the only language my future children will learn is the language of math.
This reminds me of an engineering joke of old…
Friend: What’s up!!
Engineer: East cross north.
(If you’re an engineer and know the right hand rule, this may be hilarious. Otherwise, it may just be “nerd humor”. Your mileage may vary)
What’s new?
Frequency!
Bah, it doesn’t work when you misspell nu.
LOL –
The other one:
What’s new?
(hearing gnu) C over lambda.
What’s gnu? It’s a herd animal from Africa!
Nu gnu new…I KNEW something wasn’t quite right!
http://www.poppyfields.net/poppy/songs/gnu.html
GNU’s not Unix!
I laughed. Perhaps more than was warranted, but I laughed.
If you're writing a game, it's very, very common to have some sort of variable to hold these three values in one handy package. Confusingly, this package is usually called a vector.
And it only gets better when you find out there is a Standard Template Library container called vector. And it’s used to store any data whatsoever (in a very simplified nutshell).
And that you may then store your threedimensional position vectors in a scalable array called a vector and jokingly assign it the variable name victor!
That isn’t too far from the truth.
It’s entirely possible you end up with a bunch of objects having a velocity Vector and a position Vector all stored in an std::vector, Victor!
We have tons of production code where vectors are stored in Vectors.
This shit will make xzibit throw his hands up in defeat!
Are you sure you don’t have Vectors stored in vectors instead? :P
It’s what happens when two different definitions of vectors clash. :D
I know, right? Arg!
And then that collides with all the vector structs that everyone has made for themselves. The could have called stl::vector ANYTHING. WHY?!?!?
*shakes fist*
As a onlypartiallyprogrammer I can’t really understand the problem. Those things _are_ called vectors, and they have a whole subset of mathematics for themselves.
Also, as remarked before, a vektor can contain anything. So it stands to reason to have a vector library for use with vector operations …
I guess this is something that a scientific programmer will be happy about, while you, for whom a vector is something entirely different, look on in bewilderment.
For me (with more mathematicaltechnical background), these packages are not _confusinly_ but quite correctly named vector, because that’s what they are, and I’d be confused if they were called anything else …
Except std::vector is not a location in vectorspace, nor is it a tuple, std::vector is a resizable indexed collection of values of a type T. I’m not even sure what that mathematical concept that is, a mapping from a domain of nonnegative contigous integers maybe, ignoring it’s mutability.
Apparently, people from a math background like to call 1D arrays vectors and 2D arrays matrices.
Since your typical vector is [x, y, z, w, …], which would have been implemented as an array of size 4 in Ye Olde Days (which is when everything got its name), I guess the name stuck.
A compact, ordered collection of elements of fixed size is a “vector,” possibly implemented as an array.
A compact, ordered collection of elements of expandable size is a “list”, possibly implemented as a linked list; though also possibly implemented as an array with code to resize it if you need more entries.
While I haven’t used it personally, just going on what’s above, std::vector is apparently the second thing; which is bad naming, even for people with a math background.
As for good old days, Lisp is older than all these things, designed by a math geek in 1958(!), and clearly named lists as lists. C++ is just being its usual designbycommittee self.
Yeah, the second. I believe the reasoning for not using ‘list’ was that lots of existing C++ libraries used that for linked lists :( (And C++ wants to always be clear about performance).
Java and .NET got this right, fortunately.
I’ve always prefered ‘tuple’ for your first definition, since it doesn’t imply vector operations are legal, nor continous values.
I miss using LISP. Not too much call for it in my job :(.
The difference between a vector and a tuple (in my understanding) is that the tuple is a composite value (ie a value made from other values) and is therefore immutable (coords are tuples, for instance, being a composition of 3 reals to create a value representing a point in 3D space), while a vector is a container for values.
If I can recall correctly from back in the day the practical use differences were something like this in STL C++:
Array > limited in size at initialization, accessed with index
Vector > could grow beyond initialization, accessed with index
List > could grow beyond initialization, each only contained info for the next item only, used iterator
LinkedList > could grow beyond initialization, contained info for the previous and next node only, used iterator
I believe the index was just a short hand baked in iterator for those two cases, it still acts similar to the iterator of the Lists.
Anyways more recent language permutations have ignored or created knew meanings for these similar concepts. C# for example more or less replaced everything 1d arrays with Lists that are used in a practical sense like vectors.
Of course, this is all for eluclidian geometry and fixed space rendering. You can do some seriously twisted stuff with the imaginary 3d spaces your computers conjure, like noneuclidian geometry (which will make your eyes hurt) or space which is defined by how it connects to other space (there’s a fancy name for this, i forget what) which leads to things like 720 degree corners.
Ironically, the latter technique actually evolved first (i think) in that it appeared in a lot of early FPS’s, i believe Portal does something similar, good luck working out how to render and translate vectors through noneuclidian portal space (i do not envy the poor bastards who have to program some of these things.)
You can listen to Valve’s commentary to hear how they fudge physics in a bubble around the opening of the portals. (I think this is what they said they did — don’t quote me it was quite some time ago that I went through the Portal commentary).
Portal based rendering is actually super simple. Rendering engines need to “cull” – throw away parts of the scene that arn’t going to be seen – for speed. Portal rendering is a technique to do this by splitting your map up so everything is inside a “room”. When you should be able to see from one room into another (out of a window or door, for example), there is a “portal” that connects the two. When the renderer is drawing the map, it draws the room you are in, then the rooms on the other side of any portals you can see, which can show more portals, and so on. This allows heaps of tricks when the two rooms are not actually beside each other in the map – even mirrors are just a portal back into the same room but flipped.
I’m not sure if it preceded its main competor BSP which was invented by John Carmack for Wolf3D, the most famus portal engine was Duke3D’s Build, which was a lot later.
Actually, according to my lecturers there is a more efficient way of rendering portal space. Older engines do the whole ‘render everything in connected portals’ schtick and Valve probably do the same thing with Portal because the source engine is a dinosaur, but apparently you can work out what is where and what needs to be rendered much faster by calculating point locations based on an arbitrary origin (usually the player).
So yeah, take from that what you will.
Source is derived from GoldSource (HL1) which is derived from idTech 1 (Quake), and is therefore BSP based. The map file extension is even BSP.
And I’m not sure what your lecturer meant – you *have* to transform world space to camera space to render! That happens after the cull/display listing that the portal traversal does.
Oh, no, not that i mean in working out what chunks of world are where in relation to other chunks of world. Apparently there’s a couple of different ways to do it.
Ah right. I assume real portal renderers optimize the normal case where the rooms are stored actually beside each other so they can skip a matrix multiply, but I can’t see that being a particularly big win. I’d guess testing if the portal is visible would be about 10 times more work. The only general solution I can think of (though I could be missing other ways, of course) is two matrices (one for each direction through) to concatenate to the current world matrix as you render – essentially how to move the light travelling through it to get to the other side. I’ve just noticed that should let you create a portal that scales – walk through and end up in the same room, at a 1/10 the size :).
So, today we’ve got identity matrices and transformation matrices. I assume that next time you’ll tell us about the Matrix of Leadership?
I immediately tried to think of ways to get my wife to read this… but alas, it is not to be. (Not. Even. Close.)
And really, with all these discussions about vectors, even mentioning direction and magnitude, how can I possibly be the first to mention Despicable Me??!? “Squidlauncher! Ohyeaaah!!”
On a vaguely related note, more posts like this (dense mathy stuff, relatively speaking) would be nice. Your other posts just get WAY too many comments these days. I miss our tightknit community of old (none of whom I actually know personally, of course). Do I need to make some kind of “get off my* lawn” reference now?
*technically Shamus’ lawn, I suppose…
The red square actually does technically have a use, called Homogenization. It’s used with homogenized vectors though, which have a fourth value, which in my class, was referred to as ‘h’ (Yes, they were very inventive when they did the naming conventions – at least it wasn’t ‘t’; wouldn’t hear the end of “Time is the fourth dimension” if they did that).
The idea is that:
(x, y, z, 1)
Meant that this was in the 1st dimension, and was technically in the same location (In 3D space, the same way that the zvalue has no difference in 2D space – more on that later) as:
(2x, 2y, 2z, 2)
Because to compare the two points, you have to homogenize the last value, and flatten it into the 3rd dimension.
Why this level of complexity?
Perspectiveprojection translation.
When you have those 8 vertices, for example, we need to figure out how they look in 2d space (Rather, where they end up from your viewpoint, if say, you’re down behind the xyplane on the zaxis (So, again, negative/positive, depending our your viewpoint, as you’ve mentioned). To do this, we need a matrix that removes the zdimension (Zeroing all the zvalues, if I remember correctly (Forgive me, I’ve forgotten the matrix values off memory, but I think the far right (with the scalars on the far right) changes based on where you are in the matrix looking from), but the multiplications mess with the homogenization value of the vector. Then, by homogenizing them to the same homogenization value (By convention, 1. Achieved by [x/h, y/h, z/h, h/h]).
Once you calculate these values, then you plot them on the xy plane, and thus, you can have 3D on a 2D screen. And that is why, by convention, that red box value is 1 unless required otherwise. Though if you don’t homogenize the value (Which, if you offhand to OpenGL or such, you probably don’t need to, as it’ll probably just assume these are all homogenized to 1 anyways.).[/Pedantic]
tl:dr; Thanks for letting me dork out here, Shamus. I had an entire class of stuff on this (We had to complete small geometric translations, rotations, and perspectiveprojections, then plot them out on a graph, and finally connect all the vertices as appropriate. Granted, no texture stuff to work with, but that would’ve been just killer during a 2hour final exam), and this was nice to be able to contribute.
EDIT:
Right, so the idea is that, for a point represented by a vector:
(x, y, y, h) where h = 1
To see where it goes after projecting it as if we are from looking from x = 2, y = 3, z = 4 :
[1 0 0 0]
[0 1 0 0]
[0 0 1 1/4]
[2 3 0 0]
Note: This is assuming you consider your vectors arranged as shown above [(x, y, z, h)]. If you go vertical instead, prepare for fun formatting times.
This is even more fun for stereoscopic calculations!
Or if the user wants to, you know, rotate his view. This probably explains why cameras in games are so wonky, because on top of this, you need collision detection to make sure you can actually see something useful.
For the record, trying to this by hand gives you much slower than the 2 fps from Railroad to Nowhere.
Games use 3 stages to convert points: The World Matrix, the View Matrix and the Projection Matrix. The World transform points to world space (ie. where they are in the world), the View transforms them to view space (ie. where they are in the viewing frustum (the volume that represents the space the camera can see)) and finally the Projection transforms it to projection space (ie. where they are in the 2D space that is the screen).
And then you get helper methods like CreateLookAt(cameraPos, cameraTarget, cameraUp) to make the View matrix, a whole bunch to translate/rotate/scale the World and Create____Projection(width, height, farClip, nearClip) for the Projection (Isometric, Perspective, whatever).
Having the screen be upsidedown is then as easy as remaking the View matrix with the CameraUp pointing down.
So: PointInScreen3D = Vertex4D * World4x4 * View4x4 * Projection4x4 (Technically, it’s PointInScreen2D, but since people prefer to not draw points that get hidden behind stuff, depth is tracked ’till the end)
Wonky cameras are usually ’cause the dev can’t really guess all the possible combinations of target/camera/obstacles, so it’ll sometimes go where it has no escape. Or they were lazy. Or the game stuttered and the camera managed to get behind the wall without triggering collision. Or etc.
And then you have to worry about which combination of the WVP Matrix to apply to your data. Texture shadows, for instance, have some interesting elements to them.
Ah, didn’t know that about all that abstraction.
Still, I stand by the idea that the red box should be a one then, unless you are writing projection code/homogeneous. I’ve never seen reason to not do it.
I think Shamus might be using a different kind of matrix math to the one I use, ’cause when I worked it out according to my mental model (which I copied off Wikipedia), the red box must always be 1, being an affine transformation and all that. The yellow boxes are for translation, scaling is done on the major diagonal, rotation is sin/cos over the gray and green is unused.
The model I discussed is inherited from a graphics engine I used in the 90’s. I assumed this was a common way of doing things.
I sort of dislike doing scaling in the 3×3. You can’t just plug in your sin / cos values, but have to preserve the scaling. And clearing the scaling means normalizing the vectors. It’s not wrong to do it that way, but if you’re actually scaling stuff on a regular basis it seems like it would waste CPU cycles to do the scaling in the 3×3.
And yeah, I never used the red box because I never had to build a projection matrix by hand. I’ve always left that to the engine, or to OpenGL.
I’m still not going to vacuum behind the sofa, though.
Which just goes to prove programmers suck at the whole “doing things the same for everyone” thing.
And go vacuum behind that sofa, before the rats start making nests. Most sofas are pretty light, you can lift’em with one hand while vacuuming with the other! :D
All the matrix transform methods I’ve seen build a fresh matrix, and expect you to concatenate them. It’s certainly easier to use.
I assume you threw away the w coordinate somehow? If that was random going to the card, you should be drawing nonsense.
I must plead ignorance here. It’s been… erm. Eight years(?) since I saw that code. It’s all hazy now.
I’d guess you start with fresh matrices and vectors with all the values properly init’d, and then so long as you don’t explicitly change it it should take care of itself.
Fair enough.
I reserve the right to cringe if I see the sofa slanted at a different angle than the floor. :p
Once you work in 4×4 matrices though, it is hard to go back, yeah?
Cool post, Shamus. Did you know: The first derivative of a position vector with respect to time is velocity. The second derivative is acceleration. The third derivative is jerk force. So if someone is calling you a third derivative, they are calling you a jerk, mathematically.
All together now: secant! tangent! cosine! sine! three point one four one five nine! Wheeee……!
Well, derivative humor is a bit of a jerk move at the best of times.
P.S. Yay Portal 2!
Thanks for the post, Shamus! I was looking into graphics programming recently myself, and everything I found mentioned matrices. And now I have (between the post and some of the comments) a basic understanding of what these matrices are and what they’re doing. Hurray!
Of course, like everything in life, it is more complex that a simple example can show. It seems obvious that X,Y,Z define an orientation and you can use an angle for rotation, making the full orientation of an object Position(X,Y,Z) and Orientation(X,Y,Z,A) (where A is the angle). The problem with this is that the Orientation will suffer from something called “gimbal lock” which will prevent proper rotation in some situations.
Because of this, “quaternions” are commonly used for orientation (and it is a big topic: http://www.gamedev.net/page/resources/_/reference/programming/mathandphysics/quaternions/quaternionpowersr1095 )
The basic matrix math can take you a long way, but the first time you encounter gimbal lock you will think your code is buggy, when in fact the simple representation doesn’t encode enough information to avoid it in the *math* itself.
That’s a feature of the basic matrix math, not a bug. <_<
For those wondering, “Gimbal Lock” refers to when you lose an entire axis of rotation ’cause of the rotations. The typical example is how, when you look straight up in a FPS and then try to look to the side, instead of doing an arc the view simply rotates.
To explain, assume you’re standing flat on the floor, arms outstretched, looking ahead and you have 3 axii: X, updown and goes along your arms, Y, leftright and goes from head to toe and Z, which is viewrotation and goes straight ahead.
Xrotation is tilting your head up/down, Yrotation is turning on the spot, Yrotation is turning your head to the side.
If you turn on the spot (Yrotation), you’ll look at the stuff to the left/right. Now straight look up or down (Xrotation). If you turn on the spot now (Yrotation), instead of looking left/right as normal, you’ll just rotate your view, as if you were tilting your head (Zrotation). In other words, you lost your capability to rotate along Y ’cause it got locked in Z.
“In 3D, one of those annoyances is in deciding what x, y, and z mean. X is the one variable we mostly agree on. X is for east / west positioning, although nobody can say for sure if positive x should go east and negative x should go west, or the other way around. It gets even trickier with y and z, since we can't agree on which one is up and which one controls north / south movement, and which way they point. ”
While I never delved under the hood of 3d animation, to me it’s always been understood that ‘y’ is the constant as it is always the ‘height’. I’ve never known it to be view as anything else. And since I’ve only ever worked in Maya, they’ve always been known as ‘vertices’ to me, but that’s just cause of the whole ‘every 3d program uses it’s own language’ nonsense.
A vertex is the information related to one corner of a triangle, information that includes a 3D vector for the position, but normally also one or more texture coordinates, a normal and some animation rigging information.
Funny you should say that.
In my world z is always the vertical axis, x and y are the plane, while x is the direction something is moving into. Except if you have vehiclefixed coordinates, where x also points into the direction the vehicle is made to move into, z points to the “belly” and y to the side.
For me, it was:
+y

……/ z
…./
../
/
———— +x
+z goes towards the screen.
Right hand rule, right?
http://xkcd.com/199/
Though I think this explains my one issue with programming: Despite the idea that things should build on each other and not conflict with conceptual models, EVERYONE who programs want to do it their way, or the highway. It’s why HTML is still such a mess, and why 3D systems will likely still be a mess, and why the uncanny valley is always going to be inescapable.
There will always be someone in the “perfect” simulation going “What do you mean, by convention, reality won’t let me do this this way?”
I guess for screenrelated coordinates yours makes perfect sense (I’d have uses x instead of z, but since everyone uses x and y for pixel coordinates, that’d be confusing). But for world coordinates I wondered why all the graphics guys want to use y as the vertical coordinate, but I guess I’ve found the reason in screen coordinates. If you (like I) had drawn coordinate systems on paper on a desk, not on a vertical screen, x and y would naturally be in the horizontal plane, but if 90% of your coordinate systems are on a screen, then I understand why x goes sideways and y is vertical.
But then, as long as everyone uses a righthand system and properly declares the coordinate system, It’s all the same.
“The Minecraft Guy is made from six cubes, which means means 72 triangles…”
Don’t think you meant to type means twice, though I may have been wrong. The post is very interesting, thanks for it.
Actually, a question – why do we use triangles over other shapes in 3D graphic rendering code? It’s never made sense to me…
There are some aspects of building upon foundations that make triangles practical in terms of rendering efficiently. But the main reason is that any polygon (square, dodecahedron, cube, and yes other triangles) can be broken down into a set of triangles, and triangles are always planar (flat).
A surface being planar makes calculating things like clipping, lighting etc much simpler, and if you break everything down to small enough triangles, people cannot see the apparently smooth surface actually contains a lot of little sharp edges.
I you really want to break your brain, cosnider what happens if you (usually by accident) make your transformations between a righthanded and a lefthanded coordinate system. What was the xdimension again? I have made some funky looking teapots that way in graphics classes.
Wait, wait, WAIT!
Have your programming posts always had a picture of a scene from the Matrix in them, or did you change it to that just for this post?
If the latter, then man it took me a long time to pick up on it.
If the former, then….whoa
Any chance you could do a similar post on quaternions? I never could grok those.
I don’t think anyone groks quaternions. Our puny human brains can’t really handle 4D constructs.
Quaternions aren’t bad; you just have to stop thinking them as vectors and think of them as a kind of a big brother to the Complex numbers (if you can’t wrap your head around Complex arithmetic, then you’re hosed with quaternions).
Let’s review complex numbers for a moment: complex numbers can be thought of as a sum of a real and an imaginary number, an imaginary number being a real number multiplied by a constant i which has a property that iÂ² = 1. So let’s see how we add and multiply two arbitrary complex numbers a+bi and c+di:
Quaternions are the same, except that instead of adding up a real and an imaginary number, you add up a real number, an imaginary number, a “jmaginary” number and a “kmaginary” number. These last two are just like imaginary numbers, except they’re reals multiplied by the magic numbers j and k, respectively. They both, multiplied by themselves, add up to 1 just like i, but when you multiply those three magic numbers i, j and k together they yield different results depending on their order (they’re “noncommutative”).
Particularly, ij = k, but ji = –k; jk = i, but kj = –i; and ki = j, but ik = –j. This takes a little bit of memorising, but you can eventually do it with some practise. Better yet, you can let the computer do it. At any rate, this weird noncommutative property of these numbers looks a lot like rotation in 3d space!
I’ll explain better how you actually represent rotations with quaternions later on, since it’s getting pretty late…
BWahaha! Thank you so much for “jmaginary” and “kmanginary”! :D
But damn, I was trying to avoid learning the logic behind Quaternions once I found out they were “very” complex numbers. Now I have to figure out how j and k are constructed :(.
All three i, j and k are the square root of minus one. When you multiply a real number by any of these numbers they all rotate the real by ninety degrees, but in 4D space you have three orthogonal directions you can rotate 90Â° in; and those three directions are the aforementioned i, j and k.
Sure, I can intuitively get how complex numbers can be extended (in what is amusingly known as “hypercomplex numbers”), it is simply defining algebra over a field, and both of those are pretty simple concepts, so long as you don’t try to use your puny animal brain to “visualize” things.
I can also be happy with all 3 of i, j and k being square roots of 1 without being equivalent, so long as you have more than 2 dimensions to let them “spread out” – [1] explains fairly well – but I’ve never really understood how you can construct i, in the same way I get Cauchy sequence or Dedekind cut constructions of the reals, for example. While I know the “just get all the axioms to work” method of doing maths is common (or was, I hear proof programs are fairly constructivist), I’m tainted by programming to expect there to be things under the hood, as it were.
[1] http://en.wikipedia.org/wiki/Quaternion#Square_roots_of_.E2.88.921
Since noone brought it up yet, I just have to post this link, which explains rotation matrices with much fewer words than I could ever manage:
http://xkcd.com/184/
It doesn’t seem fair to say that “you have to already understand the subject before you can comprehend the lessons to teach you the subject”. “The subject” is “matrix maths as applied to 3d graphics”, expecting someone to already understand the basics of matrix maths doesn’t seem unreasonable.
Also, your 4×4 “identity matrix”? With the 4th column full of ones? That’s just wrong, man. I’m surprised that nobody else commented on that.
Okay. I’ll email the authors of the graphics engines I’ve used and let them know that some guy on the internet says their engines are wrong.
Maybe nobody commented because it’s not as wrong as you think.
Identity Matrix. I’m not saying that technique isn’t commonly used. I’m saying that calling it an identity matrix is wrong, from a mathematical point of view. The first matrix, with just the diagonal? That was an actual identity matrix. Given that a number of commenters seemed to be maths types rather than programmers, it is a little surprising that none of them mentioned it in almost two hundred comments.
Also, the technique as described does seem a little screwy, too. It looks like the yellow boxes should be translating the result, and the green ones should be doing nothing, assuming normal matrix multiplication. Obviously what games have been doing for the last 20 years works, though. Maybe you misremembered it (you mentioned above that you hadn’t used the code in years before writing this post), maybe OpenGL is doing something nonstandard under the hood, or maybe I’m just looking at it funny, but something doesn’t seem right.