3D Matrix Transformations Guide: Translation, Rotation & Scaling
🧮 Understanding 3D Graphics Matrix Transformations
In 3D graphics development, game engines, or modeling software (like Autodesk Maya), “Matrices” are everywhere. Whether it's Translation, Scaling, or Rotation, these operations are actually implemented through matrix operations at the underlying level.
Why? — Because matrix operations are extremely efficient in computers, capable of representing and combining multiple transformations at once, much faster than processing each operation individually.
I. What is a Matrix?
A matrix is a rectangular array of numbers, for example:
[0.2 3 1.7 0.3]
[-0.6 2 0.1 1.2]
[-0.9 0 1 0.4]
[1.2 4 0.6 1 ]
Matrices can have different dimensions, such as 2×2, 3×3, or 4×4. A Vector is actually a one-dimensional matrix, for example [1, 12, 6]
.
In three-dimensional space, we usually use a 4×4 matrix to represent the position, rotation, and scale of an object.
II. Identity Matrix
When an object has not undergone any transformation (i.e., is at the origin (0,0,0)
, has no rotation, and is scaled to 1,1,1
), its matrix is:
[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[0 0 0 1]
This is the Identity Matrix. It represents “nothing happening.” Any vector multiplied by the identity matrix remains unchanged.
III. Scaling
If we want to scale an object by a factor of 3 in the Y-axis, we can represent it as:
[1 0 0 0]
[0 3 0 0]
[0 0 1 0]
[0 0 0 1]
The values on the diagonal represent the scaling ratios for each axis (Sx, Sy, Sz).
IV. Translation
Translation is represented by the last row (or last column) of the matrix. For example, to move an object 6 units along the X-axis and -1 units along the Y-axis:
[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[6 -1 0 1]
Thus, a point (0,0,0)
transformed by this matrix will move to (6,-1,0)
.
V. Rotation
Rotation is the most complex but also the most interesting part of matrices. Each three-dimensional object has X, Y, and Z rotation axes, and we usually need to calculate a rotation matrix for each axis separately.
For example, the matrix for rotating around the Z-axis by an angle θ is:
[cosθ -sinθ 0 0]
[sinθ cosθ 0 0]
[0 0 1 0]
[0 0 0 1]
If we rotate by 40°, then:
cos(40°) ≈ 0.76
sin(40°) ≈ 0.64
Substituting:
[0.76 -0.64 0 0]
[0.64 0.76 0 0]
[0 0 1 0]
[0 0 0 1]
VI. Combining Rotations: The Order Matters!
In actual 3D systems (like Maya, Unreal, Unity), the order of rotations is critical. We usually use rotation matrices obtained by multiplying the rotation matrices of the X, Y, and Z directions in sequence. Different multiplication orders yield different results—this is why the same angles may appear different in different engines.
For example:
RotateX → RotateY → RotateZ
and
RotateZ → RotateY → RotateX
can result in completely different poses.
A very good question 👍 — this is actually one of the core concepts of understanding 3D graphics mathematics.
Let's explain it systematically:
🧩 Why Use 4×4 Matrices in Three-Dimensional Space?
In 3D space, the transformations we think of most intuitively are:
- Translation
- Rotation
- Scaling
Rotation and scaling can be represented by a 3×3 matrix because they only change the relative position or size of points, without moving the origin. So why do we need an extra row and column, becoming 4×4?
3×3 Matrices Can Represent Rotation and Scaling, But Not Translation
Let's look at a change in point coordinates:
P = [x, y, z]
Suppose we only rotate or scale it:
P' = M3x3 × P
OK, no problem. But if we want to “translate” it, for example, move it 5 units to the right and 2 units up, mathematically:
P' = M3x3 × P + T
Here T
is the translation vector [tx, ty, tz]
. This means we need to do matrix multiplication + vector addition, which cannot be done with a single matrix.
Introducing “Homogeneous Coordinates”
To allow all transformations to be done with a single matrix multiplication, we introduce a fourth dimension w, transforming the three-dimensional coordinates [x, y, z]
into [x, y, z, 1]
.
This “1” is called the homogeneous coordinate.
Thus, our transformation matrix also becomes 4×4:
[x' y' z' 1] = [x y z 1] × M4x4
At this point, rotation, scaling, and translation can all be uniformly implemented through matrix multiplication.
Structure of a 4×4 Matrix
A typical 4×4 transformation matrix looks like this:
[ R11 R12 R13 0 ]
[ R21 R22 R23 0 ]
[ R31 R32 R33 0 ]
[ Tx Ty Tz 1 ]
Let's look at it in parts:
Part | Meaning | Description |
---|---|---|
Top-left 3×3 | Rotation + Scaling | Controls the orientation and size of the object |
Last column | Generally [0, 0, 0, 1] | Maintains homogeneous coordinate format |
Last row first three [Tx, Ty, Tz] | Translation vector | Controls the position of the object in space |
Bottom-right 1 | Homogeneous coordinate identifier | Ensures the matrix is invertible and can distinguish translation from rotation |
Example: A 4×4 Matrix for Translation + Rotation + Scaling
Suppose:
- Scale by a factor of 2
- Rotate 90° around the Z-axis
- Translate (5, 3, 2)
The corresponding matrix:
[ 0 -2 0 0 ]
[ 2 0 0 0 ]
[ 0 0 2 0 ]
[ 5 3 2 1 ]
Now, any point [x, y, z, 1]
multiplied by this matrix:
- Will first be rotated 90° (around the Z-axis)
- Then scaled by a factor of 2
- Finally moved to (5, 3, 2)
All these transformations are done in one step!
Why Use 4×4?
✅ Summary of Reasons:
- 3×3 can only represent linear transformations (rotation, scaling, shear)
- Translation is an affine transformation, requiring an extra “1” for unified calculation
- 4×4 matrix = unified expression of linear transformation + translation + homogeneous coordinates
- Can be conveniently concatenated with camera projection matrix and view matrix (all matrices are 4×4 for chained multiplication)
A Simple Summary
You can remember it like this:
Matrix Size | Use Case | What Can It Do |
---|---|---|
2×2 | 2D Rotation and Scaling | Plane transformation |
3×3 | 3D Rotation and Scaling | No translation |
4×4 | 3D Combined Transformation | Translation + Rotation + Scaling, all in one step |
Conclusion
Matrices make geometric transformations in the 3D world elegant and efficient. A 4×4 matrix can simultaneously contain:
- ✅ Object position (Translation)
- ✅ Rotation angle (Rotation)
- ✅ Scaling ratio (Scaling)
That is, 16 numbers can describe the complete state of an object in three-dimensional space.
Final Thoughts
Understanding matrices is not only the foundation for learning 3D programming, but also the key to understanding rendering, animation, physics engines, and other fields. Next time you see a “Transform Matrix” in Blender, Maya, or Unreal, I hope you can smile and say:
“Ah, I know what it’s doing.”