3D Matrix Transformations Guide: Translation, Rotation & Scaling

🧮 Understanding 3D Graphics Matrix Transformations

In 3D graphics development, game engines, or modeling software (like Autodesk Maya), “Matrices” are everywhere. Whether it's Translation, Scaling, or Rotation, these operations are actually implemented through matrix operations at the underlying level.

Why? — Because matrix operations are extremely efficient in computers, capable of representing and combining multiple transformations at once, much faster than processing each operation individually.

I. What is a Matrix?

A matrix is a rectangular array of numbers, for example:

[0.2  3   1.7  0.3]
[-0.6 2   0.1  1.2]
[-0.9 0   1    0.4]
[1.2  4   0.6  1  ]

Matrices can have different dimensions, such as 2×2, 3×3, or 4×4. A Vector is actually a one-dimensional matrix, for example [1, 12, 6].

In three-dimensional space, we usually use a 4×4 matrix to represent the position, rotation, and scale of an object.

II. Identity Matrix

When an object has not undergone any transformation (i.e., is at the origin (0,0,0), has no rotation, and is scaled to 1,1,1), its matrix is:

[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[0 0 0 1]

This is the Identity Matrix. It represents “nothing happening.” Any vector multiplied by the identity matrix remains unchanged.

III. Scaling

If we want to scale an object by a factor of 3 in the Y-axis, we can represent it as:

[1 0 0 0]
[0 3 0 0]
[0 0 1 0]
[0 0 0 1]

The values on the diagonal represent the scaling ratios for each axis (Sx, Sy, Sz).

IV. Translation

Translation is represented by the last row (or last column) of the matrix. For example, to move an object 6 units along the X-axis and -1 units along the Y-axis:

[1 0 0 0]
[0 1 0 0]
[0 0 1 0]
[6 -1 0 1]

Thus, a point (0,0,0) transformed by this matrix will move to (6,-1,0).

V. Rotation

Rotation is the most complex but also the most interesting part of matrices. Each three-dimensional object has X, Y, and Z rotation axes, and we usually need to calculate a rotation matrix for each axis separately.

For example, the matrix for rotating around the Z-axis by an angle θ is:

[cosθ  -sinθ  0  0]
[sinθ   cosθ  0  0]
[0      0     1  0]
[0      0     0  1]

If we rotate by 40°, then:

cos(40°) ≈ 0.76
sin(40°) ≈ 0.64

Substituting:

[0.76 -0.64  0  0]
[0.64  0.76  0  0]
[0     0     1  0]
[0     0     0  1]

VI. Combining Rotations: The Order Matters!

In actual 3D systems (like Maya, Unreal, Unity), the order of rotations is critical. We usually use rotation matrices obtained by multiplying the rotation matrices of the X, Y, and Z directions in sequence. Different multiplication orders yield different results—this is why the same angles may appear different in different engines.

For example:

RotateX → RotateY → RotateZ

and

RotateZ → RotateY → RotateX

can result in completely different poses.

A very good question 👍 — this is actually one of the core concepts of understanding 3D graphics mathematics.

Let's explain it systematically:

🧩 Why Use 4×4 Matrices in Three-Dimensional Space?

In 3D space, the transformations we think of most intuitively are:

Translation
Rotation
Scaling

Rotation and scaling can be represented by a 3×3 matrix because they only change the relative position or size of points, without moving the origin. So why do we need an extra row and column, becoming 4×4?

3×3 Matrices Can Represent Rotation and Scaling, But Not Translation

Let's look at a change in point coordinates:

P = [x, y, z]

Suppose we only rotate or scale it:

P' = M3x3 × P

OK, no problem. But if we want to “translate” it, for example, move it 5 units to the right and 2 units up, mathematically:

P' = M3x3 × P + T

Here T is the translation vector [tx, ty, tz]. This means we need to do matrix multiplication + vector addition, which cannot be done with a single matrix.

Introducing “Homogeneous Coordinates”

To allow all transformations to be done with a single matrix multiplication, we introduce a fourth dimension w, transforming the three-dimensional coordinates [x, y, z] into [x, y, z, 1].

This “1” is called the homogeneous coordinate.

Thus, our transformation matrix also becomes 4×4:

[x' y' z' 1] = [x y z 1] × M4x4

At this point, rotation, scaling, and translation can all be uniformly implemented through matrix multiplication.

Structure of a 4×4 Matrix

A typical 4×4 transformation matrix looks like this:

[ R11  R12  R13  0 ]
[ R21  R22  R23  0 ]
[ R31  R32  R33  0 ]
[ Tx   Ty   Tz   1 ]

Let's look at it in parts:

Part	Meaning	Description
Top-left 3×3	Rotation + Scaling	Controls the orientation and size of the object
Last column	Generally `[0, 0, 0, 1]`	Maintains homogeneous coordinate format
Last row first three `[Tx, Ty, Tz]`	Translation vector	Controls the position of the object in space
Bottom-right 1	Homogeneous coordinate identifier	Ensures the matrix is invertible and can distinguish translation from rotation

Example: A 4×4 Matrix for Translation + Rotation + Scaling

Suppose:

Scale by a factor of 2
Rotate 90° around the Z-axis
Translate (5, 3, 2)

The corresponding matrix:

[  0  -2   0   0 ]
[  2   0   0   0 ]
[  0   0   2   0 ]
[  5   3   2   1 ]

Now, any point [x, y, z, 1] multiplied by this matrix:

Will first be rotated 90° (around the Z-axis)
Then scaled by a factor of 2
Finally moved to (5, 3, 2)

All these transformations are done in one step!

Why Use 4×4?

✅ Summary of Reasons:

3×3 can only represent linear transformations (rotation, scaling, shear)
Translation is an affine transformation, requiring an extra “1” for unified calculation
4×4 matrix = unified expression of linear transformation + translation + homogeneous coordinates
Can be conveniently concatenated with camera projection matrix and view matrix (all matrices are 4×4 for chained multiplication)

A Simple Summary

You can remember it like this:

Matrix Size	Use Case	What Can It Do
2×2	2D Rotation and Scaling	Plane transformation
3×3	3D Rotation and Scaling	No translation
4×4	3D Combined Transformation	Translation + Rotation + Scaling, all in one step

Conclusion

Matrices make geometric transformations in the 3D world elegant and efficient. A 4×4 matrix can simultaneously contain:

✅ Object position (Translation)
✅ Rotation angle (Rotation)
✅ Scaling ratio (Scaling)

That is, 16 numbers can describe the complete state of an object in three-dimensional space.

Final Thoughts

Understanding matrices is not only the foundation for learning 3D programming, but also the key to understanding rendering, animation, physics engines, and other fields. Next time you see a “Transform Matrix” in Blender, Maya, or Unreal, I hope you can smile and say:

“Ah, I know what it’s doing.”