Simultaneous Triangularization When Rank(AB-BA) = 1

Aug 12, 2025 by Pedro Alvarez 52 views

Rank(AB-BA) = 1: Proving Simultaneous Triangularizability of Matrices A and B

Hey guys! Ever stumbled upon a matrix problem that just makes you scratch your head? Well, today we're diving deep into a fascinating one in linear algebra. We're going to explore what happens when the rank of AB - BA equals 1, and how that tells us something really cool about matrices A and B: they can be simultaneously triangularized. Buckle up, because this is going to be a fun ride!

Understanding the Problem: When Rank(AB - BA) = 1

So, let’s break down the core of the problem. We're given two matrices, A and B, both of size n x n, and they contain complex numbers. The expression AB - BA is known as the commutator of A and B. It measures, in a way, how much the matrices don't commute. If AB - BA = 0, then A and B commute perfectly, which simplifies things quite a bit. But what if AB - BA isn't zero? That’s where the rank comes in.

The rank of a matrix tells us the number of linearly independent rows (or columns) it has. A rank of 1 means that the matrix, in a sense, is highly “collapsed” – it has a lot of dependency between its rows and columns. In our case, rank(AB - BA) = 1 is a very specific condition. It tells us that the “non-commutativity” between A and B is constrained in a particular way. This constraint, surprisingly, leads to a powerful conclusion.

Our mission, should we choose to accept it (and we do!), is to prove that if rank(AB - BA) = 1, then A and B are simultaneously triangularizable. What does that mean? It means we can find a single invertible matrix P such that both P⁻¹AP and P⁻¹BP are upper triangular matrices. Upper triangular matrices are those where all the entries below the main diagonal are zero. Triangular matrices are much easier to work with in many situations, especially when finding eigenvalues and eigenvectors. Simultaneous triangularization is a strong property – it implies that A and B share a common “eigenbasis” in some sense.

This problem is a generalization of a classic result: if AB = BA (i.e., A and B commute), then they are simultaneously triangularizable. Our condition, rank(AB - BA) = 1, is a relaxed version of commutativity. It turns out that even this weaker condition is enough to guarantee simultaneous triangularizability. That’s pretty neat, right?

The Induction Approach: A Step-by-Step Strategy

Now, how do we actually prove this thing? The suggested approach is to use mathematical induction on n, the size of the matrices. Induction is a powerful proof technique that works like a chain reaction. We first prove the statement for a base case (usually a small value of n), and then we show that if the statement is true for some n = k, it must also be true for n = k + 1. This “inductive step” is the heart of the proof. If we can pull this off, then the statement holds for all n greater than or equal to our base case.

Think of it like climbing a ladder. The base case is getting onto the first rung. The inductive step is showing that if you're on any rung, you can always climb to the next one. If you can do both, you can climb the whole ladder!

So, our strategy is to:

Base Case: Prove the statement for a small value of n, like n = 1 or n = 2. This is usually the easiest part.
Inductive Hypothesis: Assume the statement is true for some n = k. This is our “if” – we assume the result holds for smaller matrices.
Inductive Step: Show that if the statement is true for n = k, it must also be true for n = k + 1. This is the crucial step where we use the inductive hypothesis to extend the result to larger matrices.

Let’s start by thinking about the base case. For n = 1, A and B are just complex numbers (1x1 matrices). In this case, AB - BA is always zero, so rank(AB - BA) = 0, not 1. So, n = 1 doesn't satisfy our condition. Let’s try n = 2. This is a bit more interesting, and we’ll see how the condition rank(AB - BA) = 1 starts to constrain the structure of A and B.

Diving into the Proof: The Base Case (n = 2)

Let's tackle the base case where n = 2. This means A and B are 2x2 matrices. We can write them in general form as:

A = | a b |
    | c d |

B = | e f |
    | g h |

Where a, b, c, d, e, f, g, h are complex numbers. Now, let’s compute AB - BA:

AB = | a b | | e f | = | ae + bg af + bh |
     | c d | | g h |   | ce + dg cf + dh |

BA = | e f | | a b | = | ea + fc eb + fd |
     | g h | | c d |   | ga + hc gb + hd |

AB - BA = | (ae + bg) - (ea + fc) (af + bh) - (eb + fd) |
          | (ce + dg) - (ga + hc) (cf + dh) - (gb + hd) |

So, AB - BA is a 2x2 matrix, and we're given that its rank is 1. This means that the rows (and columns) of AB - BA are linearly dependent. In other words, one row is a scalar multiple of the other. This gives us some equations to work with.

Let’s denote the entries of AB - BA as follows:

AB - BA = | x y |
          | z w |

Where:

x = (ae + bg) - (ea + fc)
y = (af + bh) - (eb + fd)
z = (ce + dg) - (ga + hc)
w = (cf + dh) - (gb + hd)

Since rank(AB - BA) = 1, there exists a scalar k such that z = kx and w = ky. We need to use these relationships to show that A and B can be simultaneously triangularized. This is where things get a bit more algebraic, and we might need to consider different cases depending on whether x and y are zero or not.

The key idea here is to find a common eigenvector for A and B. If we can find a vector v such that Av = λv and Bv = μv (where λ and μ are eigenvalues), then v is a common eigenvector. We can then use this eigenvector to build our triangularizing matrix P. This is a standard technique in linear algebra for proving simultaneous triangularizability.

To find a common eigenvector, we might need to solve a system of equations involving the entries of A, B, and the scalar k. This can be a bit messy, but the relationships z = kx and w = ky give us crucial constraints. We're essentially trying to show that these constraints force A and B to have a shared eigenvector, which then allows us to triangularize them simultaneously.

We will continue the detailed calculations and reasoning in the next sections to fully establish the base case for n = 2. Stick with me, guys! We're making progress!

The Inductive Step: Scaling Up the Argument

Okay, so we've laid the groundwork by understanding the problem and diving into the base case for n = 2. Now comes the heart of the proof: the inductive step. This is where we assume the result holds for matrices of size k x k and then show it must also hold for matrices of size (k+1) x (k+1). This step can be a bit abstract, but it's the engine that drives the entire proof.

Inductive Hypothesis: Let's assume that if A' and B' are k x k matrices with complex entries such that rank(A'B' - B'A') = 1, then A' and B' are simultaneously triangularizable. This is our assumption – we're taking it as true for smaller matrices.

Inductive Goal: Our goal is to show that if A and B are (k+1) x (k+1) matrices with rank(AB - BA) = 1, then A and B are also simultaneously triangularizable. We need to use our inductive hypothesis somehow to bridge the gap between the k x k case and the (k+1) x (k+1) case.

The key idea here is to use the rank condition to find a common eigenvector for A and B. Just like in the base case, a common eigenvector is our ticket to simultaneous triangularization. We'll use this eigenvector to reduce the problem to a smaller size, where we can apply the inductive hypothesis.

Let's think about the properties of the matrix C = AB - BA. Since rank(C) = 1, we know that the image (or column space) of C has dimension 1. This means all the columns of C are scalar multiples of a single non-zero vector, let's call it u. In other words, for any vector x, Cx is a scalar multiple of u. This is a crucial piece of information.

Now, consider the subspace W spanned by u. We want to show that W is invariant under some linear combination of A and B. This means that if we apply that linear combination to a vector in W, we get another vector in W. Why is this important? Because if we have an invariant subspace, we can restrict our matrices to that subspace and work with smaller matrices.

Let's take an eigenvector v of A (every matrix over complex numbers has at least one eigenvector). Let λ be the corresponding eigenvalue, so Av = λv. Now, consider C = AB - BA acting on v:

Cv = (AB - BA)v = ABv - BAv = A(Bv) - B(Av) = A(Bv) - B(λv) = A(Bv) - λBv

Since rank(C) = 1, we know that Cv must be a multiple of u. So, we have:

Cv = A(Bv) - λBv = αu

For some scalar α. This equation is a goldmine! It tells us something about the relationship between A(Bv) and Bv. If Bv is a multiple of v, then v is a common eigenvector of A and B, and we're in great shape. We can use this eigenvector to start building our triangularizing matrix P.

But what if Bv is not a multiple of v? This is where things get a bit trickier. In this case, we need to carefully analyze the subspace spanned by v and Bv. We'll show that this subspace is invariant under both A and B. This will allow us to restrict A and B to this smaller subspace, and we'll be one step closer to applying the inductive hypothesis.

We will continue unraveling this argument, looking at the different cases and showing how the rank condition forces the existence of a common eigenvector or an invariant subspace. This inductive step is where the real magic happens, guys! We're using the structure imposed by rank(AB - BA) = 1 to peel away layers of the problem, reducing it to a smaller and smaller scale until we can finally apply our inductive hypothesis.

Completing the Induction: Bringing It All Together

Alright, let's push forward and complete this proof by induction! We've set up the base case and made significant headway into the inductive step. Now we need to tie up the loose ends and show that the inductive step truly works in all scenarios.

Recall where we left off. We had the equation:

Cv = A(Bv) - λBv = αu

Where v is an eigenvector of A with eigenvalue λ, u spans the image of C = AB - BA, and α is a scalar. We identified two key scenarios:

Bv is a multiple of v: In this case, v is a common eigenvector of both A and B. This is fantastic news! We can use this common eigenvector as the first vector in a basis, and after a change of basis, the first column of both A and B will be zero below the diagonal. This effectively reduces the problem to a smaller submatrix.
Bv is not a multiple of v: This is the more challenging scenario. We need to carefully analyze the subspace spanned by v and Bv. Let's call this subspace W. We need to show that W is invariant under both A and B. This means that for any vector w in W, Aw and Bw are also in W.

Let's focus on the second scenario. We know that v is in W by definition. Now we need to show that Bv is also in W. Since W is spanned by v and Bv, any vector in W can be written as a linear combination of v and Bv. So, we need to show that A(Bv) and B(Bv) can also be written as linear combinations of v and Bv.

We already have some information about A(Bv) from the equation A(Bv) = λBv + αu. Now, we need to consider where u lives. Remember that u spans the image of C. This means that u can be written as Cx for some vector x. So, we have:

u = Cx = (AB - BA)x = ABx - BAx

This expression for u is crucial. It allows us to relate u back to the actions of A and B. Now, let's substitute this into our equation for A(Bv):

A(Bv) = λBv + α(ABx - BAx)

This looks a bit complicated, but we're getting closer. We need to show that A(Bv) is in W, which means it can be written as a linear combination of v and Bv. The term λBv is already in W, so we need to focus on the term α(ABx - BAx). This is where the rank(AB - BA) = 1 condition really shines.

Since rank(AB - BA) = 1, we know that the image of AB - BA is one-dimensional. This means that (AB - BA)x is a multiple of u for any vector x. In particular, it holds for the vector v. So, (AB - BA)v = γu for some scalar γ. This is a powerful constraint!

We will use this constraint, along with some careful manipulations of the equations, to finally show that W is indeed invariant under both A and B. Once we've established this, we can restrict A and B to W, and we'll have smaller matrices to work with. This is where our inductive hypothesis comes into play. We can apply the inductive hypothesis to the restricted matrices, and then we'll be able to