Fifth-degree equations and the impossibility proofs

The history of algebra, and the actors involved, are as interesting as the subject itself. Among many, there is Paolo Ruffini, an accomplished doctor and amateur math tinkerer, ignored by most professional mathematicians of his time; Neils Abel, that fought with poverty and lack of recognition, dying of pneumonia at 26; and Evariste Galois, shot at 20 in a duel fought over women and politics.

The search of the solutions or roots for polynomial equations, like the second-degree equation with the format

x² + bx + c = 0

moves mathematicians since antiquity. The roots are the values of x that make the equation true.

Many innovative ideas and new mathematical tools have been spawned in the course of this search of roots. It starts by the choice of symbols and forms to best express the problem on paper. Then, seeking the best metaphors to connect the equation to real-world objects, making it palpable and useful. Quadratic equations are naturally related to areas, cubic equations with volumes.

Remembering the school days

In the context of this text, the coefficients b, c, d... of the polynomial are always rational numbers. In school, many of us learn the equation in this form:

ax² + bx + c = 0

The coefficient a shows often in school textbooks in equations with all-integer coefficients. In more advanced materials, it goes away in exchange of allowing b,c to be rational non-integers.

Of course a polynomial with irrational of complex coefficients is possible, but they are not as interesting. In that case, both the coefficients and the roots exist in the field of the complex numbers. On the other hand, polynomials with rational coefficients may have non-rational roots, so the field of the roots is larger than the field of the coefficients, and that's what makes them intriguing.

Thanks to the Fundamental Theorem of Algebra, we know that every polynomial equation of degree n always has n complex roots that are not necessarily unique. So we can always express a polynomial as a chain of binomials or linear factors:

a(x − x₁)(x − x₂) = 0

In the form above, it is easy to see that, when x is either equal to x₁ or to x₂, the binomial goes to zero, and the whole left side goes to zero. Therefore, x₁ e x₂ are indeed the roots.

The coefficient a is just a scale factor that does not change the values of the roots, and we can leave it out to have an even tighter expression:

(x − x₁)(x − x₂) = 0

Multiplying the binomials, we kind of get the equation back to the canonical form:

x² − (x₁+x₂).x + (x_1.x₂) = 0

Here, functions of the roots have taken the places of b and c. It is clear now that there is a close relationship between the coefficients and the roots. Each coefficient is a simple formula on the roots. The big challenge, that cost sleepless nights to mathematicians for millennia, is to find the roots given the coefficients.

The number of roots depends on equation degree: 2 for quadratic, 3 for cubic, and so on. The specific challenge is to find a closed formula for the roots. That is, a formula with a finite number of arithmetic steps to deliver an exact result.

It may be the case some roots are complex — if so, they always come in conjugate pairs (as long as all coefficients are real). A corolary of this fact is: an odd-degree equation always has a real root.

A complex (or negative) root may not make sense in a real-world application of the equation. For example, the ultra-simple equation below

x² = 4

may be used to determine the side of a square plot of land whose area is 4. The roots are equal to the possible sides. x₁=2, which makes sense, and x₂=−2, which does not. But this is beside the point. We Mathematicians find all roots, the surveyor uses them as they sees fit.

In school we learn the Bhaskara formula to find the roots of a quadratic equation:

x_1, x₂ = (−b ± (b² − 4c)^1/2) / 2

This is a closed formula, and neat for day-to-day usage. There are closed formulas for cubic and quartic equations, but they are too intrincate for most applications. For cubic equations and beyond, it's simply more practical to find the roots using numeric methods (trial-and-error, bisection, etc.).

Another problem found in cubic formula is, they produce roots that are "exact" but expressed as sums of cubic roots and with complex number notation. In some cases, these intricate expressions gaslight a root that is actually rational or even integer. The result that was supposed to be "final" needs further work to be simplified. Again, it would have been more effective to use numeric methods in the first place.

But the simple existence, or non-existence, of a closed formula to calculate the roots is a question that has many consequences in advanced math. And, in any case, the history of answering this question is exciting by itself. The fun of travel may be the trip, not the destination.

For equations of fifth degree and above, there is no closed formula. They do have roots, but they cannot be calculated by finite arithmetic steps and radicals (n-th roots). This is the proposition of Abel-Ruffini Theorem that closed a chapter written for many centuries.

The whys of the non-existence of such formula opened the doors for a new branch of mathematics, the abstract algebra.

The first degree polynomial equations, whose canonical form is:

ax + b = 0 solution: x = b/a

have a single rational root, that is, a fraction. Note we employed the school textbook notation, conserving the coefficient a, to make it clear what is the nature of the root.

As long as the coefficients belong to Q (the field of rationals), the root also belongs to Q.

Permutations and symmetries

Algebraic equations have interesting characteristics of symmetry. Let's look back into the quadratic equation:

x² + bx + c = 0

As already shown, we can factor this into 1st degree binomials:

(x − x₁)(x − x₂) = 0

We also know how to calculate the roots as a function of the coefficients:

x_1, x₂ = (−b ± (b² − 4c)^1/2) / 2

And we have seen how the coefficients can be expressed as functions of the roots:

−b = x₁ + x₂
c = x_1.x₂

Note the functions above, known as Viète formulas, are much simpler than Bhaskara formula. More to the point, they are symmetric formulas. We can exchange the roots, and their results don't change. This is valid for coefficients of any degree.

(In this story, Viète is a supporting actor, but also important. A lawyer by trade, he believed his mathematical discoveries were simple and obvious, already known by mathematicians. Due to this, he never bothered to properly publish them and establish the priority.)

With a little more effort, we calculate the Viète formulas for the third-degree equation coefficients. Again we find that the formulas are impervious to root permutations:

x³ + bx² + cx + d = 0

−b = x₁ + x₂ + x₃
c = x_1.x₂ + x_1.x₃ + x_2.x₃
−d = x_1.x_2.x₃

We have a paradox here. Find the coefficients given the roots is easy. The results are given by simple, symmetric, purely arithmetic formulas, no radicals in sight. On the other hand, the inverse problem — finding the roots given the coefficients — cannot employ symmetric formulas, since there are multiple roots to find. The final formula, if it exists, is messy, asymmetric, convoluted, and needs radicals that bring irrational numbers in.

Noting again: in the context of Abel-Ruffini and Galois theories, the coefficients of the polynomial must be rational. The roots may be irrational; but when they are recombined within the Viète formulas to recover the coefficients, the irrationality must "vanish".

Therefore, the roots are not free to have any irrational value. For example, we cannot make the roots of a quadratic equation equal to 2^1/2 and 3^1/2, because the coefficients would be 2^1/2+3^1/2 and 6^1/2. For the coefficients to be rational, the roots must somehow fit together. For example, the roots 2^1/2 and −2^1/2 produce the rational coefficients 0 and 2.

Perhaps you remember that little rule of algebraic equations: if there are complex roots, they must come in conjugate pairs. E.g. if there is a root 2+i, there must be another root 2−i. The reasoning is the same: to make sure the imaginary parts cancel out within the coefficients. (By the way, i, the square root of −1, is deemed an irrational number.)

In the same fashion, other irrationalities present in roots have to come in conjugate groups. If one root is 2^1/2, another root must be −2^1/2, so the irrational part vanishes, within a sum (2^1/2+(−2^1/2) = 0), or within a product (2^1/2.(−2^1/2) = −2).

These conjugates may come in groups bigger than pairs. If the irrational number is a cubic root, we need a triplet to achieve the cancellation. Since the roots of a cubic equation may be cubic roots of square roots, there are two conjugations (pairs and triplets) that must work simultaneously.

In a nutshell, this is why the quintic equation has no formula for the roots. It is impossible to make so many degrees of irrationality (square, cubic, quartic and fifth roots) fit together, cancel each other out, and produce rational coefficients. Not if the roots are expressed using only arithmetic and radicals.

Another variation of this proof, by Kronecker, says: if it is possible to express the roots of a fifth equation by radicals, there will be either 1 or 5 real roots. That is, if there are complex roots, they will come in pairs of conjugate pairs. But there are plenty of quintic equations with 3 real roots in textbooks, which fall out of that condition.

Lagrange

This mathematician tried another approach: to find the pattern that allowed the known root formulas (from quadratic to quartic) to work in the first place. In this effort, he found the resolvent, an intermediate formula that "generates" the irrational radicals.

The core idea of the resolvent is to reduce the equation one degree at a time. For example, the resolvent of a quartic equation is a cubic equation, whose resolvent is quadratic, whose resolvent is trivial.

Putting it in another way: in an equation with n roots, if we know n−1 roots, the last root is trivial to calculate, since the root values are interdependent, and they are because their irrationalities must cancel out inside the Viète formulas for the coefficients to be rational.

Therefore, the "hard" part of solving e.g. a 4th degree equation is to find 3 roots. An equation with 3 roots is cubic, so the resolvent will be a cubic polynomial. Of course the 3 roots of this resolvent won't be the same 3 roots of the original equation. There will be a final manipulation of these roots to finally get the 4 roots of the target equation.

In the familiar quadratic equation, the resolvent is Δ^1/2. Lagrange also noted the resolvents could be expressed as rational functions of the roots, and they assumed different values as the roots were permuted. In the quadratic equation, Δ^1/2 is two-valued, and equal to (x₁−x₂) or (x₂−x₁).

That was the first call, showing the study of permutations of the roots was the key of the problem.

Of course the resolvents as functions of coefficients are more important, from a practical point of view. It is useless to know the discriminant is a quasi-symmetric function of the roots, if we don't know the roots.

But, from an abstract point of view, it is important to know the resolvent can be a rational function of the roots, and the permutations of these roots generate the different values we expect.

Lagrange hoped to extend the method of resolvents to the fifth degree, and when he couldn't, he suspected that such a formula could not be found. He didn't press on, since he didn't believe the resolvent method was the only possible approach.

Ruffini

Ruffini went forward, almost completing the proof of impossibility for quintic equations and above.

A key concept introduced by Ruffini is the "order" of a function, today known as tower of radicals. For example, a function of 0th-order f₀:

f₀(a, b, c, ...) = f₀(x₁, x₂, x₃...)

f₀ is, at the same time, an irrational function of the coefficients and a rational function of the roots, just like the resolvents. But, while Langrange looked at the problem from top to bottom (find a resolvent easier than original equation), Ruffini looked at the problem bottom up (reach original equation by stockpiling resolvents).

f₀ may return n irrational values — but if any of them is raised to the n-th power, the result is always the same, rational number. Therefore, f₀ is the n-th root of some number.

If we consider f₀ as a function of coefficientes, we can't use permutations. The different values returned by f₀ are generated by multiplying one result by the n-th roots of unity (we will talk about them later). For example, in the quadratic formula, we multiply Δ^1/2 by +1 and −1, which are the square roots of 1.

If we consider f₀ as a function of the roots, the different values are generated by root permutations. So, there must be a relationship between the number of the roots, the number of permutations, and the number of possible results. For example, if there are 3 roots, there are 6 possible permutations, and it is not possible to have f₀ to return 4 different values, since 4 does not divide 6.

Now, the higher-order formula f₁, has the following form:

f₁(x₁, x₂, x₃..., f₀(x₁, ...))

The function f₁ also enjoys a dual definition: as function of coefficients and as function of permuted roots. But now we start to pay more attention to the latter definition, since it will lead us to a conclusion.

Besides the roots, f₁ receives one or more zeroth-order functions as arguments. If f₁ returns m different values, raising any of them to m-th power yields the same result, that may be irrational, but using only the irrationality introduced by f₀.

For example, the solution of a quadratic equation has a single formula f₀=Δ^1/2 that introduces a single irrational number among the rationals. But the solution of a cubic equation has some f₀, which yields a number containing a square root, and some f₁ that yields a cubic root of a number containing the result of f₀. Therefore, the root of a cubic equation has two "layers" of irrationality within.

If this chain of functions exists, it is possible to invert them i.e. find the resolvents and a formula for the roots. Ruffini noted a function of 5 permutable parameters could not return 3 or 4 different values. Given this rationale, he concluded there cannot be a formula for the fifth degree equation.

Abel

Ruffini committed the opposite mistake of Lagrange: assumed without proof that every irrational function (in the form shown above) can be expressed as a rational function of the roots. The values yielded by them were called "Ruffini irrationals", suggesting there could be other irrationals in the splitting field, unreachable by these functions.

This gap, among others, was closed by Abel. He proved that, if there is a formula to calculate the roots, it must indeed fit the pattern postulated by Ruffini. All irrationals of the splitting field are Ruffini irrationals, after all.

A heavily simplified version of the Abel-Ruffini proof is as follows. A chain of nested functions that solves the quintic equation would have a height of at least four orders:

f₀(x₁, x₂, x₃, x₄, x₅) ⇒ returns 2 different values
f₁(x₁, x₂, x₃, x₄, x₅, f₀) ⇒ returns 3 different values
f₂(x₁, x₂, x₃, x₄, x₅, f₀, f₁) ⇒ returns 4 different values
f₃(x₁, x₂, x₃, x₄, x₅, f₀, f₁, f₂) ⇒ returns 5 different values

The problem is, a function with 5 permutable parameters cannot yield 3 or 4 different values. Cauchy had recently shown that such a function may return just 1, 2 or 5 values (within the range 1..5). This means f₁ and f₂ cannot exist. Therefore, there cannot exist a solution fitting this pattern, therefore there is no closed formula for the roots using only arithmetic and radicals.

It is important to note that a quintic equation, or any higher-degree equation, always has at least one root. And such a root may be expressible by a formula using transcendental techniques e.g. trigonometric functions, infinite sums, continued fractions, etc. What Abel-Ruffini proves, is

a) these roots cannot be expressed using a finite number of arithmetic operations and radicals; and/or

b) it is not possible to develop a closed formula with finite arithmetic and radical steps to determine these roots.

Now, let's review the equations with degree lower than 5, and find out why they are solvable, starting with the quadratic.

f₀(x₁, x₂) ⇒ returns 2 different values

No problem here, since a function with n permutable parameters can always return n different values.

Cubic equation:

f₀(x₁, x₂, x₃) ⇒ returns 2 different values
f₁(x₁, x₂, x₃, f₀) ⇒ returns 3 different values

The 0th order function is not a problem, since it is always possible to find a two-valued function, regardless of the number of permutable parameters.

A notorious example of two-valued function is the discriminant. For every degree (not only quadratic!) the discriminant is 0 if there are multiple (i.e. repeated) roots. If there are only single (unique) roots, the discriminant has a fixed absolute value, whose sign is toggled by the exchange of any two roots.

Quartic equation:

f₀(x₁, x₂, x₃, x₄) ⇒ returns 2 different values
f₁(x₁, x₂, x₃, x₄, f₀) ⇒ returns 3 different values
f₂(x₁, x₂, x₃, x₄, f₀, f₁) ⇒ returns 4 different values

According to Cauchy, a function with n permutable parameters may yield m results, where m is the biggest prime dividing m!. In this case, 3 is the biggest prime that divides 24, so f₁ is feasible.

Galois

Abel proved the impossibility theorem for the general quintic equation. But there are quintic equations solvable with pure arithmetic, e.g.

x⁵ − 2 = 0

is easily solved, one root being 2^1/5.

Galois explained why some quintic (or higher-degree) equations can be solved with arithmetic tools, and others cannot. En passant, he created or hinted many concepts that are the staple of abstract algebra: fields, groups, morphisms, normal subgroups, etc.

We will come back to the "chez Galois" of proving the impossibility theorem. But first we need to lay some groundwork.

Primitive roots of unity

The quintic equation

x⁵ − 2 = 0

is solvable with arithmetic. The Ruffini function tower that solves this equation is

f₀(x₁, x₂, x₃, x₄, x₅) ⇒ yields 2 values
f₁(x₁, x₂, x₃, x₄, x₅, f₀) ⇒ yields 2 values
f₂(x₁, x₂, x₃, x₄, x₅, f₀, f₁) ⇒ yields 5 different values

Actually, it is much easier to define f₂ as

f₂ = 2^1/5

This is just one of the five roots, but it is the only real root. The other complex roots are the products of 2^1/5 by the complex fifth roots of 1.

Given the latest definition, f₂ is self-sufficient. What about f₀ and f₁? Well, they are still necessary, but to introduce these complex fifth roots of 1, about which we never heard before.

Let's tread lightly. As you know, (+1)²=(−1)²=1. Therefore , +1 e −1 are both square roots of unity. But only −1 is a primitive root, since it can generate the other root v.g. (−1)×(−1)=(+1). The root +1 is called trivial since it cannot generate the other.

As we raise the primitive root to power 2, 3, etc. it generates −1 and +1 alternatively. We can see the repeated product of the primitive root creates a closed cycle.

The three cubic roots of 1 are: +1, (−1−i.3^1/2)/2 and (−1+i.3^1/2)/2. The latter two are primitive. Either of the primitive roots can generate all others as it is raised to square, cube and so on.

One funny thing here: despite being cubic roots, they are composed of square roots — i and 3^1/2. How come?! Well, the cubic roots of unity satisfy the following equation:

x³ − 1 = 0

Obviously x=1 satisfies the equation above (1 is always the n-th trivial root of 1) so we can factor the equation to

(x − 1)(x² + x + 1) = 0

The remaining irreducible polynomial is quadratic, and now we have a quadratic equation to solve, whose roots are the primitive cubic roots of unity. Being the roots of a quadratic equation, it is only fair they have square roots in their expressions.

The quadratic polynomial also tells us the sum of the three roots is equal to 0, and one primitive root is equal to the square of the other primitive root.

Any equation in the form

xⁿ − 1 = 0

can be factored to

(x − 1)(1 + x + x² + ... + xⁿ⁻¹) = 0

That is, the n-th primitive root of unity always has degree smaller than n. This automatic downgrade is a powerful tool in algebra.

The fifth roots of 1 satisfy the equation

x⁵ − 1 = 0

that can be factored into (x − 1)(1 + x + x² + x³ + x⁴) = 0

The 5th primitive roots are solutions of a quartic equation. Therefore, they have two nested square roots, and we must define f₀ and f₁ to solve a solvable quintic equation.

f₀(x₁, x₂, x₃, x₄, x₅) = φ ("golden ration") and its conjugate
f₁(x₁, x₂, x₃, x₄, x₅, f₀) = square roots of a+b.f₀

Note that, in the cases above, f₀ and f₁ are not functions of coefficients. They yield 2 and 4 different constants, and the root permutaions only shuffle among these constants.

From the point of view of Abel-Ruffini, the equation

x⁵ − 2 = 0

is solvable with a chain of arithmetic functions because it was not necessary to find a resolvent that yielded 3 different values (under permutations of 5 values).

Still about primitive roots of unity, in abstract algebra we are more interested in n-th roots where n is prime. When it is not prime, there will be roots that are neither trivial nor primitive.

For example, the set of quartic roots of unity are {+1, i, −1, −i}. The root −1 is not primitive, but it is not trivial either. Its powers generate a subset of the quartic roots (+1 and −1).

Topological proof

A fairly recent proof of the quintic impossibility was developed by Arnold in 1963. I don't reproduce it here (honestly, I didn't "get it" yet), but the idea is clever: to "rotate" the coefficients' values around the origin in the complex plane.

The fundamental idea is gorgeous. Suppose the equation

x² − 4 = 0

It is obvious that 2 is a root. But let's use Bhaskara anyway. Since c is the only non-zero coefficient, the formula simplifies to

x = (−4c)^1/2 / 2 = (−c)^1/2

x = (−(−4))^1/2 = 4^1/2 = 2

The trivial root of 4 is 2. We also know there is "another" root, −2. It is something we are force-fed in school without too much explanation. Sure, (−2)² is 4, but perhaps the math teacher didn't tell the whole story. Is this negative root really legit?

Now, let's show a convincing argument that non-trivial roots emerge from the properties of complex numbers.

Every complex number may be written in the form a+bi. They can also be written in the polar form a.e^iθ. For example, 4.e⁰=4, 4.e^iπ/2=4i, and 4.e^iπ=−4.

In particular, 4.e^i2π=4, and the angle 2π looks harmless. But, if we extract the square root of this number, it is no longer so harmless:

x = (4.e^i2π)^1/2
= 4^1/2.e^i2π.1/2
= 2.e^iπ
= −2

If we recall the example equation

x² − 4 = 0

we see that coefficient c=4. So, if we rotate c by 360º (which, at first sight, does not affect its value) this causes the root of the equation to rotate 180º. The root only goes back to +4 if the coefficient is rotated by a multiple of 720º. That is, the root rotates half the speed of the coefficient. This is no coincidence; it is because the equation has degree two.

In a trivial cubic equation like

x³ − 8 = 0

rotating the coefficient 360º causes the root to rotate just 120º, making it a complex number. The coefficient must rotate in multiples of 1080º for the root to come back to real.

Rotating any coefficient 360º effectively provokes a permutation of the roots. In the case of a general cubic equation, the roots are composed of quadratic and cubic radicals, rotating at half and third speed, respectively. In order to send all roots back to their original positions, the coefficent must rotate a multiple of 2160º, which is a multiple of both 720º and 1080º.

Galois theory

As mentioned before, Abel proved the impossibility for the general quintic equation. The very next question is: which quintics are solvable by arithmetic and radicals? Abel did not work in this problem since he died at 26, and his focus had already moved on. It is impossible to know whether Abel would have conquered the honors given to Galois, had he lived longer.

The greatest work of Galois was to translate the problem of solving a polynomial to the realm of studying groups of permutations. The polynomial is converted to a group. The most "intractable" the polynomial is, the bigger the group related to it. When this group grows beyond a certain size and structure, it is no longer solvable, meaning the respective polynomial has no roots in radicals.

The thing is, group theory and abstract algebra didn't exist back then. Galois had to create ad-hoc the necessary toolchain, and that was genius.

As usual, let's start slow. Consider the first degree equation

ax + b = 0

If the coefficients are rational numbers, the solution x=b/a is also a rational number. In technical language, the field Q is the splitting field of all polynomials in Q[x] of degree 1.

What is a field? It is an algebraic structure, formed by a set of elements, plus two commutative operations. For example, the set of rationals Q is a field under addition and multiplication, since we can sum and multiply any two rationals, and the result is always another rational number also belonging to Q.

A counterexample is the set of square matrixes of a given size. Why? Because matrix multiplication is not commutative. They form an algebraic structure known as ring.

Splitting field is the field that factors completely a polynomial. In other words, is the field that contains all roots of that polynomial.

We didn't explain what is Q[x]. This is the set of all polynomials whose coefficients belong to Q, that is, polynomials with rational coefficients. Polynomials with irrational or complex coefficients belong to R[x] and C[x], respectively.

Now, let's see the quadratic equation

x² − 3 = 0

The roots do not belong to the field Q, therefore Q is not the splitting field. We must extend the field of rationals by adjoining some irrational number, that we know to be 3^1/2.

In technical notation, Q(3^1/2) is the field of rationals extended with 3^1/2. This new field contains all rationals, plus the irrationals in the form a+b.3^1/2 (where a and b are rational).

But try to detach yourself from the true value of 3^1/2. Get into the shoes of the field Q that knows integers and fractions only. Or think like an ancient Greek mathematician, that did not accept the existence of irrational numbers.

From this backwards point of view, the root of that equation is a "thing" C. It must exist, but it cannot be mixed with "normal" numbers. The splitting field of the equation is Q(C). Or is it? Since a quadratic equation has two roots, we must have two "things": C and Ĉ. Do we actually need two field extensions?

These "things" are rubbed harder in our faces when we deal with finite fields. For example, in the field GF(2ⁿ), very often used in computing, there are only two numbers, 0 and 1. The XOR arithmetic is dead simple: 1+1=0 e 0−1=1. In such a field, the equation

x² + x + 1 = 0

seems to have no roots, since neither x=0 nor x=1 satisfies it. But, the fundamental theorem of algebra guarantees it has three roots! What now? Well, the roots exist, but they are "things" whose nature transcend the numbers. (TL;DR the roots are other polynomials.)

Let's go back to the infinite field Q. Not every field extension is algebraic, e.g. Q(π) is an extension but it is not algebraic since π is not a root of any polynomial with rational coefficients. In technical language, Q(π) has no minimal polynomial in Q[x].

For a field extension to be algebraic, it must have a minimal polynomial in Q[x]. That is, it must contain at least one root of a binomial. For example, Q(3^1/2) is algebraic since it contains at least one root of

x² − 3 = 0

Note we can rewrite this equation as

x² = 3

In this form, it is easier to see a property of the irrational 3^1/2: when squared, it yields a rational number: +3. Transcendental numbers like π or e can't do that. This is another definition of an algebraic extension: Q(C) given C does not belong to Q, but Cⁿ belongs to Q.

If Cⁿ belongs to Q, we say the degree of an extension C is exactly n. In the example equation, C is one root, and the equation is quadratic, so it must be an extension of degree 2. Many other traits of the extended field can be inferred from the extension degree.

Let's back to the stage where we deemed C e Ĉ "things" without real-world value, only conceding that C and Ĉ are the roots of

x² − 3 = 0

From the point of view of Q (or an ancient Greek) either C or Ĉ is equally good (or bad). They are indistinguishable. Neither of them has (rational) value. But perhaps there is a mathematical relationship between C and Ĉ? Or can we replace C by any irrational value like 5^1/2?

Here we introduce the concept of automorphism. A field automorphism is an orderly and reversible mapping of certain elements of the field to other elements. An automorphism deserving of the name respects the following relations:

A(x+y) = A(x) + A(y)
A(xy) = A(x)A(y)

Playing around with these definitions, we soon find an automorphism won't change rational values. A possible automorphism is to exchange the "things" C and Ĉ:

A(x): C ⇔ Ĉ

A number composed of rational and irrational portions is itself irrational, so it is expected to be affected by the automorphism:

A(2.C + 0.5) = A(2).A(C) + A(0.5) = 2.Ĉ + 0.5

On the other hand, a number or formula that is purely rational is not affected:

A(2 + 0.5) = A(2) + A(0.5) = 2 + 0.5

Is this automorphism really valid? Given that C and Ĉ are algebraic extensions of degree 2 with the same minimal polynomial, the following identity holds:

C² = Ĉ² = 3

At left side we only have irrationals, at right side we have a rational number. The automorphism does not upset this identity:

A(C²) = A(3)
A(C)A(C) = 3
Ĉ.Ĉ = 3
Ĉ² = 3

The identity below, obvious at first glance,

C² = (−C)² = 3

shows that −C is also a root of the minimal polynomial of C and Ĉ. But, since it is a polymomial of degree 2, it has at most 2 roots. Therefore, −C must be equal to either C or Ĉ. C is by definition irrational, therefore it cannot be zero, therefore it cannot be equal to −C. The necessary conclusion is Ĉ=−C.

The same rationale shows that C=−Ĉ, and a single field extension — either Q(C) or Q(Ĉ) — is enough to split the equation

x² − 3 = 0

An automorphism B(x) that mapped 3^1/2 to 5^1/2 would not be acceptable, since it does not survive the simple test below:

B(3^1/2)² = B(3)
(5^1/2)² = 3
5 = 3 (false)

Going back to the valid automorphism A(x), note that it can be applied many times, but if applied an even number of times, it is the same as nothing:

A(A(C)) = A(Ĉ) = C

The group of automorphisms of the equation

x² − 3 = 0

is composed of two operations: A(x) that replaces C⇔Ĉ, and E(x) that does nothing. It is a cyclic group, since the composition A.A=E. Groups of two elements are technically known as S₂ ou C₂.

The following quadratic equation is related to a trivial group of automorphisms containing a single element, the identity E(x) that does not change anything:

x² − 4 = 0

That's because the roots of this equation are rational (+2 e −2), therefore the field Q is already the splitting field. No extensions are necessary.

The automorphisms only exist while we treat the irrational number as a "thing", without a palpable value. As soon as we admit C through a field extension, all numbers in the form a+bC are now accepted as "true" numbers, and their values cannot be messed around.

In technical language, the group of automorphisms of C over Q is S₂, but the group of automorphisms of C over Q(C) is S₁, the latter with a single element, the identity.

Let's look into a not-so-trivial equation:

x² − 4x + 13 = 0

This one has complex roots 2+3i and 2−3i. In this case, which is the "thing" C that we adjoin to the field Q, in order to split (and resolve) the equation? Would it be 2+3i, 2−3i, 3i, or i?

Answer: any of them will do. Each one of these "things" has an "anti-thing" (2−3i, 2+3i, −3i and −i, respectively). Starting with any of these values, we can fabricate the others using the form a+bC.

A cornerstone of the Galois theory: the automorphism of a root is also a root. If we adopt Q(i) as extension, the non-trivial automorphism is −i, which correctly swaps the roots 2+3i and 2−3i. If we adopt Q(3i), the non-trivial automorphism is −3i, which works equally well.

This cornerstone is one of the easiest to prove. Suppose the automorphism A(x) and a polynomial f(x) of degree n with roots x₁..x_n,

f(x) = xⁿ + c₁xⁿ⁻¹ + ... c_n

If x_i is a root, then

0 = x_iⁿ + c₁x_iⁿ⁻¹ + ... c_n

Applying the automorphism to the equation,

A(0) = A(x_iⁿ + c₁x_iⁿ⁻¹ + ... c_n)
A(0) = A(x_iⁿ) + A(c₁x_iⁿ⁻¹) + ... A(c_n)
A(0) = A(x_i)ⁿ + A(c₁)A(x_i)ⁿ⁻¹ + ... A(c_n)
0 = A(x_i)ⁿ + c₁A(x_i)ⁿ⁻¹ + ... c_n

Given the last step, we find that A(x_i) must be a root as well.

Given a polynomial with rational coefficients, we can always extend Q with some irrationals to create the splitting field containing all roots. This will work for polynomials of every degree. A tautological method is to extend Q with the roots themselves. Kind of obvious, but it is important to establish that this method always works.

For polynomials with degree smaller than 5, we can always choose "simpler" irrationals to extend Q with and achieve the splitting field. In a previous example, we had the options Q(2+3i), Q(3i) or Q(i). Clearly Q(i) is the most elegant venue. Its minimal polynomial is

x² + 1 = 0

but Q(i) is also a splitting field of the other equation, and of many others.

Now, consider the equation

x² + 3 = 0

whose roots are 3^1/2i and −3^1/2i. We seem to have two "things", we could extend the field Q twice to get the splitting field Q(3^1/2)(i), but actually we just need a single extension Q(i.3^1/2).

Note the irrational number i.3^1/2 has the number i in its composition, but Q(i.3^1/2) is not a splitting field of the equation

x² + 1 = 0

since i cannot be found in isolation within the field Q(i.3^1/2). Dividing i.3^1/2 by 3^1/2 is not an option, because the pure 3^1/2 is also absent from the field. Every number inside this field has the form

a + b.i.3^1/2,

where a and b are rationals. There are no tuples a,b that can make it equal to just i or just 3^1/2. If we want a field that can split both the following equations:

x² + 1 = 0
x² + 3 = 0

then we really need a double extension Q(3^1/2)(i), in which the numbers are in the form

a + b.i + c.3^1/2 + d.i.3^1/2

The form above is the representation of a field extension as a vector. The elements (1, i, 3^1/2, i.3^1/2) are the basis of this vector. As long as a,b,c,d are rationals, the elements of the basis are linearly independent, meaning that every number in this field maps to a unique tuple of rationals (a,b,c,d).

A conterexample, that is, a defective vector with linearly dependent elements in basis, would be

a + bω + cω²

where ω is the primitive cubic root of unity. Note that, if a=b=c, the total value is zero. So, a given vector (zero) can be expressed by an infinite number of tuples (a=b=c=0, a=b=c=1, a=b=c=2, ...). This happens because ω²=−1−ω, so ω² should not belong to the basis.

Now, let's look into a cubic equation:

x³ − 7 = 0

The roots are 7^1/3, ω7^1/3 and ω²7^1/3. Obviously, we need to extend Q with the irrational 7^1/3 to achieve a splitting field. It is an extension of degree 3, since (7^1/3)³=7. But is Q(7^1/3) enough to split the polynomial?

The answer is negative. We also need to extend Q with ω. We have seen that ω hides a secret: it is the also the root of a quadratic equation. Therefore, it is an extension of degree 2, in spite of being the cubic root of 1.

The splitting field of the equation can be either Q(7^1/3)(ω) or Q(ω)(7^1/3). The vector representation of every in this field is

a + b.7^1/3 + c.7^2/3 + dω + e.7^1/3ω + f.7^2/3ω

with a...f being rationals, as usual. Note the basis has 6 elements (2×3), and we need 7^2/3 in the basis to properly cover all numbers of this field.

Q(7^1/3) is an algebraic extension (whose minimal polynomial is the example equation, since it contains at least one root of it). But it is not a normal extension because it is not a splitting field of any equation; it does not contain all the roots of any polynomial, not even of the example equation.

Q(7^1/3)(ω) and Q(ω)(7^1/3) are normal extensions, since both contain all roots of some polynomial, namely the example equation. But only Q(ω)(7^1/3) is a normal series, since Q, Q(ω) and Q(ω)(7^1/3) are all normal, for some polynomial.

The normal extensions are desirable because only they can have automorphisms. The field Q(7^1/3) has no valid automorphisms — note that (7^1/3)³ is different from (−7^1/3)³.

Yet another way is to extend Q with the 3 roots of the equation themselves. This is guaranteed to deliver the splitting field, but it is brute force, it is excessive (we have seen that adjoining 2 irrationals was enough) and it negates the final objective: to calculate the roots when we don't know them!

The extension Q(ω)(7^1/3) is an example of "tower of radicals", and Galois proved that the existence of a tower of radicals implies the possibility of finding an arithmetic formula to calculate the roots.

Every tower of algebraic extensions can be converted to a simple extension. Still in the cubic example, Q(ω)(7^1/3) is equivalent to Q(i.3^1/27^1/3). If we raise i.3^1/27^1/3 to square, cube, etc. we get all the irrationals necessary to form the field basis:

(i.3^1/27^1/3)² = -3.7^2/3
(i.3^1/27^1/3)³ = -21.i.3^1/2 = 21.(2ω+1)
(i.3^1/27^1/3)⁴ = 63.7^1/3
(i.3^1/27^1/3)⁵ = 441.i.3^1/2.7^2/3 = -441.(2ω+1).7^2/3
(i.3^1/27^1/3)⁶ = -1323

This magic number that replaces a whole tower is named primitive element of an extension. In the example above, the primitive element replaces a tower of radicals with degrees 2 and 3, so the degree of the primitive element must be 6, and the respective minimal polynomial is sixth degree:

x⁶ + 1323 = 0

Let's now find the possible automorphisms related to the equation

x³ − 7 = 0

From the point of view of Q, we have two "things" ω and B, whose values are transcendental to Q. The thing ω admits two automorphisms:

E(x) = V⁰(x): identity
V(x): ω ⇒ ω²

The thing B admits three automorphisms, with some help of ω:

E(x) = R⁰(x): identity
R(x): B ⇒ ωB
R²(x): B ⇒ ω²B

To check whether these automorphims are really sound, you can do the same procedure shown for the quadratic, checking whether the identities ω³=1 e B³=7 hold under the automorphisms.

The automorphisms V(x) and R(x) have different effects over the roots of the equation. V(x) permutes two complex roots. R(x) rotates the three roots in a circular fashion. It is important to note the composition of these automorphisms is no longer commutative. V(R(x)) yields one result, while R(V(x)) yields another. By the way, R(R(V(x)))=V(R(x)).

The group of automorphisms of the example cubic equation has 6 elements, since there are 6 distinct possibilities of shuffling three roots: R, R.R, V, R.V, V.R, E (identity) — recalling that R.R.V=V.R. It is no coincidence the order of V(x) is 2 (because it has 2 elements), the order of R(x) is 3. The count of all combinations is 3×2=6. Groups with this characteristic are known as S₃.

We said before the automorphism of a root is also a root. We know a cubic equation has 3 roots. So, if we apply all six automorphisms on some root (e.g. 7^1/3, equal to B) we should get only 3 distinct values. Let's check:

E(B) = B
R(B) = ω.B
R(R(B)) = R(ω.B) = R(ω)R(B) = ω.ω.B = ω²B
F(B) = B
R(F(B)) = R(B) = ω.B
F(R(B)) = F(ω.B) = F(ω)F(B) = ω²B

Testing with another root (ω.B), the same happens:

E(ω.B) = ω.B
R(ω.B) = R(ω)R(B) = ω.ω.B = ω²B
R(R(ω.B)) = R(ω.ω.B) = ω³B = B
F(ω.B) = F(ω)F(B) = ω²B
R(F(ω.B)) = R(ω²B) = ω²R(B) = ω³B = B
F(R(ω.B)) = F(ω.ω.B) = F(ω)F(ω)F(B) = ω⁴B = ω.B

When we extend Q with ω, making Q(ω), we fix ω. From that moment on, ω has a definite and palpable value, it is not a "thing" anymore. The valid automorphisms over Q(ω) preserve the value of ω, any number in the format a+bω, and of course they preserve all rationals (which are in the form a+0ω).

Given this new restriction, the remaining group of automorphisms has only the 3 remaining elements of R(x). This group of order 3 is commutative and cyclic. Note that, in spite of R(x) making use of ω to shuffle B, it does not touch the value of ω, nor any number in the form a+bω. It only affects irrational numbers that have B in their recipe (e.g. a+bB and a+bωB).

Finally, when we extend the field again to Q(ω)(7^1/3), the only possible automorphism is the identity (do-nothing), since there is no "thing" left. The roots of the polynomial, as well as all other intermediate results like discriminants, resolvents. etc. have definite value. There is an inverse relationship between the fields and the Galois groups:

Q ........... S₃
Q(ω) ........ C₃
Q(ω)(B) ...... C₁ (trivial group)

Therefore, increasing the height of the radical tower causes the shrinking of the repective Galois group.

An interesting fact: for a polynomial of degree n, the maximum degree of the extension (single or towered) to create the splitting field is n!. This is valid even for unsolvable equations of 5th degree or above, in which the only method to create the splitting field is to adjoin the roots themselves.

The reason is the following: suppose a fifth degree equation with roots x₁..x₅. We adjoin a root x_a to the field Q, creating the field Q(x_a), which is an extension of degree 5 in relation to Q. Now, x_a is now a "first-class citizen" in the realm of numbers. The polynomial is no longer irreducible, we can divide it by (x−x_a), and the quotient is equivalent to

(x−x_b)(x−x_c)(x−x_d)(x−x_e)

This is a fourth degree polynomial. At this point, x_b..x_e are roots of a fourth degree equation. The next extension Q(x_a)(x_b) will have degree 4 in relation to Q(x_a), Q(x_a)(x_b)(x_c) has degree 3 in relation to Q(x_a)(x_b), and so on. The total degree of the extension Q(x₁..x₅) in relation to Q, is 120 (5×4×3×2×1).

It is noteworthy that, despite the fact we can adjoin the roots in any order, the ones that are adjoined first are awared with higher degree. This happens because all roots are related (for the Viète formulas to deliver rational coefficients). As soon as one root is known, the others become a little easier to find.

At the end of the process, the last adjoined root has degree 1, that is, it does not extend the field at all. Q(x_a,x_b,x_c,x_d) is the same field as Q(x₁..x₅). This means that, in an n-degree equation, once we know n−1 roots, the last one is easily determined.

A meticulous study of the formulas of 2nd, 3rd and 4th degree equations shows that each step of each formula is related to one field extension, that maps to a reduction of the Galois group. And, being the way it is, the problem can be approached in a more abstract fashion, looking only at the groups.

The Galois group of a polynomial relates to the complexity of finding a formula for the roots using only arithmetics and radicals. If the Galois group is solvable, such a formula exists. The definition of solvable group will be given later.

The general cubic equation has a Galois group S₃. Even it has only real roots, the intermediate steps in root formula must use complex numbers, and we need to have ω around. The only case of a cubic equation has a smaller group C₃, is when the dicriminant is a perfect square, which means the square root of the Cardano formula is rational.

The biggest possible group of a quartic equation is S₄, whose normal series can be

S₄ ⊳ A₄ ⊳ V₄ ⊳ C₂ ⊳ C₁

The groups in the above normal series have 24, 12, 4, 2, and 1 elements, respectively. Note that 24/12=2, 12/4=3, 4/2=2, 2/1=2, all quotients are prime numbers. Also, the series goes down to C₁, the trivial group with 1 element.

The definition of "normal subgroup" will be given later, but it is related to a normal field extension — that one which contains all the roots of some polynomial, and whose automorphisms make a group.

Group quotients are important because they are groups themselves. The group S₄ has order 24, the group A₄ has order 12. But the quotient between them is the tiny group C₂, and this is important because it represents the automorphisms of a field extension of degree 2 e.g. that one with irrational square roots +C and −C.

In the process of solving a quartic equation, we start by extending the field Q with a square root, whose group of automorphisms is C₂. When we do that, a "thing" whose value was unknown became a first-class number. In response, the group of permutations of roots is shrunk from S₄, being "divided" by C₂. The result (so far) is the group A₄.

Another possible normal series for milder quartic equations e.g. the biquadratics, is

D₄ ⊳ V₄ ⊳ C₂ ⊳ C₁

D₄ has 8 elements, it is a subgroup of S₄ but not a normal subgroup. This suggests the necessary tooling to solve this type of equation is different than the general case. But the trailer of the normal series is the same. Biquadratic equations with 4 real roots have an even smaller Galois group, V₄, and they are even easier to solve.

A group is solvable when all quotients of its normal series are abelian groups. Groups that have prime order (i.e. a prime number of elements) are always cyclic and abelian. All quotients found in the normal series of S₄ are small prime numbers, so S₄ is abundantly solvable.

Every subgroup of a solvable group, even a non-normal subgroup, is solvable. For example, D₄ is not a normal subgroup of S₄, but the latter is solvable, so is the former.

A quotient between groups only exists when the "divisor" is a normal subgroup of the "dividend". In order to find out whether a group is solvable, we must look into all quotients between adjacent groups of the subgroup series. This series must be a normal series, otherwise some of these quotients would not exist.

The normal series of the general quintic equation is:

S₅ ⊳ A₅ ⊳ C₁

The quotients are 120/60=2 and 60/1=60. The group A₅ is simple, that is, it has no normal subgroups (save for the trivial C₁), which breaks the chain and makes S₅ unsolvable.

In free language, the group A₅ is too intrincate to be disassembled in smaller parts. The impossibility of gradually shrinking this group maps to the impossibility of building a splitting field through radical extensions, which means there is no arithmetic formula for the roots.

The only way to build the splitting field for the general quintic is to adjoin the roots themselves, at least 4 of them. (Adjoining 1, 2 or 3 roots does not give us a normal extension. It becomes normal with 4, since the field becomes splitting and we can easily calculate the 5th root.) Unfortunately, this "possibility" does not help us at all when we want to calculate unknown roots...

On the other hand, the quotient 120/60=2 shows we can make the very first step of solving the equation, using only arithmetic. Every normal series of every polynomial equation of any degree starts with a quotient 2. This quotient is related to the discriminant formula.

For any degree, there exists a coefficient-based formula for the discriminant, and the square root of the discriminant always belongs to the splitting field containing the roots. The discriminant tells a lot about the nature of the roots e.g. a zero discriminant means there are multiple (non-unique) roots.

A solvable quintic equation has a Galois group F₂₀, whose normal series is

F₂₀ ⊳ D₅ (10 elem.) ⊳ C₅ > C₁

Note 20/10=2, 10/5=2 and 5/1=5. The first two quotients map to the calculation of the quintic roots of unity, that have degree 4 (2×2), and the latter quotient maps to the quintic root of some rational number.

F₂₀ is a subgroup of S₅, but it is not a normal subgroup. This is congruent with the fact that solvable quintic equations are fundamentally different from the general quintics, and the toolchain that solves one type is useless on the other type.

Going deeper

The group F₂₀ is generated by two automorphisms:

H(x): ω ⇒ ω²
R(x): B ⇒ ωB

where ω is a quintic primitive root of unity. And, in the case of the equation

x⁵ − 3 = 0

that we have been using as example, "B" is the real quintic root of 3.

With some effort, we find that H(x) maps ω to ω², this to ω⁴, this to ω³, and this to ω, creating an X-shaped cycle of 4 elements. In group lingo, the Galois group of H(x) has a generator (1243). Soon we will show how to work with this syntax of permutations.

Applying H(x) repeatedly on the example equation will suffle the complex roots around, but the real root is fixed, so there is only 4 possible permutations.

The automorphism R(x) rotates the five roots in a circular fashion, achieving 5 different possibilities. The generator element is (12345). Combining R(x) and H(x) gives us 20 different ways of shuffling the roots.

Another possible automorphism is the traditional complex conjugation, i.e. map every complex number of the field to its conjugate. Let's call it C(x). Its generator is (14)(23), and it creates just two possible ways to permute the roots. How it fits between H(x) and R(x)? Using the syntax of permutations, we find that

(1243)(1243) = (14)(23)

That is, C(x) is just a subgroup of H(x). It is a normal subgroup, since H(x) has only 4 elements, so it must be abelian, and abelian groups can only have normal subgroups.

More about the permutation syntax

The group theory is heavily based on the study of permutation groups. For example, the word ABC has 6 possible anagrams or permutations: ABC, BCA, CAB, ACB, CBA, BAC. The isomorphic group is S₃ where "3" is the number of letters. The group itself has 6 elements, since 6=3×2×1=3! Every group S_n has order n!.

The permutation group is the "king of groups". They were the first to be studied, since they represent what happens when we shuffle roots in the context of the Galois theory; and it has been proven that every group, of any shape or form, is a subgroup of some permutation group.

In the literature, permutations are often expressed using the cycle notation or syntax. For example, (12) means "element 1 goes to position 2, the element 2 goes to the position 1". The parentheses delimit a cycle. The word ABC operated by (12) becomes BAC. The permutations (12) and (21) are the same, but the custom is to start by the lowest number.

(132) means "move element 1 to position 3, element 3 to position 2, and element 2 to position 1". ABC.(132)=BCA. A cycle with a single number e.g. (1) is the identity permutation, it does nothing ("move element 1 to position 1"). It is the neutral element present in every permutation group. The labels "id" or "e" are also used to represent the neutral permutation.

Compositions of permutations may be disjoint e.g. (12)(34) is disjoint since one cycle does not affect the other. When disjoint, they are commutative e.g. ABCD.(12)(34) = ABCD.(34).(12) = BACD.

On the other hand, (12)(13) and (13)(12) are not disjoint and are not commutative v.g. ABC.(12)(13) = CAB but ABC.(13)(12) = BCA.

A composition of permutations may often be simplified. Trivial example: (12)(12)=(1)=neutral. It is noteworthy to check that (12)(13)=(123) and (13)(12)=(132). Let's now describe the practical rule to calculate and simplify a composition, using the example:

(12)(23)

Start by the lowest number, whatever it is, and write it down:

(12)(23)=(1

The first cycle moves 1 to 2, and the next cycle moves 2 to 3. It is useful to say it aloud "1 goes to 2, and 2 goes to 3, so 1 goes to 3". Write 3 down:

(12)(23)=(13

Now, consider the number 3. It exists only at the right cycle, which moves it to 2, so

(12)(23)=(132

No more numbers left, so we are finished. A proof the procedure went well is to check that 2 goes to 1 (as it does, by means of left cycle), and the result cycle was correctly closed:

(12)(23)=(132)

In a case like (23)(23), 2 goes to 3 e 3 goes to 2. Something similar happens to 3. So the result would be

(23)(23)=(2)(3)

but (2) and (3) are neutral permutations, so they can be omitted, or replaced by the customary (1):

(23)(23)=(1)

Caveat: it is very easy to confuse the group action with the final position of the permuted elements. For example, the word ABC does not correspond to the action (123), but to the neutral action (1). If we express the word using digits e.g. 123 instead of ABC, the confusion is even easier to take place. How to explain that "123" is not the result of action (123)?

The correspondence between anagrams and permutations of the base word ABC is as follows:

ABC   (1)
BAC   (12)
CBA   (13)
ACB   (23)
CAB   (123)
BCA   (132)

Since it is a group of permutations, the number of anagrams is the same as the group order. Any permutation action in S₃ different from the list above, can be simplified or converted to the canonical form.

Often we prefer to use the "language of groups" instead of the anagrams. Instead of saying BAC, we say (12), even when we mean the anagram instead of the action. When we want to apply the permutation (123) on BAC, we say (12)(123) instead of BAC.(123). Note that (12)(123) can be simplified to (13), which is immediately relatable to CBA. This is possible in permutation groups because there is an 1:1 mapping between actions and anagrams.

A very compact form of expressing a group or subgroup is using only the generators — those elements that, combined among themselves, produce all others. In the case of S₃, two possible generators are (123) and (23), so

S₃ = <(123),(23)>

It is left as an exercise to check that these two elements can be combined to reach all others. We could have chosen other generators e.g. (321) and (13). The important thing is to have an order-3 generator and an order-2 generator.

One last important concept: there are "even" and "odd" permutations. An even permutation can be expressed by an even number of trampositions. For example, in S₃, (132) is equal to (12)(23), so (132) is even. On the other hand, (12) is odd. The identity (1) is considered even.

In the literature, we find that A_n is the subgroup of S_n containing only the even permutations. The group A₃ is the subgroup <(123)> of S₃. On the other hand, the odd permutations cannot form a subgroup because they lack the identity (which must be even).

The composition of permutations is analogous to arithmetic sum: the combination of even permutations is even, the combination of two odd permutations is even, the combination of an odd number of odd permutations is odd.

What is a normal subgroup?

There are many definitions of normal subgroup. We will use hC = Ch. That is, to compose some element h of the father group H with every element of the subgroup C, by the left or by the right, should yield the same result set, the coset.

This is easier done than said. Let's take S₃, whose generators can be (123) e (23). In isolation, <(23)> is a commutative subgroup of order 2, while <(123)> is commutative of order 3. But only <(123)> is a normal subgroup:

 h    C   = hC
(23)(1)   = (23)
(23)(123) = (12)
(23)(132) = (13)

  C   h   = Ch
  (1)(23) = (23)
(123)(23) = (13)
(132)(23) = (12)

Note that "multiplying" by (23) is not a commutative operation. (123)(23) yields a different result than (23)(123). But the coset obtained by composing (23) with the whole <(123)>, leftwise or rightwise, is the same: {(12),(13),(23)}. Note that sets are not ordered internally, so {a, b} = {b, a}.

Doing this exercise with the generator (23) is enough, but we can double check by doing the same with every element of S₃ and reach the same conclusion.

Note that a coset is a "layer" of a group, but it is not necessarily a subgroup!

Now, the subgroup {(1),(23)} is not normal:

 h    C   = hC
(123)(1)  = (123)
(123)(23) = (13)

 C    h   = Ch
(1) (123) = (123)
(23)(123) = (12)

The combination of {(1),(23)} with the remaining generator (123) yielded different cosets {(123), (13)} and {(123), (12)} as the combination was left-handed or right-handed.

Another, very compact, definition of normal subgroup is

h⁻¹Ch=C

where C is the subgroup e h is any element of the father group H. Taking the subgroup {(1), (123), (321)} of S₃,

(23)⁻¹(123)(23) = (23)(123)(23) = (12)(23) = (132)

(132) still belongs to <(123)>. The operation was made easier due to (23) element being its own inverse.

Now consider the subgroup {(1), (23)}, and test with the generator h=(123),

(123)⁻¹(23)(123) = (321)(23)(123) = (12)(123) = (13)

(13) does not belong to <(23)>, therefore the tested subgroup is not normal.

Looking into the concept of group quotient

We have said that a quotient group only exists when the divisor is a normal subgroup of the divident. Let's test this definition. We have just seen that C₃ and C₂ are subgroups of S₃, but only C₃ is a normal subgroup.

The "division" of S₃ by C₃ is found by "multiplying" each element of S₃ by the whole C₃ group, leftwise and rightwise, and tally the cosets we get:

(1) . {(1),(123),(132)} = {(1),(123),(132)} = A
(12) . {(1),(123),(132)} = {(12),(13),(23)} = B
(13) . {(1),(123),(132)} = {(13),(23),(12)} = B
(23) . {(1),(123),(132)} = {(23),(12),(13)} = B
(123) . {(1),(123),(132)} = {(123),(132),(1))} = A
(132) . {(1),(123),(132)} = {(132),(1),(123)} = A

{(1),(123),(132)} . (1) = {(1),(123),(132)} = A
{(1),(123),(132)} . (12) = {(12),(23),(13)} = B
{(1),(123),(132)} . (13) = {(13),(12),(23)} = B
{(1),(123),(132)} . (23) = {(23),(13),(12)} = B
{(1),(123),(132)} . (123) = {(123),(132),(1)} = A
{(1),(123),(132)} . (132) = {(132),(1),(123)} = A

We got two cosets, A and B, that occur in same frequency and equally scattered in left-handed and right-handed operations. We can say it is analogous to the arithmetic operation (6/3=2).

We can devise a conjugation group M(x) with two elements (1) e (23), that acts upon the cosets, commutating them. Conjugate or "multiply" all elements of coset A by (23) yields the coset B, and vice-versa. So we can say M(A)=B, M(B)=A and M(M(A))=A. By the way, M(x) is exactly the quotient group we have mentioned before.

On the other hand, if we try to do the same with the subgroup C₂, generated by (23),

(1) . {(1),(23)} = {(1),(23)} = A
(12) . {(1),(23)} = {(12),(132)} = B
(13) . {(1),(23)} = {(13),(123)} = C
(23) . {(1),(23)} = {(23),(1)} = A
(123) . {(1),(23)} = {(123),(13)} = C
(132) . {(1),(23)} = {(132),(12)} = B

{(1),(23)} . (1) = {(1),(23)} = A
{(1),(23)} . (12) = {(12),(123)} = D
{(1),(23)} . (13) = {(13),(132)} = E
{(1),(23)} . (23) = {(23),(1)} = A
{(1),(23)} . (123) = {(123),(12)} = D
{(1),(23)} . (132) = {(132),(13)} = E

See? We got five different cosets, in different frequencies and irregularly scattered (e.g. D shows up only in right-handed compositions). It is not possible to devise a conjugation group with these cosets. Therefore, there is no quotient group.

If the subgroup generated by (23) was normal, the right-hand operations would generate the same cosets A,B,C of the left-handed operations. Then we would have 3 cosets in equal frequency, and the generator (123) would be able to conjugate them in a cyclic fashion.

There is a 6-element group divisible by 2: it is Z₆ (addition modulo 6), isomorphic to C₆. Dividing by C₂ works for this particular group because addition is commutative, and all subgroups of a commutative (abelian) group are normal. The subgroups of (0,1,2,3,4,5) are (0,1) and (0, 2, 4). Doing the coset test:

(0, 1) + 0 = (0, 1) = A
(0, 1) + 2 = (2, 3) = B
(0, 1) + 4 = (4, 5) = C
0 + (0, 1) = (0, 1) = A
2 + (0, 1) = (2, 3) = B
4 + (0, 1) = (4, 5) = C

Now we got 3 well-scattered cosets, respecting the analogy 6/2=3. Here, the group quotient panned out because C₂ is a normal subgroup of Z₆, while it was not a normal subgroup of S₃.

Conversely, the group Z₆ may be assembled by a direct product of groups Z₂ and Z₃:

Z₂ = (0, 1, 2)
Z₃ = (0, 1)
Z₆ = (0, 1, 2) × (0, 1) = (0:0, 0:1, 1:0, 1:1, 2:0, 2:1)
Replacing a:b by 2a+b,
Z₆ = (0, 1, 2, 3, 4, 5)

A direct product is a simple crossing of two groups, in a way that both will be normal subgroups of the result. That is, there will be a quotient group between a direct product and either of the building blocks.

Now, there is the semidirect product. It is also a crossing of two groups, but only one will be a normal subgroup of the result. It is exactly what happens when we create S₃ through a product:

C₃ ~= {(1), (123), (132)}
C₂ ~= {(1), (23)}
S₃ = {(1)(1), (123)(1), (132)(1), (1)(23), (123)(23), (132)(23)}
S₃ = {(1), (123), (132), (23), (13), (12)}

We have expressed the groups C₃ and C₂ as subgroups of S₃, so the result is promptly recognizable as the elements of S₃. Despite the smaller groups being commutative themselves, the final product is not. There is a quotient group between S₃ and C₃ because the latter is a normal subgroup of the former, but there is no quotient between S₃ and C₂.

A direct product of two abelian groups is also abelian. Since S₃ is not abelian, it simply cannot be a direct product of groups isomorphic to C₂ and C₃.

Every permutation group S_n can be expressed as a semidirect product of the alternating group A_n (the normal subgroup of S_n that contains the even permutations of S_n) and C₂. The last question is whether A_n can be expressed as a semidirect product. The answer is "No" when n is 5 or greater.

The simple groups, those that don't have normal subgroups, are analogous to the prime numbers, since neither of them can be expressed as products of smaller non-trivial groups or numbers.

Why quotients are so important for Galois?

As said before, for a polynomial equation to have a formula for the roots, its Galois group must be solvable. We deem "solvable" the group with a normal subgroup series, and each quotient group between adjacent groups in this series must be an abelian group.

But why is this a requirement, and what is the relationship with field extensions?

As shown before, the extended field that contains all the roots of an n-degree equation has an extension degree of n! at the most. If n=5, n!=120. One tautological way to build this extension field is to adjoin the roots themselves, so we can take for granted the existence of this field.

If we don't know the roots, but we want to solve the equation using arithmetic and radicals, we must extend the field Q by adjoining square roots, cubic roots, and so on. An n-th root with n prime generates a cyclic group of automorphisms. Going back once again to the example

x² − 3 = 0

The roots 3^1/2 and −3^1/2 are interchangable from the point of view of field Q. They are commutable under the automorphism C₂. As soon as we extend the field to Q(3^1/2), the group of automorphisms shrinks to C₁, because 3^1/2 has just been accepted as a number with definite value and we cannot change it anymore.

Thinking it backwards: Q(3^1/2) is the splitting field of the equation above, since it contains all its roots. If we remove 3^1/2 from the field, we increase the number of possible automorphisms, from 1 to 2, since a value that was once well-known, is now a mistery "thing" C, and either +C or −C will satisfy the equation.

Let's consider the cubic equation

x³ − 5 = 0

The splitting field is Q(ω, B) being ω is the primitive cubic root of unity, and B is the real cubic root of 5. The Galois group of this field is {(1)}, with no automorphisms other than the neutral. Now, the objective is to remove extensions one by one until we reach the group S₃ with 6 automorphisms.

If we remove ω from the field, we stay with Q(B). This is not a good intermediate field, since it is not a normal extension field. But why is this bad? Because, of three equation roots (B, ωB and ω²B), two became unknowns (since ω became a "thing") but one is still a perfectly known value (B). Being known, it is fixed in place.

In technical language, the remaining group of automorphisms is not transitive, since we cannot move the root B to another position, and we cannot move other root B's position. We broke the rule "an automorphism of a root is another root"; an automorphism of B is still B.

The other option is to remove B from the field, staying with Q(ω). This is a good option, since B became an unknown with 3 automorphisms, and all 3 roots of the equation are now unkowns as well. Translating this to groups, we did the direct product of (1) by <(123)> obtaining <(123)>. This is a transitive group.

The next step is to remove ω from Q(ω), staying with Q alone. Now, ω is also an unknown thing with 2 automorphisms: {(1), (23)} which is a non-normal subgroup of S₃. In the realm of Galois groups, we just did a semidirect product:

<(123)> × <(23)> = S₃ in full.

In an equation solvable by radicals, the Galois group is built up this way, by a chain of semidirect products. Each product is between a normal subgroup (representing all extensions removed so far) and a cyclic subgroup (representing the next extension being removed, which is an irrational p-th root). This sequence assembles a normal series, from (1) to the final group.

In order to recover the cyclic groups, related to the p-th roots, starting with the big group, we do the inverse operation of the semidirect product: the quotient. We take the quotients of adjacent groups of the normal series. For example, S₃/<(123)> yields the "quotient" <(23)>, and <(123)>/(1) = <(123)>.

One clarification remains to be made. Both the automorphism groups of ω and B are cyclic and isomorphic to C₂ and C₃ respectively. Why their product is S₃, instead of C₆, given that C₂ and C₃ are both normal subgroups of the latter? It would be even better to use direct product instead of semidirect, we would have two normal series to choose from.

Answer: the group S₃ is our final objective, since it is a permutation group, it represents and abstracts accurately what happens when we permutate roots of a cubic equation. It is true that C₆ has 6 elements too, but it does not represent what happens in a game of permutations. It is also true that a polynomial equation of degree n may have a Galois group smaller than S_n, but such a group will always be a subgroup of S_n. C₆ is not a subgroup of S₃, so there is no chance to relate it to a cubic equation.

Being that way, in order to combine C₃ and C₂ such as the result is S₃, or some subgroup of S₃, we first convert the elements of C₃ and C₂ to elements of S₃.

The only subgroup of S₃ with order 3 is <(123)>. By the way, this is another trait of a normal subgroup: it is the only subgroup with its size. On the other hand, there are three subgroups of S₃ with order 2: <(12)>, <(13)> and <(23)>. Doing the product of any of them with <(123)>, we get the whole S₃.

The group A₅, a normal subgroup of S₅, has no further normal subgroups apart from the trivial (1). The "quotient" is the A₅ group itself. This group is not cyclic, so it cannot be related to a 60th root extension. Being simple (no normal subgroups), this group cannot be built up by a sequence of semidirect products, therefore it cannot be mapped to a tower of field extensions by radicals.

One possibility to rule out: perhaps the Galois group of a quintic equation is actually a subgroup S₅ different than A₅? We have seen that solvable quintics map to the Galois group F₂₀. Perhaps it is the case that every quintic equation maps to F₂₀?

Suppose a quintic equation with 3 real roots and 2 complex ones, easily found in textbooks. In this case, the automorphism C(x) that replaces complex numbers by their conjugates will act upon a pair of roots, leaving the other 3 fixed. It is a simple transposition; the respective group element is something like (12) or (34).

On the other hand, every quintic equation admits a permutation that rotates all roots in a circular motion e.g. (12345) in its Galois group, since the field extension Q(x_i) (where x_i is any root) has degree 5. Therefore, solvable or not, the Galois group of a quintic always has order multiple of 5.

Well, one transposition and one rotation are enough to generate the whole permutation group, S₅ in this case. So, at least for quintic polynomials with 3 real roots, their Galois groups have the maximum size, and they are insolvable by radicals.

But a quintic equation with 4 complex roots may have a smaller Galois group. The permutation (1243) rotates the complex roots along an X-shaped orbit, and (12345) is the always-present circular rotation of all roots. These two permutations generate a subgroup of S₅ having just 20 elements. In this case, the ever-present complex conjugate automorphism would be (14)(23), but it is already included, since it is equivalent to (1243)².

References

MAS 442, Galois Theory, Algebra I. Notes by A. F. Jarvis, with some reworking by K. Mackenzie and A. Weiss.
A classical introduction to Galois Theory. Newman, Stephen C. Wiley: 2012.
http://fermatslasttheorem.blogspot.com.ar/2008/10/abels-impossibility-proof.html
http://fermatslasttheorem.blogspot.com.br/2008/08/cauchys-theorem-on-permutations-of.html
Visual group theory, YouTube playlist do prof. Matthew Macauley
Why you can't solve quintic equations (Galois theory approach) #SoME2 (YouTube)