Exploring Finite Fields, Part 4: The Power of Forgetting
algebra
finite field
haskell
Or: how I stopped learned to worrying and appreciate the Monad.
Published
February 20, 2024
Modified
August 5, 2025
The last post in this series focused on understanding some small linear groups and implementing them on the computer over both a prime field and prime power field.
The prime power case was particularly interesting. First, we adjoined the roots of a polynomial to the base field, GF(2). Rather than the traditional means of adding new symbols like α, we used companion matrices, which behave the same arithmetically. For example, for the smallest prime power field, GF(4), we use the polynomial p(x) = x^2 + x + 1, and map its symbolic roots (α and α2), to matrices over GF(2):
dataF4=ZeroF4|OneF4|AlphaF4|Alpha2F4derivingEqfield4 = [ZeroF4, OneF4, AlphaF4, Alpha2F4]instanceShowF4whereshowZeroF4="0"showOneF4="1"showAlphaF4="α"showAlpha2F4="α^2"-- Addition and multiplication over F4instanceNumF4where (+) ZeroF4 x = x (+) OneF4AlphaF4=Alpha2F4 (+) OneF4Alpha2F4=AlphaF4 (+) AlphaF4Alpha2F4=OneF4 (+) x y =if x == y thenZeroF4else y + x (*) ZeroF4 x =ZeroF4 (*) x ZeroF4=ZeroF4 (*) OneF4 x = x (*) AlphaF4AlphaF4=Alpha2F4 (*) Alpha2F4Alpha2F4=AlphaF4 (*) AlphaF4Alpha2F4=OneF4 (*) x y = y * xabs=idnegate=idsignum=idfromInteger= (cycle field4 !!) .fromInteger-- Companion matrix of `p`, an irreducible polynomial of degree 2 over GF(2)cP :: (Num a, Eq a, Integral a) =>Matrix acP = companion $Poly [1, 1, 1]f ZeroF4= zero 2f OneF4= eye 2f AlphaF4= cPf Alpha2F4= (`mod`2) <$> cP |*| cPfield4M =map f field4
Finally, we constructed GL(2, 4) using matrices of matrices – not block matrices! This post will focus on studying this method in slightly more detail.
Reframing the Path Until Now
In the above description, we already mentioned larger structures over GF(2), namely polynomials and matrices. Since GF(4) can itself be described with matrices over GF(2), we can generalize f to give us two more maps:
f^*, which converts matrices over GF(4) to double-layered matrices over GF(2), and
f^\bullet, which converts polynomials over GF(4) to polynomials of matrices over GF(2)
Matrix Map
We examined the former map briefly in the previous post. More explicitly, we looked at a matrix B in SL(2, 4) which had the property that it was cyclic of order five. Then, to work with it without relying on symbols, we simply applied f over the contents of the matrix.
Code
-- Starred maps are instances of fmap composed with modding out-- by the characteristicfStar :: (Eq a, Num a, Integral a) =>MatrixF4->Matrix (Matrix a)fStar =fmap (fmap (`mod`2) . f)mBOrig = toMatrix [[ZeroF4, AlphaF4], [Alpha2F4, Alpha2F4]]mBStar = fStar mBOrigmarkdown $"$$\\begin{gather*}"++concat [-- First row, type of fStar"f^* : \\mathbb{F}_4 {}^{2 \\times 2}"++"\\longrightarrow"++"(\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}"++"\\\\[10pt]",-- Second row, B"B = "++ texifyMatrix' show mBOrig ++"\\\\",-- Third row, B*"B^* = f^*(B) = "++ texifyMatrix' (\x ->"f("++show x ++")") mBOrig ++" = "++ texifyMatrix' (texifyMatrix' show) mBStar ] ++"\\end{gather*}$$"
We can do this because a matrix contains values in the domain of f, thus uniquely determining a way to change the internal structure (what Haskell calls a functor). Furthermore, due to the properties of f, it and f* commute with the determinant, as shown by the following diagram:
It should be noted that the determinant strips off the outer matrix. We could also consider the map det* , where we apply the determinant to the internal matrices (in Haskell terms, fmap determinant). This map isn’t as nice though, since:
Code
markdown $"$$\\begin{align*}"++concat [-- First row, det* of B"\\det {}^*(B^*) &= "++ texifyMatrix' (("\\det"++) . texifyMatrix' show) mBStar ++" = "++ texifyMatrix ((`mod`2) . determinant <$> mBStar) ++"\\\\ \\\\",-- Second row, determinant of B*-- Note how the commutation between `determinant` and <$> fails"&\\neq"++ texifyMatrix ((`mod`2) <$> determinant mBStar) ++" = "++"\\det(B^*)","" ] ++"\\end{align*}$$"
Much like how we can change the internal structure of matrices, we can do the same for polynomials. For the purposes of demonstration, we’ll work with b = \lambda^2 + \alpha^2 \lambda + 1, the characteristic polynomial of B, since it has coefficients in the domain of f. We define the extended map f^\bullet as:
Code
-- Bulleted maps are also just instances of fmap, like the starred mapsfBullet :: (Eq a, Num a, Integral a) =>PolynomialF4->Polynomial (Matrix a)fBullet =fmap (fmap (`mod`2) . f)
Since we’re looking at the characteristic polynomial of B, we might as well also look at the characteristic polynomial of B*, its image under f^*. We already looked at the determinant of this matrix, which is the constant term of the characteristic polynomial (up to sign). Therefore, it’s probably not surprising that f^\bullet and the characteristic polynomial commute in a similar fashion to the determinant.
It should also be mentioned that charpoly*, taking the characteristic polynomials of the internal matrices, does not obey the same relationship. For one, the type is wrong: the codomain is a matrix containing polynomials, rather than a polynomial over matrices.
There does happen to be an isomorphism between the two structures (a direction of which we’ll discuss momentarily). But even by converting to the proper type, we already have a counterexample in the constant term from taking det* earlier.
Clearly, layering matrices has several advantages over how we usually interpret block matrices. But what happens if we do “forget” about the internal structure?
Haskell implementation of forget
importData.List (transpose)-- Massively complicated point-free way to forget double matrices:-- 1. Convert internal matrices to lists of lists-- 2. Convert the external matrix to a list of lists-- 3. There are now four layers of lists. Transpose the second and third.-- 4. Concat the new third and fourth layers together-- 5. Concat the first and second layers together-- 6. Convert the list of lists back to a matrixforget ::Matrix (Matrix a) ->Matrix aforget = toMatrix .concatMap (fmapconcat. transpose) . fromMatrix .fmap fromMatrix-- To see why this is the structure, remember that we need to work with rows-- of the external matrix at the same time.-- We'd like to read across the whole row, but this involves descending into two matrices.-- The `fmap transpose` allows us to collect rows in the way we expect.-- For example, for the above matrix, We get `[[[0,0],[0,1]], [[0,0],[1,1]]]` after the transposition,-- which are the first two rows, grouped by the matrix they belonged to.-- Then, we can finally get the desired row by `fmap (fmap concat)`ing the rows together.-- Finally, we `concat` once more to undo the column grouping.mBHat = forget mBStarmarkdown $"$$\\begin{gather*}"++concat ["\\text{forget} : (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}"++"\\longrightarrow \\mathbb{F}_2 {}^{4 \\times 4}"++"\\\\[10pt]","\\hat B = \\text{forget}(B^*) = \\text{forget}"++ texifyMatrix' (texifyMatrix' show) mBStar ++" = "++ texifyMatrix mBHat,"" ] ++"\\end{gather*}$$"
Like f, forget preserves addition and multiplication, a fact already appreciated by block matrices. Further, by f, the internal matrices multiply the same as elements of GF(4). Hence, this shows us directly that GL(2, 4) is a subgroup of GL(4, 2).
However, an obvious difference between layered and “forgotten” matrices is the determinant and characteristic polynomial:
It’s a relatively simple matter to move between determinants, since it’s straightforward to identify 1 and the identity matrix. However, a natural question to ask is whether there’s a way to reconcile or coerce the matrix polynomial into the “forgotten” one.
First, let’s formally establish a path from matrix polynomials to a matrix of polynomials. We need only use our friend from the second post – polynomial evaluation. Simply evaluating a matrix polynomial r at λI converts our matrix indeterminate (Λ) into a scalar one (λ).
It should be noted that we do not get the same results by taking the determinant after applying charpoly*, indicating that the above method is “correct”.
Since we can get \lambda^4 + \lambda^3 + \lambda^2 + \lambda + 1 in two ways, it’s natural to assume this polynomial is significant in some way. In the language of the the second post, the polynomial can also be written as 231, whose root we determined was cyclic of order 5. This happens to match the order of B in GL(2, 4).
Perhaps this is unsurprising, since there are only so many polynomials of degree 4 over GF(2). However, the reason we see it is more obvious if we look at the powers of scalar multiples of B. First, recall that f* takes us from a matrix over GF(4) to a matrix of matrices of GF(2). Then define a map g that gives us degree 4 polynomials:
\begin{gather*}
g : \mathbb{F}_4^{2 \times 2} \rightarrow \mathbb{F}_2[\lambda]
\\
g = \text{charpoly} \circ \text{forget} \circ f^*
\end{gather*}
The matrices in the middle and rightmost columns both have order 15 inside GL(2, 4). Correspondingly, both 10011λ = 219 and 11001λ = 225 are primitive, and so have roots of order 15 over GF(2).
A Field?
Since we have 15 matrices generated by the powers of one, you might wonder whether or not they can correspond to the nonzero elements of GF(16). And they can! In a sense, we’ve “borrowed” the order 15 elements from this “field” within GL(4, 2). However, none of the powers of this matrix are the companion matrix of either 219 or 225.
Haskell demonstration of the field-like-ness of these matrices
All we really need to do is test additive closure, since the powers trivially commute and include the identity matrix.
-- Check whether n x n matrices (mod p) have additive closure-- Supplement the identity, even if it is not already presenthasAdditiveClosure ::Integral a =>Int-> a -> [Matrix a] ->BoolhasAdditiveClosure n p xs =all (`elem` xs') sums where-- Add in the zero matrix xs' = zero n:xs-- Calculate all possible sums of pairs (mod p) sums =map (fmap (`mod` p)) $ (+) <$> xs' <*> xs'-- Generate the powers of x, then test if they form a field (mod p)generatesField ::Integral a =>Int-> a ->Matrix a ->BoolgeneratesField n p x = hasAdditiveClosure n p xs where xs =map (fmap (`mod` p) . (x^)) [1..p^n-1]print$ generatesField 42$ forget $ fStar $fmap (AlphaF4*) mBOrig
True
More directly, we might also observe that α2B is the companion matrix of an irreducible polynomial over GF(4), namely q(x) = x^2 - \alpha x - \alpha.
Both the “forgotten” matrices and the aforementioned companion matrices lie within GL(4, 2). A natural question to ask is whether we can make fields by the following process:
Filter out all order-15 elements of GL(4, 2)
Partition the elements and their powers into their respective order-15 subgroups
Add the zero matrix into each class
Check whether all classes are additively closed (and are therefore fields)
In this case, it happens to be true, but proving this in general is difficult, and I haven’t done so.
Expanding Dimensions
Of course, we need not only focus on GF(4) – we can just as easily work over GL(2, 2r) for other r than 2. In this case, the internal matrices will be r×r while the external one remains 2×2. But neither do we have to work exclusively with 2×2 matrices – we can work over GL(n, 2r). In either circumstance, the “borrowing” of elements of larger order still occurs. This is summarized by the following diagram:
Here, fr is our map from GF(2r) to r×r matrices and fnr is a similar map. r must greater than 1 for us to properly make use of matrix arithmetic. Similarly, n must be greater than 1 for the leftmost GL. Thus, nr is a composite number. Here, k is a proper factor of 2nr - 1. In the prior discussion, k was 5 and 2nr - 1 was 15.
Recall that primitive polynomials over GF(2nr) have roots with order 2nr - 1. This number can never be prime, since the only primes of the form 2p - 1 are Mersenne primes – p itself must be prime. Thus, in GL of prime dimensions, we can never loan to a GL over a field of larger order with the same characteristic. Conversely, GL(nr + 1, 2) trivially contains GL(nr, 2) by fixing a subspace. So we do eventually see elements of order 2m - 1 for either prime or composite m.
Other Primes
This concern about prime dimensions is unique to characteristic 2. For any other prime p, pm - 1 is composite since it is at the very least even. All other remarks about the above diagram should still hold for any other prime p.
In addition, the diagram where we found a correspondence between the orders of elements in GL(2, 22) and GF(22×2) via the characteristic polynomial also generalizes. Though I have not proven it, I strongly suspect the following diagram commutes, at least in the case where K is a finite field:
Over larger primes, the gap between GL and SL may grow ever larger, but SL over a prime power field seems to inject into SL over a prime field. If the above diagram is true, then the prior statement follows.
Monadicity and Injections
The action of forgetting the internal structure may sound somewhat familiar if you know your Haskell. Remember that for lists, we can do something similar – converting [[1,2,3],[4,5,6]] to [1,2,3,4,5,6] is just a matter of applying concat. This is an instance in which we know lists to behave like a monad. Despite being an indecipherable bit of jargon to newcomers, it just means we:
can apply functions inside the structure (for example, to the elements of a list),
have a sensible injection into the structure (creating singleton lists, called return), and
can reduce two layers to one (concat, or join for monads in general).
Monads are traditionally defined using the operator >>=, but join = (>>= id)
Just comparing the types of join :: Monad m => m (m a) -> m a and forget :: Matrix (Matrix a) -> Matrix a suggests that Matrix (meaning square matrices) could be a monad, and further, one which respects addition and multiplication. Of course, this is only true when our internal matrices are all the same size. In the above diagrams, this restriction has applied, but should be stated explicitly since no dimension is specified by Matrix a.
Condition 2 gives us some trouble, though. For one, only “numbers” (elements of a ring) can go inside matrices, which restricts where monadicity can hold. More importantly, we have a lot of freedom in what dimension we choose to inject into. For example, we might pick a return that uses 1×1 matrices (which add no additional structure). We might also pick return2, which scalar-multiplies its argument to a 2×2 identity matrix instead.
Unfortunately, there’s no good answer. At the very least, we can close our eyes and pretend that we have a nice diagram:
As one last note on the monadicity of matrices, I have played around with an alternative Matrix type which includes scalars alongside proper matrices, which would allow for a simple canonical injection. Unfortunately, it complicates join – we just place the responsibility of sizing the internal matrices front-and-center since we can correspond internal scalars with identity matrices.
Closing
At this point, I’ve gone on far too long about algebra. One nagging curiosity makes me wonder whether the there are any diagrams like the following:
Or in English, whether “rebracketing” certain nr × nr matrices can be traced back to not only a degree r field extension, but also one of degree n.
The mathematician in me tells me to believe in well-defined structures. Matrices are one such structure, with myriad applications. However, the computer scientist in me laments that the application of these structures is buried in symbols and that layering them is at most glossed over. There is clear utility and interest in doing so, otherwise the diagrams shown above would not exist.
Of course, there’s plenty of reason not to go down this route. For one, it’s plainly inefficient – GPUs are built on matrix operations being as efficient as possible, i.e., without the layering. It’s also inefficient to learn for people just learning matrices. I’d still argue that the method is useful for learning about more complex topics, like field extensions.