Exploring Finite Fields, Part 4: The Power of Forgetting

algebra

finite field

haskell

Or: how I stopped learned to worrying and appreciate the Monad.

Published

February 20, 2024

Modified

August 5, 2025

The last post in this series focused on understanding some small linear groups and implementing them on the computer over both a prime field and prime power field.

The prime power case was particularly interesting. First, we adjoined the roots of a polynomial to the base field, GF(2). Rather than the traditional means of adding new symbols like α, we used companion matrices, which behave the same arithmetically. For example, for the smallest prime power field, GF(4), we use the polynomial p(x) = x^2 + x + 1, and map its symbolic roots (α and α²), to matrices over GF(2):

\begin{gather*} f : \mathbb{F}_4 \longrightarrow \mathbb{F}_2 {}^{2 \times 2} \\ \\ \begin{gather*} f(0) = {\bf 0} = \left(\begin{matrix} 0 & 0 \\ 0 & 0 \end{matrix}\right) & f(1) = I = \left(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}\right) \\ f(\alpha) = C_p = \left(\begin{matrix} 0 & 1 \\ 1 & 1 \end{matrix}\right) & f(\alpha^2) = C_p {}^2 = \left(\begin{matrix} 1 & 1 \\ 1 & 0 \end{matrix}\right) \end{gather*} \\ \\ f(a + b)= f(a) + f(b), \quad f(ab) = f(a)f(b) \end{gather*}

Equivalent Haskell

data F4 = ZeroF4 | OneF4 | AlphaF4 | Alpha2F4 deriving Eq
field4 = [ZeroF4, OneF4, AlphaF4, Alpha2F4]

instance Show F4 where
  show ZeroF4   = "0"
  show OneF4    = "1"
  show AlphaF4  = "α"
  show Alpha2F4 = "α^2"

-- Addition and multiplication over F4
instance Num F4 where
  (+) ZeroF4 x = x
  (+) OneF4 AlphaF4 = Alpha2F4
  (+) OneF4 Alpha2F4 = AlphaF4
  (+) AlphaF4 Alpha2F4 = OneF4
  (+) x y = if x == y then ZeroF4 else y + x

  (*) ZeroF4 x = ZeroF4
  (*) x ZeroF4 = ZeroF4
  (*) OneF4 x = x
  (*) AlphaF4 AlphaF4 = Alpha2F4
  (*) Alpha2F4 Alpha2F4 = AlphaF4
  (*) AlphaF4 Alpha2F4 = OneF4
  (*) x y = y * x

  abs = id
  negate = id
  signum = id
  fromInteger = (cycle field4 !!) . fromInteger


-- Companion matrix of `p`, an irreducible polynomial of degree 2 over GF(2)
cP :: (Num a, Eq a, Integral a) => Matrix a
cP = companion $ Poly [1, 1, 1]

f ZeroF4   = zero 2
f OneF4    = eye 2
f AlphaF4  = cP
f Alpha2F4 = (`mod` 2) <$> cP |*| cP

field4M = map f field4

Finally, we constructed GL(2, 4) using matrices of matrices – not block matrices! This post will focus on studying this method in slightly more detail.

Reframing the Path Until Now

In the above description, we already mentioned larger structures over GF(2), namely polynomials and matrices. Since GF(4) can itself be described with matrices over GF(2), we can generalize f to give us two more maps:

f^*, which converts matrices over GF(4) to double-layered matrices over GF(2), and
f^\bullet, which converts polynomials over GF(4) to polynomials of matrices over GF(2)

Matrix Map

We examined the former map briefly in the previous post. More explicitly, we looked at a matrix B in SL(2, 4) which had the property that it was cyclic of order five. Then, to work with it without relying on symbols, we simply applied f over the contents of the matrix.

Code

-- Starred maps are instances of fmap composed with modding out
-- by the characteristic

fStar :: (Eq a, Num a, Integral a) => Matrix F4 -> Matrix (Matrix a)
fStar = fmap (fmap (`mod` 2) . f)

mBOrig = toMatrix [[ZeroF4, AlphaF4], [Alpha2F4, Alpha2F4]]
mBStar = fStar mBOrig

markdown $ "$$\\begin{gather*}" ++ concat [
    -- First row, type of fStar
    "f^* : \\mathbb{F}_4 {}^{2 \\times 2}" ++
      "\\longrightarrow" ++
      "(\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}" ++
      "\\\\[10pt]",
    -- Second row, B
    "B = " ++ texifyMatrix' show mBOrig ++
      "\\\\",
    -- Third row, B*
    "B^* = f^*(B) = " ++
      texifyMatrix' (\x -> "f(" ++ show x ++ ")") mBOrig ++ " = " ++
      texifyMatrix' (texifyMatrix' show) mBStar
  ] ++
  "\\end{gather*}$$"

\begin{gather*}f^* : \mathbb{F}_4 {}^{2 \times 2}\longrightarrow(\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}\\[10pt]B = \left( \begin{matrix}0 & α \\ α^2 & α^2\end{matrix} \right)\\B^* = f^*(B) = \left( \begin{matrix}f(0) & f(α) \\ f(α^2) & f(α^2)\end{matrix} \right) = \left( \begin{matrix}\left( \begin{matrix}0 & 0 \\ 0 & 0\end{matrix} \right) & \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) \\ \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right) & \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\end{matrix} \right)\end{gather*}

We can do this because a matrix contains values in the domain of f, thus uniquely determining a way to change the internal structure (what Haskell calls a functor). Furthermore, due to the properties of f, it and f* commute with the determinant, as shown by the following diagram:

\begin{gather*} f(\det(B)) = f(1) = I =\det(B^*)= \det(f^*(B)) \\[10pt] \begin{CD} \mathbb{F}_4 {}^{2 \times 2} @>{\det}>> \mathbb{F}_4 \\ @V{f^*}VV ~ @VV{f}V \\ (\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2} @>>{\det}> \mathbb{F}_2 {}^{2 \times 2} \end{CD} \end{gather*}

It should be noted that the determinant strips off the outer matrix. We could also consider the map det* , where we apply the determinant to the internal matrices (in Haskell terms, fmap determinant). This map isn’t as nice though, since:

Code

markdown $ "$$\\begin{align*}" ++ concat [
    -- First row, det* of B
    "\\det {}^*(B^*) &= " ++
      texifyMatrix' (("\\det" ++) . texifyMatrix' show) mBStar ++ " = " ++
      texifyMatrix ((`mod` 2) . determinant <$> mBStar) ++
      "\\\\ \\\\",
    -- Second row, determinant of B*
    -- Note how the commutation between `determinant` and <$> fails
    "&\\neq" ++
      texifyMatrix ((`mod` 2) <$> determinant mBStar) ++ " = " ++
      "\\det(B^*)",
    ""
  ] ++
  "\\end{align*}$$"

\begin{align*}\det {}^*(B^*) &= \left( \begin{matrix}\det\left( \begin{matrix}0 & 0 \\ 0 & 0\end{matrix} \right) & \det\left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) \\ \det\left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right) & \det\left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\end{matrix} \right) = \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right)\\ \\&\neq\left( \begin{matrix}1 & 0 \\ 0 & 1\end{matrix} \right) = \det(B^*)\end{align*}

Polynomial Map

Much like how we can change the internal structure of matrices, we can do the same for polynomials. For the purposes of demonstration, we’ll work with b = \lambda^2 + \alpha^2 \lambda + 1, the characteristic polynomial of B, since it has coefficients in the domain of f. We define the extended map f^\bullet as:

Code

-- Bulleted maps are also just instances of fmap, like the starred maps

fBullet :: (Eq a, Num a, Integral a) => Polynomial F4 -> Polynomial (Matrix a)
fBullet = fmap (fmap (`mod` 2) . f)

\begin{gather*} f^{\bullet} : \mathbb{F}_4[\lambda] \longrightarrow \mathbb{F}_2 {}^{2 \times 2}[\Lambda] \\ f^{\bullet} (\lambda) = \Lambda \qquad f^{\bullet}(a) = f(a), \quad a \in \mathbb{F}_4 \\ \\ \begin{align*} b^{\bullet} = f^{\bullet}(b) &= f^{\bullet}(\lambda^2) &&+&& f^{\bullet}(\alpha^2)f^{\bullet}(\lambda) &&+&& f^{\bullet}(1) \\ &= \Lambda^2 &&+&& \left(\begin{matrix} 1 & 1 \\ 1 & 0\end{matrix}\right) \Lambda &&+&& \left(\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}\right) \end{align*} \end{gather*}

Since we’re looking at the characteristic polynomial of B, we might as well also look at the characteristic polynomial of B*, its image under f^*. We already looked at the determinant of this matrix, which is the constant term of the characteristic polynomial (up to sign). Therefore, it’s probably not surprising that f^\bullet and the characteristic polynomial commute in a similar fashion to the determinant.

Code

bStar = fmap (fmap (`mod` 2)) $ charpoly $ fStar mBOrig
bBullet = fmap (fmap (`mod` 2)) $ fBullet $ charpoly mBOrig

if bStar /= bBullet then
    markdown "$b^\\star$ and $b^\\bullet$ are not equal!"
  else
    markdown $ "$$\\begin{align*}" ++ concat [
        "b^* &= \\text{charpoly}(f^*(B)) = \\text{charpoly} " ++
          texifyMatrix' (texifyMatrix' show) mBStar ++
          "\\\\",
        "&= " ++
          texifyPoly' "\\Lambda" (texifyMatrix' show) bStar ++ " = " ++
          "f^\\bullet(\\text{charpoly}(B)) = b^\\bullet",
        ""
      ] ++
      "\\end{align*}$$"

\begin{align*}b^* &= \text{charpoly}(f^*(B)) = \text{charpoly} \left( \begin{matrix}\left( \begin{matrix}0 & 0 \\ 0 & 0\end{matrix} \right) & \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) \\ \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right) & \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\end{matrix} \right)\\&= \left( \begin{matrix}1 & 0 \\ 0 & 1\end{matrix} \right) + \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\Lambda + \Lambda^{2} = f^\bullet(\text{charpoly}(B)) = b^\bullet\end{align*}

\begin{CD} \mathbb{F}_4 {}^{2 \times 2} @>{\text{charpoly}}>> \mathbb{F}_4[\lambda] \\ @V{f^*}VV ~ @VV{f^\bullet}V \\ (\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2} @>>{\text{charpoly}}> (\mathbb{F}_2 {}^{2 \times 2})[\Lambda] \end{CD}

It should also be mentioned that charpoly*, taking the characteristic polynomials of the internal matrices, does not obey the same relationship. For one, the type is wrong: the codomain is a matrix containing polynomials, rather than a polynomial over matrices.

There does happen to be an isomorphism between the two structures (a direction of which we’ll discuss momentarily). But even by converting to the proper type, we already have a counterexample in the constant term from taking det* earlier.

Code

markdown $ "$$\\begin{align*}" ++ concat [
    "\\text{charpoly}^*(B^*) &= " ++
      texifyMatrix' (("\\text{charpoly}" ++) . texifyMatrix' show) mBStar ++
      "\\\\",
    "&= " ++
      texifyMatrix' (texifyPoly' "\\lambda" show)
        (fmap (fmap (`mod` 2) . charpoly) mBStar) ++
      "\\\\",
    "&\\cong " ++
      -- Not constructing this by isomorphism yet
      texifyPoly' "\\Lambda" texifyMatrix
        (Poly [
            toMatrix [[0,1], [1,1]],
            toMatrix [[0,1], [1,1]],
            toMatrix [[1,1], [1,1]]
          ]) ++
      "\\\\ \\\\",
    "&\\neq f^\\bullet(\\text{charpoly}(B))"
  ] ++
  "\\end{align*}$$"

\begin{align*}\text{charpoly}^*(B^*) &= \left( \begin{matrix}\text{charpoly}\left( \begin{matrix}0 & 0 \\ 0 & 0\end{matrix} \right) & \text{charpoly}\left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) \\ \text{charpoly}\left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right) & \text{charpoly}\left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\end{matrix} \right)\\&= \left( \begin{matrix}\lambda^{2} & 1 + \lambda + \lambda^{2} \\ 1 + \lambda + \lambda^{2} & 1 + \lambda + \lambda^{2}\end{matrix} \right)\\&\cong \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) + \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right)\Lambda + \left( \begin{matrix}1 & 1 \\ 1 & 1\end{matrix} \right)\Lambda^{2}\\ \\&\neq f^\bullet(\text{charpoly}(B))\end{align*}

Forgetting

Clearly, layering matrices has several advantages over how we usually interpret block matrices. But what happens if we do “forget” about the internal structure?

Haskell implementation of forget

import Data.List (transpose)

-- Massively complicated point-free way to forget double matrices:
--   1. Convert internal matrices to lists of lists
--   2. Convert the external matrix to a list of lists
--   3. There are now four layers of lists. Transpose the second and third.
--   4. Concat the new third and fourth layers together
--   5. Concat the first and second layers together
--   6. Convert the list of lists back to a matrix
forget :: Matrix (Matrix a) -> Matrix a
forget = toMatrix . concatMap (fmap concat . transpose) .
  fromMatrix . fmap fromMatrix

-- To see why this is the structure, remember that we need to work with rows
--   of the external matrix at the same time.
-- We'd like to read across the whole row, but this involves descending into two matrices.
-- The `fmap transpose` allows us to collect rows in the way we expect.
-- For example, for the above matrix, We get `[[[0,0],[0,1]], [[0,0],[1,1]]]` after the transposition,
--   which are the first two rows, grouped by the matrix they belonged to.
-- Then, we can finally get the desired row by `fmap (fmap concat)`ing  the rows together.
-- Finally, we `concat` once more to undo the column grouping.

mBHat = forget mBStar

markdown $ "$$\\begin{gather*}" ++ concat [
    "\\text{forget} : (\\mathbb{F}_2 {}^{2 \\times 2})^{2 \\times 2}" ++
      "\\longrightarrow \\mathbb{F}_2 {}^{4 \\times 4}" ++
      "\\\\[10pt]",
    "\\hat B = \\text{forget}(B^*) = \\text{forget}" ++
      texifyMatrix' (texifyMatrix' show) mBStar ++ " = " ++
      texifyMatrix mBHat,
    ""
  ] ++
  "\\end{gather*}$$"

\begin{gather*}\text{forget} : (\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2}\longrightarrow \mathbb{F}_2 {}^{4 \times 4}\\[10pt]\hat B = \text{forget}(B^*) = \text{forget}\left( \begin{matrix}\left( \begin{matrix}0 & 0 \\ 0 & 0\end{matrix} \right) & \left( \begin{matrix}0 & 1 \\ 1 & 1\end{matrix} \right) \\ \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right) & \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\end{matrix} \right) = \left( \begin{matrix}0 & 0 & 0 & 1 \\ 0 & 0 & 1 & 1 \\ 1 & 1 & 1 & 1 \\ 1 & 0 & 1 & 0\end{matrix} \right)\end{gather*}

Like f, forget preserves addition and multiplication, a fact already appreciated by block matrices. Further, by f, the internal matrices multiply the same as elements of GF(4). Hence, this shows us directly that GL(2, 4) is a subgroup of GL(4, 2).

However, an obvious difference between layered and “forgotten” matrices is the determinant and characteristic polynomial:

Code

markdown $ "$$\\begin{align*}" ++ intercalate " \\\\ \\\\ " (
  map (intercalate " & ") [
    [
      "\\det B^* &= " ++
        texifyMatrix ((`mod` 2) <$> determinant mBStar),
      "\\text{charpoly} B^* &= " ++
        texifyPoly' "\\Lambda" texifyMatrix (fmap (`mod` 2) <$> charpoly mBStar)
    ], [
      "\\det \\hat B &= " ++
        show ((`mod` 2) $ determinant mBHat),
      "\\text{charpoly} \\hat B &= " ++
        texifyPoly' "\\lambda" show ((`mod` 2) <$> charpoly mBHat)
    ]
  ]) ++
  "\\end{align*}$$"

\begin{align*}\det B^* &= \left( \begin{matrix}1 & 0 \\ 0 & 1\end{matrix} \right) & \text{charpoly} B^* &= \left( \begin{matrix}1 & 0 \\ 0 & 1\end{matrix} \right) + \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)\Lambda + \Lambda^{2} \\ \\ \det \hat B &= 1 & \text{charpoly} \hat B &= 1 + \lambda + \lambda^{2} + \lambda^{3} + \lambda^{4}\end{align*}

Another Forgotten Path

It’s a relatively simple matter to move between determinants, since it’s straightforward to identify 1 and the identity matrix. However, a natural question to ask is whether there’s a way to reconcile or coerce the matrix polynomial into the “forgotten” one.

First, let’s formally establish a path from matrix polynomials to a matrix of polynomials. We need only use our friend from the second post – polynomial evaluation. Simply evaluating a matrix polynomial r at λI converts our matrix indeterminate (Λ) into a scalar one (λ).

\begin{align*} \text{eval}_{\Lambda \mapsto \lambda I} &: (\mathbb{F}_2 {}^{2 \times 2})[\Lambda] \rightarrow (\mathbb{F}_2[\lambda]) {}^{2 \times 2} \\ &:: \quad r(\Lambda) \mapsto r(\lambda I) \end{align*}

Code

-- Function following from the evaluation definition above
-- Note that `Poly . pure` is used to transform matrices of `a`
--   into matrices of polynomials.

toMatrixPolynomial :: (Eq a, Num a) =>
   Polynomial (Matrix a) -> Matrix (Polynomial a)
toMatrixPolynomial xs = evalPoly eyeLambda $ fmap (fmap (Poly . pure)) xs where
  -- First dimensions of the coefficients
  (is, _)    = unzip $ map (snd . bounds . unMat) $ coeffs xs
  -- Properly-sized identity matrix times a scalar lambda
  eyeLambda  = eye (1 + maximum is) * toMatrix [[Poly [0, 1]]]


markdown $ "$$\\begin{align*}" ++
  "\\text{eval}_{\\Lambda \\mapsto \\lambda I}(\\text{charpoly}(B^*)) &=" ++
  texifyPoly' "(\\lambda I)" texifyMatrix
    (fmap (`mod` 2) <$> charpoly mBStar) ++
  "\\\\ &= " ++
  texifyMatrix' (texifyPoly' "\\lambda" show)
    (toMatrixPolynomial $ fmap (`mod` 2) <$> charpoly mBStar) ++
  "\\end{align*}$$"

\begin{align*}\text{eval}_{\Lambda \mapsto \lambda I}(\text{charpoly}(B^*)) &=\left( \begin{matrix}1 & 0 \\ 0 & 1\end{matrix} \right) + \left( \begin{matrix}1 & 1 \\ 1 & 0\end{matrix} \right)(\lambda I) + (\lambda I)^{2}\\ &= \left( \begin{matrix}1 + \lambda + \lambda^{2} & \lambda \\ \lambda & 1 + \lambda^{2}\end{matrix} \right)\end{align*}

Since a matrix containing polynomials is still a matrix, we can then take its determinant. What pops out is exactly what we were after…

Code

markdown $ "$$\\begin{align*}" ++
  "\\det(\\text{eval}_{\\Lambda \\mapsto \\lambda I}(" ++
    "\\text{charpoly}(B^*))) &=" ++
    "(1 + \\lambda + \\lambda^2)(1 + \\lambda^2) - \\lambda^2" ++
  "\\\\ &=" ++
  texifyPoly' "\\lambda" show
    (fmap (`mod` 2) <$> determinant $ toMatrixPolynomial $ charpoly mBStar) ++
  "\\\\ &= \\text{charpoly}{\\hat B}" ++
  "\\end{align*}$$"

\begin{align*}\det(\text{eval}_{\Lambda \mapsto \lambda I}(\text{charpoly}(B^*))) &=(1 + \lambda + \lambda^2)(1 + \lambda^2) - \lambda^2\\ &=1 + \lambda + \lambda^{2} + \lambda^{3} + \lambda^{4}\\ &= \text{charpoly}{\hat B}\end{align*}

…and we can arrange our maps into another diagram:

\begin{gather*} \begin{CD} (\mathbb{F}_2 {}^{2 \times 2})^{2 \times 2} @>{\text{charpoly}}>> (\mathbb{F}_2 {}^{2 \times 2})[\Lambda] \\ @V{\text{id}}VV ~ @VV{\text{eval}_{\Lambda \mapsto \lambda I}}V \\ - @. (\mathbb{F}_2 [\lambda])^{2 \times 2} \\ @V{\text{forget}}VV ~ @VV{\det}V \\ \mathbb{F}_2 {}^{4 \times 4} @>>{\text{charpoly}}> \mathbb{F}_2[\lambda] \end{CD} \\ \\ \text{charpoly} \circ \text{forget} = \det \circ ~\text{eval}_{\Lambda \mapsto \lambda I} \circ\text{charpoly} \end{gather*}

It should be noted that we do not get the same results by taking the determinant after applying charpoly*, indicating that the above method is “correct”.

Code

markdown $ "$$\\begin{align*}" ++
  "\\text{charpoly}^*(B^*) &=" ++
  texifyMatrix' (texifyPoly' "\\lambda" show)
    (fmap (`mod` 2) <$> fmap charpoly mBStar) ++
  "\\\\ \\\\" ++
  "\\det(\\text{charpoly}^*(B^*)) &=" ++
  "\\lambda^2(1 + \\lambda + \\lambda^2) - (1 + \\lambda + \\lambda^2)^2" ++
  "\\\\ &= " ++
  texifyPoly' "\\lambda" show
    (fmap (`mod` 2) <$> determinant $ fmap charpoly mBStar) ++
  "\\end{align*}$$"

\begin{align*}\text{charpoly}^*(B^*) &=\left( \begin{matrix}\lambda^{2} & 1 + \lambda + \lambda^{2} \\ 1 + \lambda + \lambda^{2} & 1 + \lambda + \lambda^{2}\end{matrix} \right)\\ \\\det(\text{charpoly}^*(B^*)) &=\lambda^2(1 + \lambda + \lambda^2) - (1 + \lambda + \lambda^2)^2\\ &= 1 + \lambda^{3}\end{align*}

Cycles and Cycles

Since we can get \lambda^4 + \lambda^3 + \lambda^2 + \lambda + 1 in two ways, it’s natural to assume this polynomial is significant in some way. In the language of the the second post, the polynomial can also be written as ₂31, whose root we determined was cyclic of order 5. This happens to match the order of B in GL(2, 4).

Perhaps this is unsurprising, since there are only so many polynomials of degree 4 over GF(2). However, the reason we see it is more obvious if we look at the powers of scalar multiples of B. First, recall that f* takes us from a matrix over GF(4) to a matrix of matrices of GF(2). Then define a map g that gives us degree 4 polynomials:

\begin{gather*} g : \mathbb{F}_4^{2 \times 2} \rightarrow \mathbb{F}_2[\lambda] \\ g = \text{charpoly} \circ \text{forget} \circ f^* \end{gather*}

Code

g = fmap (`mod` 2) . charpoly . forget . fStar

showSeries varName var = "$$\\begin{array}{}" ++
  " & \\scriptsize " ++
  texifyMatrix var ++
  "\\\\" ++
  intercalate " \\\\ " [
      (if n == 1 then varName' else varName' ++ "^{" ++ show n ++ "}") ++
        "& \\overset{g}{\\mapsto} &" ++
        texPolyAsPositional' "\\lambda" (g $ var^n)
      | n <- [1..5]
    ] ++
  "\\end{array}$$" where
  varName' = if length varName == 1 then varName else "(" ++ varName ++ ")"

markdown $ showSeries "B" mBOrig
markdown $ showSeries "αB" (fmap (AlphaF4*) mBOrig)
markdown $ showSeries "α^2 B" (fmap (Alpha2F4*) mBOrig)

\begin{array}{} & \scriptsize \left( \begin{matrix}0 & α \\ α^2 & α^2\end{matrix} \right)\\B& \overset{g}{\mapsto} &11111_{\lambda} \\ B^{2}& \overset{g}{\mapsto} &11111_{\lambda} \\ B^{3}& \overset{g}{\mapsto} &11111_{\lambda} \\ B^{4}& \overset{g}{\mapsto} &11111_{\lambda} \\ B^{5}& \overset{g}{\mapsto} &10001_{\lambda}\end{array}

\begin{array}{} & \scriptsize \left( \begin{matrix}0 & α^2 \\ 1 & 1\end{matrix} \right)\\(αB)& \overset{g}{\mapsto} &10011_{\lambda} \\ (αB)^{2}& \overset{g}{\mapsto} &10011_{\lambda} \\ (αB)^{3}& \overset{g}{\mapsto} &11111_{\lambda} \\ (αB)^{4}& \overset{g}{\mapsto} &10011_{\lambda} \\ (αB)^{5}& \overset{g}{\mapsto} &10101_{\lambda}\end{array}

\begin{array}{} & \scriptsize \left( \begin{matrix}0 & 1 \\ α & α\end{matrix} \right)\\(α^2 B)& \overset{g}{\mapsto} &11001_{\lambda} \\ (α^2 B)^{2}& \overset{g}{\mapsto} &11001_{\lambda} \\ (α^2 B)^{3}& \overset{g}{\mapsto} &11111_{\lambda} \\ (α^2 B)^{4}& \overset{g}{\mapsto} &11001_{\lambda} \\ (α^2 B)^{5}& \overset{g}{\mapsto} &10101_{\lambda}\end{array}

The matrices in the middle and rightmost columns both have order 15 inside GL(2, 4). Correspondingly, both 10011_λ = ₂19 and 11001_λ = ₂25 are primitive, and so have roots of order 15 over GF(2).

A Field?

Since we have 15 matrices generated by the powers of one, you might wonder whether or not they can correspond to the nonzero elements of GF(16). And they can! In a sense, we’ve “borrowed” the order 15 elements from this “field” within GL(4, 2). However, none of the powers of this matrix are the companion matrix of either ₂19 or ₂25.

Haskell demonstration of the field-like-ness of these matrices

All we really need to do is test additive closure, since the powers trivially commute and include the identity matrix.

-- Check whether n x n matrices (mod p) have additive closure
-- Supplement the identity, even if it is not already present
hasAdditiveClosure :: Integral a => Int -> a -> [Matrix a] -> Bool
hasAdditiveClosure n p xs = all (`elem` xs') sums where
  -- Add in the zero matrix
  xs' = zero n:xs
  -- Calculate all possible sums of pairs (mod p)
  sums = map (fmap (`mod` p)) $ (+) <$> xs' <*> xs'

-- Generate the powers of x, then test if they form a field (mod p)
generatesField :: Integral a => Int -> a -> Matrix a -> Bool
generatesField n p x = hasAdditiveClosure n p xs where
  xs = map (fmap (`mod` p) . (x^)) [1..p^n-1]


print $ generatesField 4 2 $ forget $ fStar $ fmap (AlphaF4*) mBOrig

True

More directly, we might also observe that α²B is the companion matrix of an irreducible polynomial over GF(4), namely q(x) = x^2 - \alpha x - \alpha.

Both the “forgotten” matrices and the aforementioned companion matrices lie within GL(4, 2). A natural question to ask is whether we can make fields by the following process:

Filter out all order-15 elements of GL(4, 2)
Partition the elements and their powers into their respective order-15 subgroups
Add the zero matrix into each class
Check whether all classes are additively closed (and are therefore fields)

In this case, it happens to be true, but proving this in general is difficult, and I haven’t done so.

Expanding Dimensions

Of course, we need not only focus on GF(4) – we can just as easily work over GL(2, 2r) for other r than 2. In this case, the internal matrices will be r×r while the external one remains 2×2. But neither do we have to work exclusively with 2×2 matrices – we can work over GL(n, 2^r). In either circumstance, the “borrowing” of elements of larger order still occurs. This is summarized by the following diagram:

\begin{CD} \underset{ \scriptsize S \text{ (order $k$)} }{ \text{SL}(n,2^r) } @>>> \underset{ \scriptsize \begin{matrix} S \text{ (order $k$)} \\ T \text{ (order $2^{nr}-1$)} \end{matrix} }{ \text{GL}(n, 2^r) } @>{\text{forget} \circ f_{r}^*}>> {\text{GL}(nr, 2)} @<{f_{nr}}<< \underset{ \scriptsize \begin{matrix} s \text{ (order $k$)} \\ t \text{ (order $2^{nr}-1$)} \end{matrix} }{ \mathbb{F}_{2^{nr}} } \end{CD}

Here, f_r is our map from GF(2^r) to r×r matrices and f_nr is a similar map. r must greater than 1 for us to properly make use of matrix arithmetic. Similarly, n must be greater than 1 for the leftmost GL. Thus, nr is a composite number. Here, k is a proper factor of 2^nr - 1. In the prior discussion, k was 5 and 2^nr - 1 was 15.

Recall that primitive polynomials over GF(2^nr) have roots with order 2^nr - 1. This number can never be prime, since the only primes of the form 2^p - 1 are Mersenne primes – p itself must be prime. Thus, in GL of prime dimensions, we can never loan to a GL over a field of larger order with the same characteristic. Conversely, GL(nr + 1, 2) trivially contains GL(nr, 2) by fixing a subspace. So we do eventually see elements of order 2^m - 1 for either prime or composite m.

Other Primes

This concern about prime dimensions is unique to characteristic 2. For any other prime p, p^m - 1 is composite since it is at the very least even. All other remarks about the above diagram should still hold for any other prime p.

In addition, the diagram where we found a correspondence between the orders of elements in GL(2, 2²) and GF(2^2×2) via the characteristic polynomial also generalizes. Though I have not proven it, I strongly suspect the following diagram commutes, at least in the case where K is a finite field:

\begin{CD} (K^{r \times r})^{n \times n} @>{\text{charpoly}}>> (K^{r \times r})[\Lambda] \\ @V{\text{id}}VV ~ @VV{\text{eval}_{\Lambda \mapsto \lambda I}}V \\ - @. (K [\lambda])^{r \times r} \\ @V{\text{forget}}VV ~ @VV{\det}V \\ K^{nr \times nr} @>>{\text{charpoly}}> K[\lambda] \end{CD}

Over larger primes, the gap between GL and SL may grow ever larger, but SL over a prime power field seems to inject into SL over a prime field. If the above diagram is true, then the prior statement follows.

Monadicity and Injections

The action of forgetting the internal structure may sound somewhat familiar if you know your Haskell. Remember that for lists, we can do something similar – converting [[1,2,3],[4,5,6]] to [1,2,3,4,5,6] is just a matter of applying concat. This is an instance in which we know lists to behave like a monad. Despite being an indecipherable bit of jargon to newcomers, it just means we:

can apply functions inside the structure (for example, to the elements of a list),
have a sensible injection into the structure (creating singleton lists, called return), and
can reduce two layers to one (concat, or join for monads in general).
- Monads are traditionally defined using the operator >>=, but join = (>>= id)

Just comparing the types of join :: Monad m => m (m a) -> m a and forget :: Matrix (Matrix a) -> Matrix a suggests that Matrix (meaning square matrices) could be a monad, and further, one which respects addition and multiplication. Of course, this is only true when our internal matrices are all the same size. In the above diagrams, this restriction has applied, but should be stated explicitly since no dimension is specified by Matrix a.

Condition 2 gives us some trouble, though. For one, only “numbers” (elements of a ring) can go inside matrices, which restricts where monadicity can hold. More importantly, we have a lot of freedom in what dimension we choose to inject into. For example, we might pick a return that uses 1×1 matrices (which add no additional structure). We might also pick return2, which scalar-multiplies its argument to a 2×2 identity matrix instead.

Unfortunately, there’s no good answer. At the very least, we can close our eyes and pretend that we have a nice diagram:

\begin{gather*} \begin{matrix} & L\underset{\text{degree } r}{/} K \\ \\ \small f & \begin{matrix} | \\ \downarrow \end{matrix} \\ \\ & K^{r \times r} \end{matrix} & \quad & \quad & \begin{matrix} & (L\underset{\text{degree } r}{/} K)^{n \times n} \\ \\ \small f^* & \begin{matrix} | \\ \downarrow \end{matrix} & \searrow & \small \texttt{>>=} ~ f \qquad \\ \\ & (K^{r \times r})^{n \times n} & \underset{\text{forget}} {\longrightarrow} & K {}^{nr \times nr} \end{matrix} \end{gather*}

As one last note on the monadicity of matrices, I have played around with an alternative Matrix type which includes scalars alongside proper matrices, which would allow for a simple canonical injection. Unfortunately, it complicates join – we just place the responsibility of sizing the internal matrices front-and-center since we can correspond internal scalars with identity matrices.

Closing

At this point, I’ve gone on far too long about algebra. One nagging curiosity makes me wonder whether the there are any diagrams like the following:

\begin{matrix} & (L\underset{\text{degree } r}{/} K)^{n \times n} & & & & (L\underset{\text{degree } n}{/} K)^{r \times r} \\ \\ \small f_1^* & \begin{matrix} | \\ \downarrow \end{matrix} & \searrow & & \swarrow & \begin{matrix} | \\ \downarrow \end{matrix} & \small f_2^* \\ \\ & (K^{r \times r})^{n \times n} & \underset{\text{forget}} {\longrightarrow} & K {}^{nr \times nr} & \underset{\text{forget}}{\longleftarrow} & (K^{n \times n})^{r \times r} \end{matrix}

Or in English, whether “rebracketing” certain nr × nr matrices can be traced back to not only a degree r field extension, but also one of degree n.

The mathematician in me tells me to believe in well-defined structures. Matrices are one such structure, with myriad applications. However, the computer scientist in me laments that the application of these structures is buried in symbols and that layering them is at most glossed over. There is clear utility and interest in doing so, otherwise the diagrams shown above would not exist.

Of course, there’s plenty of reason not to go down this route. For one, it’s plainly inefficient – GPUs are built on matrix operations being as efficient as possible, i.e., without the layering. It’s also inefficient to learn for people just learning matrices. I’d still argue that the method is useful for learning about more complex topics, like field extensions.