If we’re going to talk about abstract math, we need to define some terms. We’ll start with the foundational ones and try to iteratively increase complexity.
What is a category?
A category C consists of:
- A class of objects ob(C)
- For each pair of objects (A, B), a class of morphisms hom(A, B)
- For each triple (A, B, C), a composition operation ∘, such that ∘ : hom(B, C) × hom(A, B) → hom(A, C)
- For each object A, an identity morphism idA ∈ hom(A, A)
Such that these two axioms hold:
- Associativity: h ∘ (g ∘ f) = (h ∘ g) ∘ f
- Identity: idB ∘ f = f = f ∘ idA for all f: A → B
Types as a category
I know that’s a lot if it’s your first time seeing it. But we’re all programmers here, right? Let’s talk about a category we all know: the types. The types themselves—strings and ints and whatnot—are the objects. Morphisms are functions that take us from one type to another. The hom annotation used above is just a shorthand for “the set of all morphisms from A → B”—in this example, C++ functions.
// foo is a morphism from string -> size_t
size_t foo(string s) { return s.length(); }
If we have a function from typeA to typeB, and another function from typeB to typeC, we can easily write a function from typeA to typeC by combining (composing) them:
bool is_zero(size_t st) { return st == 0; }
// bar is morphism from string -> bool, formed by
// composing foo and is_zero
bool bar(string s) { return is_zero(foo(s)); }
For every type we can write an identity function that returns whatever it is passed unchanged.
template <typename A>
A id(A arg) { return arg; }
The axioms look abstract but they’re things you rely on constantly without thinking about it. Associativity:
char a_or_b(bool b) { return b ? 'a' : 'b'; }
// comp_one : a_or_b . is_zero
char comp_one(size_t st) { return a_or_b(is_zero(st)); }
// comp_two : is_zero . foo
bool comp_two(string s) { return is_zero(foo(s)); }
// (a_or_b . is_zero) . foo = a_or_b . (is_zero . foo)
string s = "yay category theory";
char a = comp_one(foo(s));
char b = a_or_b(comp_two(s));
assert(a == b);
And finally, identity:
// foo . id is the same as id . foo
size_t a = id(foo(s));
size_t b = foo(id(s));
assert(a == b);
This is an example of a category in which there are potentially many morphisms from A → B. But it doesn’t always have to be like this.
Posets as categories
Let’s take a look at another category, this time from math—partially ordered sets (posets). Say you have the set of the integers ℤ, and the morphism is ≤:
- If a ≤ b, then there’s a morphism a → b. If not, there isn’t.
- If a ≤ b and b ≤ c, then a ≤ c (composition).
- For every integer there’s an identity morphism, because every integer is less than or equal to itself.
- Associativity is trivial. Let’s say you have three morphisms in a row:
- f : x → y (so x ≤ y)
- g : y → z (so y ≤ z)
- h : z → w (so z ≤ w)
- Then (h ∘ g) ∘ f means that we compose z ≤ w and y ≤ z, getting y ≤ w, which we then compose with x ≤ y, to end up with x ≤ w.
- It’s easy to see that h ∘ (g ∘ f) will result in the same thing.
- Finally, identity:
- id : x ≤ x (the identity morphism)
- f : x → y (so x ≤ y)
- then id ∘ f and f ∘ id are equal.
Unlike with our types discussion, the morphisms for posets aren’t functions but rather relationships. There’s a “less-than-or-equal” relationship between 1 and 2, but there isn’t one between 8 and 7. This means unlike functions in programming (where there may be many functions that take a string and return an int), there is only at most a single relationship between two objects in a poset. Either there is a relationship, or there’s not. That’s why associativity and identity are trivial—any composition of morphisms that starts and ends in the same place must be the same because there’s only one of them!
Posets are an example of a thin category, where there is either zero or one morphism between any two objects.
Fly me to the moon
Finally, let’s take a turn to the everyday: flight plans.
- Objects: airports (MCO, SFO, PVG, DXB, etc.)
- Morphisms: flights between those airports
- Composition: If there’s a flight from MCO → SFO, and a flight from SFO → PVG, then taking that combined flight is the equivalent of taking a single flight MCO → PVG (though after that many hours in economy you may not be in the mood for category theory).
- Identity Morphism: There’s an “identity flight,” which is a flight that doesn’t take off, so you depart from and arrive at the same airport (let’s hope you’re not on the tarmac for too long).
- Associativity: a little strained, but there’s no difference in flying from MCO → SFO → PVG on one day and then PVG → DXB on the next, versus flying from MCO → SFO on one day and SFO → PVG → DXB on the next. Whether you view your trip as
(MCO → SFO → PVG)then→ DXBorMCO →then(SFO → PVG → DXB), the total path is the same. - Identity: Taking the “identity flight” MCO → MCO then flying to SFO is the same as flying from MCO → SFO and taking the SFO → SFO “identity flight.”
This example is easy to imagine, but it actually has a bit of a wrinkle definitionally. Is the flight from MCO → ORD → PVG the same as the flight from MCO → SFO → PVG?
- If we take our morphism to be “is there a flight route between these cities,” then we have a thin category again, where the morphism either exists or it doesn’t.
- If we consider those two different morphisms (and if you’re flying in January they are definitely different), then we have a free category, where the morphisms are all possible paths (unique sequences of flights) and no further structure is imposed.
That doesn’t matter very much for our discussion here, but it’s interesting nonetheless.
Whew
We covered a lot of ground today. We defined what a category is, and looked at three examples: types and functions, integers under ≤, and airports and flights. The examples look nothing like each other, but they all satisfy the same four conditions and two axioms. That’s the point: a category is an abstraction over structure, not over content. Next week we’ll look at one more example that I think will change how you think about something you use every day.