Set Theory
Definitions
A set is an unordered, duplication-free collection of values (values = strings, numbers, functions, objects, sets, whatever).
Examples:
{0, 1, 2, 3, 5, 8, 13}
{}
{a, e, i, o, u}
{{}, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3}}
Unordered means order doesn't matter:
{1, 2, 3} = {1, 3, 2} = {2, 1, 3} = etc.
Duplication-free means that duplicated elements are ignored:
{1, 1, 2, 3} = {1, 2, 3}
Suppose S = {1, 2, 3}, then 1, 2, and 3 are the members of S. Equivalently, S contains 1, 2, and 3.
Notation:
1 S & 2 S & 3 S & 4 S
We use a special symbol to denote the empty set:
{} =
The cardinality or size of a set is the number of members:
|S| = 3
|| = 0
A multiset (or bag) is an unordered collection of values.
Notation:
S = [1, 2, 3] = multiset containing 1, 2, & 3 (Just made up [].)
Duplicates are not ignored in multisets:
[1, 1, 2, 3] != [1, 2, 3]
A list (or sequence) is an ordered collection of values. Duplicates are allowed and order is important.
For example:
(1, 2, 3) != (3, 2, 1)
(1, 1, 2, 3) != (1, 2, 3)
Each element in a list has a unique position. The first element has position 0.
For example, suppose L = (a, e, i, o, u), then L[0] = a, L[1] = e, etc.
More notation:
In general, there are two ways to describe a set.
Set lister notation:
A = {1, 2, 3}
Set builder notation:
A = {x | x == 1 || x == 2 || x == 3}
In English:
A = the set of all x such that x equals 1 or x equals 2 or x equals 3.
Set lister notation is convenient for small sets. We sometimes stretch it to describe infinite sets by using ellipses:
EVEN = {0, 2, 4, 6, ... }
Set builder notation is more formal since any property can come after the "such that" pipe:
PRIME = {x | 1 < x & x = a * b implies a = x || b = x}
Operations
Assume A and B are sets, then so are:
A B = {x | x A || x B}
A X B = {(x, y) | x A & y B}
Ak = A X A X A ... X A (k-times)
A – B = {x | x A & x B}
A B = {x | x A & x B}
P(A) = {x | x A}
A -> B = {f | f is a function & domain(f) A & range(f) B}
A is a subset of B if every member of A is also a member of B. Notation: A B
A equals B (A = B) if A B & B A
Note: For each set A, A
It is conventional to write f A->B as f: A->B.
Famous Sets
NAT = {0, 1, 2, ... }
INT = {0, 1, -1, 2, -2, ... }
RAT = {a/b | a, b INT & 0 < b}
REAL = all possible magnitudes and their negatives
STRING = all strings of char
BOOLEAN = {true, false}
Infinities
Yes, there is more than one infinity. In fact, there are infinity infinities, somemore infiniter than others!
Theorem: For every set A, |P(A)| = 2|A|
Proof: By induction on |A|.
Induction is a cool way to prove things. Here's the idea. Start by defining:
W = { n NAT | if |A| = n, then |P(A)| = 2n}
In other words:
W = { n NAT | the theorem is true if |A| = n}
Our goal will be to prove our theorem by proving W = NAT. We do this by showing:
(i) 0 W
(ii) if n - 1 W, then n W
In other words, W contains the increment of any member by (ii), and since 0 W by (i), then 1 W. Applying (ii) again we conclude that 2 W, and so on.
The cool feature of induction is that (ii) allows us to assume n – 1 W. This is a powerful assumption. In essence a proof by induction is analogous to a recursive algorithm.
proof of (i): If |A| = 0, then A = and P(A) = {} and |P(A)| = 1 = 20.
proof of (ii): Assume |A| = n > 0. Pick a random element aA. Let B = A – {a}. Then |P(B)| = 2n-1 by induction. Of course P(B) P(A).
P(A) – P(B) = {x {a} | x B}, so |P(A – B)| = |P(B)| and therefore |P(A)| = 2 * 2n-1 = 2n.
QED.
Assume A and B are infinite sets. What does it mean to say that |A| = |B|?
Definition |A| = |B| if there is an f: A -> B such that:
- x != y implies f(x) != f(y) (f is one-to-one)
- for every b: B there is an a: A such that f(a) = b (f is onto)
A one-to-one onto function is called a bijection. Note that if f is a bijection, then so is f-1: B -> A, where f(f-1(a)) = a.
Note, using this definition, it's possible that a proper subset can have the same cardinality as the superset.
Theorem: |NAT| = |EVEN|
Proof: This is true because the function f(n) = 2 * n is a bijection. QED.
Theorem: If A = {n NAT | n = 2k for some k}, then |A| = |NAT|
Proof: Left as an exercise.
Theorem (Euclid): If A = {n NAT | n is prime}, then |A| = |NAT|
Proof: Clearly |A| <= |NAT|. If |A| < |NAT|, then A = {p0, p1, p2, ...,pn}, where 2 = p0 < p1 < p2 < ... < pn.
Let k = (p0*p1*p2*...*pn), and let p = k + 1. Assume p = r * pi for some i and r, then p = pi*s + 1 = pi * r, so pi = 1/(r – s). This is impossible if pi is a NAT larger than 1, hence p is a prime larger than all of the primes assumed to exist.
QED.
Theorem: |A x B| <= |A| * |B|
Proof: Left as an exercise.
Theorem: |NAT| = |INT| = |RAT| = |STRING|
Proof: Left as an exercise.
Theorem: |NAT x NAT| = |NAT|
Proof: Left as an exercise.
Theorem: |A B| <= |A| + |B|
Proof: Left as an exercise.
Theorem: A B implies |A| <= |B|
Proof: Left as an exercise.
Theorem: If A NAT and A is finite (i.e., |A| < |NAT|), then |NAT – A| = |NAT|
Proof: Left as an exercise.
Theorem: If A = all binary strings, then |A| = |NAT|
Proof: Left as an exercise.
Theorem: If A = all valid C programs, then |A| = |NAT|
Proof: Left as an exercise.
Theorem: If A = all protons in the universe, then |A| <= |NAT|
Proof: Left as an exercise.
Theorem (Cantor): |NAT| < |P(NAT)|
Proof: Clearly |Nat| <= |P(NAT)|.
Now assume there is a bijection f: NAT -> P(NAT). Let D = {n: NAT | n f(n)}. Since f is onto, f(m) = D for some m. Is m a member of D? If yes, then mf(m), but this implies m D. On the other hand, if m f(m), then by definition m D, i.e., m f(m). A contradiction either way!
QED.
Theorem: |A| < |P(A)| for any set A.
Proof: Left as an exercise.
Notation: Even though |A| may be infinite, we still use exponentiation notation to denote the cardinality of its power set: |P(A)| = 2|A|.
We also write |P(A)| = |A|+. So for all A, |A| < 2|A| = |A|+.
Theorem: if 2 <= |B|, then |A| < |A->B|
Proof: Note that |P(A)| = |A->BOOLEAN| and D = {n: NAT | f(n) = false}.
More generally, assume f: A -> (A->B) is a bijection. Let g:A->B be defined by g(a) != f(a)(a). This should be possible since B contains more than one element. Now assume g = f(k) for some k A. then g(k) != f(k)(k) = g(k), a contradiction!
QED.
Theorem: |NAT| < |REAL|
Proof: The proof depends on a famous bijection from NAT->NAT to REAL.
So, we have an infinity of infinities: |NAT| < |P(NAT)| < |P(P(NAT))| < ...
Notation: |NAT| = 0
0 is called countable infinity since any set with this cardinality can be put in a one-to-one correspondence with NAT, i.e., it can be counted.
The next infinity after 0 is 1, and so on:
012 < ... < 0 < ...
Generalized Continuum Hypothesis: i+1 = i+.
This is a hypothesis because it is known that in some models of set theory it is true, while it is false in others. For example, in some weird models of set theory 0+ = 2.