Catalan Adventure

Catalan Adventure

“You like mathematical puzzles, don’t you?”, said a friend. “Try this one: it’s called the ‘brackets problem’.”

Consider any mathematical expression, such as (a+b)(c+d). Get rid of everything except the brackets, leaving ()(). How many legitimate orderings of these brackets are there? In this case with two (pairs of) brackets, the answer is two: (()) and ()(), or, letting O mean ‘open’ and C mean ‘closed’: OOCC and OCOC.

So, letting F(N) denote the function for N brackets, we have F(2)=2. And obviously F(1) =1. We want to find the function F(N) and prove that our formula is correct.

A useful recurrence relation

It would be good to be able to generate the first few terms of the sequence F(N) and it turns out that there is a way to do this. It helps to represent the problem in a sort of ‘mountain range’ diagram, whereby bracket openings are ascents and bracket closings are descents. Thus OOCC and OCOC look like:

and

Admissible solutions to the brackets problem for any N can be represented as mountain ranges as above, the key constraint being that these can never go below ground level, i.e. there can never have been more Cs than Os at any point along the way.

For any N, we can classify solutions according to their ‘first return’, i.e. the number of brackets, K, after which they return to ground level for the first time. In the example above K=2 for OOCC and K=1 for OCOC. Illustrated below is a solution for N=5, with K=3.

How many such solutions are there? Since, by construction, the first set of three brackets starts with an O and ends with a C, there are F(2) permutations of that set: the interior part, above the dashed line, forms a 2-bracket problem of its own. And for the last set of two brackets, we just have the original brackets problem for N=2, therefore F(2) permutations again. This gives a total of F(2)*F(2) permutations for N=5 and K=3. Adding up for all values of K from 1 to 5 and defining F(0)=1 gives:

F(5)=F(0)*F(4)+F(1)*F(3)+F(2)*F(2)+F(3)*F(1)+F(4)*F(0)

The obvious general form of this recurrence relation can be used to construct F(N) for any given N in terms of preceding values of F. It is therefore easy to generate the first few values after N=2:

F(3)=5, F(4)=14, F(5)=42, F(6)=132, F(7)=429, F(8)=1430

Guessing the answer

The brackets problem has a close affinity with the simple combinatorial problem of finding the number of ways of choosing N objects from 2N, viz: (2N)!/(N!)2. Indeed this would be the solution, were it not for the side constraint that there may never be more Cs than Os as we move along the sequence. Denoting this function by P(N), we have:

P(1)=2, P(2)=6, P(3)=20, P(4)=70, P(5)=252 …

Comparing F(N) with P(N) it appears that P(N)=(N+1)*F(N) so that

F(N) = (2N)!/(N!(N+1)!)

It remains to prove that this beguilingly simple formula is the correct one.

Pascal’s triangle with a reflecting barrier

The values of P(N) appear in the central spine of Pascal’s triangle, and the numbers in each box in the triangle can be obtained, sequentially moving down, by adding up the numbers in the boxes immediately above. So that P(3)=20 tells us that there are twenty ways of ordering a collection of three Os and three Cs: if an O is a move down and to the right and a C is a move down and to the left, then – equivalently – 20 is the number of routes to the box in question, starting at the top of the triangle.

We can construct an equivalent diagram for the brackets problem, simply by removing all of the cells to the left of the spine, as shown on the next page.

The values of F(N) are in the yellow shaded boxes. It is of interest to compare the values in the other (green) boxes with their equivalents in the Pascal triangle. Let us define G(N, Z) as the value in the box Z steps to the right of the box containing F(N). So G(N,0)=F(N). In the figure above G(4,0)=14, G(4,1)=28, G(4,2)=20, G(4,3)=7 and G(4,4)=1. We define the equivalent function for the (half) Pascal triangle below as Q(N,Z),

It is well-known, and easy to see, that:

Q(N,Z) = (2N)!/(N+K)!(N-K)!

Inspection of the diagrams suggests the conjecture

G(N,Z) = (2Z+1)(Z+N+1)-1Q(N,Z)(0)

This fits, of course, for Z=0 but also for Z>0. For example we have G(4,1)=28 and Q(4,1)=56. In this case (2Z+1)/(Z+N+1) = ½.

Proof by induction

We suppose the conjectured formula (0) to be true for N=n and show that it then holds for N=n+1. Doing this requires a little care, because the ‘parents’ (or perhaps ‘grandparents’) in row n of a given cell in the row (n+1) differ at the edges from in the middle, as can be seen from the diagram.

Using the diagram, for K=0, we have:

G(n+1, 0) = G(n,0)+G(n,1)(1)

For K=1, 2, …, n-1, we have three (grand)parents, so

G(n+1,K) = G(n, K-1)+2G(n,K)+G(n,K+1)(2)

For K=n, we have only two parents

G(n+1,n) = G(n,n-1) + 2G(n,n)(3)

And for K=n+1, we have just one parent

G(n+1,n+1)=G(n,n)(4)

We have to prove that formula (0) yields the correct result for G(n+1,K) in all four equations.

Considering theRHS of equation (1), our hypothesized formula for G gives:

G(n+1, 0)= (n+1)-1Q(n,0) +3(n+2)-1Q(n,1)

= (n+1)-1Q(n,0) +3n(n+2)-1(n+1)-1 Q(n,0)

= 2(2n+1)(n+1)-1 (n+2)-1Q(n,0)

Using Q(n+1,0) = (2n+2)(2n+1)(n+1)-2Q(n,0) = 2(2n+1)(n+1)-1Q(n,0), we obtain

G(n+1, 0)= (n+2)-1Q(n+1,0) as required.

Similar calculations verify that (2), (3) and (4) also hold. Since we can see that (0) holds for, say, N=2, it must therefore hold for all N, completing the proof.

Endnote

Long after arriving at the above solution to the brackets problem (which was itself long after the problem had been posed to me) I discovered that the problem was well known under another name. The numbers F(N) are known as the Catalan numbers. Wikipedia at the time of writing (March 2015) gives no fewer than five proofs of the formula. This paper provides a sixth.

There is, of course, nothing particularly interesting about this proof, although it serves to illustrate the principle that one may sometimes be able to solve a given problem by solving a more general one first. Although I discovered the recurrence relation and had guessed the correct formula F(N) quite quickly, I was able to make progress only when it occurred to me much later to consider the Pascal triangle as a whole, thus leading me to the relation between G(N, K) and Q(N, K).

RKE March 2015