CMU ACM ICPC Selection Round 1

The CMU ACM ICPC Team Selection Round 1 featured a mix of problems from old ICPC regional contests. Although I did not do very well due to a number of stupid bugs, problem statement misreadings, failure to make key observations, and the such, I have been going through the problems trying to solve them. The problem statements are available here. Here is a selection of explanations:

1 Problem A: Binary Tree

This is a dynamic programming problem. First, let us ignore the “up” instructions. Suppose that we are given some set of instructions:


How do we count the number of distinct paths in the binary tree that you can make from taking any subset of these instructions? This is the same as asking, how many distinct subsequences you can make from the string. Suppose you already know the answer to the string containing the last n-1 letters RRLR, and you add a new one in front.

If you add L to the front, let’s see what subsequences we can get from LRRLR. At first glance it looks like it’s twice as many as the subsequences of RRLR, since for any subsequence of RRLR, you can add an L in front of it. However, this will double count the string LR, since you can get this from both adding L to R, as well as just taking the subsequence LR itself. This will also double count the string L, which can be obtained from both adding L to the empty string, as well as just taking the subsequence L itself. In fact, this double counts all subsequences of R. See a pattern? What we really want is twice the subsequences of RRLR minus the number of subsequences past the next occurrence of L.

1 \displaystyle{\begin{align}f(\text{LRRLR}) = 2f(\text{RRLR}) - f(\text{R})\end{align}}

Conversely, when adding R, the number of subsequences is twice the number of subsequences of the old string, minus the number of subsequences for the substring past the next occurrence of R.

Now, we can take into account the U commands. Suppose you arrived at your present spot by some path formed from the input S, say, RLRRR. If we go up one level, and then go down a path starting with R, then it is not a new path at all. Instead, we only want to count the subsequences starting with L. So, starting from the position of the U, we find the next occurrence of L, and add the dynamic programming result for that subproblem to our answer. Conversely, if your path ends with L, we would want to find the next occurrence of R instead.

2 Problem B: Strange Billboard


If we pick the following tiles to flip (probably a poor choice):


Then we would get:


Observe that the only way you can clear the tiles in the first row, without having to flip any more tiles in the first row, is to flip the tiles directly below the remaining X tiles, and not to flip any other tile in the second row (or else you would mess up the first row forever). This tells you exactly which tiles you can and cannot flip in the second row. Likewise, having done this, we know exactly which tiles we must flip on the third row, and so forth.

When you reach the last row, if it is already cleared, then hooray! You have found a solution. Otherwise, it is not a valid solution.

The brute force part is to try all the possible patterns of tiles to flip in the first row. Since there are at most 16 columns, you will only have to try 65536 possibilities. It is easy to implement this efficiently using bitwise methods. In GCC, the __builtin_popcount(int) method tells you the number of bits set in an integer efficiently.

3 Problem C: Phone Cell

This is a classical computational geometry problem. Simply put, you are given a number of points in the plane, and you wish to find the best place to put a unit disk such that it covers the maximum number of points.

3.1 Slow solution

The naive way to approach this is to observe that, for any solution, without loss of generality, you can shift the solution such that two points lie on the edge of the circle. Thus, you can try all possible ways to do this: for each pair of points, find the two circles that pass through these points. Then, for each of those circles, go through all the other points and see if they fit inside. This takes O(n^3), and since n can be a thousand, it is too slow. You can try to speed it up using quadtrees or something, but it is not worth it.

3.2 Fast solution

The proper way is an angular sweep. For each point p, rotate a circle p such that p is on the boundary of the circle. For each other point, we can compute the angle at which the circle will hit the point (start event), and the angle at which the circle will leave the point (stop event). There are 2n events, and sorting them takes n\log{n} time. So, in total, the algorithm runs in O(n^2\log{n}). This is fast enough.

The trigonometry needed to compute the positions of the events will be left as an exercise for the reader.

3.3 Slightly faster solution

It is possible to do this in O(n^2) expected time using O(n^2) space. The observation is that, if a circle centered at some point q covers the point p, then it is equivalent to saying that a circle centered at p covers q. Thus, we can draw a unit circle centered at every point, and find out where is covered by the most circles. You can compute a circle arrangement, iterate through all the faces, and find the face covered by the most points. This is mathematically very similar to computing a line arrangement, for which it is possible to use a randomized insertion algorithm which runs in expected O(n^2) time. The precise implementation details are beyond the scope of this blog post.

4 Problem D: Key Task

This is a straightforward implementation of breadth-first search. A tricky thing is that, actually, the maze is of size R\times C \times 16 since there are 16 possible states of key possession (0000, 0001, …, 1111). This problem is conceptually easy, but has numerous implementation annoyances. For example, I got several wrong submissions and took a long time because I didn’t notice that there could be no exit or more than one exit.

5 Problem E: Weird Numbers

This is an implementation/math problem. Converting from a negative base number is fairly easy. Given the digits, d\_m\cdots d\_1 the result is straightforward:

2 \displaystyle{\begin{align}n = d\_0 - d\_1 R + d\_2 R^2 - \cdots\end{align}}

Converting to a negative base is slightly less easy. Suppose you are converting a number to a positive base R instead, and get digits

3 \displaystyle{\begin{align}d\_m',\cdots d\_0'\end{align}}

Now you want to change the digits to that of the negative base: d\_m\cdots d\_1. First, observe that d\_0 = d\_0'. Next, if d\_1 is nonzero, then d\_1 = R - d\_1', with the caveat that you will have to add one to d\_2 (and the carry can propagate onwards, possibly resulting in more digits). This is because

4 \displaystyle{\begin{align}R^2 + d\_1 (-R) = (R-d\_1) R\end{align}}

So, simply iterate from the least significant digit to the most significant digit. Even with the most naive implementation, it is only O(m^2) for m digits. But the number of digits only goes up to around 20, since n<1000000<2^{20}.

6 Problem F: Rectangular Polygons

This problem has a beautiful solution. Let us group points both by x-coordinate and y-coordinate. In each case, when you look at a single column or row of points, you will have an even number of points, like so (for example):

    *     *  *   *         *    *

The key observation is that they must be connected as such:

    *-----*  *---*         *----*

This is because each point must be connected to both a vertical edge and a horizontal edge.

In this way, by looking at each row and column, you can write down an adjacency list of connections between points. Then, it suffices to traverse the graph to output the desired polygon. Since the problem asks for only the clockwise traversal, you can traverse it in both directions and output whichever one has the positive signed area.

7 Problem I: House Cleaning

This problem is essentially about finding a minimum spanning tree in a lattice graph. The edge between any two vertices is the minimum of the heights of those two rooms. You can solve it easily using either Prim’s algorithm or Kruskal’s algorithm. I personally find the former easier to implement (actually, I tried implementing the latter a few times, got incorrect output due to stupid bugs in my union-find data structure, and switched to Prim’s algorithm). You can store the “frontier” of Prim’s algorithm in a priority_queue, which is available in the c++ standard library.