My Take on Aliens' Trick

July 24, 2019

There was a problem in International Olympiad in Informatics 2016 named 'Aliens'. Its full solution uses an unconventional technique which I'll be trying to explain in this post. The trick is often named after this problem, "Aliens' Trick". Frankly speaking, this technique was literally alien to me untill few moments before I started writing this post.

My feeling - when I realized the underlying idea of the trick. Full version available here.

The Background Problem

The problem which ignited the process of realization of this technique is Building a Tall Barn from USACO 2017 January Contest, Platinum Division.

Abridged Statement: You are constructing an $N$-story building and you have $K$ ($1 \leq N \leq K \leq 10^{12}$ and $N \leq 10^5$) cows to do this job. You have to assign every cow to work at exactly one floor and each floor must have at least one cow assigned to it. If you assign $c_i$ cows to the $i$-th floor, it will take $a_i/c_i$ units of time to complete that floor. Also, floor $i$ must be completed before construction can begin on floor $i+1$. You need to compute the minimum total time in which the construction can be completed, if the cows are assigned to work on floors in an optimal fashion, that is, you have find the minimum possible value of $\sum_{i=1}^{N}a_i/c_i$. Output this number rounded to the nearest integer.

A Greedy Solution. The following simple (and intuitive) greedy algorithm works: set $c_i = 1, \forall i$. Now iterate this procedure $K - N$ times: pick an index $j$ such that it maximizes the value $\frac{a_j}{c_j} - \frac{a_j}{c_j + 1} = \frac{a_j}{c_j (c_j + 1)}$ and set the value of $c_j$ to $c_j + 1$. At some moment, when the $i$-th floor has $c_i$ cows assigned to it, let's call $\frac{a_i}{c_i (c_i + 1)}$ the profiting weight of $i$-th floor. If we let $T = \sum_{i=1}^{N} a_i / c_i$ which is initially $\sum_{i=1}^{N} a_i$, then we can say that in each iteration we are reducing our $T$ by as much as we can. And also taking the maximum profiting weight in each iteration yields a correct algorithm because every floor's profiting weight decreases as $c_i$ increases. Due to the fact that $K$ is quite big, simulating $K - N$ iterations (maybe even with a priority queue) would take tremendous amount of time — roughly $10^9 + 7$ tons of ice will melt in Anterctica in that time.

Some properties and notations. Let $f: \{x \in \mathbb{Z} | x \in [N, \infty)\} \rightarrow \mathbb{R}$ be a function with $f(i)$ being the minimum total time in which the barn can be completed using $i$ cows. Also, let's define $m_i$ as the slope at $x = i$, which is, for simplicity, $f(i-1) - f(i)$ (note that according to normal definition of slope, it would be $f(i) - f(i-1)$). In words, $m_i$ is the amount of decrease when moving from $i-1$ cows to $i$ cows. It's clear that $f$ is decreasing, i.e. $f(i + 1) < f(i)$. Also from the greedy algorithm, we observe that the decrease of $f$ is also non-increasing, that is, $f(i) - f(i+1) \le f(i-1) - f(i) \implies m_{i+1} \le m_{i}$ (when going from left to right along the $x$-axis, the fall is less steeper than the previous one) which makes the function convex.

Figure 1: Graph of $f$ (only for $x \in [N, 100]$) for $N = 20, K = 50$ and $a_i = \texttt{random_value} \in [1, 10]$. Plotted points are $(i, f(i))$.

If you had infinitely many cows you would keep employing them. But if you had to pay some amount of penalty for using each of those cows you wouldn't use many of them. Let $g(m)$ be the minimum amount of time we will need to complete construction if we have to pay a penalty time (cows need rest while working, don't they?) of $m$ per cow (say each cow spends a total of $m$ units of time resting throughout the whole building process) and $g_c(m)$ $(g_c: D_{g_c} \rightarrow R_{g_c}, R_{g_c} \subseteq \mathbb{N})$ be the number of cows in such assignment (maximize the number of cows when multiple assignment minimizes time). We will refer calculating $g$ and $g_c$ as $K$-loose version of the main problem. To solve this, we may just adapt the previous greedy algorithm, but, we must stop in such an iteration when our maximum profiting weight is less than $m$ (note that, 'less' comparison allows us to maximize the number of cows, however, if we had used 'less than or equal to' comparison, we would have minimized the number of cows). Note that, we already have a way to calculate final $c_i$ in $O(1)$ — for floor $i$, you will keep adding cows till the following inequality satisfies: $\frac{a_i}{c_i (c_i + 1)} \ge m \implies c^2 + c - \frac{a_i}{m} \le 0$ which implies final $c_i = \left \lfloor \frac{\sqrt{1 + \frac{4a_i}{m}} - 1}{2} \right \rfloor + 1$. Now we can find $g(m)$ and $g_c(m)$ in $O(N)$. But what actually $g(m)$ and $g_c(m)$ are calculating?

Figure 2: Using $g$ and $g_c$ we can actually find the coordinates of the red point (the touchpoint).

Upon giving some slope $m$ (again, our slope is the additive inverse of the traditional slope), the two functions give us the point where the line with slope $m$ touches the lower convex hull of the function $f$ — the coordinate of that point is $(g_c(m), f(g_c(m)))$. As $g$ is non-increasing, we can gradually decrease the value of $m$ and at some point we will encounter $g_c(m) = K$. By properly engineering the monotonicity of $g_c$, we can binary search for which $m$ we will obtain $g_c(m) = K$. We also know that $f(g_c(m)) = g(m) - m \cdot g_c(m)$. Subtituting $g_c(m) = K$, we get $f(K) = g(m) - K\cdot m$.

Figure 3: Example of what $(g_c(m), f(g_c(m)))$ will be when some consecutive segments' slopes coincide with $m$.

But here's a catch that may have been irritating you for a while: what if there're few consecutive slopes which are all equal to $m$. Formally, consider the scenario where $m_l = m_{l+1} = \ldots = m_{r} = m$ $(l < r)$. From the definition of $g_c$, in these kind of abnormalities $g_c(m)$ will be equal to $r$. As a result of this, the range of $g_c$, $R_{g_c}$ will miss out all the integers $ \in [l, r)$. So, if $K \in [l, r)$, we will not find any such intended $m$ for sure.

Improvise. Adapt. Overcome. Now, we have several informations: $m_l = m_{l+1} = \ldots = m_{r}$ $(l \le r)$ and $K \in [l, r]$ (note that, I've changed the right end of the interval from excluded to included and '$<$' to '$\le$' because we are about to generalize for any $K$). To tackle this, instead of trying to attain $g_c(x) = K$, we will find maximum $m_{opt}$, such that $g_c(m_{opt}) \ge K$. This will yield $m_{opt} = m_r = m_K$. Note that, $\forall i \in [l-1, r]$, $$f(i) = f(K) - (i - K) \cdot m_{opt}$$ \begin{equation} \implies f(K) = f(i) + (i - K) \cdot m_{opt} \end{equation} \begin{equation} \implies f(K) = f(r) + (r - K) \cdot m_{opt} \end{equation} Also, we know that, $g_c(m_i) = i$ and $f(g_c(m)) = g(m) - m \cdot g_c(m)$, which implies: $$f(r) = g(m_r) - m_r \cdot r$$ \begin{equation} \implies f(r) = g(m_{opt}) - m_{opt} \cdot r \end{equation} By adding $m_{opt} \cdot (r - K)$ to both sides of $(3)$- \begin{equation} f(r) + m_{opt} \cdot (r - K) = g(m_{opt}) - m_{opt} \cdot r + m_{opt} \cdot (r - K) \end{equation} Combining $(4)$ and $(2)$ we get, \begin{equation} f(K) = g(m_{opt}) - m_{opt} \cdot K \end{equation} Now, we just need to binary search $m_{opt}$! An important thing to note in this problem is, we are binary searching over $\mathbb{R}$. So, $m_{opt}$ might not be exactly equal to $m_K$ $( = m_r)$. Instead it might be slightly smaller than $m_K$, leading to inequality of $(1)$. Precisely, we need to rewrite $(1)$ as $f(K) = f(i) + (i - K) \cdot (m_{opt} + \epsilon)$ for some $\epsilon \ge 0$. This would derail $(5)$ to become a not so eye pleasing figure. But, the more iterations we are going to perform for the search, the smaller $\epsilon$ gets, the more precise our answer becomes with the current formula of $f(K)$. Also, the last line of the problem statement states that, "it is guaranteed that the solution will be more than $0.1$ from the boundary between two integers". However, when we are working with integers, we won't need to consider this issue as $m_{opt}$ will be exactly $m_K$.

One might think that, calculating $f(K)$ by adding something to $f(r)$ may not work all the times, because we may not always find such arrangement which gives us a cost of $f(r) + m_r \cdot (r-K)$. But given the fact that the function is convex and properly defined over the relevant elements (usually a contiguous range, e.g. for this problem it is $[N, K]$) allows us to assume that such arrangement exists.

All we need to do is...

Convince ourselves that the function is convex/concave (just making sure that the sequence of slopes is monotonic) (maybe write up some bruteforce before coding a fast enough solution and submit, see if it works for small subtasks).
P.S. If cost function (weight of directed edge in terms of DAG) satisfies quadrange inequality, dp function will be convex/concave (depending on the quadrange inequality sign). See resources 6 to completely understand what are these.
Being able to solve the $K$-loose version of the main problem in reasonable time and memory complexity.
Binary search.
Arsenal.
Celebrate.

Some Related Problems and Solutions of them

~~Coming soon...~~

Problem 1: You are given a weighted graph, some of edges are colored in black and others in white. You need to find minimum total weight of a spanning tree which uses esactly $K$ black edges. $|V| \le 10^5, |E| \le 2 \cdot 10^5$.
P.S. Assume the graph is connected even when no black edge is used. Otherwise, we get some 'holes' in the function.
Let $f(k)$ be the ans if $K = k$. Now, $f$ is concave. Binary seach on slope.
Bonus: Can you prove $f$'s concavity?

Problem 2: Codeforces 673E - Levels and Regions

Problem 3: IOI 2016 Aliens. Along with aliens' trick, you need convex hull trick optimization to solve this problem (the $K$-loose version needs cht). I added a link to cht optimization on the next section.

Problem 4: Codeforces 739E - Gosha is hunting. (some discussion)

Problem 5: B of this contest. This problem needs another idea to solve its $K$-loose version, which is 1D1D dp optimization (you can find resource of that in the next section). Vaguely speaking, consider some dp recurrence: $f_i = \max_{j=0}^{i-1} \left \{ f_j + w_{j,i} \right \}$ and let $g_i = \arg \min_{j=0}^{i-1} \left \{ f_j + w_{j, i} \right \}$. Then the following is true: $$\forall i \le j, w_{i,j} + w_{i+1,j+1} \le w_{i+1,j} + w_{i,j+1} \implies \forall i \le j, g_i \le g_j$$. Monotonicity of $g$ allows us to binary search (P.S. w/ some noise though).

You can find 2 and 3's editorial in the internet. 1 is from some chinese team selection test.

More Resources

A delicious treat for you If you can read bangla.
Lagrange Multiplier
More DP optimizations
1D1D DP Optimization
Convex Hull Trick - PEGWiki

See this for rigorous proofs.

Special thanks to Rezwan Arefin who introduced this trick to me. You can check out his blog here, it is awesome!