Friday, September 14, 2012

TopCoder SRM 556

DIV 2 250 - ChocolateBar

You just bought a chocolate bar, each square of the chocolate bar contains a letter in the range 'a' - 'z'. You decide to give this whole chocolate bar to your friend, but he is a really picky guy... he just like chocolate bars that do not contain repeated letters. You are able to cut 0 or more bars from the beginning and the end of your chocolate bar.

You are asked to return the maximum possible length of the chocolate bar that you can share with your friend.

First we are going to explain the naive solution, because the maximum length of the chocolate bar is at most 50, we can simply iterate over all the chocolate bars beginning at index \(i\) and ending at index \(j\), where \(j \geqslant i\). For each of this chocolate bars we can store the frequency of the letters seen so far, if one of the letter frequency is greater or equal than 2, we should not consider this chocolate bar, otherwise, we  calculate the length of the chocolate bar partition which is \(j - i + 1\) and stay with the maximum length as the result. The overall time complexity of this solution is \(O(n^{3})\).

After the contest I was thinking hmmm can we do better? the answer is yes. 

The idea is the following, the new algorithm keep two pointers \(lo\) and \(hi\) the first one represent the beginning of the sequence and the later one the ending. In order to guarantee the correctness of the results, at each step we should maintain the following invariant: "the frequency of the characters between \(lo\) and \(hi\) is less or equal to 1".

Given this, at each step we try to extend our current sequence one element more if the new sequence violate the invariant, we increase the pointer \(lo\) until the invariant holds again, otherwise, continue with the next element of the sequence. Simultaneously, we keep track of the maximum length partition that maintain the invariant, we can easily calculate this with the formula \(hi - lo + 1\). The overall time complexity of this solution is \(O(n + n)\).

DIV 2 500 - XorTravelingSalesman 

You are a very particular traveling salesman, every time you move to a new city your current profit is calculate as \(P \, \mathrm{XOR}\, V_{i}\) where \(V_{i}\) is the profit cost associated with the \(i\)-th city. You are allowed to move to the same city more that one time. Return the maximal profit that you can achieve.

The main idea to solve this problem is to notice that there is not so many states to consider, we are given maximum 50 cities and each of the Vi cost  do not exceeds 1,023, which give us a total of 51,150 possible states. We can simply iterate over those states using dfs and keep the maximum profit among them. 

Monday, September 03, 2012

Counting inversions in an array using Binary Indexed Tree


Given an array \(A\) of \(N\) integers, an inversion of the array is defined as any pair of indexes \((i,j)\) such that \(i \leqslant j\) and \(A[i] \geqslant A[j]\).
For example, the array \(a = \{2, 3, 1, 5, 4\}\) has three inversions: \((1,3)\), \((2,3)\), \((4,5)\), for the pairs of entries \((2,1)\), \((3,1)\), \((5,4)\).

Traditionally the problem of counting the inversions in an array is solved by using a modified version of Merge Sort. In this article we are going to explain another approach using Binary Indexed Tree (BIT, also known as Fenwick Tree). The benefit of this method is that once you understand its mechanics, can be easily extended to many other problems.


This article assume that you have some basic knowledge of Binary Indexed Trees, if not please first refer to this tutorial.

Replacing the values ​​of the array with indexes

Usually when we are implementing a BIT is necessarily to map the original values of the array to a new range with values between \([1, N]\), where \(N\) is the size of the array. This is due to the following reasons:

(1) The values in one or more \(A[i]\) entry are too high or too low. 
(e.g. \(10^{12}\) or \(10^{-12}\)).

For example imagine that we are given an array of 3 integers:
\(\{1, 10^{12}, 5\}\)
This means that if we want to construct a frequency table for our BIT data structure, we are going to need at least an array of \(10^{12}\) elements. Believe me not a good idea...

(2) The values in one or more \(A[i]\) entry are negative
Because we are using arrays it's not possible to handle in our BIT frequency of negative values (e.g. we are not able to do \(freq[-12]\)).

A simple way to deal with this issues is to replace the original values of the target array for indexes that maintain its relative order.  

For example, given the following array:

The first step is to make a copy of the original array \(A\) let's call it \(B\). Then we proceed to sort \(B\) in non-descending order as follow:

Using binary search over the array \(B\) we are going to seek for every element in the array \(A\), and stored the resulting position indexes (1-based) in the new array \(A\).

\(binary\_search(B, 9) = 4\)      found at position 4 of the array B
\(binary\_search(B, 1) = 1\)      found at position 1 of the array B
\(binary\_search(B, 0) = 0\)      found at position 0 of the array B
\(binary\_search(B, 5) = 3\)      found at position 3 of the array B
\(binary\_search(B, 4) = 2\)      found at position 2 of the array B

The resulting array after increment each position by one is the following:

The following C++ code fragment illustrate the ideas previously explained:

Counting inversions with the accumulate frequency 

The idea to count the inversions with BIT is not to complicate, we start iterating our target array in reverse order, at each point we should ask ourselves the following question "How many numbers less than \(A[i]\) have already occurred in the array so far?" this number corresponds to the total number of inversions beginning at some given index. For example consider the following array \(\{3, 2, 1\}\) when we reach the element 3 we already seen two terms that are less than the number 3, which are 2 and 1. This means that the total number of inversions beginning at the term 3 is two.

Having this ideas in mind let's see how we can applied BIT to answer the previous question:
  1. \(\mathbf{read}(idx)\) - accumulate frequency from index 1 to idx
  2. \(\mathbf{update}(idx, val)\) - update the accumulate frequency at point idx and update the tree.
  3. cumulative frequency array - this array represents the cumulative frequencies (e.g. \(c[3] = f[1] + f[2] + f[3])\) , as a note to the reader this array is not used for the BIT, in this article we used as a way of illustrating the inner workings of this data structure.
Step 1: Initially the cumulative frequency table is empty, we start the process with the element 3, the last one in our array.

how many numbers less than 3 have we seen so far
\(x = read(3 - 1)\) = 0
\(inv\_counter = inv\_counter + x\)

update the count of 3's so far 
\(update(3, +1)\)
\(inv\_counter = 0\)

Step 2: The cumulative frequency of value 3 was increased in the previous step, this is why the \(read(4 - 1)\) count the inversion \((4,3)\).

how many numbers less than 4 have we seen so far
\(x = read(4 - 1) = 1\)
\(inv\_counter = inv\_counter + x\)

update the count of 4's so far
\(update(4, +1)\)
\(inv\_counter = 1\)

Step 3: The term 1 is the lowest in our array, this is why there is no inversions beginning at 1.

how many numbers less than 1 have we seen so far
\(x = read(1 - 1) = 0\)
\(inv\_counter = inv\_counter + x\)

update the count of 1's so far
\(update(1, +1)\)
\(inv\_counter = 1\)

Step 4: Theres is only one inversion involving the value 2 and 1.

how many numbers less than 2 have we seen so far
\(x = read(2 - 1) = 1\)
\(inv\_counter = inv\_counter + x\)

update the count of 2's so far
\(update(2, +1)\)
\(inv\_counter = 2\)

Step 5: There are 4 inversions involving the term 5: \((5,2)\), \((5,1)\), \((5,4)\) and \((5,3)\).

how many numbers less than 5 have we seen so far
\(x = read(5 - 1) = 4\)
\(inv\_counter = inv\_counter + x\)

update the count of 5's so far  
\(update(5, +1)\)
\(inv\_counter = 6\)

The total number of inversion in the array is 6.

The overall time complexity of this solution is \(O(N logN)\), the following code corresponds to a complete implementation of the ideas explained in this tutorial:

Related problems

UVa 10810 Ultra-QuickSort
UVa 11495 Bubbles and Buckets
SPOJ 6256 Inversion Count
Codeforces Beta Round #57 E. Enemy is weak