Thursday, June 06, 2013

Codeforces Round #185

A. Whose sentence is it?

You got the chat record of two of your friends Freda and Rainbow, by mere curiosity you want to classified the sentences by the following categories: (1) Written by Freda, (2) Written by Rainbow or (3) You don't know who is the writer.

To do that you know that Freda always says "lala." (without quotes) at the end of the sentence. Rainbow in the other hand, always says "miao." (without quotes)  at the beginning of the sentence.

This is an implementation problem. There are just 3 cases to consider:

(1) If the sentence begins with prefix "miao." and don't end with the suffix "lala.", then it was written by Rainbow.
(2) If the sentence ends with suffix  "lala." and don't begin with the prefix "miao.", then it was written by Freda.
(3) Otherwise, you don't know who is the writer of the sentence.

B. Archer

There is a match of archery between SmallR and Zanoes, they try to shoot a certain target in turns, and SmallR shoots first. The probability that SmallR hits the target is of \(\frac{a}{b}\) while for Zanoes is \(\frac{c}{d}\). Given 4 integers \(a\), \(b\), \(c\), \(d\), return the probability of SmallR will win the match.

Let's first define some notation:

\(P(A) = \frac{a}{b}\) probability of SmallR hit the target
\(P(B) = \frac{c}{d}\) probability of Zanoes hit the target
\(1 - P(A)\) probability of SmallR don't hit the target
\(1 - P(B)\) probability of Zanoes don't hit the target
\(P(W)\)  probability of SmallR winning the archery match

Before writing any math, let's get some intuition about what is happening on each turn. There are 2 players shooting the target, that's give us 4 possibles outcomes:

(1) SmallR hits the target and Zanoes hits the target
(2) SmallR hits the target and Zanoes miss the target
(3) SmallR miss the target and Zanoes hit the target
(4) SmallR miss the target and Zanoes miss the target

Now because SmallR shoots first the probability of him winning in the first turn is equal to \(P(A) = \frac{a}{b}\).  If neither players win the match, or in other words, if SmallR and Zanoes both miss in the first turn, then, the probability that SmallR win in the second turn is \((1 - P(A)) \cdot (1 - P(B)) \cdot P(A)\).

Generalizing from the previous idea, we have that the probability that SmallR win in the \(k\)-th turn is: \( ((1 - P(A))(1 - P(B)))^{k-1} \cdot P(A)\)

This means that our answer, the probability of SmallR winning is given by the expression:
If we look carefully this turn out to be an Infinite geometric series, so we can reduce the infinite series to the expression:

Saturday, May 18, 2013

Google Codejam 2013 Round 1C

Problem A. Consonants

In brief, given a string \(S\) return the total number of substrings containing at least \(N\) consecutive consonants. The vowels are define as a, e, i, o, and u, and our consonants are the remaining 21 letters.

Let's begin by solving the small input of this problem, because the size of the string \(L\) can be at most 100. We just simple iterate over all the possibles substrings of the given string. For each substring check if contains at least one occurrence of \(N\) consecutive consonants, if that's the case we count that substring in our final answer. Things get a little bit more interesting when we are facing the large input, the size of the given string \(L\) can be at most \(10^{6}\). Clearly, the previous algorithm is not going to be fast enough. So we need a different approach to attack this problem.

First let's get some intuition with the following example:

S =  "quxrtz"
L =  6
N = 2

Imagine that we are iterating our string, until we find the substring "xr" which has \(N = 2\) consecutive consonants:

Now we should ask ourselves the following question, how many substrings  contains the string "xr"?

Listing all the substring of the given string S, we have in total 9 of them:

{"q", "u", "x", "r", "t", "z", "qu", "ux", "xr", "rt", "tz", "qux", "uxr",
"xrt", "rtz", "quxr", "uxrt", "xrtz", "quxrt", "uxrtz", "quxrtz"}

Where this answer come from? let's imagine for a second that we are building all the substrings that contains the string "xr". To do that we could either add characters to the front or to the end of "xr".

In the case of the left side we have 3 different possibilities, we can add "u", "qu" or "" (the empty string) to the front of "xr". In the same manner, in the case of the right side we also have 3 different possibilities, we can add "t", "tz" or "" (the empty string) to the end of the string "xr". By the rule of product we have \(3 \cdot  3 = 9\) substring including the string "xr".

Formalizing more the previous statement, let's call \(i\) the index of the last consonant that we found (0-based), and, let's call \(C\) the count of consecutive consonants so far. If the condition \(C \geq N\) holds, our count of substring that include this \(N\) consonants is given by the expression \((L - i) \cdot (i - N + 1)\).

There is just one problem remaining, what happen then if we apply the same method to the next N consecutive consonants:

We end up double counting some of the substrings (in red color the substrings that we have counted twice):

{"q", "u", "x", "r", "t", "z", "qu", "ux", "xr", "rt", "tz", "qux", "uxr",
"xrt", "rtz", "quxr", "uxrt", "xrtz", "quxrt", "uxrtz", "quxrtz"}

The solution to this issue is simple, instead of taking the whole left part of our string, we just take the part that we haven't count yet, we can do that by maintaining a pointer to the last \((i - N + 1)\) seen so far.

Adding this details left, we end up with the following expression \((L - i) \cdot (i - N + 1 - P)\) where \(P\) is the pointer to the last \((i - N + 1)\).

Thursday, May 09, 2013

Google Codejam 2013 Round 1A

It's been a while since my last post, I want to apologize to the readers of my blog, I was kind of busy studying Chinese.

There is not to much left to say about this match, there is an excellent official contest analysis. Nevertheless, I want to share some ideas that may help somebody to better understand the problems.

A. Bullseye

Let's first begin with a brute force solution of the small-input, according to the problem statement Maria begins first by drawing a circle of radius \(r + 1\) around the first white circle of radius \(r\), if there is enough paint she continue with a ring of radius \(r + 3\) around a circle of radius \(r + 2\) and so on...

So the area of the \(k\) ring correspond to the expression:
\( A_{k} = (r + 2k - 1)^{2} - (r +2k - 2)^{2} \pi \) \(cm^{2}\)

For example to paint the black rings showed in the previous picture we need:

\(S = ((r + 1) ^2 - r^2) +  ((r +3)^{2} - (r + 2)^{2})\) millilitres of black paint.

Let's don't forget that according to the statement 1 \(ml\) of paint is required to cover an area of \(\pi\) \(cm^{2}\).

Because for the small-input the constraints are low, we can just simply sum up the areas of the rings until we run out of black paint:

Until this point everything seems like a fairy tale, unfortunate the constraints for the large-input are kind of evil :). To understand the idea behind the solution we may need some understanding of arithmetic progressions.

An arithmetic progression is a sequence of numbers such that the difference between the consecutive terms is constant, let's call this constant value \(d\), so the \(n\)-term of certain sequence \(a\) is given by the formula:

\( a_{n} = a_{1} + (n-1)d \)

In order to get some intuition, let's first take a sequence of 3 rings, by our previous area formula we get something like this:

\(a = ((r + 1) ^2 - r^2) , ((r +3)^{2} - (r + 2)^{2}) , ((r +5)^{2} - (r + 4)^{2})\)

So far our expression don't look really arithmetic :) let's apply some algebra and see where it take us:

\(a = (r^{2} + 2r + 1 - r^{2}) ,  (r^{2} + 6r + 9 - (r^{2} + 4r + 4))
\\ ,  (r^{2} + 10r + 25 - (r^{2} + 8r + 16))\)

\(a = (2r + 1) ,  ( 2r + 5)  , (2r + 9)\) 

Given this expression is easy to see that we have arithmetic progression, with a constant factor \(d = 4\):

\(a = (2r + 1) ,  ( 2r + 5)  , (2r + 9), \cdots, (2r + 4k - 3) \)
\( a_{k} =  2r + 4k - 3 \)

Another way of getting the s \(k\)-th term of our arithmetic sequence, is applying some algebra to the given area formula:

\( a_{k} =  (2k + r -1)^{2} - (2k + r -2)^{2} \)
\( a_{k} =  (2k + r -1)^{2} - ((2k + r - 1) - 1)^{2} \)
\( a_{k} =  (2k + r -1)^{2} - (2k + r - 1)^{2} + 2(2k + r - 1) - 1 \)
\( a_{k} =  2(2k + r - 1) - 1 \)
\( a_{k} =  2r + 4k - 3 \)

Now our next goal is the sum of the area of the first \(k\) black rings, such that  \( S_{k} \leq t\) where \(t\) is the total amount of black paint available.

To calculate the sum of an arithmetic progression we use the following formula \(S_{k} = \frac{k(a_{1} + a_{k})}{2} \), here is the proof.

By substituting \(a_{1} = 2r + 1\) and \(a_{k} = 2r + 4k - 3\) we get:

\(S_{k} = \frac{k(2r + 1 + 2r + 4k - 3)}{2} \)
\(S_{k} = \frac{n(4r + 4k - 2)}{2} \)
\(S_{k} = k(2r + 2k - 1)\) 

Up to this point we already have all the math that we need, but there still one more problem, we can't check value after value until some \(k\) holds, we are going to need binary search to look for the \(k\) that maximize our result. Because the constraints are a little bit to high, we should be careful with overflow. A work around is use Python, so you don't have to worry about overflow.

Wednesday, February 06, 2013

UVA 12526 - Cellphone Typing

You are part of a research team that is developing a new technology to save time when typing text messages in mobile devices. This new technology aids users to skip key-strokes by the following conditions:
  1. The first letter always has to be input manually (can't be skipped).
  2. If a given prefix \(c_{1}, c_{2}, \cdots, c_{n}\) has been input into the system, and the user tried to insert a letter c such that for all the words that starts with \(c_{1}, c_{2}, \cdots, c_{n}\) also starts with \(c_{1}, c_{2}, \cdots, c_{n}\mathbf{c}\) then this key-stroke can be skipped. Otherwise the system waits for the users input.
Your task is, given the dictionary of words, calculate the average of key-strokes to input those words in the new system.

This problem can be solved using the Trie data structure, if you are not familiar with Tries I personally recommend the topcoder tutorial.

Back to the business :) In order to get a better intuition of the task let's first build a Trie with the following words  `hello', `hell', `heaven' and `bye':

As we can see in the graph bellow the numbers in red are the vertex prefix counter, in simple terms, we can say that every time a word is insert in our Trie a certain path is follow along the tree, and for each of the vertex in that path we keep a counter that records all the visits.

For example the letter H has a prefix count of 3 because there are 3 words that begin with H, `hello', `hell' and `heaven'. We could also say that the prefix count of H is 3 because there are 3 paths that crossed by H.

Understanding this let's focus in our goal the average number of key-strokes per word. Looking the picture below is not hard to see that we can only skip key-strokes when two consecutive nodes has the same prefix count.
Let's try to convince ourselves about the previous statement, assume that we are moving along a path in our Trie, along this path we visit the nodes in the following order \(1, 2, 3, \cdots, n, n + 1\) if a node \(n\) and \(n+1\) had the same prefix count implies that all the paths that past through \(n+1\) also past through \(n\) so we can say then that all the words that cross by the node \(n\) also cross by the node \(n+1\).

Finally, To get our result we just need to create a method that count the number of key-strokes. This method should not count the strokes of consecutive vertex with the same prefix count. The overall time complexity of this algorithm is \(O(|L|)\) where \(|L|\) is the length of the concatenate dictionary.