Tuesday 4 August 2015

Amazon Internship Selection Experience

Since the internship season has started, it's really evident that it must come with some of it's side effects mostly lack of sleep and constant nervousness (whether you will get selected or not) after completely screwing up my Codenation Online Test (partly because the Testcases were wrong partly because I couldn't do even one question completely) Amazon was to visit the campus for selection of interns. Luckily I managed to get selected in it and concluding the on campus internship season for myself quite early (phew what a relief).

So I am just going to describe the whole process of Internship selection alongside some of the personal experiences and opinions, so if your sole purpose of reading this blog is to ask me "What were the questions asked in the interview ?" I would suggest you can skip a majority of sections and go directly to the interview part.

Begining of the day.
Tell you what, nothing messes up with your sleeping cycle like summer vacations you may sleep at 9:00 pm , 3:00 am , 5:00 am or don't sleep at all throughout the night and then remain like a retard througout the day. So I tried sleeping early on the day before let's say 9:00 pm but woke up at 12:30 am and that's start of the day for me. So not being able to sleep now I had two choices either to remain uneasy for the rest of the night or do something productive, something that would help me with my interviews etc.
I decided to open up https://www.hackerrank.com/domains/data-structures to practice some data structures, I would recommend practicing from here if you are reading geeksforgeeks instead of just cramming up the codes for various problems. Atleast do all the sections before the advanced data structures section here for two possible reasons.
  1. It's easy and contains the implementation of standard DS problems asked in interviews and online tests of various companies.
  2. Because it's Hackerrank (the platform is used by various companies to conduct their Online tests so you will get familiar with the environment).
I coded up a little bit of trees and Disjoint sets Linked lists etc the problems that remained. When I was finished it was about 3:30 am I realized I should definitely sleep now because otherwise it would be really difficult to perform well tommorow. So I tried to sleep but sadly couldn't. In the morning I got ready for the day left the house at 7:30 am reached the campus at 8:00 am as it was the reporting time (Bless those TnP guys calling us 2 hrs before the actual start of any test or interviews). 

The Online test.
The online test started at about 9:30 am it was supposed to be a 1hr 30 min round with 20 Multiple choice questions and 2 programming questions.
The objective questions were related to various topics Data structures, Algorithms, outputs of codes and recursions, C-syntax, logic puzzles etc. You could actually manage to do sufficient amount of MCQ's pretty straightforward stuff.

So the two programming questions.

1) Given an array print the minimum number of elements to be deleted such that the maximum of the array is less than or equal to the twice of the minimum element of the array.

let's say given the array:- 2 3 4 5 6 7
we can delete the elements 2 and 7 and we will get the array 3 4 5 6 also it's not the only solution, there can be multiple solution too.

Solution:-
At first I tried greedy solutions which I knew wouldn't work like sorting the array and keep on deleting the minimums or maximums or even alternating between the maximums and minimums.
Then it finally struck me to perform a binary search on the array to solve it (I don't know how exactly but it somehow struck me).

I could just select the i'th element let's say and find the index of rightmost j'th element which is at maximum twice the i'th element and delete all the elements outside of this segment. no need to implement binary search if you already know how to use upper_bound()
//Warning untested code
  1. int f(vector<int>v,int n)  
  2. {  
  3.     sort(v.begin(),v.end());  
  4.     int ans=(int)(1e9);  
  5.     for(int i=0;i<n;i++){  
  6.         int tmp=i-1;  
  7.         tmp+=(v.end()-upper_bound(v.begin(),v.end(),2*v[i]));  
  8.         ans=min(ans,tmp);  
  9.     }  
  10.     return ans;  
  11. }  

2) Given a string find the first non repeating character and return the character as a string otherwise         return "NONE".
    For example lets say the string is "twitter" the returned string should be "w". Remember that the         characters are case sensitive.
    Solution:- Straightforward use of maps although there are other solutions also.
 
  1. string f(string s){  
  2.     map<char,int>ma;  
  3.     for(char ele:s){  
  4.         ma[ele]+=1;  
  5.     }  
  6.     for(char ele:s){  
  7.         if(ma[ele]==1){  
  8.             string str="";  
  9.             str.push_back(ele);  
  10.             return str;  
  11.         }  
  12.     }  
  13.     return "NONE";  
  14. }  
 
    That was sufficient for the online round. as far as I remember I could do 18 out of 22 questions.
    not attempting the other ones because there was -0.25 for each wrong answer.

The seemingly unending wait
The test concluded as far as I remember between 11:00-12:00 am and then we were supposed to wait for the guys in the second batch to get finished and then wait for our shortlist. This wait seemed unending mainly because it was stressful thinking about how you have performed also the weather hasn't been kind in past few days It was really hot in there. So here we were waiting to get shortlisted and suddenly comes a ping on someone's phone regarding a shortlist. 27 people were shortlisted out those seemingly 200 people who gave the online test. I guess 6 from I.T.(my department) and 21 from  C.O.E  4 out of all of the people shortlisted from my class performed an awkward group hug :P yeah that included me too.

Pre Interview
We were supposed to report at the 4th floor of Library Block at 2:30 all 27 of us. The nice elevator moment was when there was an overload and with polite greetings the last two guys were asked to use the stairs. So we arrived at the 4th floor which I would say is still under construction and I believe has been the same for the last 2 years. We were asked to wait (as usual) by the TnP guys in the sort of abandoned lab where all you could find was dusty chairs. Anyways it was time for freshening up and a major lot hit the washrooms (of course I am not going to describe what happens in there what are you expecting ?). We all freshened up and they starting calling us in chronological order of the shortlist I believe. Pre Interview Jitters are pretty common but I was a bit relaxed I don't know how and why.

The first Interview.
The candidates were being called in one after another and then I hear my name being called "Gaurav Kumar" So I went to the room following my interviewer.
He asked me how was my day. I replied it was fine with a smile. We reached to the room I sat down I was handed some white sheets and told to write the code on paper. He just scanned my resume and asked for any projects I said no sir I haven't done any, he told me its cool no problem. Then he began explaining the interview process, he said if at any point I feel I have enocountered the problem before I could just explain the solution verbally to him so that we don't waste time on it and I could explain my thought process.

So my first question was to delete the nodes out of a bst which don't lie in a specified range.
The question was taken from here :- http://www.geeksforgeeks.org/remove-bst-keys-outside-the-given-range/
The gfg code is really really clean but alas I didn't prepare the problem from here but what I coded up was a bit brute but also sufficient.

All I could think of was a recursive function
  1. void f(struct node * root, int a,int b,struct node * ptr,int id)  
this would go from the root if at any point root->data < a I would recurse onto the right subtree
if at any point the root->data > b i would recurse onto the left subtree. if both were true I would recurse on both I was taking ptr as the parent of the previous selection. I still think my implementation was not really good.
The interviewer said it was fine but I was not freeing up the nodes I could get a memory leak. so I added up the free statements but then he pointed out another memory leak stating the left or right subtree would not be freed then I fathomed enough courage to implement another function g() which would specifically take the job of freeing up the nodes in post order way. He asked me the complexity I told him that since i am checking each node in the BST I just bruted the very standard question but Interviewer proceeded to the next question.

My second question was a bit of a puzzle type question:-
given a river with points marked as 1,2,3,4....n in one order along one bank, and one in random order on the other bank, you can join the two same marked points together and that would be a bridge, so you have to maximize the number of bridges and such that no two bridges intersect.





I immediately told the interviewer that this was a problem I have seen before and it would be solved with Dynamic programming and the problem is called longest increasing subsequence problem, which can be solved in O(n^2) time using Dynamic Programming he said okay we move to the next question. I also asked for the confirmation he said If you wouldn't have been correct I wouldn't let you go away with the question. So cool, screwing a bit of the first question doesn't mattered now. because I could straightaway solve the Second.

My third question was to find the number of possible decoding for  a number string :-

for example :-
 if given string is "121"
 it can be decoded as 1-2-1 = "ABA"
 or it can be decoded as 12-1 = "LA"
 or it can be decoded as 1-21 = "AU"
so for 121 answer is 3.

I first coded a linear dp solution, the interviewer pointed a bug, I dealt with it and explained my bug free solution like.
I told him that dp[i] stores the number of ways of decoding the substring starting from 0 to i'th index.
//Warning untested code

  1. int f(string s,int n){  
  2.     int dp[n+1];  
  3.     fill(dp,dp+n+1,0);  
  4.     dp[0]=1;  
  5.     for(int i=1;i<n;i++)  
  6.     {  
  7.         if(s[i]=='0'){  
  8.             if(i-2>=0)  
  9.             dp[i]=dp[i-2];  
  10.             else  
  11.             dp[i]=1;  
  12.         }  
  13.         else{  
  14.             if(s[i-1]!='0'){  
  15.                 dp[i]+=dp[i-1];  
  16.                 if(i-2>=0 && (s[i-1]-'0')*10+(s[i]-'0')<=26){  
  17.                     dp[i]+=dp[i-2];  
  18.                 }  
  19.                 else if(i-2<0 &&(s[i-1]-'0')*10+(s[i]-'0')<=26){  
  20.                     dp[i]+=1;  
  21.                 }  
  22.             }  
  23.             else{  
  24.                 dp[i]=dp[i-1];  
  25.             }  
  26.         }  
  27.     }  
  28.     return dp[n-1];  
  29. }  

He said alright and told me that he was done with the interview and asked if there were any questions from my side, I just customarily asked him about the learning opportunities for an intern at Amazon he told something.
I came out of the room the TnP guy told me to wait in another room.

The seemingly unending wait part 2
My first interview ended at about 5:30pm and then begun the unending wait part 2 because all the interviewers were busy taking the interviews of other round 1 candidates I was confined to another room, during this time I was texting with dad, then received calls from some of my freinds,  I called back dad by going outside the room. then got back to the room again waiting and waiting and waiting, 2 hours passed I wasn't getting alloted for the interview two of my classmates who got shortlisted were not selected after round 1 I felt bad for them but what else I could do, this also made me a bit nervous and I started overthinking but then few deep breaths worked like magic, I was trying as hard as possible to not lose hope and focus. Finally I got called for interview.

The second interview
The second interview started close to 7:30pm for me I got into the room greeted the interviewer with a smile. He told me to sit. I sat down and he described as how the interview will proceed, He told me that the would start by introducing himself then he would ask for my introduction then he will ask what all subjects we have studied and what all subjects I liked and then there will be some technical questions and finally he would let me ask some questions to him.
He started off with his name and told me that he worked at managament at Amazon, and some product that he has designed for Amazon.
Then he asked for my introduction with my CV in his hand,
Now is the point I felt Is he really asking for my introduction ?? he has my resume in his hand he could just read it. What if I say My name is Gaurav Kumar and he says get lost ? because that's too stupid to say.
I started to stammer while speaking anything and then told him that "Sir, the point is you have my CV in your hand I am struggling to find what's not there in the CV which I could tell you because you might have read my CV and telling the things which you already know wouldn't be nice." He just kept the CV on the other side of the table and said don't worry I didn't even read it you go on. (Pheww.. relief  I thought it was some sort of a trick but it wasn't). Then I introduced myself although stammering again then told him about my academics my branch and my course duration my interestes like competitive programming, problem solving , and Code Golf.  Then I told him that I have like Data Structures and Algorithms, course and Introduction to programming course in the 3rd sem and 2nd sem he was noting them down in his notebook.
Then he said okay gaurav give me a 5 minute dump on all the data structures you know their advantages and disadvantages etc.
I was like okay here we go now:-
Arrays :- contiguous memory allocation referenced by a same name. Text book definition.
accessing of data O(1) and update of data like insertion O(n) searching in a sorted array O(log(n)).
Linked list :- searching O(n) and deletion/insertion O(1).
Trees :- Hierarchial Data Structures, representing some sort of hierarchy in between two nodes. searching worst case O(n) if n nodes then n-1 edges etc etc.
Binary search trees :- data structures with log(n) search and O(1) update / deletion and insertion time.

And because you only live once I blurted out two advanced data structures.

DSU and Segment Tree. partly because I was confident in them.

Then he gave a technical question to find the 3rd distinct maxima in an array.
like for example if the given array is:- 4 4 3 2 the answer should be 2

I said the worst solution would be to sort the array and check for the third maxima, then I tried to implement  a O(n) solution, using three variables max1,max2,max3 and one pass through the array he said, there should be a industry level code covering all edge test cases.

I wrote a solution and as usual he pointed an edge test case like 4,4,2 I asked him that sir there is no third maxima in this case he said that's why you should make the code fool proof. I tried correcting it he then moved on.

He asked me if I have studied Operating Systems,
I said yes, then he asked some textbook questions on Operating systems, like what is Cache memory, what is a deadlock , what are the deadlock detection strategies, how do you describe a fuctioning of cache memory.

Then came the favorite topic of interviewers,
The Hash Table,
he asked me what is a hash table, how does hash table fuctions, what is lookup time , disadvantages of hash table, advantages of hash table, under what conditions one can use a hash table, I didn't really like hash tables but could possibly make reasonable amount of answers it was fine.

Final technical question:-
Given a stream of 8 bit integers find the number of set bits in them as fast as possible. I told him there is a reasonable way to do it using builtin_popcount() in O(1) he asked how it was O(1)? I told him that it is O(number of bits) but since you have specified the number of bits it becomes O(1). He said now consider builtin_popcount in not available, I implemented the textbook code.


  1. int f(int n){  
  2.     int ctr=0;  
  3.     for(int i=0;i<8;i++){  
  4.        if(n&(1<<i)){  
  5.          ctr++;  
  6.        }  
  7.     }  
  8.     return ctr;  
  9. }  

he asked for something else, So I used the fenwick tree implementation code

  1. int f(int n){  
  2.     int ctr=0;  
  3.     while(n){  
  4.       n-=(n&(-n));  
  5.       ctr++;  
  6.     }  
  7.     return ctr;  
  8. }  

he asked anyway to do it using hashing I said sir I don't really like hash tables that much because as far as I have experienced the logarithmic time lookup is sufficient like in case of a map or set, rather than hash which has some possiblility of anti hash test case and collisions. He then decided it was the end of it , He said he's done with the interview and asked me if I had any questions. I asked him to explain to me the management system at Amazon, he breifed me about it. I then apologized to him for stammering in the interview and told him that it was my first ever interview, he said it's not a problem at all. He then told me to go , there was no HR round because it was already close to 9:00 pm so I picked up my bag and trudged back home.

Aftermath
Results came close to 11:30 pm and I was selected I was 6th in the overall list really happy after the selections. 12 people were selected. 9 from COE and 3 from IT. I went on to inform my parents about the result.
Me :- "Mom, Dad I got selected for an internship at Amazon."
Them :- "Go to sleep it's pretty late. We will talk tommorow."

Yes the congratulations have been pouring in from many people, I would like to thank everyone for their wishes, and yes the party is due. I would still consider myself to be lucky to have such nice interviewers and managing to clear both the rounds.


Thanks for reading guys,
will write more frequently if possible

Tuesday 24 February 2015

Number Theory February

An Editorial on Hackerrank gave a really nice review of certain number theory concepts:-

Namely
  1. Lucas' Theorem to determinemodulo p where p is prime.
  2. Chineese remainder theoram to find x modulo (a1.a2)  where x modulo(a1) and x modulo(a2) are given.
  3. Extended Euclidean algorithm to find pair of integers c1,c2 such that. 
So remaining posts will be about these concepts only.

 Lucas' theorem
                                        \binom{m}{n}\equiv\prod_{i=0}^k\binom{m_i}{n_i}\pmod p,
Where
                            m=m_kp^k+m_{k-1}p^{k-1}+\cdots +m_1p+m_0,
                                n=n_kp^k+n_{k-1}p^{k-1}+\cdots +n_1p+n_0
Images taken from :- Lucas Theorem Wikipedia

But these images don't provide much of an explaination as such, (at least when I read them)
The approach of Lucas' thoram can be summarized by a simple procedure In the following manner. To find the  modulp p where p is a prime
  1. Express m in the base p 
  2. Express n in the base p
  3. Incase the expansion of n contains less number of digits than m, pad leading zeroes to the left of the expansion of n
  4. Take digitwise combination of each pair of digits.
  5. Multiply the obtained terms to find the value of desired combination and take modulo p.
As usual the proof is still not touched because haven't been able to prove it using my own arguements do look at the wikipedia article for the proof incase needed.

Now will be building our number theory library so hence I will be posting the codes for the above mentioned theorams:-

Python :-
  1. from math import factorial  
  2. f=factorial  
  3. def ncr(a,b):  
  4.     if b>a:  
  5.         return 0  
  6.     return f(a)/(f(a-b)*f(b))  
  7. def conv(a,b):  
  8.     s=[]  
  9.     while a:  
  10.         s.append(a%b)  
  11.         a=a//b  
  12.     return s[::-1]  
  13. def main():  
  14.     n,m,p=map(int,raw_input().split()) #input m , n and the prime  
  15.     k1=conv(n,p) #convert to base p  
  16.     k2=conv(m,p)  
  17.     if len(k2) < len(k1):  
  18.         k2= [0]*(len(k1)-len(k2))+k2 #pad in leading zeroes  
  19.     l=[]  
  20.     for i in xrange(len(k1)):  
  21.         l.append(ncr(k1[i],k2[i])) #take digitwise ncr  
  22.     prod=1  
  23.     for i in l:  
  24.         prod*=i  
  25.         prod%=p  
  26.     print prod  
  27. main()  





Tuesday 3 February 2015

Musings on Topoogical Sort

A fairly recent Codeforces round made me apply the Topological Sorting algorithm. And the make use of this piece of code.

  1. #define MAXN 1000010  
  2. vector<int>graph[MAXN];  
  3. bool vis[MAXN];  
  4. int n=MAXN;  
  5. void topodef(int a,stack<int>&s)//topo DFS  
  6. {  
  7.     vis[a]=true;  
  8.     for(vector<int>::iterator it=graph[a].begin();it!=graph[a].end();it++)  
  9.     {  
  10.         if(!vis[*it])  
  11.         {  
  12.             topodef(*it,s);  
  13.         }  
  14.     }  
  15.     s.push(a);  
  16. }  
  17.   
  18. void toposort()  
  19. {  
  20.     stack<int>s;  
  21.     fill(vis,vis+n,false);  
  22.     for(int i=0;i<n;i++)  
  23.     {  
  24.         if(!vis[i])  
  25.         {  
  26.             topodef(i,s);  
  27.         }  
  28.     }  
  29.     while(!s.empty())  
  30.     {  
  31.         cout<<(char)(s.top());  
  32.         s.pop();  
  33.     }  
  34.     cout<<endl;  
  35. }  

Now not to mention the fact that I was in doubt about one concept of Topological sort. How does this Algorithm selects the node with the least indegree and outputs it first ? And I cleared this doubt of mine fairly recently. And constructed this example to assist the reasoning further.
Let's assume the graph is like this

Not to mention the fact that it's acyclic and directed.
So let's see what happens when we call our toposort sub routine.
we have:-
1) Graph {{5},{1},{1},{1},{}} //graph represented as an adjaceny list.
2) A stack of integers S={} // blank currently
3) Boolean array vis[5]={false,false,false,false,false}
We find the first node which is currently unvisited by looping and it turns out to be 1 in our 1 based notation graph  (lets stick to 1 based notation for a while).
so we find 1 is the current unvisited node.
Now we dfs from node 1 remember the stack is still empty and vis[1]=true.
since the adjaceny list of 1 is {5}
we again perform the dfs with node  5 since adj list of node  5 is empty
S={} , vis[5]={true,false,false,false,true}
now we return back to the parent call and prior to that we push node  5 into the stack
S={5} , vis[5]={true,false,false,false,true}
now since no other element is present in adjacency list of 1we return to Toposort() but only after pushing node 1 into the stack
S={1,5} , vis[5]={true,false,false,false,true}
now we find that next unvisited node is node  2
so we mark it visited and dfs from node  2
S={1,5} , vis[5]={true,true,false,false,true}
since 1 (the only element in the adjacency list of 2) is visited already, we push node 2 into stack and return
S={2,1,5} , vis[5]={true,true,false,false,true}
again the same case for node 3 as that of node 2
S={3,2,1,5} , vis[5]={true,true,true,false,true}
and finally same case for node 4 in the graph
S={4,3,2,1,5} , vis[5]={true,true,true,true,true}
So there you have it the topological ordering 4,3,2,1,5

So as a footnote I can say is ahem!
It isn't necessary that the vertex which the Topological sort subroutine choses first, should have a minimum indegree the recursive nature of topological sort will push the ones having a less indegree than the current chosen node above the chosen node in the stack (not to mention again the acyclic and directed nature of graph are followed).

Thanks for reading guys
Dobranich ...A little bit of Ukrainian I learnt recently :)