Friday, March 4, 2011

NetApp的Phone Interview惨败(四)

8. 给我9999个数字,数值连续从1到10000,但缺了一个。要求用一种非常简洁的方法得出缺了哪个数字?

完全不会,面试官的答案是从1连续加到10000,得到A;再把给定的9999个数字加起来,得到B;然后A - B即可。

他后来又说,如果是9998个数字,数值从1到10000,缺两个。这种情况该怎么处理?

以下是我在mitbbs上求助和高手的应助:
---
给我9999个数字,数值连续从1到10000,但缺了一个。要求用一种非常简洁的方法得出缺了哪个数字?

by johny

方法很多:
a. 全部加起来,看缺多少,就是哪个数
b. 1xor2xor...10000,然后继续xor给的9999个数字

by Chevy

这第一个方法就是面试官后来跟我说的。不过他后来又说,如果是9998个数字,数值从1到10000,缺两个,加法就貌似不行了。这种情况下,该用什么方法呢?

by johny

两个的话,可以解方程:
x+y+其他9998个数 = 1+...+10000
x^2+y^2+...=1^2+...+10000^2

by Chevy

两个仍然用xor
不然很可能溢出

by fzblg

不明白这个xor解法的原理是什么,可以简单讲一下或给个参考文献吗?

by johny

xor有个性质就是,对于一个数a,a xor a = 0
所以用这个性质其他9999个数都消掉了。

by Chevy

没找到,我试着讲一下:仍然按照Chevy的方法做。

如果是一个数:最后的结果就是M。
如果是两个数:结果就是M1 xor M2。

从这个结果中挑一个bit为1的,就可以把原数组划分为两组。missing numbers分别在这两个组中。

然后,按照1个数的方法再分别做一遍。据说这个思路可以扩展到3个,4个数去。

有个类似的http://geeksforgeeks.org/?p=2457

by fzblg
---
从这些帖子,我可以理解一个数的解法。但对于两个数的解法,还是有些迷惑。



去到fzblg提供的链接,仔细读了一番,终于基本搞明白了。

Given an array in which all numbers except two are repeated once. (i.e. we have 2n+2 numbers and n numbers are occurring twice and remaining two have occurred once). Find those two numbers in the most efficient way.

Method 1(Use Sorting)
First sort all the elements. In the sorted array, by comparing adjacent elements we can easily get the non-repeating elements. Time complexity of this method is O(nLogn).

Method 2(Use XOR)
Let x and y be the non-repeating elements we are looking for and arr[] be the input array. First calculate the XOR of all the array elements.

        xor = arr[0]^arr[1]^arr[2].....arr[n-1]

All the bits that are set in xor will be set in one non-repeating element (x or y) and not in other. So if we take any set bit of xor and divide the elements of the array in two sets – one set of elements with same bit set and other set with same bit not set. By doing so, we will get x in one set and y in another set. Now if we do XOR of all the elements in first set, we will get first non-repeating element, and by doing same in other set we will get the second non-repeating element.

Let us see an example.
        arr[] = {2, 4, 7, 9, 2, 4}
1) Get the XOR of all the elements.
        xor = 2^4^7^9^2^4 = 14 (1110)
2) Get a number which has only one set bit of the xor.
    Since we can easily get the rightmost set bit, let us use it.
        set_bit_no = xor & ~(xor-1) = (1110) & ~(1101) = 0010
    Now set_bit_no will have only set as rightmost set bit of xor.
3) Now divide the elements in two sets and do xor of
    elements in each set, and we get the non-repeating
    elements 7 and 9. Please see implementation for this
    step.

Implementation:

#include <stdio.h>
#include <stdlib.h>

/* This function sets the values of *x and
   *y to non-repeating elements in an array
   arr[] of size n */
void get2NonRepeatingNos(int arr[], int n, int *x, int *y)
{
  int xor = arr[0]; /* Will hold xor of all elements */
  int set_bit_no;  /* Will have only single set bit of xor */
  int i;
  *x = 0;
  *y = 0;  

    /* Get the xor of all elements */
  for(i = 1; i < n; i++)
    xor ^= arr[i];

  /* Get the rightmost set bit in set_bit_no */
  set_bit_no = xor & ~(xor-1);

    /* Now divide elements in two sets by comparing
     rightmost set bit of xor with bit at same
     position in each element. */
  for (i=0; i < n; i++)
  {
    if(arr[i] & set_bit_no)
      *x = *x ^ arr[i]; /* XOR of 1st set */
    else
      *y = *y ^ arr[i]; /* XOR of 2nd set */
  }


/* Driver program to test above function */
int main()

  int arr[] = {2, 3, 7, 9, 11, 2, 3, 11};
//johny.0
  /* if we declare "int x" and "int y" here, and
     call "get2NonRepeatingNos(..., &x, &y)", we
     would not need to free them later */
//johny.1
  int *x = (int *)malloc(sizeof(int));
  int *y = (int *)malloc(sizeof(int));

  get2NonRepeatingNos(arr, 8, x, y); 
  printf("The non-repeating elements are %d and %d",
         *x, *y); 
  getchar();

//johny.0
  free(x);
  free(y);

  return 0; 
//johny.1
}

Time Complexity: O(n)
Auxiliary Space: O(1)


Can some1 explain me the logical part of how it works? I find it difficult to analyse it.

by nn

XOR of two same numbers results in 0(000..00).

XOR of two different numbers x and y results in a number which contains set bits at the places where x and y differ. So if x and y are 10...0100 and 11...1001, then result would be 01...1101.

So the idea is to XOR all the elements in set. In the result xor, all repeating elements would nullify each other. The result would contain the set bits where two non-repeating elements differ.

Now, if we take any set bit of the result xor and again do XOR of the subset where that particular bit is set, we get the one non-repeating element. And for other non-repeating element we can take the subset where that particular bit is not set.

We have chosen the rightmost set bit of the xor as it is easy to find out.

by Sandeep


这个帖子和里面的code基本把要点都讲到了,非常好。其实,昨天面试官出的题目本质上就是这个题。设想9998个数的数值是1到10000,缺了两个。然后我再补上连续的1到10000,原来找漏的问题不就是变成找出两个non-repeated numbers了吗?


对于缺两个数的情形,程锦后来提出一个巧妙的算法:在1~10000中,大致找到一个中点,比如5000这个数,分别计算1+...+5000 = A;5001+...+10000 = B。接着再把给定的9998个数分成两堆,一堆是1到5000,另一堆是5001到10000,分别求两堆的和,设为C和D。然后A-B,再C-D,如果两个差俱不为0,那么说明漏掉的数一边一个,得出的两个差值就是漏掉的两个数。如果任何一个差为0,就说明两个漏掉的数在另一堆里面。假设是在5001到10000这一堆里面,则下一步为在这一堆里面找一个中点,再重复上面的步骤。这样,总有一个时刻,会有两个差都不为0的时候。

这是个类似binary search的解法。虽然程锦没有CS的背景,可她有一颗比我聪明得多的脑袋,呵呵。

No comments:

Post a Comment