Took some time think over this and I made an assumption, suppose there is no repeating numbers. So we want to find the single missing number ranging from 0 to 2^32-1 unsinged integers in a random sequence.
First, need background of unique factorial theorem.
Second, search prime numbers 32 bit in Internet, you will find we have total about 200,xxx,xxx prime numbers in this scope.
Here is the deal.
suppose we have a_1 ... a_m prime numbers. a_1*...*a_m is the ceiling of 2^32 -1 with any extra prime number needed. I think the count of a_1 to a_m will be far less than 200,xxx,xxx. Think over it?
So put a_1 and a_m in a hashmap. The value will be the count of given prime number.
From 0 to 2^32-1 - 1, the hashmap above will have a unique sequence.
Then we process the input, factor each number in the input data. Count the presence of each prime number (add to the according value in the hashmap). We will detect the prime numbers fail to present, compare to an total sequence of 0 to 2^32-1. Their multiple is the single missed numbers.
If there are multiple missed numbers, we can use a second pass, use the different combinations of the missed primes, screening the input data again.
So it will be good the OS has a local prime table.
Just for discussion for potential improvement of the situation. Please feel free to comment. Thank off irrelevants!
【 在 creation (努力自由泳50m/45sec !) 的大作中提到: 】
: 常见的。 为严谨起见我逐字抄下来
: given a sequential file that contains ____ at most four billion 32-bit
: integers -_____ in random order, find a 32-bit integer that isn't in the
: 简单的办法，如果有4G内存， 就做一个bitmap 可以找到。 书里讨论如果只有几百
: byte内存， 但可以写文件， 怎么弄。
: 书上说，把要找的range 分两半 ， scan 过整个文件，看每一个range 里有多少， 肯
: 定有一个range 不够满，那就继续找那个range, 这样range 指数变小，最后为size 1,
: 就是要找的missing number.
提示: 本博文来自于 JobHunting 版