Thanks to Shan, and Yasar for spending their time at answering my question in the previous post!
Even I too don’t know the exact answer, but I am happy with your answers. So, you may be called ‘Mr.Smarts’ as going forward! J
Ok. Let’s once again become energetic Comp.Science students and discuss this issue in detail.
Question is: “Whats the most efficient way to sort a million 32-bit integers?”
I went and searched for the answer a lot on google, but no direct answer was found. Google is very smart in asking such interview question, which you probably can’t find from internet.
Let it be.
My answer goes like this:
In Comp.Science, there are lots of classical problems. When you encounter such question, try to see if you can match it with some existing problem. Or, try to find out the base ‘Problem Class/Type’. Yes, you can map most of the problem to a Data Structure or Algorithm!
Consider the priorities and scenarios in this case.
(Before that, remember Sorting Thumb Rule #1: Bubble Sort is a wrong way to go! :) )
1. If memory is a constraint:
Suppose, if you need to sort a million integers in a system which has a tight memory constraint, and then go for External Sorting or Merge Sorting or External Merge Strategy (Combined).
Since, Merge sorting takes only O(nlogn) time (at average case and as well as at worst case)!. The combined External sort mechanism can help you to assist ‘Divide and Conqueror’.
2. If speed is the constraint (and not memory):
Two choices: Quick Sort, Heap Sort, Merge Sort.
Quick Sort: Gives you O(nlogn) speed. But, this sorting approach is little dangerous. Because, if your input (1 million integers) is close to already-sorted form, then the Quick Sort will slow down to the speed of O(n^2) .
Also, Quick Sort is unstable sort! Quick sorting is based on recursion (recursion means stack.. stack means more memory). So, the repeated recursion calls can tend to hog more memory.
But, you stand a good choice to use parallelism. That’s quick sort creates lots of temporary partitions. So, you can distribute these portions across various processing units (say, across a cluster).
One important thing is, if you have N number of processors, each processor can process a partition, than you can complete the sorting by O(nlogn)/N = O(logn) !
That is an extreme performance as it is in Linear Time!
Heap Sort:
Takes very less memory – good point.
Runs at O(nlogn) time, even at worst case – Too good point! :)
Not sure about partitioning.
One should definitely consider HeapSort, when speed is a constraint.
PS: I have NOT taken or NOT got this answer from anywhere/anybody! The answer is purely based on my own thoughts, so it MAY NOT be 100% correct!