I am working on an interview question from Glassdoor Software Engineer
The question is
Given a list of one million numbers, how will you find the top n numbers from the list in an efficient way
Here is a solution an author gave from same link
- create a min heap
- take first n of m elements and place in the heap (O(n))
- for each (m-n) remaining elements, if it is greater than find-min of the heap, insert into heap and delete min. (worst case O((m-n)log n) if the list is sorted.
net result is you can do this in O(n) memory usage and worst-case O((m-n)logn) runtime.
I agree with the author's algorithm and the author's assessment of the space complexity of this algorithm. What I have an issue with is the author's analysis of the runtime for insertion into heap and overall time
For the step "take first n of m elements and place in the heap", wouldn't that run in O(nlogn)? At least according to my class notes Heap Add, insertion would be O(logn) and because you are inserting n elements, the runtime of that whole step would be O(nlogn).
Taking that into consideration, wouldn't the overall runtime of this entire algorithm be, using big oh addition from Big Oh Addition
O(nlogn + (m-n)logn) = O(mlogn)