## praalhans

It seems to me that with the widespread deployment of many-core servers in large well-connected data centers, that it is reasonable to analyse algorithmic complexity not only in terms of primitive computational steps on a single processor, but also with the assumption that we have availability of a large number of logical processors that may be equal to the original problem's input size.

Let's analyse the complexity of sorting a list.

Classical example: a typical merge sort algorithm, working on input size $n$, has a worst-case time complexity in terms of comparison operations of $O(n\log n)$. The algorithm has as output a list of $n$ indices, such that the corresponding elements are sorted.

Distributed example: assume we have $n$ processors available, each processor having access to all $n$ elements of the list. The processors are fully connected, and we assume zero communication overhead. We want to measure the number of *necessarily sequential* comparison operations, up to the point in time that a single processor knows the output of the algorithm.

Clearly, every processor can perform the merge sort algorithm individually, and we end up with $O(n\log n)$. Can we do better by sharing information? My take on this would be the following: each processor gets assigned a unique element in the list (say, each processor has an identifier uniquely corresponding with an element of the list). Then each processor walks over the list, counting the number of elements lower than its designated element. It finally sends this number to a predefined master processors. Of course, this algorithm only works for lists with unique elements.

The complexity of this algorithm in terms of comparisons is essentially the same as that of linear search, $O(n)$. This algorithm sends $O(n)$ messages, namely one for each processor, which is worse than $O(1)$ for the classical merge sort. Perhaps better distributed sorting algorithms exists.

Obviously there is a shortcoming to this assumption. Sorting a list of 1,000,000,000,000 elements requires the same number of processors, which at this time might be unrealistic. Still, there is no reason why in the near future this would not be a realistic assumption.

As a corollary of this assumption, we forcefully collapse $P = NP$ by taking this approach, since we can let each of the $n$ processors take different non-deterministic choices that are explored in parallel.

Does anyone agree/disagree with this notion of measuring algorithms? Perhaps someone can point me to relevant literature?