algorithmsortingexternal-sorting

what is the complexity of parallel external sort


I'm wondering what is the complexity when i making parallel external sort.

Suppose I have big array N and limited memory. F.e 1 billion entries to sort and only 1k in entries memory.

for this case i've splitted the big array into K sorted files with chunk size B using parallel threads , and save in Disk.

After that read from all files , merged back to new array using with priprityQueue and threads.

I need to calculate the complexity with big O notation.

and what happened to complexity if i would use multi process lets say N processors ?

is it ~O(N/10 * log N) ?? 

thanks


Solution

  • The time complexity is going to be O(n log(n)) regardless of the number of processors and/or the number of external drives. The total time will be T(n/a logb(n)), but since a and b are constants, the time complexity remains the same at O(n log(n)), even if the time is say 10 times as fast.

    It's not clear to me what you mean by "parallel" external sort. I'll assume multiple cores or multiple processors, but are there also multiple drives? Do all N cores or processors share the same memory that only holds 1k elements or does each core or processor have its own "1k" of memory (in effect having "Nk" of memory)?


    external merge sort in general

    On the initial pass, the input array is read in chunks of size B, (1k elements), sorted, then written to K sorted files. The end result of this initial pass is K sorted files of size B (1k elements). All of the remaining passes will repeatedly merge the sorted files until a single sorted file is produced.

    The initial pass is normally cpu bound, and using multiple cores or processors for sorting each chunk of size B will reduce the time. Any sorting method or any stable sorting method can be used for the initial pass.

    For the merge phase, being able to perform I/O in parallel with doing merge operations will reduce the time. Using multi-threading to overlap I/O with merge operations will reduce time and be simpler than using asynchronous I/O to do the same thing. I'm not aware of a way to use multi-threading to reduce the time for a k-way merge operation.

    For a k-way merge, the files are read in smaller chunks of size B/(k+1). This allows for k input buffers and 1 output buffer for a k-way merge operation.

    For hard drives, random access overhead is an issue, say transfer rate is 200 MB/s, and average random access overhead is 0.01 seconds, which is the same amount of time to transfer 2 MB. If buffer size is 2 MB, then random access overhead effectively cuts transfer rate by 1/2 to ~100 MB/s. If buffer size is 8 KB, then random access overhead effectively cuts transfer rate by 1/250 to ~0.8 MB/s. With a small buffer, a 2-way merge will be faster, due to the overhead of random access.

    For SSDs in a non-server setup, usually there's no command queuing, and command overhead is about .0001 second on reads, .000025 second on writes. Transfer rate is about 500 MB/s for Sata interface SSDs. If buffer size is 2MB, the command overhead is insignificant. If buffer size is 4KB, then read rate is cut by 1/12.5 to ~ 40 MB/s, and write rate cut by 1/3.125 to ~160 MB/s. So if buffer size is small enough, again a 2-way merge will be faster.

    On a PC, these small buffer scenarios are unlikely. In the case of the gnu sort for huge text files, with default settings, it allocates a bit over 1GB of ram, creating 1GB sorted files on the initial pass, and does a 16-way merge, so buffer size is 1GB/17 ~= 60 MB. (The 17 is for 16 input buffers, 1 output buffer).


    Consider the case of where all of the data fits in memory, and that the memory consists of k sorted lists. The time complexity for merging the lists will be O(n log(k)), regardless if a 2-way merge sort is used, merging pairs of lists in any order or if a k-way merge sort is used to merge all the lists in one pass.

    I did some actual testing of this on my system, Intel 3770K 3.5ghz, Windows 7 Pro 64 bit. For a heap based k-way merge, with k = 16, transfer rate ~ 235 MB/sec, with k = 4, transfer rate ~ 495 MB/sec. For a non-heap 4-way merge, transfer rate ~ 1195 MB/sec. Hard drive transfer rates are typically 70 MB/sec to 200 MB/sec. Typical SSD transfer rate is ~500 MB/sec. Expensive server type SSDs (SAS or PCIe) are up to ~2GB/sec read, ~1.2GB/sec write.