Thursday, August 26, 2010

Simple Quicksort in F#

Here is a very simple quicksort function in F#. Quicksort is a generally fast, general purpose sorting algorithm. It works well because it is a divide and conquer algorithm that does all of its work in place.

Quicksort implementations can be cache friendly because it keeps working on smaller and smaller portions of the data set. If the data is in an array, the computer won't have as many cache misses while the algorithm is running. In modern computers, memory bandwidth is one of the limiting factors in how fast things can go.

F# is a multi-paradigm language, which is a good thing for Quicksort. Although I like working with functional code, this code looks a lot nicer when we use mutable state and arrays.

This implementation does the bare minimum needed. The fanciest part of the code is the median-of-three pivot selection. The reason I included this pivot selection is because the rest of the code is simpler if we can assume that the first element of a subarray is smaller than the current pivot. The loops inside the main partition function get a little cleaner.

In an F# program, code must be defined before it is used. So, the main function here is at the bottom of the listing. To get a top down view, read from the bottom up, or vice versa.

The main algorithm consists of two steps (besides checking for empty subarrays)

  1. Pick a pivot element, and partition the sub-array into a chunk less than or equal to the pivot, and a chunk greater than or equal to the pivot.
  2. Recursively sort the subarrays.

The recursion stops when we have a subarray of length 1. In the code here, we have another case for when the subarray has length 2.

There are many other improvements that we can make to the basic algorithm. In the future I may implement some of them, and also, I would like to show how to test this code with FsCheck, a randomized test framework.

let inline swap (data:'a[]) a b =
    let t = data.[a]
    data.[a] <- data.[b]
    data.[b] <- t
/// Sort three values from an array.  Put the smallest
/// at index a, biggest at c, and median at b.    
let inline median_of_three (data:'a[]) a b c =
    if data.[b] < data.[a] then
        swap data a b
    if data.[c] < data.[b] then
        swap data c b
        if data.[b] < data.[a] then
            swap data a b 

/// partition the data from index a to b (inclusive)
/// use the element at position b as the pivot, and
/// assume data.[a] is <= data.[b].
let inline partition (data:'a[]) a b =
    // After this is called, return the new position
    // of the pivot value.  After the call, all values before
    // the pivot position should be <= to the pivot, and all values
    // after should be >= the pivot.
    let mutable i = a - 1
    let mutable j = b
    let pivot = data.[b]
    while i < j do
        i <- i + 1
        while data.[i] < pivot do
            i <- i + 1

        j <- j - 1
        while data.[j] > pivot do
            j <- j - 1
        if i < j then
            swap data i j
    // Now, i is >= j.  data.[i] is >= pivot.
    // data.[j] <= pivot.  So, move the pivot
    // to the middle where it belongs, and move
    // the value at i to the end of the array.
    swap data i b


let inline sort (data:'a[]) =
    let rec loop a b =
        if a >= b then ()
        elif a + 1 = b then
            if data.[a] > data.[b] then swap data a b
            let c = (a+b)/2
            median_of_three data a b c
            let p = partition data a b
            loop a (p-1)
            loop (p+1) b
    loop 0 (data.Length - 1) | CBAMC

No comments: