Sorting and Related Functions

# Sorting and Related Functions

Julia has an extensive, flexible API for sorting and interacting with already-sorted arrays of values. By default, Julia picks reasonable algorithms and sorts in standard ascending order:

``````julia> sort([2,3,1])
3-element Array{Int64,1}:
1
2
3``````

You can easily sort in reverse order as well:

``````julia> sort([2,3,1], rev=true)
3-element Array{Int64,1}:
3
2
1``````

To sort an array in-place, use the "bang" version of the sort function:

``````julia> a = [2,3,1];

julia> sort!(a);

julia> a
3-element Array{Int64,1}:
1
2
3``````

Instead of directly sorting an array, you can compute a permutation of the array's indices that puts the array into sorted order:

``````julia> v = randn(5)
5-element Array{Float64,1}:
0.297288
0.382396
-0.597634
-0.0104452
-0.839027

julia> p = sortperm(v)
5-element Array{Int64,1}:
5
3
4
1
2

julia> v[p]
5-element Array{Float64,1}:
-0.839027
-0.597634
-0.0104452
0.297288
0.382396``````

Arrays can easily be sorted according to an arbitrary transformation of their values:

``````julia> sort(v, by=abs)
5-element Array{Float64,1}:
-0.0104452
0.297288
0.382396
-0.597634
-0.839027``````

Or in reverse order by a transformation:

``````julia> sort(v, by=abs, rev=true)
5-element Array{Float64,1}:
-0.839027
-0.597634
0.382396
0.297288
-0.0104452``````

If needed, the sorting algorithm can be chosen:

``````julia> sort(v, alg=InsertionSort)
5-element Array{Float64,1}:
-0.839027
-0.597634
-0.0104452
0.297288
0.382396``````

All the sorting and order related functions rely on a "less than" relation defining a total order on the values to be manipulated. The `isless` function is invoked by default, but the relation can be specified via the `lt` keyword.

## Sorting Functions

``sort!(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)``

Sort the vector `v` in place. `QuickSort` is used by default for numeric arrays while `MergeSort` is used for other arrays. You can specify an algorithm to use via the `alg` keyword (see Sorting Algorithms for available algorithms). The `by` keyword lets you provide a function that will be applied to each element before comparison; the `lt` keyword allows providing a custom "less than" function; use `rev=true` to reverse the sorting order. These options are independent and can be used together in all possible combinations: if both `by` and `lt` are specified, the `lt` function is applied to the result of the `by` function; `rev=true` reverses whatever ordering specified via the `by` and `lt` keywords.

Examples

``````julia> v = [3, 1, 2]; sort!(v); v
3-element Array{Int64,1}:
1
2
3

julia> v = [3, 1, 2]; sort!(v, rev = true); v
3-element Array{Int64,1}:
3
2
1

julia> v = [(1, "c"), (3, "a"), (2, "b")]; sort!(v, by = x -> x); v
3-element Array{Tuple{Int64,String},1}:
(1, "c")
(2, "b")
(3, "a")

julia> v = [(1, "c"), (3, "a"), (2, "b")]; sort!(v, by = x -> x); v
3-element Array{Tuple{Int64,String},1}:
(3, "a")
(2, "b")
(1, "c")``````
source
``sort(v; alg::Algorithm=defalg(v), lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)``

Variant of `sort!` that returns a sorted copy of `v` leaving `v` itself unmodified.

Examples

``````julia> v = [3, 1, 2];

julia> sort(v)
3-element Array{Int64,1}:
1
2
3

julia> v
3-element Array{Int64,1}:
3
1
2``````
source
``sort(A; dims::Integer, alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)``

Sort a multidimensional array `A` along the given dimension. See `sort!` for a description of possible keyword arguments.

Examples

``````julia> A = [4 3; 1 2]
2×2 Array{Int64,2}:
4  3
1  2

julia> sort(A, dims = 1)
2×2 Array{Int64,2}:
1  2
4  3

julia> sort(A, dims = 2)
2×2 Array{Int64,2}:
3  4
1  2``````
source
``sortperm(v; alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)``

Return a permutation vector `I` that puts `v[I]` in sorted order. The order is specified using the same keywords as `sort!`. The permutation is guaranteed to be stable even if the sorting algorithm is unstable, meaning that indices of equal elements appear in ascending order.

See also `sortperm!`.

Examples

``````julia> v = [3, 1, 2];

julia> p = sortperm(v)
3-element Array{Int64,1}:
2
3
1

julia> v[p]
3-element Array{Int64,1}:
1
2
3``````
source
``sortperm!(ix, v; alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward, initialized::Bool=false)``

Like `sortperm`, but accepts a preallocated index vector `ix`. If `initialized` is `false` (the default), `ix` is initialized to contain the values `1:length(v)`.

Examples

``````julia> v = [3, 1, 2]; p = zeros(Int, 3);

julia> sortperm!(p, v); p
3-element Array{Int64,1}:
2
3
1

julia> v[p]
3-element Array{Int64,1}:
1
2
3``````
source
``sortslices(A; dims, alg::Algorithm=DEFAULT_UNSTABLE, lt=isless, by=identity, rev::Bool=false, order::Ordering=Forward)``

Sort slices of an array `A`. The required keyword argument `dims` must be either an integer or a tuple of integers. It specifies the dimension(s) over which the slices are sorted.

E.g., if `A` is a matrix, `dims=1` will sort rows, `dims=2` will sort columns. Note that the default comparison function on one dimensional slices sorts lexicographically.

For the remaining keyword arguments, see the documentation of `sort!`.

Examples

``````julia> sortslices([7 3 5; -1 6 4; 9 -2 8], dims=1) # Sort rows
3×3 Array{Int64,2}:
-1   6  4
7   3  5
9  -2  8

julia> sortslices([7 3 5; -1 6 4; 9 -2 8], dims=1, lt=(x,y)->isless(x,y))
3×3 Array{Int64,2}:
9  -2  8
7   3  5
-1   6  4

julia> sortslices([7 3 5; -1 6 4; 9 -2 8], dims=1, rev=true)
3×3 Array{Int64,2}:
9  -2  8
7   3  5
-1   6  4

julia> sortslices([7 3 5; 6 -1 -4; 9 -2 8], dims=2) # Sort columns
3×3 Array{Int64,2}:
3   5  7
-1  -4  6
-2   8  9

julia> sortslices([7 3 5; 6 -1 -4; 9 -2 8], dims=2, alg=InsertionSort, lt=(x,y)->isless(x,y))
3×3 Array{Int64,2}:
5   3  7
-4  -1  6
8  -2  9

julia> sortslices([7 3 5; 6 -1 -4; 9 -2 8], dims=2, rev=true)
3×3 Array{Int64,2}:
7   5   3
6  -4  -1
9   8  -2``````

Higher dimensions

`sortslices` extends naturally to higher dimensions. E.g., if `A` is a a 2x2x2 array, `sortslices(A, dims=3)` will sort slices within the 3rd dimension, passing the 2x2 slices `A[:, :, 1]` and `A[:, :, 2]` to the comparison function. Note that while there is no default order on higher-dimensional slices, you may use the `by` or `lt` keyword argument to specify such an order.

If `dims` is a tuple, the order of the dimensions in `dims` is relevant and specifies the linear order of the slices. E.g., if `A` is three dimensional and `dims` is `(1, 2)`, the orderings of the first two dimensions are re-arranged such such that the slices (of the remaining third dimension) are sorted. If `dims` is `(2, 1)` instead, the same slices will be taken, but the result order will be row-major instead.

Higher dimensional examples

``````julia> A = permutedims(reshape([4 3; 2 1; 'A' 'B'; 'C' 'D'], (2, 2, 2)), (1, 3, 2))
2×2×2 Array{Any,3}:
[:, :, 1] =
4  3
2  1

[:, :, 2] =
'A'  'B'
'C'  'D'

julia> sortslices(A, dims=(1,2))
2×2×2 Array{Any,3}:
[:, :, 1] =
1  3
2  4

[:, :, 2] =
'D'  'B'
'C'  'A'

julia> sortslices(A, dims=(2,1))
2×2×2 Array{Any,3}:
[:, :, 1] =
1  2
3  4

[:, :, 2] =
'D'  'C'
'B'  'A'

julia> sortslices(reshape([5; 4; 3; 2; 1], (1,1,5)), dims=3, by=x->x[1,1])
1×1×5 Array{Int64,3}:
[:, :, 1] =
1

[:, :, 2] =
2

[:, :, 3] =
3

[:, :, 4] =
4

[:, :, 5] =
5``````
source

## Order-Related Functions

``issorted(v, lt=isless, by=identity, rev:Bool=false, order::Ordering=Forward)``

Test whether a vector is in sorted order. The `lt`, `by` and `rev` keywords modify what order is considered to be sorted just as they do for `sort`.

Examples

``````julia> issorted([1, 2, 3])
true

julia> issorted([(1, "b"), (2, "a")], by = x -> x)
true

julia> issorted([(1, "b"), (2, "a")], by = x -> x)
false

julia> issorted([(1, "b"), (2, "a")], by = x -> x, rev=true)
true``````
source
``searchsorted(a, x; by=<transform>, lt=<comparison>, rev=false)``

Return the range of indices of `a` which compare as equal to `x` (using binary search) according to the order specified by the `by`, `lt` and `rev` keywords, assuming that `a` is already sorted in that order. Return an empty range located at the insertion point if `a` does not contain values equal to `x`.

Examples

``````julia> a = [4, 3, 2, 1]
4-element Array{Int64,1}:
4
3
2
1

julia> searchsorted(a, 4)
5:4

julia> searchsorted(a, 4, rev=true)
1:1``````
source
``searchsortedfirst(a, x; by=<transform>, lt=<comparison>, rev=false)``

Return the index of the first value in `a` greater than or equal to `x`, according to the specified order. Return `length(a) + 1` if `x` is greater than all values in `a`. `a` is assumed to be sorted.

Examples

``````julia> searchsortedfirst([1, 2, 4, 5, 14], 4)
3

julia> searchsortedfirst([1, 2, 4, 5, 14], 4, rev=true)
1

julia> searchsortedfirst([1, 2, 4, 5, 14], 15)
6``````
source
``searchsortedlast(a, x; by=<transform>, lt=<comparison>, rev=false)``

Return the index of the last value in `a` less than or equal to `x`, according to the specified order. Return `0` if `x` is less than all values in `a`. `a` is assumed to be sorted.

Examples

``````julia> searchsortedlast([1, 2, 4, 5, 14], 4)
3

julia> searchsortedlast([1, 2, 4, 5, 14], 4, rev=true)
5

julia> searchsortedlast([1, 2, 4, 5, 14], -1)
0``````
source
``partialsort!(v, k; by=<transform>, lt=<comparison>, rev=false)``

Partially sort the vector `v` in place, according to the order specified by `by`, `lt` and `rev` so that the value at index `k` (or range of adjacent values if `k` is a range) occurs at the position where it would appear if the array were fully sorted via a non-stable algorithm. If `k` is a single index, that value is returned; if `k` is a range, an array of values at those indices is returned. Note that `partialsort!` does not fully sort the input array.

Examples

``````julia> a = [1, 2, 4, 3, 4]
5-element Array{Int64,1}:
1
2
4
3
4

julia> partialsort!(a, 4)
4

julia> a
5-element Array{Int64,1}:
1
2
3
4
4

julia> a = [1, 2, 4, 3, 4]
5-element Array{Int64,1}:
1
2
4
3
4

julia> partialsort!(a, 4, rev=true)
2

julia> a
5-element Array{Int64,1}:
4
4
3
2
1``````
source
``partialsort(v, k, by=<transform>, lt=<comparison>, rev=false)``

Variant of `partialsort!` which copies `v` before partially sorting it, thereby returning the same thing as `partialsort!` but leaving `v` unmodified.

source
``partialsortperm(v, k; by=<transform>, lt=<comparison>, rev=false)``

Return a partial permutation `I` of the vector `v`, so that `v[I]` returns values of a fully sorted version of `v` at index `k`. If `k` is a range, a vector of indices is returned; if `k` is an integer, a single index is returned. The order is specified using the same keywords as `sort!`. The permutation is stable, meaning that indices of equal elements appear in ascending order.

Note that this function is equivalent to, but more efficient than, calling `sortperm(...)[k]`.

Examples

``````julia> v = [3, 1, 2, 1];

julia> v[partialsortperm(v, 1)]
1

julia> p = partialsortperm(v, 1:3)
3-element view(::Array{Int64,1}, 1:3) with eltype Int64:
2
4
3

julia> v[p]
3-element Array{Int64,1}:
1
1
2``````
source
``partialsortperm!(ix, v, k; by=<transform>, lt=<comparison>, rev=false, initialized=false)``

Like `partialsortperm`, but accepts a preallocated index vector `ix`. If `initialized` is `false` (the default), `ix` is initialized to contain the values `1:length(ix)`.

source

## Sorting Algorithms

There are currently four sorting algorithms available in base Julia:

• `InsertionSort`
• `QuickSort`
• `PartialQuickSort(k)`
• `MergeSort`

`InsertionSort` is an O(n^2) stable sorting algorithm. It is efficient for very small `n`, and is used internally by `QuickSort`.

`QuickSort` is an O(n log n) sorting algorithm which is in-place, very fast, but not stable – i.e. elements which are considered equal will not remain in the same order in which they originally appeared in the array to be sorted. `QuickSort` is the default algorithm for numeric values, including integers and floats.

`PartialQuickSort(k)` is similar to `QuickSort`, but the output array is only sorted up to index `k` if `k` is an integer, or in the range of `k` if `k` is an `OrdinalRange`. For example:

``````x = rand(1:500, 100)
k = 50
k2 = 50:100
s = sort(x; alg=QuickSort)
ps = sort(x; alg=PartialQuickSort(k))
qs = sort(x; alg=PartialQuickSort(k2))
map(issorted, (s, ps, qs))             # => (true, false, false)
map(x->issorted(x[1:k]), (s, ps, qs))  # => (true, true, false)
map(x->issorted(x[k2]), (s, ps, qs))   # => (true, false, true)
s[1:k] == ps[1:k]                      # => true
s[k2] == qs[k2]                        # => true``````

`MergeSort` is an O(n log n) stable sorting algorithm but is not in-place – it requires a temporary array of half the size of the input array – and is typically not quite as fast as `QuickSort`. It is the default algorithm for non-numeric data.

The default sorting algorithms are chosen on the basis that they are fast and stable, or appear to be so. For numeric types indeed, `QuickSort` is selected as it is faster and indistinguishable in this case from a stable sort (unless the array records its mutations in some way). The stability property comes at a non-negligible cost, so if you don't need it, you may want to explicitly specify your preferred algorithm, e.g. `sort!(v, alg=QuickSort)`.

The mechanism by which Julia picks default sorting algorithms is implemented via the `Base.Sort.defalg` function. It allows a particular algorithm to be registered as the default in all sorting functions for specific arrays. For example, here are the two default methods from `sort.jl`:

``````defalg(v::AbstractArray) = MergeSort
defalg(v::AbstractArray{<:Number}) = QuickSort``````

As for numeric arrays, choosing a non-stable default algorithm for array types for which the notion of a stable sort is meaningless (i.e. when two values comparing equal can not be distinguished) may make sense.