sort
sort – general sort function
Calling sequence
[y [,p]] = sort(x)
[y [,p]] = sort(x, type=type_of_sort, dir=order)
[y, p] = sort(x, type=type_of_sort, dir=order, ind_type=ptype)
[y [,p]] = sort(x, type_of_sort, order)
[y, p] = sort(x, type_of_sort, order, ptype)
Parameters
- x: matrix or vector of numbers or strings.
- type=type_of_sort: a string among
"g","gb","gm","gs","c","r","ldc","ldr","lc","lr" (default is "g")
- dir=order: must be "i" for an increasing sort or "d" for a decreasing sort (current
default)
- ind_type=ptype: a string among "int","double" (default is "double")
- y: sorted output
- p: vector (or matrix) of indices such that y=x(p) or y=x(:,p) or y=x(p,:) or other
meanings depending of the sort type. p could be stored in an array of floating point
numbers (Mat) (the default) or in an array of usual C integers (IMat of itype int32) if
you use ind_type = "int"
Description
This function is an interface for various kind of sorts depending of the input argument
type_of_sort:
type_of_sort = "g","gb","gm","gs"
In this case the function sorts the vector x (if x is a
matrix it is considered as a big vector using the major column order) in decreasing order by default
or when order="d" and in increasing order when order="i".
|
|
| "g" | quick sort a la Bentley, McIlroy's ”Engineering a Sort Function” |
| | (tends to be the fastest when x have many non unique elements) |
|
|
| "gb" | quick sort a la Sedgewick |
| | (tends to be the fastest when x have few non unique elements) |
|
|
| "gm" | merge sort |
| | generally slower than the 2 first by always in O(nlog(n)) |
|
|
| "gs" | stable quick sort |
| | (adapted quick sort a la Sedgewick) |
|
|
| |
Currently only "g" and "gs" are available for string vectors.
type_of_sort = "c"
In this case each row of x is sorted independently of the others. p is a
matrix of indices and y(i,:) should be equal to x(i,p(i,:)).
type_of_sort = "r"
In this case each column of x is sorted independently of the others. p is a
matrix of indices and y(:,j) should be equal to x(p(:,j),j).
type_of_sort = "lc"
In this case the columns are sorted using lexical order (increasing order if
order="i"). p is a vector of indices such that y should be equal to x(:,p). Important note : for
matrix of floating point numbers a cast double to int is done first, so use type_of_sort = "ldc" to
avoid this problem.
type_of_sort = "ldc"
Lexical columns sorting for matrix of floating point numbers (see option
”lc”)
type_of_sort = "lr"
In this case the rows are sorted using lexical order (increasing order if
order="i"). p is a vector of indices such that y should be equal to x(p,:). Important note : for
matrix of floating point numbers a cast double to int is done first, so use type_of_sort = "ldr" to
avoid this problem.
type_of_sort = "ldr"
Lexical rows sorting for matrix of floating point numbers (see option
”lr”).
Remarks
- For matrix of floating point numbers, Nan values are considered larger than Inf. So all
nans are positionned at the end in case of an increasing sort and at the beginning for a
decreasing sort.
- stable sorts are not provided for other kind of sort than "g" ; if you need such a
feature:
- for lexical sorting of columns (”lc” or ”ldc”), you can add a last row 1:n to your
matrix:
[B,ind] = sort([A;1:size(A,2)],"ldc", "i"); B($,:)=[];
- for lexical sorting of rows (”lr” or ”ldr”), you can add a last column 1:m to your
matrix:
[B,ind] = sort([A,1:size(A,1)],"ldr", "i"); B(:,$)=[];
- for sorting each column independantly (”c”), do a loop with a stable sort (”gm” or ”gs”)
on each column:
B = zeros(size(A)); ind = zeros(size(A));
for j=1:size(A,2), [B(:,j),ind(:,j)] = sort(A(:,j),"gm", "i"); end
- for sorting each row independantly (”c”), do a loop with a stable sort (”gm” or ”gs”) on
each row:
B = zeros(size(A)); ind = zeros(size(A));
for i=1:size(A,1), [B(i,:),ind(i,:)] = sort(A(i,:),"gm", "i"); end
Examples
example 1
sort of a vector of numbers:
x = [0.5, -1, 2, 2, -1, 0.5, 2, -1, 2]; // increasing sort
y = sort(x,dir="i") // or y = sort(x,"g","i") [y,p] = sort(x,dir="i")
y.equal[x(p)] // must be true // decreasing sort y = sort(x,dir="d") // or y = sort(x,"g","d")
[y,p] = sort(x,dir="d") y.equal[x(p)] // must be true // behavior with special values
x(2)= %nan;x(3)=%inf; x(7)=%nan; x(8)=-%inf y = sort(x,dir="i") y = sort(x,dir="d")
example 2
sort a vector of strings:
x = ["toto", "foo", "bar", "toto", "foobar", "bar", "toto", "foo", "bar"] y = sort(x,dir="i")
y = sort(x,dir="d") [y,p] = sort(x,dir="i") y.equal[x(p)] // must be true
example 3
lexical sorts
A = [0.5, -1, 2; 0.3, 4, 0.7; 0.3, -1, 2] // increasing row lexical sort
B = sort(A,type="ldr",dir="i") // or B = sort(A,"ldr","i") [B,p] = sort(A,type="ldr",dir="i")
B.equal[A(p,:)] // must be true // increasing column lexical sort
B = sort(A,type="ldc",dir="i") [B,p] = sort(A,"ldc","i") B.equal[A(:,p)] // must be true
example 4
speed issues
// a case where all or near all vector components are different x = rand(1e6,1);
// g sort (should be slower than gb here) t=cputime(); y = sort(x,"g","i"); cputime()-t
// gb sort (should be faster here) t=cputime(); y = sort(x,"gb","i"); cputime()-t
// merge sort (generally slower than g and gb) t=cputime(); y = sort(x,"gm","i"); cputime()-t
// a case with many equal components x = grand(1e6,1,"uin",1,100);
// g sort (should be faster here) t=cputime(); y = sort(x,"g","i"); cputime()-t
// gb sort (should be slower than g) t=cputime(); y = sort(x,"gb","i"); cputime()-t
// merge sort t=cputime(); y = sort(x,"gm","i"); cputime()-t
See also
issorted , unique
Authors
Bentley and McIlroy 's code adapted and modified by Jean-Philippe Chancelier,”gb” and ”gs”
codes by Bruno Pincon.