@@ -32,9 +32,7 @@ versions corresponding to differend types of array arguments.
32
32
33
33
The ` int_size ` parameter is used to specify the kind of integer used
34
34
in indexing the various arrays. Currently the module sets ` int_size `
35
- to the value of ` int64 ` from the intrinsic ` ISO_FORTRAN_ENV ` module.
36
- For many applications a value of ` INT32 ` would be sufficient for
37
- addressing and would save some stack space for the subroutines,
35
+ to the value of ` int64 ` from the ` stdlib_kinds ` module.
38
36
39
37
### The module subroutines
40
38
@@ -43,29 +41,26 @@ subroutines intended to sort three different kinds of arrays of
43
41
data:
44
42
* ` ORD_SORT ` is intended to sort simple arrays of intrinsic data
45
43
that have significant sections that were partially ordered before
46
- the sort; and
44
+ the sort;
47
45
* ` ORD_SORTING ` is intended to provide indices for sorting arrays of
48
46
derived type data, based on the ordering of an intrinsic component
49
- of the derived type.
47
+ of the derived type; and
50
48
* ` UNORD_SORT ` is intended to sort simple arrays of intrinsic data
51
- that are effectively unordered before the sort;
49
+ that are effectively unordered before the sort.
52
50
53
- #### The ` ORD_SORT ` subroutine
51
+ #### Licensing
54
52
55
- ` ORD_SORT ` is a translation of the [ ` rust sort ` sorting algorithm]
56
- (https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs )
57
- which in turn is inspired by the [ ` timsort ` algorithm of Tim Peters]
58
- (http://svn.python.org/projects/python/trunk/Objects/listsort.txt ).
59
- ` ORD_SORT ` is a hybrid stable comparison algorithm combining ` merge sort ` ,
60
- and ` insertion sort ` . It has always at worst O(N Ln(N)) runtime
61
- performance in sorting random data, having a performance about 15-25%
62
- slower than ` UNORD_SORT ` on such data. However it has much better
63
- performance than ` UNORD_SORT ` on partially sorted data, having O(N)
64
- performance on uniformly increasing or decreasing data.
53
+ The Fortran Standard Library is distributed under the MIT
54
+ License. However components of the library may be based on code with
55
+ additional licensing restriction. In particular ` ORD_SORT ` ,
56
+ ` ORD_SORTING ` , and ` UNORD_SORT ` are translations of codes with their
57
+ own distribution restrictions.
65
58
66
- The [ ` rust sort ` implementation]
67
- (https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs )
68
- is distributed with the header:
59
+ The ` ORD_SORT ` and ` ORD_SORTING ` subroutines are essentially
60
+ translations to Fortran 2008 of the ` "rust" sort ` of the Rust Lsnguage
61
+ distributed as part of
62
+ [ ` slice.rs ` ] ( https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs ) .
63
+ The header of the ` slice.rs ` file has as its licensing requirements:
69
64
70
65
Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
71
66
file at the top-level directory of this distribution and at
@@ -77,19 +72,48 @@ is distributed with the header:
77
72
option. This file may not be copied, modified, or distributed
78
73
except according to those terms.
79
74
80
- so the license for the original code is compatible with the use of
75
+ so the license for the ` slice.rs ` code is compatible with the use of
81
76
modified versions of the code in the Fortran Standard Library under
82
77
the MIT license.
83
78
84
- As with ` timsort ` , ` ORD_SORT ` is a stable hybrid algorithm.
85
- It begins by traversing the array starting in its tail attempting to
86
- identify ` runs ` in the array, where a run is either a uniformly
87
- decreasing sequence, ` ARRAY(i-1) > ARRAY(i) ` , or non-decreasing,
88
- ` ARRAY(i-1) <= ARRAY(i) ` , sequence. Decreasing sequences are reversed.
89
- Then, if the sequence has less than ` MIN_RUN ` elements, previous
90
- elements in the array are added to the run using ` insertion sort `
91
- until the run contains ` MIN_RUN ` elements or the array is completely
92
- processed. As each run is identified the start and length of the run
79
+ The ` UNORD_SORT ` subroutine is essentially a translation to Fortran
80
+ 2008 of the
81
+ [ ` introsort ` ] ((http://www.cs.rpi.edu/~musser/gp/introsort.ps ) of David
82
+ Musser. David Musser has given permission to include a variant of
83
+ ` introsort ` in the Fortran Standard Library under the MIT license
84
+ provided we cite:
85
+
86
+ Musser, D.R., “Introspective Sorting and Selection Algorithms,”
87
+ Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
88
+
89
+ as the official source of the algorithm.
90
+
91
+
92
+ #### The ` ORD_SORT ` subroutine
93
+
94
+ ` ORD_SORT ` is a translation of the ` "Rust" sort ` sorting algorithm
95
+ contained in [ ` slice.rs ` ]
96
+ (https://github.com/rust-lang/rust/blob/90eb44a5897c39e3dff9c7e48e3973671dcd9496/src/liballoc/slice.rs ).
97
+ ` "Rust" sort ` , in turn, is inspired by the [ ` timsort ` algorithm]
98
+ (http://svn.python.org/projects/python/trunk/Objects/listsort.txt )
99
+ that Tim Peters created for the Python Language.
100
+ ` ORD_SORT ` is a hybrid stable comparison algorithm combining ` merge sort ` ,
101
+ and ` insertion sort ` . It has always at worst O(N Ln(N)) runtime
102
+ performance in sorting random data, having a performance about 15-25%
103
+ slower than ` UNORD_SORT ` on such data. However it has much better
104
+ performance than ` UNORD_SORT ` on partially sorted data, having O(N)
105
+ performance on uniformly increasing or decreasing data.
106
+
107
+
108
+ ` ORD_SORt ` begins by traversing the array starting in its tail
109
+ attempting to identify ` runs ` in the array, where a run is either a
110
+ uniformly decreasing sequence, ` ARRAY(i-1) > ARRAY(i) ` , or a
111
+ non-decreasing, ` ARRAY(i-1) <= ARRAY(i) ` , sequence. Once deliminated
112
+ decreasing sequences are reversed in their order. Then, if the
113
+ sequence has less than ` MIN_RUN ` elements, previous elements in the
114
+ array are added to the run using ` insertion sort ` until the run
115
+ contains ` MIN_RUN ` elements or the array is completely processed. As
116
+ each run is identified the start and length of the run
93
117
is then pushed onto a stack and the stack is then processed using
94
118
` merge ` until it obeys the stack invariants:
95
119
@@ -101,39 +125,22 @@ Ln(N))`. However, because of the identification of decreasing and
101
125
non-decreasing runs, processing of structured data can be much faster,
102
126
with processing of uniformly decreasing or non-decreasing arrays being
103
127
of order O(N). The result in our tests is that ` ORD_SORT ` is about
104
- 15- 25% slower than ` UNORD_SORT ` on purely random data, depending on
105
- the compiler, but can be more than an order of magnitude faster than
106
- ` UNORD_SORT ` on highly structured data. As a modified ` merge sort ` ,
107
- ` ORD_SORT ` requires the use of a "scratch" array, that may be provided
108
- as an optional ` work ` argument or allocated internally on the stack.
128
+ 25% slower than ` UNORD_SORT ` on purely random data, depending on
129
+ the compiler, but can be ` Ln(N) ` faster than ` UNORD_SORT ` on highly
130
+ structured data. As a modified ` merge sort ` , ` ORD_SORT ` requires the
131
+ use of a "scratch" array, that may be provided as an optional ` work `
132
+ argument or allocated internally on the stack.
109
133
110
134
#### The ` ORD_SORTING ` subroutine
111
135
112
- The ` UNORD_SORT ` and ` ORD_SORT ` subroutines can sort isolated arrays
113
- of intrinsic types, but do nothing for the sorting of arrays of
114
- derived types. For arrays of derived types what is useful is an array
115
- of indices that maps the original array to an array sorted based on the
116
- value of a component of the derived type. For such a sort, a stable
117
- sort is useful, therefore the module provides a subroutine,
118
- ` ORD_SORTING ` , that generates such an array of indices based on
119
- the ` ORD_SORT ` algorithm.
120
-
121
- As ` ORD_SORT ` is also based on the ` rust sort ` algorithm the `rust
122
- sort` license must be acknowledged:
123
-
124
- Copyright 2012-2015 The Rust Project Developers. See the COPYRIGHT
125
- file at the top-level directory of this distribution and at
126
- http://rust-lang.org/COPYRIGHT.
127
-
128
- Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
129
- http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
130
- <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
131
- option. This file may not be copied, modified, or distributed
132
- except according to those terms.
133
-
134
- noting that the Fortran Standard Library is released under the MIT
135
- license so that incorporation of the ` rust sort ` algorithm is
136
- compatible with its license.
136
+ The ` UNORD_SORT ` and ` ORD_SORT ` subroutines can sort rank 1 isolated
137
+ arrays of intrinsic types, but do nothing for the coordinated sorting
138
+ of related data, e.g., multiple related rank 1 arrays, higher rank
139
+ arrays, or arrays of derived types. For such related data, what is
140
+ useful is an array of indices that maps a rank 1 array to its sorted
141
+ form. For such a sort, a stable sort is useful, therefore the module
142
+ provides a subroutine, ` ORD_SORTING ` , that generates such an array of
143
+ indices based on the ` ORD_SORT ` algorithm.
137
144
138
145
The logic of ` ORD_SORTING ` parallels that of ` ORD_SORT ` , with
139
146
additional housekeeping to keep the array of indices consistent with
@@ -145,19 +152,12 @@ internally on the stack.
145
152
146
153
#### The ` UNORD_SORT ` subroutines
147
154
148
- ` UNORD_SORT ` uses the [ ` introsort ` sorting algorithm of David Musser]
149
- (http://www.cs.rpi.edu/~musser/gp/introsort.ps ). ` introsort ` is a hybrid
150
- unstable comparison algorithm combining ` quicksort ` , ` insertion sort ` ,
151
- and ` heap sort ` . While this algorithm's runtime performance is always
152
- O(N Ln(N)), it is relatively fast on randomly ordered data, but
153
- inconsistent in performance on partly sorted data. David Musser has
154
- given permission to include a variant of ` introsort ` in the Fortran
155
- Standard Library under the MIT license provided we cite:
156
-
157
- Musser, D.R., “Introspective Sorting and Selection Algorithms,”
158
- Software—Practice and Experience, Vol. 27(8), 983–993 (August 1997).
159
-
160
- as the official source of the algorithm.
155
+ ` UNORD_SORT ` uses the ` introsort ` sorting algorithm of David Musser.
156
+ ` introsort ` is a hybrid unstable comparison algorithm combining
157
+ ` quicksort ` , ` insertion sort ` , and ` heap sort ` . While this algorithm's
158
+ runtime performance is always O(N Ln(N)), it is relatively fast on
159
+ randomly ordered data, but inconsistent in performance on partly
160
+ sorted data.as the official source of the algorithm.
161
161
162
162
As with ` introsort ` , ` UNORD_SORT ` is an unstable hybrid algorithm.
163
163
First it examines the array and estimates the depth of recursion a
@@ -260,7 +260,7 @@ element of `array` is a `NaN`.
260
260
...
261
261
```
262
262
263
- #### ` ord_sorting ` - creates an arry of sorting indices for an input array.
263
+ #### ` ord_sorting ` - creates an array of sorting indices for an input array.
264
264
265
265
##### Status
266
266
@@ -329,8 +329,52 @@ is a `NaN`. It should be emphasized that the order of `array` will
329
329
typically be different on return.
330
330
331
331
332
- ##### Example
332
+ ##### Examples
333
+
334
+ Sorting a related rank one array:
335
+
336
+ ``` Fortran
337
+ subroutine sort_related_data( a, b, work, index, iwork )
338
+ ! Sort `b` in terms or its related array `a`
339
+ integer, intent(inout) :: a(:)
340
+ integer(int32), intent(inout) :: b(:) ! The same size as a
341
+ integer(int32), intent(inout) :: work(:)
342
+ integer(int_size), intent(inout) :: index(:)
343
+ integer(int_size), intent(inout) :: iwork(:)
344
+ ! Find the indices to sort a
345
+ call ord_sorting(a, index(1:size(a)),&
346
+ work(1:size(a)/2), iwork(1:size(a)/2))
347
+ ! Sort b based on the sorting of a
348
+ b(:) = b( index(1:size(a)) )
349
+ end subroutine sort_related_data
350
+ ```
351
+
352
+ Sorting a rank 2 array based on the data in a column
353
+
354
+ ``` Fortran
355
+ subroutine sort_related_data( array, column, work, index, iwork )
356
+ ! Sort `a_data` in terms or its component `a`
357
+ integer, intent(inout) :: a(:,:)
358
+ integer(int32), intent(in) :: column
359
+ integer(int32), intent(inout) :: work(:)
360
+ integer(int_size), intent(inout) :: index(:)
361
+ integer(int_size), intent(inout) :: iwork(:)
362
+ integer, allocatable :: dummy(:)
363
+ integer :: i
364
+ allocate(dummy(size(a, dim=1)))
365
+ ! Extract a component of `a_data`
366
+ dummy(:) = a(:, column)
367
+ ! Find the indices to sort the column
368
+ call ord_sorting(dummy, index(1:size(dummy)),&
369
+ work(1:size(dummy)/2), iwork(1:size(dummy)/2))
370
+ ! Sort a based on the sorting of its column
371
+ do i=1, size(a, dim=2)
372
+ a(:, i) = a(index(1:size(a, dim=1)), i)
373
+ end do
374
+ end subroutine sort_related_data
375
+ ```
333
376
377
+ Sorting an array of a derived type based on the dsta in one component
334
378
``` Fortran
335
379
subroutine sort_a_data( a_data, a, work, index, iwork )
336
380
! Sort `a_data` in terms or its component `a`
@@ -455,9 +499,9 @@ arrays each of size `2**20`:
455
499
* Random-10 - the final ten elements of the increasing array are
456
500
replaced by random values.
457
501
458
- These benchmarks have been performed on two different processors , both
502
+ These benchmarks have been performed on two different compilers , both
459
503
on a MacBook Pro, featuring a 2.3 GHz Quad-Core Intel Core i5, with 8
460
- GB 2133 MHz LPDDR3 memory. The first processor was GNU Fortran
504
+ GB 2133 MHz LPDDR3 memory. The first compiler was GNU Fortran
461
505
(GCC) 10.2.0, with the following results:
462
506
463
507
| Type | Elements | Order | Method | Time (s) |
@@ -490,7 +534,7 @@ GB 2133 MHz LPDDR3 memory. The first processor was GNU Fortran
490
534
| Integer | 1048576 | Random 3 | Unord_Sort | 0.13826 |
491
535
| Integer | 1048576 | Random 10 | Unord_Sort | 0.35356 |
492
536
493
- The second processor was ifort (IFORT) 18.0.3 20180410, with the
537
+ The second compiler was ifort (IFORT) 18.0.3 20180410, with the
494
538
following results:
495
539
496
540
| Type | Elements | Order | Method | Time (s) |
0 commit comments