Thank you
@bplus for your tests.
In the morning did write the 'Magnetica' partitioning, which helps greatly in "few distinct keys" scenario, usually 15x to 3x speed boost it gives. However, it hurts speed badly (2x) in "many distinct keys", the problem lies in STILL not well written right part traversal (Jndx), doing unnecessary swaps has to be fixed...
Your test (modified) is given below with ability (just change line 8) to run in "few distinct keys", otherwise it is the most common "many distinct keys":
"few distinct keys" scenario:
_Title "Recursive Vrs NonRecursive test with Quicksort" 'b+ 2021-11-01, Sanmayce added some stuff 2021-Nov-04 ArrayHigh = 10000000
n(i) = i '13 'i
For i
= ArrayHigh
To 2 Step -1 ' Fisher Yates Shuffle m(i) = n(i) ' copy m off n
z(i) = n(i) ' copy m off n
Print "Test arrays ready. Testing Recursive QuickSort first." n(1) = ArrayHigh
m(1) = n(1)
z(1) = n(1)
QuickSort 1, ArrayHigh
QuickSortT#
= Timer(.001) - T#
For i
= 1 To ArrayHigh
- 1 Print "QuickSort checked, QuickSort time was"; QuickSortT#
Quicksort_QB64
QuicksortQB64T#
= Timer(.001) - T#
For i
= 1 To ArrayHigh
- 1 Print "Quicksort_QB64 checked, Quicksort_QB64 time was"; QuicksortQB64T#
Quicksort_QB64_v2
QuicksortQB64T#
= Timer(.001) - T#
For i
= 1 To ArrayHigh
- 1 Print "Quicksort_QB64_v2 checked, Quicksort_QB64_v2 time was"; QuicksortQB64T#
' recursive
Lo = start: Hi = finish
Middle = n((Lo + Hi) \ 2)
Lo = Lo + 1: Hi = Hi - 1
If Hi
> start
Then QuickSort start
, Hi
If Lo
< finish
Then QuickSort Lo
, finish
' non recursive
' ref: https://rosettacode.org/wiki/Sorting_algorithms/Quicksort#QB64
LeftMargin = Left
StackPtr = 0
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Left
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Right
Right = Stack&&(StackPtr + LeftMargin)
StackPtr = StackPtr - 1
Left = Stack&&(StackPtr + LeftMargin)
StackPtr = StackPtr - 1
Pivot~&& = m((Left + Right) \ 2)
Indx = Left
Jndx = Right
Indx = Indx + 1
Jndx = Jndx - 1
Indx = Indx + 1
Jndx = Jndx - 1
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Indx
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Right
Right = Jndx
StackPtr = 0
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Left
StackPtr = StackPtr + 1
Stack&&(StackPtr + LeftMargin) = Right
Right = Stack&&(StackPtr)
Left = Stack&&(StackPtr - 1)
StackPtr = StackPtr - 2
' 'Classica' partitioning [
'Pivot~&& = z((Left + Right) \ 2)
'Indx = Left
'Jndx = Right
'Do
' Do While (z(Indx) < Pivot~&&)
' Indx = Indx + 1
' Loop
' Do While (z(Jndx) > Pivot~&&)
' Jndx = Jndx - 1
' Loop
' If Indx <= Jndx Then
' If Indx < Jndx Then Swap z(Indx), z(Jndx)
' Indx = Indx + 1
' Jndx = Jndx - 1
' End If
'Loop While Indx <= Jndx
' 'Classica' partitioning ]
' 'Magnetica' partitioning [
Jndx = Right
PL = Left
PR = Left
Swap z
((Left
+ Right
) \
2), z
(PL
) PL = PL + 1
PR = PR + 1
PR = PR + 1
Jndx = Jndx - 1
Jndx = PL - 1
Indx = PR + 1
' 'Magnetica' partitioning ]
StackPtr = StackPtr + 2
Stack&&(StackPtr - 1) = Indx
Stack&&(StackPtr) = Right
Right = Jndx
Consider having two records/fields (list of all living 7 billion people names with their birth date), now, the second column/field if it is to be sorted we cannot afford using the "Classica" partitioning because we enter the "few distinct keys" scenario with some 100 unique values. I am against "patches" of some pseudo-smart heuristics trying to adapt "on the fly" to the type of keys in use, rather, I would prefer having two dedicated subroutines, one dealing with "few" the other with "many".
Of course if we can come up with fixing the "Magnetica" then there will be one subroutine only, nice and dirty-cheap, heh-heh.
Also, very often one needs sorting all the files on some drive by their year, not a good feeling 'Classica' to force you wait 7 seconds as in example above for 10,000,000 files.
In practice, there are more brutal sort cases, such as sorting occurrences of 600 billion English 5-grams (phrases of 5 words), the nasty 600GB x 4bytes =~2TB.
You are right
@jack, but there is one aspect you should consider, being dirty from what standpoint!
As far as I know, using runtime stack (in addition to program's needs) is not a good practice, saw few discussions in the past where some Linux kernel gurus explicitly stated recursive code is not welcome at all, the topic was some very refined B-tree implementation, but recursive. Simply, the stack is considered an internal zone, endangered in case of entering deep recursion or for some reason stack being used extensively to the brim, hah-hah, hence the super popular stackoverflow site.
Oof, forgot to share the benchmarks done on some modern machines:
https://www.overclock.net/threads/benchmark-quicksort-says-sorting-2-billion-qwords.1794855/