and my 2 cents from tool box, load and sort array at same time:
_TITLE "Load and sort to array at same time" 'B+ 2019-10-26 '2019-10-27 edited 2x with Steve's improvements thanks Steve!
' 2019-11-03 mod for name score data file
DIM d
(1 TO 100) AS dat
'dimensioned to something known to be > data lines i = i + 1
d(k) = d(k - 1)
d(j).s = s: d(j).n = N$
Here's the only reason why I don't usually advocate the "load and sort" method:
Even with integer values and a fairly smallish data set (only 20,000 items), the load and sort method takes 5 seconds to run on my machine. Loading then sorting takes about 0.6 seconds...
Now, part of the issue here is with opening the file for INPUT, which is slower than heck to process. If we replace it with opening the file for BINARY access and using LINE INPUT and VAL, we change the times to 4.5 seconds and 0.15 seconds like so:
In the time it takes us to load and sort 20000 numbers, we can instead load 500000 numbers and then sort them, as illustrated below:
CONST limit
= 500000 'number of items to load and sort
FOR i
= 0 TO limit
'print a ton of entries
n(k) = n(k - 1)
n(j) = num
PRINT USING "###.### seconds to load and sort 20,000 items together"; t1##
- t##
Sort m
PRINT USING "###.### seconds to load and then sort 500,000 items"; t3##
- t2##
'Convert our offset data over to something we can work with
_MEMPUT m1
, m1.OFFSET
, m.ELEMENTSIZE:
_MEMGET m1
, m1.OFFSET
, ES
'Element Size _MEMPUT m1
, m1.OFFSET
, m.SIZE:
_MEMGET m1
, m1.OFFSET
, EC
'Element Count will temporily hold the WHOLE array size
EC = EC / ES - 1 'Now we take the whole element size / the size of the elements and get our actual element count. We subtract 1 so our arrays start at 0 and not 1.
'And work with it!
i = 0
temp1(t1) = temp1(t1) + 1
i = i + 1
i1 = -128
counter = counter + 1
temp1(i1) = temp1(i1) - 1
i1 = i1 + 1
i = 0
temp2(t2) = temp2(t2) + 1
i = i + 1
i1 = -32768
counter = counter + 1
temp2(i1) = temp2(i1) - 1
i1 = i1 + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 32
o1 = m.OFFSET + (i + gap) * 32
swapped = -1
i = i + 1
gap = EC
gap
= INT(gap
/ 1.247330950103979) i = 0
swapped = 0
o = m.OFFSET + i * ES
o1 = m.OFFSET + (i + gap) * ES
T7c = T7b
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
CASE 11:
'_UNSIGNED _BYTE i = 0
temp11(t11) = temp11(t11) + 1
i = i + 1
i1 = 0
counter = counter + 1
temp11(i1) = temp11(i1) - 1
i1 = i1 + 1
CASE 12 '_UNSIGNED INTEGER i = 0
temp12(t12) = temp12(t12) + 1
i = i + 1
i1 = 0
counter = counter + 1
temp12(i1) = temp12(i1) - 1
i1 = i1 + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
CASE 18:
'_UNSIGNED _INTEGER64 gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
Loading and sorting as you go works good for small data sets, but what you're really implementing is an Insertion Sort method, and there's a lot of other sorting routines which are much more efficient out there. The more data you need to add into an array, the less useful an insertion sort method becomes. If you have 100,000 sorted items and need to add a new sorted item into the set, then it's definitely the way to go. If you have 100,000 items that all need to be sorted completely to begin with, then you should generally choose a more efficient sorting method.
Since the memsort routine is a standard part of my toolbox, I almost never bother with the "load and sort" method. I generally just load an array and then sort it and be done with it, like the examples above. ;)
And just for fun, and to really highlight the difference in various sorting methods performance, I took the file loading times completely out and wrote this little demo:
CONST limit
= 2000000 'number of items to load and sort
FOR i
= 0 TO limit
'print a ton of entries n(i) = num
Array(i) = num
num = n(i)
n(k) = n(k - 1)
n(j) = num
PRINT USING "###.### seconds to insertion sort 20,000 items together"; t1##
- t##
Sort m
PRINT USING "###.### seconds to mem sort 2,000,000 items"; t3##
- t2##
'Convert our offset data over to something we can work with
_MEMPUT m1
, m1.OFFSET
, m.ELEMENTSIZE:
_MEMGET m1
, m1.OFFSET
, ES
'Element Size _MEMPUT m1
, m1.OFFSET
, m.SIZE:
_MEMGET m1
, m1.OFFSET
, EC
'Element Count will temporily hold the WHOLE array size
EC = EC / ES - 1 'Now we take the whole element size / the size of the elements and get our actual element count. We subtract 1 so our arrays start at 0 and not 1.
'And work with it!
i = 0
temp1(t1) = temp1(t1) + 1
i = i + 1
i1 = -128
counter = counter + 1
temp1(i1) = temp1(i1) - 1
i1 = i1 + 1
i = 0
temp2(t2) = temp2(t2) + 1
i = i + 1
i1 = -32768
counter = counter + 1
temp2(i1) = temp2(i1) - 1
i1 = i1 + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 32
o1 = m.OFFSET + (i + gap) * 32
swapped = -1
i = i + 1
gap = EC
gap
= INT(gap
/ 1.247330950103979) i = 0
swapped = 0
o = m.OFFSET + i * ES
o1 = m.OFFSET + (i + gap) * ES
T7c = T7b
swapped = -1
i = i + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
CASE 11:
'_UNSIGNED _BYTE i = 0
temp11(t11) = temp11(t11) + 1
i = i + 1
i1 = 0
counter = counter + 1
temp11(i1) = temp11(i1) - 1
i1 = i1 + 1
CASE 12 '_UNSIGNED INTEGER i = 0
temp12(t12) = temp12(t12) + 1
i = i + 1
i1 = 0
counter = counter + 1
temp12(i1) = temp12(i1) - 1
i1 = i1 + 1
gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 4
o1 = m.OFFSET + (i + gap) * 4
swapped = -1
i = i + 1
CASE 18:
'_UNSIGNED _INTEGER64 gap = EC
gap = 10 * gap \ 13
i = 0
swapped = 0
o = m.OFFSET + i * 8
o1 = m.OFFSET + (i + gap) * 8
swapped = -1
i = i + 1
4.34 seconds to insertion sort 20,000 items together.
0.14 seconds to mem sort 2,000,000 items together.
30 times faster, and we sorted 100 times more items... I think that goes to show why generally speaking, I prefer to just load the array and then sort the array. ;)