Author Topic: A skeleton code for Text Scroller via Drag-and-Drop  (Read 29995 times)

0 Members and 1 Guest are viewing this topic.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #60 on: December 07, 2021, 10:13:37 pm »
With Quicksort 'Magnetica' in our toolbox, now Masakari r.8.1++ has been enriched with an unseen functionality, namely, sorting by linesize!

Sometimes, the need to view broken/long lines arises, then Masakari comes sorting them, FAST!

Looking what the most powerful editor UltraEdit v28.20 (latest version) can do with loaded file:
https://www.ultraedit.com/wiki/images/2/25/Ultraedit_sort.png

I see only the basic functionality, but when comes to spotting/browsing longest lines in huge files UE is not ultra enough, to fill the gap, here comes the fastest sorter combined with the fastest scheme of representing the lines - a pair of QWORDS: Offset:Length, naturally the sorting is simplified down to sorting all those pairs without accessing external RAM nor computing length of lines.

Again the longest XML file (English Wikipedia) is our testdataset:

 
111.png


The shortcut 'Shift+1' sorts by Length, the longest line is at the bottom:

 
222.png


The shortcut 'Shift+2' sorts by Offset, in fact, it returns the original file (instead of reloading it) i.e. REVERT.

As far as I know, there is still no tool/editor capable to sort 1+ billion lines in  (5:01:02 - 4:59:15) = 107 seconds and allow to browse all the longest lines.
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #61 on: December 09, 2021, 09:37:31 am »
Being a nasty bottleneck in Case-Insensitive searching, UCASE$() function needs a overhaul.

Glad to share another superfast function - UCASE_XMM, it replaces the superslow UCASE$() built-in.
Vectorization using XMM registers trumps bigtime the scalar variant.

 
manatarka.png


My intent is to include more superfast functions in next edition of my "manatarka" custom library, QB64 deserves some boost in textual department, as:

- a replacement for INSTR() built-in, memmem(), the FASTEST variant (to my knowledge) - Railgun();
- 128bit hasher, the FASTEST variant (to my knowledge) - Gumbotron();
- UCASE_YMM, the AVX variant.
* UCASE_showdown_SIMD.zip (Filesize: 2.31 MB, Downloads: 211)
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #62 on: December 10, 2021, 02:26:35 am »
Did some comparisons, INSTR() vs my SCALAR and VECTOR Railgun() functions, the bottomline is INSTR() is no match to the monstrously optimized memmem() variants of mine.

Okay, the testmachine is my laptop with i5-7200U, 36GB DDR4, the testdatafile is the ~4MB King James Bible.

The speedup is awesome, keep in mind the longer the Needle/Pattern the bigger the speedup becomes, short needles in range 2..6 are not that indicative of how superfast Railgun() is:

My SCALAR memmem(), called Railgun_Trolldom, is 2543/1413= 1.79x faster than INSTR() when the Needle is "gods".
My VECTOR memmem(), called Railgun_Nyotengu, is 6359/1413= 4.50x faster than INSTR() when the Needle is "gods".

As for the UCASE_XMM, the speedup is 30x to 60x:

 
r2.png


In the future, UCASE_XMM is to be incorporated into NyoTengu_XMM - then we will have the FASTEST Case-Insensitive INSTR() counterpart!

The manatarka.h housing the UCASE_XMM, Railgun_Trolldom and Railgun_Nyotengu is attached along with the QB64 test program.
Enjoy!
* UCASE_showdown_v2_SIMD.zip (Filesize: 17.93 MB, Downloads: 188)
He learns not to learn and reverts to what all men pass by.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #63 on: December 10, 2021, 02:52:27 am »
Try https://www.qb64.org/wiki/STRICMP when you don't care about case while comparing strings.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #64 on: December 10, 2021, 03:06:46 am »
Try https://www.qb64.org/wiki/STRICMP when you don't care about case while comparing strings.

This is good to have as well, but I am not aware of INSTR() that is Case-Insensitive, let alone superfast.
If we are forced to search for "gods" in the Project Gutenberg KJV Bible, case-insensitively then we have 2 more hits i.e. "Gods".
The idea is QB64 community to have even faster functions than GLIBC.
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #65 on: December 11, 2021, 01:10:13 am »
AFAIK, until now, QB64 lacks a general purpose hash function, my long experience with hashing (mostly for look-up tables) comes in handy in form of Gumbotron().

My Gumbotron hash function (more in Reply #47 in this thread) is the fastest 128bit hasher, it comes in XMM and YMM variants i.e. for SSE4.1 and AVX2, while generating the same hash. The screenshot shows the Gumbotron_YMM giving the correct hash (as it should be, given in the orange console by Nakamichi):

 
r3.png


It returns a pair of QWORDS, so if the usecase demands 64bit hash you can use either loPart or HiPart.
It is nearly as fast as the fastest 64bit hashers, but much stronger. In practice, 64bit hashers generate a lot of collisions even for casual tasks dealing with few million (try 7 billion - e.g. the names of all people) keys, nah, Gumbotron's 2x64bit scheme proves supereffective, my latest compressor Nakamichi uses this very hash without any problems - heavily tested hasher - battle-tested it is.
Imagine a search scenario where you have many keys/elements (different in length, or not), then by hashing them you can tighten them down to 16 bytes, after that you can use Quicksort 'Magnetica' to sort those hashes, and "finalize" with goodold Binary-Search.
In Nakamichi, I exploit Gumbotron even better, due to its superior 128bit-ness, it is used as a lossy compressor - shrinking all kind of keys 16+ in size (18, ... 256) down to 16 bytes. If there was a single collision then big problems would ensue, but the linear mixer provided by AES is doing great job, so, no problema.

The full source (and Windows binary) is attached.
* UCASE_showdown_v3_SIMD.zip (Filesize: 17.69 MB, Downloads: 244)
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #66 on: December 26, 2021, 11:48:14 pm »
I was able to speed up a little (~300ms, down to 1,677 ms) the spell-check rate by checking only the words with equal length.
That is, the previous stupid checking of each word against the whole 1,000,000+ Binary-Search pool now was limited only to THE fraction of the pool where the corresponding length is located - which lowers the number of seeks/comparisons.

Additionally, all the UNFAMILIAR words, now, are dumped as an index, but placed in the very beginning, in order of appearance.
Thanks to that feature, one is allowed quickly to spot misspelled words, as I did only with a few glances, 2 buggy "daughters" found:
daughers
daughers


Project Gutenberg failed to deliver quality, obviously the correctors still rely on non-automatic methods.

Finally, we have a decent benchmark utilizing the Binary-Search in a speedy manner, it is a good start for all kind of experiments since it is creating as a result the .HTM counterpart (with all unfamiliar words in BOLD, all OED distinct words are used as a spell-check wordlist) of our .TXT file:

 
r7_1.png


It throws all the words of KJV Bible at OED 2nd edition, which is 700,000+ words against the 1,000,000+ OED words:

 
r7_2.png


The spotted errors are to go along with the entire .HTM into .DOC|.PDF file for hardcopy i.e. paper reviewing, with good old pencils:

 
r7_3.png


Spell-checking KJV Bible in 1.677 seconds, can we do better?

 
satanella_sharpen_.png


On Linux, the newer GCC (v10.2.1) generates 100ms faster code, lowering the conversion down to:

 
Screenshot at 2021-12-27 05-22-44.png


 
Screenshot at 2021-12-27 05-32-30.png


Anyway, this TXT2HTM code will be part of incoming Masakari r.9, enabling having spell-checked .HTMs only a key and few seconds away...

The source code and binaries for Linux and Windows (also the OED wordlist) are in the attached package.
* KJV-BIBLE_vs_Oxford_r2_ELF-EXE.7z (Filesize: 12.46 MB, Downloads: 219)
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #67 on: February 22, 2022, 03:52:53 am »
Glad to share the latest-n-fastest Quicksort around - Magnetica r.5 - the source code is attached.

 
Quicksort Showdown_2022-Feb-22.pdf.png


I spotted a conditional swap within the mainloop, now, in r.5 this branch is removed and put outside the hottest loop.

So, r.5 is algorithmically superior compared to r.4, the performance varies between AMD and Intel, between ICL and GCC, overall it pushes the limits once more.

Behold the simplicity itself:
Code: QB64: [Select]
  1. // Magnetica r.5BB:
  2. /*
  3. ; mark_description "Intel(R) C++ Compiler XE for applications running on Intel(R) 64, Version 15.0.0.108 Build 20140726";
  4. ; mark_description "-S -O3 -D_N_HIGH_PRIORITY";
  5.  
  6. Quicksort_QB64_v9       PROC
  7. ; parameter 1: rcx
  8. ; parameter 2: rdx
  9. ; parameter 3: r8
  10. .B12.1::                        ; Preds .B12.0
  11.         push      r13                                           ;1172.72
  12.         push      r14                                           ;1172.72
  13.         push      r15                                           ;1172.72
  14.         push      rbp                                           ;1172.72
  15.         mov       eax, 800024                                   ;1172.72
  16.         call      __chkstk                                      ;1172.72
  17.         sub       rsp, 800024                                   ;1172.72
  18.         mov       eax, 2                                        ;1191.5
  19.         mov       QWORD PTR [40+rsp], rdx                       ;1190.17
  20.         mov       QWORD PTR [48+rsp], r8                        ;1191.17
  21.                                
  22. .B12.2::                        ; Preds .B12.20 .B12.1
  23.         mov       rdx, QWORD PTR [32+rsp+rax*8]                 ;1193.17
  24.         mov       r13, QWORD PTR [24+rsp+rax*8]                 ;1194.16
  25.         add       rax, -2                                       ;1195.31
  26.         cmp       r13, rdx                                      ;1197.47
  27.         jge       .B12.20       ; Prob 10%                      ;1197.47
  28.                                
  29. .B12.3::                        ; Preds .B12.2
  30.         mov       r9, r13                                       ;1199.13
  31.                                
  32. .B12.4::                        ; Preds .B12.18 .B12.3
  33.         mov       r10, QWORD PTR [rcx+r13*8]                    ;1229.13
  34.         lea       r15, QWORD PTR [r13+rdx]                      ;1229.48
  35.         sar       r15, 1                                        ;1229.56
  36.         mov       r14, r13                                      ;1200.13
  37.         mov       r11, rdx                                      ;1198.13
  38.         mov       rbp, r9                                       ;1199.13
  39.         mov       r9, r14                                       ;1200.13
  40.         mov       r8, QWORD PTR [rcx+r15*8]                     ;1229.13
  41.         mov       QWORD PTR [rcx+r15*8], r10                    ;1229.13
  42.         mov       QWORD PTR [rcx+r13*8], r8                     ;1229.13
  43.                                
  44. .B12.5::                        ; Preds .B12.4 .B12.13
  45.         inc       r14                                           ;1282.27
  46.         mov       r10, QWORD PTR [rcx+r14*8]                    ;1283.29
  47.         cmp       r8, r10                                       ;1283.29
  48.         ja        .B12.12       ; Prob 22%                      ;1283.29
  49.                                
  50. .B12.6::                        ; Preds .B12.5
  51.         jae       .B12.13       ; Prob 50%                      ;1287.36
  52.                                
  53. .B12.7::                        ; Preds .B12.6
  54.         mov       r15, QWORD PTR [rcx+r11*8]                    ;1288.35
  55.         cmp       r8, r15                                       ;1288.35
  56.         jae       .B12.11       ; Prob 10%                      ;1288.35
  57.                                
  58. .B12.9::                        ; Preds .B12.7 .B12.9
  59.         dec       r11                                           ;1289.39
  60.         mov       r15, QWORD PTR [rcx+r11*8]                    ;1288.35
  61.         cmp       r8, r15                                       ;1288.35
  62.         jb        .B12.9        ; Prob 82%                      ;1288.35
  63.                                
  64. .B12.11::                       ; Preds .B12.9 .B12.7
  65.         mov       QWORD PTR [rcx+r14*8], r15                    ;1291.21
  66.         dec       r14                                           ;1296.31
  67.         mov       QWORD PTR [rcx+r11*8], r10                    ;1291.21
  68.         dec       r11                                           ;1295.35
  69.         jmp       .B12.13       ; Prob 100%                     ;1295.35
  70.                                
  71. .B12.12::                       ; Preds .B12.5
  72.         mov       r15, QWORD PTR [rcx+rbp*8]                    ;1284.21
  73.         mov       QWORD PTR [rcx+rbp*8], r10                    ;1284.21
  74.         inc       rbp                                           ;1285.31
  75.         mov       QWORD PTR [rcx+r14*8], r15                    ;1284.21
  76.                                
  77. .B12.13::                       ; Preds .B12.12 .B12.11 .B12.6
  78.         cmp       r14, r11                                      ;1281.24
  79.         jl        .B12.5        ; Prob 82%                      ;1281.24
  80.                                
  81. .B12.14::                       ; Preds .B12.13
  82.         jle       .B12.16       ; Prob 78%                      ;1299.22
  83.                                
  84. .B12.15::                       ; Preds .B12.14
  85.         mov       r8, QWORD PTR [8+rcx+r11*8]                   ;1300.21
  86.         mov       r10, QWORD PTR [8+rcx+r14*8]                  ;1300.21
  87.         mov       QWORD PTR [8+rcx+r14*8], r8                   ;1300.21
  88.         mov       QWORD PTR [8+rcx+r11*8], r10                  ;1300.21
  89.                                
  90. .B12.16::                       ; Preds .B12.14 .B12.15
  91.         inc       r14                                           ;1655.25
  92.         cmp       rdx, r14                                      ;1730.49
  93.         jle       .B12.18       ; Prob 62%                      ;1730.49
  94.                                
  95. .B12.17::                       ; Preds .B12.16
  96.         add       rax, 2                                        ;1731.39
  97.         mov       QWORD PTR [24+rsp+rax*8], r14                 ;1732.17
  98.         mov       QWORD PTR [32+rsp+rax*8], rdx                 ;1733.17
  99.                                
  100. .B12.18::                       ; Preds .B12.17 .B12.16
  101.         lea       rdx, QWORD PTR [-1+rbp]                       ;1654.25
  102.         cmp       r13, rdx                                      ;1197.47
  103.         jl        .B12.4        ; Prob 82%                      ;1197.47
  104.                                
  105. .B12.20::                       ; Preds .B12.18 .B12.2
  106.         DB        15                                            ;1743.28
  107.         DB        31                                            ;1743.28
  108.         DB        64                                            ;1743.28
  109.         DB        0                                             ;1743.28
  110.         DB        15                                            ;1743.28
  111.         DB        31                                            ;1743.28
  112.         DB        128                                           ;1743.28
  113.         DB        0                                             ;1743.28
  114.         DB        0                                             ;1743.28
  115.         DB        0                                             ;1743.28
  116.         DB        0                                             ;1743.28
  117.         test      rax, rax                                      ;1743.28
  118.         jne       .B12.2        ; Prob 99%                      ;1743.28
  119.                                
  120. .B12.21::                       ; Preds .B12.20
  121.         add       rsp, 800024                                   ;1753.1
  122.         pop       rbp                                           ;1753.1
  123.         pop       r15                                           ;1753.1
  124.         pop       r14                                           ;1753.1
  125.         pop       r13                                           ;1753.1
  126.         ret                                                     ;1753.1
  127.         ALIGN     16
  128.                                
  129. .B12.22::
  130. ; mark_end;
  131. Quicksort_QB64_v9 ENDP
  132. */

In order gladly to say QB64 stands for "Quickness" once again, my wish is QB64 community to have the qsm.h header (built-in function to QB64 library) outperforming both qsort() C implementations of Intel and GNU, enjoy.
* Quicksort_says_rev9.7z (Filesize: 17.12 MB, Downloads: 232)
« Last Edit: February 22, 2022, 04:17:20 am by Sanmayce »
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #68 on: March 08, 2022, 05:10:05 pm »
A week ago, Scandum released a killer, the current BEST Quicksort, named Crumsort, being the FASTEST unstable, in-place, Quicksort in my tests.

As I see it, he did dive deep in sorting, achieving outstanding results.

Yet, intuitively, I see different approach, which potentially can reach and even surpass his awesome speeds, to be seen...

Currently, the picture of hi-speed sorting is this (in order to reproduce the results, the benchmark package is attached):

Code: [Select]
// Test run: 2022-Mar-08:
// Laptop "Compressionette", Intel 'Kaby Lake' i5-7200U 3.1GHz max turbo, 36GB DDR4 2133MHz:
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// |  Operating System, | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,   | Windows 10,  | Fedora 35,   |
// |      Compiler, -O3 | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1   | Intel v15.0  | GCC 11.2.1   |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// | qsort              |  59 seconds | 377 seconds | 336 seconds | 541 seconds | 157 seconds | 195 seconds | 435 seconds | 534 seconds | 851 seconds | 1036 seconds |         N.A. |         N.A. |
// | Magnetica v.13     |  31 seconds |  30 seconds | 202 seconds | 196 seconds |  88 seconds |  85 seconds | 259 seconds | 250 seconds | 506 seconds |  493 seconds |         N.A. |         N.A. |
// | Bentley-McIlroy    |  38 seconds |  36 seconds | 205 seconds | 208 seconds |  92 seconds |  94 seconds | 279 seconds | 281 seconds | 544 seconds |  553 seconds |         N.A. |         N.A. |
// | Crumsort           |  30 seconds |  32 seconds | 132 seconds | 150 seconds |  64 seconds |  70 seconds | 184 seconds | 192 seconds | 357 seconds |  376 seconds |         N.A. |         N.A. |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// | Best Time (bare    |                           |                           |                           |                           |                            |                             |
// | bone in-place QS): |  30s PARITY               | 132s for Crumsort         |  64s for Crumsort         | 184s for Crumsort         | 357s for Crumsort          | N.A.                        |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
Code: [Select]
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// |  Operating System, | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1      | Fedora 35, GCC 11.2.1       |
// |      Compiler, -O3 | instructions; IPC         | instructions; IPC         | instructions; IPC         | instructions; IPC         | instructions; IPC          | instructions; IPC           |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | qsort              |   3,302,993,934,921; 2.75 |   2,983,579,082,155; 1.75 |     886,263,153,476; 1.41 |   2,352,769,705,563; 1.38 |    4,527,367,288,814; 1.39 |                        N.A. |
// | Magnetica v.13     |     131,873,917,282; 1.00 |     658,478,895,600; 1.02 |     309,149,594,748; 1.08 |     884,297,729,161; 1.06 |    1,726,931,029,634; 1.08 |                        N.A. |
// | Bentley-McIlroy    |     164,915,835,204; 1.09 |     681,956,155,364; 1.00 |     322,584,009,352; 1.04 |     944,038,959,690; 1.02 |    1,719,825,062,847; 0.97 |                        N.A. |
// | Crumsort           |     312,328,497,447; 2.29 |   1,295,551,817,497; 2.57 |     603,276,911,007; 2.53 |   1,597,685,291,532; 2.46 |    3,091,001,982,856; 2.51 |                        N.A. |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+

Code: [Select]
// Speed Roster, (the base speed 1.00x is GLIBC's qsort):
// Rank #1: 2683/820=  3.27x =  32+150+ 70+192+ 376=  820 seconds for Crumsort
// Rank #2: 2683/1054= 2.54x =  30+196+ 85+250+ 493= 1054 seconds for Magnetica v.13
// Rank #3: 2683/1172= 2.28x =  36+208+ 94+281+ 553= 1172 seconds for Bentley-McIlroy
// Rank #4: 2683/2683= 1.00x = 377+541+195+534+1036= 2683 seconds for GLIBC's qsort
//
// Legend (The time is exactly the Sort process time):
// #1,FEW = 2,233,861,800 keys, of them distinct = 10; 178,708,944 bytes 22338618_QWORDS.bin; elements = 178,708,944/8 *100; // Keys are 100 times duplicated
// #2,MANY = 2,482,300,900 keys, of them distinct = 2,847,531; 24,823,016 bytes mobythesaurus.txt; elements = 24823016 -8+1; // BuildingBlocks are size-order+1, they are 100 times duplicated
// #3,MANYmore = 1,137,582,073 keys, of them distinct = 77,275,994; 1,137,582,080 bytes linux-5.15.25.tar; elements = 1137582080 -8+1; // BuildingBlocks are size-order+1
// #4,ALL = 2,009,333,753 keys, of them distinct = 1,912,608,132; 2,009,333,760 bytes Fedora-Workstation-Live-x86_64-35-1.2.iso; elements = 2009333760 -8+1; // BuildingBlocks are size-order+1
// #5,ALLmore = 3,803,483,825 keys, of them distinct = 3,346,259,533; 3,803,483,832 bytes Fedora-Workstation-35-1.2.aarch64.raw.xz; elements = 3803483832 -8+1; // BuildingBlocks are size-order+1
// #6,ALLmax = 7,798,235,435 keys, of them distinct = 6,770,144,405; 7,798,235,442 bytes math.stackexchange.com_en_all_2019-02.zim; elements = 7798235442 -8+1; // BuildingBlocks are size-order+1
Code: [Select]
// Notes:
// - Scandum's Crumsort is the FASTEST in-place sorter, known to me, hail Scandum!
// - All the runs were in "Current priority class is REALTIME_PRIORITY_CLASS" for Windows and "Current priority is -20." for Linux;
// To see more stats (the tables were deriving from) see 'log_i5-7200U_MAR08.txt';
// - Benchmark needs 32GB RAM, and 64GB for the 6th testset;
// - The whole package (except the 3rd, 4th, 5th and 6th datasets) is downloadable at:
//   www.sanmayce.com/QS_showdown_r13.zip
// - To reproduce the roster, run on Windows or Linux:
//   - BENCH_ICL32GB.BAT
//   - BENCH_ICL64GB.BAT
//   - sh bench_gcc32GB.sh
//   - sh bench_gcc64GB.sh
// - 3rd dataset is downloadable at:
// https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.15.25.tar.xz
// - 4th dataset is downloadable at:
// https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-35-1.2.iso
// - 5th dataset is downloadable at:
// https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/aarch64/images/Fedora-Workstation-35-1.2.aarch64.raw.xz
// - 6th dataset is downloadable at:
// https://download.kiwix.org/zim/stack_exchange/math.stackexchange.com_en_all_2019-02.zim
// - Managed to reduce the mainloop down to 68 bytes (from 71 in r.12), the gain comes from switching to pointers, had to do that long time ago, wanted to keep the etude ARRAY-syntax friendly. Now, r.12 uses arrays, whereas r.13 uses pointers. Another turning point, GCC now beats ICL in all tests.

The current best is at:
https://github.com/scandum/crumsort

Here comes Magnetica v13, a view from the kitchen, this time around, GCC v11.2.1 proves superior to ICL v15.0, generating faster code in all tests:
Code: [Select]
// And the Assembly code generated by GCC 11.2.1 for Magnetica_v13:
// 001 Quicksort_QB64_v13:
// 002 .LFB214:
// 003 .cfi_startproc
// 004 pushq %r15
// 005 .cfi_def_cfa_offset 16
// 006 .cfi_offset 15, -16
// 007 leaq -8(%rdi,%rsi,8), %r10
// 008 movq %rdi, %xmm0
// 009 movq %rdi, %rax
// 010 pushq %r14
// 011 .cfi_def_cfa_offset 24
// 012 .cfi_offset 14, -24
// 013 movq %r10, %xmm2
// 014 movl $2, %r15d
// 015 pushq %r13
// 016 .cfi_def_cfa_offset 32
// 017 .cfi_offset 13, -32
// 018 punpcklqdq %xmm2, %xmm0
// 019 pushq %r12
// 020 .cfi_def_cfa_offset 40
// 021 .cfi_offset 12, -40
// 022 pushq %rbp
// 023 .cfi_def_cfa_offset 48
// 024 .cfi_offset 6, -48
// 025 pushq %rbx
// 026 .cfi_def_cfa_offset 56
// 027 .cfi_offset 3, -56
// 028 subq $143920, %rsp
// 029 .cfi_def_cfa_offset 143976
// 030 movups %xmm0, -96(%rsp)
// 031 .p2align 4,,10
// 032 .p2align 3
// 033 .L5115:
// 034 movq %r10, %rcx
// 035 subq $2, %r15
// 036 subq %rax, %rcx
// 037 testq %rcx, %rcx
// 038 jle .L5096
// 039 leaq 8(%rax), %rbx
// 040 .L5113:
// 041 movq %rcx, %rsi
// 042 movq (%rax), %rdx
// 043 sarq $3, %rsi
// 044 cmpq $55, %rcx
// 045 ja .L5097
// 046 jmp *.L5099(,%rsi,8)
// 047 .section .rodata
// 048 .align 8
// 049 .align 4
// 050 .L5099:
// 051 .quad .L5097
// 052 .quad .L5104
// 053 .quad .L5103
// 054 .quad .L5102
// 055 .quad .L5101
// 056 .quad .L5100
// 057 .quad .L5098
// 058 .text
// 059 .p2align 4,,10
// 060 .p2align 3
// 061 .L5098:
// 062 movq 8(%rax), %rbp
// 063 movq 16(%rax), %rbx
// 064 xorl %ecx, %ecx
// 065 movq 24(%rax), %r11
// 066 movq 32(%rax), %r10
// 067 cmpq %rdx, %rbp
// 068 movq 40(%rax), %r9
// 069 movq 48(%rax), %r12
// 070 setb %cl
// 071 cmpq %rdx, %rbx
// 072 adcl $0, %ecx
// 073 cmpq %rdx, %r11
// 074 adcl $0, %ecx
// 075 cmpq %rdx, %r10
// 076 adcl $0, %ecx
// 077 cmpq %rdx, %r9
// 078 adcl $0, %ecx
// 079 cmpq %rdx, %r12
// 080 adcl $0, %ecx
// 081 xorl %r8d, %r8d
// 082 cmpq %rdx, %rbp
// 083 movl %ecx, -108(%rsp)
// 084 setnb %r8b
// 085 xorl %ecx, %ecx
// 086 cmpq %rbx, %rbp
// 087 seta %cl
// 088 addl %ecx, %r8d
// 089 xorl %ecx, %ecx
// 090 cmpq %r11, %rbp
// 091 seta %cl
// 092 addl %ecx, %r8d
// 093 xorl %ecx, %ecx
// 094 cmpq %r10, %rbp
// 095 seta %cl
// 096 addl %ecx, %r8d
// 097 xorl %ecx, %ecx
// 098 cmpq %r9, %rbp
// 099 seta %cl
// 100 addl %ecx, %r8d
// 101 xorl %ecx, %ecx
// 102 cmpq %r12, %rbp
// 103 seta %cl
// 104 xorl %edi, %edi
// 105 addl %ecx, %r8d
// 106 cmpq %rdx, %rbx
// 107 setnb %dil
// 108 xorl %ecx, %ecx
// 109 cmpq %rbx, %rbp
// 110 setbe %cl
// 111 addl %ecx, %edi
// 112 xorl %ecx, %ecx
// 113 cmpq %r11, %rbx
// 114 seta %cl
// 115 addl %ecx, %edi
// 116 xorl %ecx, %ecx
// 117 cmpq %r10, %rbx
// 118 seta %cl
// 119 addl %ecx, %edi
// 120 xorl %ecx, %ecx
// 121 cmpq %r9, %rbx
// 122 seta %cl
// 123 addl %ecx, %edi
// 124 xorl %ecx, %ecx
// 125 cmpq %r12, %rbx
// 126 seta %cl
// 127 xorl %esi, %esi
// 128 addl %ecx, %edi
// 129 cmpq %rdx, %r11
// 130 setnb %sil
// 131 xorl %ecx, %ecx
// 132 cmpq %r11, %rbp
// 133 setbe %cl
// 134 addl %ecx, %esi
// 135 xorl %ecx, %ecx
// 136 cmpq %r11, %rbx
// 137 setbe %cl
// 138 addl %ecx, %esi
// 139 xorl %ecx, %ecx
// 140 cmpq %r10, %r11
// 141 seta %cl
// 142 addl %ecx, %esi
// 143 xorl %ecx, %ecx
// 144 cmpq %r9, %r11
// 145 seta %cl
// 146 addl %ecx, %esi
// 147 xorl %ecx, %ecx
// 148 cmpq %r12, %r11
// 149 seta %cl
// 150 addl %ecx, %esi
// 151 xorl %ecx, %ecx
// 152 cmpq %rdx, %r10
// 153 setnb %cl
// 154 xorl %r13d, %r13d
// 155 cmpq %r10, %rbp
// 156 setbe %r13b
// 157 addl %r13d, %ecx
// 158 xorl %r13d, %r13d
// 159 cmpq %r10, %rbx
// 160 setbe %r13b
// 161 addl %r13d, %ecx
// 162 xorl %r13d, %r13d
// 163 cmpq %r10, %r11
// 164 setbe %r13b
// 165 addl %r13d, %ecx
// 166 xorl %r13d, %r13d
// 167 cmpq %r9, %r10
// 168 seta %r13b
// 169 addl %r13d, %ecx
// 170 xorl %r13d, %r13d
// 171 cmpq %r12, %r10
// 172 seta %r13b
// 173 xorl %r14d, %r14d
// 174 addl %r13d, %ecx
// 175 cmpq %rdx, %r9
// 176 setnb %r14b
// 177 xorl %r13d, %r13d
// 178 cmpq %r9, %rbp
// 179 setbe %r13b
// 180 addl %r14d, %r13d
// 181 xorl %r14d, %r14d
// 182 cmpq %r9, %rbx
// 183 setbe %r14b
// 184 addl %r13d, %r14d
// 185 xorl %r13d, %r13d
// 186 cmpq %r9, %r11
// 187 setbe %r13b
// 188 addl %r14d, %r13d
// 189 xorl %r14d, %r14d
// 190 cmpq %r9, %r10
// 191 setbe %r14b
// 192 addl %r13d, %r14d
// 193 xorl %r13d, %r13d
// 194 cmpq %r12, %r9
// 195 seta %r13b
// 196 addl %r14d, %r13d
// 197 movslq -108(%rsp), %r14
// 198 movq %rdx, (%rax,%r14,8)
// 199 movslq %r8d, %rdx
// 200 movl -108(%rsp), %r14d
// 201 movq %rbp, (%rax,%rdx,8)
// 202 movslq %edi, %rdx
// 203 movq %rbx, (%rax,%rdx,8)
// 204 movslq %esi, %rdx
// 205 movq %r11, (%rax,%rdx,8)
// 206 movslq %ecx, %rdx
// 207 movq %r10, (%rax,%rdx,8)
// 208 movslq %r13d, %rdx
// 209 movq %r9, (%rax,%rdx,8)
// 210 leal (%r14,%r8), %edx
// 211 addl %edi, %edx
// 212 addl %esi, %edx
// 213 addl %ecx, %edx
// 214 movl $21, %ecx
// 215 addl %r13d, %edx
// 216 subl %edx, %ecx
// 217 movslq %ecx, %rdx
// 218 movq %r12, (%rax,%rdx,8)
// 219 .p2align 4,,10
// 220 .p2align 3
// 221 .L5096:
// 222 testq %r15, %r15
// 223 je .L5095
// 224 .L5121:
// 225 movq -104(%rsp,%r15,8), %r10
// 226 movq -112(%rsp,%r15,8), %rax
// 227 jmp .L5115
// 228 .p2align 4,,10
// 229 .p2align 3
// 230 .L5103:
// 231 movq 8(%rax), %rdi
// 232 movq 16(%rax), %r8
// 233 xorl %ecx, %ecx
// 234 cmpq %rdx, %rdi
// 235 setb %cl
// 236 cmpq %rdx, %r8
// 237 adcl $0, %ecx
// 238 xorl %esi, %esi
// 239 cmpq %rdx, %rdi
// 240 setnb %sil
// 241 xorl %r9d, %r9d
// 242 cmpq %r8, %rdi
// 243 seta %r9b
// 244 addl %r9d, %esi
// 245 movslq %ecx, %r9
// 246 movq %rdx, (%rax,%r9,8)
// 247 movslq %esi, %rdx
// 248 addl %esi, %ecx
// 249 movq %rdi, (%rax,%rdx,8)
// 250 movl $3, %edx
// 251 subl %ecx, %edx
// 252 movslq %edx, %rdx
// 253 movq %r8, (%rax,%rdx,8)
// 254 testq %r15, %r15
// 255 jne .L5121
// 256 .L5095:
// 257 addq $143920, %rsp
// 258 .cfi_remember_state
// 259 .cfi_def_cfa_offset 56
// 260 popq %rbx
// 261 .cfi_def_cfa_offset 48
// 262 popq %rbp
// 263 .cfi_def_cfa_offset 40
// 264 popq %r12
// 265 .cfi_def_cfa_offset 32
// 266 popq %r13
// 267 .cfi_def_cfa_offset 24
// 268 popq %r14
// 269 .cfi_def_cfa_offset 16
// 270 popq %r15
// 271 .cfi_def_cfa_offset 8
// 272 ret
// 273 .p2align 4,,10
// 274 .p2align 3
// 275 .L5104:
// 276 .cfi_restore_state
// 277 movq 8(%rax), %rcx
// 278 cmpq %rdx, %rcx
// 279 sbbq %rsi, %rsi
// 280 andl $8, %esi
// 281 cmpq %rdx, %rcx
// 282 movq %rdx, (%rax,%rsi)
// 283 movl $1, %edx
// 284 sbbl $0, %edx
// 285 movslq %edx, %rdx
// 286 movq %rcx, (%rax,%rdx,8)
// 287 jmp .L5096
// 288 .p2align 4,,10
// 289 .p2align 3
// 290 .L5101:
// 291 movq 8(%rax), %r10
// 292 movq 16(%rax), %r9
// 293 xorl %ecx, %ecx
// 294 movq 24(%rax), %r8
// 295 movq 32(%rax), %rdi
// 296 cmpq %rdx, %r10
// 297 setb %cl
// 298 cmpq %rdx, %r9
// 299 adcl $0, %ecx
// 300 cmpq %rdx, %r8
// 301 adcl $0, %ecx
// 302 cmpq %rdx, %rdi
// 303 adcl $0, %ecx
// 304 xorl %esi, %esi
// 305 cmpq %rdx, %r10
// 306 setnb %sil
// 307 xorl %r11d, %r11d
// 308 cmpq %r9, %r10
// 309 seta %r11b
// 310 addl %r11d, %esi
// 311 xorl %r11d, %r11d
// 312 cmpq %r8, %r10
// 313 seta %r11b
// 314 addl %r11d, %esi
// 315 xorl %r11d, %r11d
// 316 cmpq %rdi, %r10
// 317 seta %r11b
// 318 addl %esi, %r11d
// 319 xorl %esi, %esi
// 320 cmpq %rdx, %r9
// 321 setnb %sil
// 322 xorl %ebx, %ebx
// 323 cmpq %r9, %r10
// 324 setbe %bl
// 325 addl %ebx, %esi
// 326 xorl %ebx, %ebx
// 327 cmpq %r8, %r9
// 328 seta %bl
// 329 addl %ebx, %esi
// 330 xorl %ebx, %ebx
// 331 cmpq %rdi, %r9
// 332 seta %bl
// 333 addl %esi, %ebx
// 334 xorl %esi, %esi
// 335 cmpq %rdx, %r8
// 336 setnb %sil
// 337 xorl %ebp, %ebp
// 338 cmpq %r8, %r10
// 339 setbe %bpl
// 340 addl %ebp, %esi
// 341 xorl %ebp, %ebp
// 342 cmpq %r8, %r9
// 343 setbe %bpl
// 344 addl %ebp, %esi
// 345 xorl %ebp, %ebp
// 346 cmpq %rdi, %r8
// 347 seta %bpl
// 348 addl %ebp, %esi
// 349 movslq %ecx, %rbp
// 350 addl %r11d, %ecx
// 351 movq %rdx, (%rax,%rbp,8)
// 352 movslq %r11d, %rdx
// 353 addl %ebx, %ecx
// 354 movq %r10, (%rax,%rdx,8)
// 355 movslq %ebx, %rdx
// 356 addl %esi, %ecx
// 357 movq %r9, (%rax,%rdx,8)
// 358 movslq %esi, %rdx
// 359 movq %r8, (%rax,%rdx,8)
// 360 movl $10, %edx
// 361 subl %ecx, %edx
// 362 movslq %edx, %rdx
// 363 movq %rdi, (%rax,%rdx,8)
// 364 jmp .L5096
// 365 .p2align 4,,10
// 366 .p2align 3
// 367 .L5102:
// 368 movq 8(%rax), %r10
// 369 movq 16(%rax), %r9
// 370 xorl %ecx, %ecx
// 371 movq 24(%rax), %r8
// 372 cmpq %rdx, %r10
// 373 setb %cl
// 374 cmpq %rdx, %r9
// 375 adcl $0, %ecx
// 376 cmpq %rdx, %r8
// 377 adcl $0, %ecx
// 378 xorl %edi, %edi
// 379 cmpq %rdx, %r10
// 380 setnb %dil
// 381 xorl %esi, %esi
// 382 cmpq %r9, %r10
// 383 seta %sil
// 384 addl %esi, %edi
// 385 xorl %esi, %esi
// 386 cmpq %r8, %r10
// 387 seta %sil
// 388 addl %esi, %edi
// 389 xorl %esi, %esi
// 390 cmpq %rdx, %r9
// 391 setnb %sil
// 392 xorl %r11d, %r11d
// 393 cmpq %r9, %r10
// 394 setbe %r11b
// 395 addl %r11d, %esi
// 396 xorl %r11d, %r11d
// 397 cmpq %r8, %r9
// 398 seta %r11b
// 399 addl %r11d, %esi
// 400 movslq %ecx, %r11
// 401 addl %edi, %ecx
// 402 movq %rdx, (%rax,%r11,8)
// 403 movslq %edi, %rdx
// 404 addl %esi, %ecx
// 405 movq %r10, (%rax,%rdx,8)
// 406 movslq %esi, %rdx
// 407 movq %r9, (%rax,%rdx,8)
// 408 movl $6, %edx
// 409 subl %ecx, %edx
// 410 movslq %edx, %rdx
// 411 movq %r8, (%rax,%rdx,8)
// 412 jmp .L5096
// 413 .p2align 4,,10
// 414 .p2align 3
// 415 .L5100:
// 416 movq 8(%rax), %r12
// 417 movq 16(%rax), %rbp
// 418 xorl %ecx, %ecx
// 419 movq 24(%rax), %r11
// 420 movq 32(%rax), %r10
// 421 cmpq %rdx, %r12
// 422 movq 40(%rax), %rbx
// 423 setb %cl
// 424 cmpq %rdx, %rbp
// 425 adcl $0, %ecx
// 426 cmpq %rdx, %r11
// 427 adcl $0, %ecx
// 428 cmpq %rdx, %r10
// 429 adcl $0, %ecx
// 430 cmpq %rdx, %rbx
// 431 adcl $0, %ecx
// 432 xorl %r9d, %r9d
// 433 cmpq %rdx, %r12
// 434 setnb %r9b
// 435 xorl %esi, %esi
// 436 cmpq %rbp, %r12
// 437 seta %sil
// 438 addl %esi, %r9d
// 439 xorl %esi, %esi
// 440 cmpq %r11, %r12
// 441 seta %sil
// 442 addl %esi, %r9d
// 443 xorl %esi, %esi
// 444 cmpq %r10, %r12
// 445 seta %sil
// 446 addl %esi, %r9d
// 447 xorl %esi, %esi
// 448 cmpq %rbx, %r12
// 449 seta %sil
// 450 xorl %r8d, %r8d
// 451 addl %esi, %r9d
// 452 cmpq %rdx, %rbp
// 453 setnb %r8b
// 454 xorl %esi, %esi
// 455 cmpq %rbp, %r12
// 456 setbe %sil
// 457 addl %esi, %r8d
// 458 xorl %esi, %esi
// 459 cmpq %r11, %rbp
// 460 seta %sil
// 461 addl %esi, %r8d
// 462 xorl %esi, %esi
// 463 cmpq %r10, %rbp
// 464 seta %sil
// 465 addl %esi, %r8d
// 466 xorl %esi, %esi
// 467 cmpq %rbx, %rbp
// 468 seta %sil
// 469 xorl %edi, %edi
// 470 addl %esi, %r8d
// 471 cmpq %rdx, %r11
// 472 setnb %dil
// 473 xorl %esi, %esi
// 474 cmpq %r11, %r12
// 475 setbe %sil
// 476 addl %esi, %edi
// 477 xorl %esi, %esi
// 478 cmpq %r11, %rbp
// 479 setbe %sil
// 480 addl %esi, %edi
// 481 xorl %esi, %esi
// 482 cmpq %r10, %r11
// 483 seta %sil
// 484 addl %esi, %edi
// 485 xorl %esi, %esi
// 486 cmpq %rbx, %r11
// 487 seta %sil
// 488 addl %esi, %edi
// 489 xorl %esi, %esi
// 490 cmpq %rdx, %r10
// 491 setnb %sil
// 492 xorl %r13d, %r13d
// 493 cmpq %r10, %r12
// 494 setbe %r13b
// 495 addl %r13d, %esi
// 496 xorl %r13d, %r13d
// 497 cmpq %r10, %rbp
// 498 setbe %r13b
// 499 addl %r13d, %esi
// 500 xorl %r13d, %r13d
// 501 cmpq %r10, %r11
// 502 setbe %r13b
// 503 addl %r13d, %esi
// 504 xorl %r13d, %r13d
// 505 cmpq %rbx, %r10
// 506 seta %r13b
// 507 addl %r13d, %esi
// 508 movslq %ecx, %r13
// 509 movq %rdx, (%rax,%r13,8)
// 510 movslq %r9d, %rdx
// 511 movq %r12, (%rax,%rdx,8)
// 512 movslq %r8d, %rdx
// 513 movq %rbp, (%rax,%rdx,8)
// 514 movslq %edi, %rdx
// 515 movq %r11, (%rax,%rdx,8)
// 516 movslq %esi, %rdx
// 517 movq %r10, (%rax,%rdx,8)
// 518 leal (%rcx,%r9), %edx
// 519 movl $15, %ecx
// 520 addl %r8d, %edx
// 521 addl %edi, %edx
// 522 addl %esi, %edx
// 523 subl %edx, %ecx
// 524 movslq %ecx, %rdx
// 525 movq %rbx, (%rax,%rdx,8)
// 526 jmp .L5096
// 527 .L5097:
// 528 sarq $2, %rsi
// 529 leaq (%rax,%rsi,8), %rcx
// 530 movq (%rcx), %r8
// 531 movq %rdx, (%rcx)
// 532 movq %r8, (%rax)
// 533 cmpq %rax, %r10
// 534 jbe .L5105
// 535 movq %rax, %rdx
// 536 movq %rax, %r9
// 537 movq %r10, %rcx
// 538 jmp .L5111
// 539 .p2align 4,,10
// 540 .p2align 3
// 541 .L5122:
// 542 movq (%r9), %rdi
// 543 movq %rsi, (%r9)
// 544 leaq 16(%rdx), %rsi
// 545 addq $8, %r9
// 546 movq %rdi, 8(%rdx)
// 547 movq %r11, %rdx
// 548 .L5107:
// 549 cmpq %rdx, %rcx
// 550 jbe .L5112
// 551 .L5111:
// 552 movq 8(%rdx), %rsi
// 553 leaq 8(%rdx), %r11
// 554 cmpq %r8, %rsi
// 555 jb .L5122
// 556 ja .L5120
// 557 leaq 16(%rdx), %rsi
// 558 movq %r11, %rdx
// 559 cmpq %rdx, %rcx
// 560 ja .L5111
// 561 .L5112:
// 562 movq 8(%rdx), %rdi
// 563 movq %rsi, %xmm0
// 564 movq %rcx, %rsi
// 565 movq %r10, %xmm1
// 566 subq %rdx, %rsi
// 567 leaq -8(%r9), %r10
// 568 punpcklqdq %xmm1, %xmm0
// 569 addq $2, %r15
// 570 movq %rdi, %r8
// 571 sarq $63, %rsi
// 572 subq 8(%rcx), %r8
// 573 movups %xmm0, -112(%rsp,%r15,8)
// 574 andq %r8, %rsi
// 575 subq %rsi, %rdi
// 576 movq %rdi, 8(%rdx)
// 577 addq %rsi, 8(%rcx)
// 578 movq %r10, %rcx
// 579 subq %rax, %rcx
// 580 testq %rcx, %rcx
// 581 jg .L5113
// 582 jmp .L5096
// 583 .p2align 4,,10
// 584 .p2align 3
// 585 .L5110:
// 586 subq $8, %rcx
// 587 .L5120:
// 588 movq (%rcx), %rdi
// 589 cmpq %r8, %rdi
// 590 ja .L5110
// 591 movq %rdi, 8(%rdx)
// 592 subq $8, %rcx
// 593 movq %rsi, 8(%rcx)
// 594 movq %r11, %rsi
// 595 jmp .L5107
// 596 .p2align 4,,10
// 597 .p2align 3
// 598 .L5105:
// 599 movq %rbx, %rsi
// 600 movq %rax, %rdx
// 601 movq %rax, %r9
// 602 movq %r10, %rcx
// 603 jmp .L5112
// 604 .cfi_endproc

Tomorrow night, will run the GCC and ICL executables on Windows 10, the testmachine will be 'Brutalitto' - the mini-monster with Zen 2 4800H and 64GB DDR4 - running the heaviest sort benchmark on Internet - 7+ billion QWORDS...

Add-on:
Code: [Select]
Test run: 2022-Mar-09:
Laptop "Brutalitto", AMD 'Renoir' 4800H 4.3GHz max turbo, 64GB DDR4 3200MHz:
+--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
| Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
|  Operating System, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10,  | Windows 10,  | Windows 10,  |
|      Compiler, -O3 | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1   | Intel v15.0  | GCC 11.2.1   |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
| qsort              |  42 seconds |  45 seconds | 242 seconds | 280 seconds | 113 seconds | 131 seconds | 319 seconds | 354 seconds | 620 seconds |  695 seconds | 1282 seconds | 1438 seconds |
| Magnetica v.13     |  22 seconds |  21 seconds | 135 seconds | 134 seconds |  60 seconds |  59 seconds | 177 seconds | 177 seconds | 350 seconds |  349 seconds |  724 seconds |  722 seconds |
| Bentley-McIlroy    |  24 seconds |  24 seconds | 146 seconds | 142 seconds |  66 seconds |  64 seconds | 200 seconds | 193 seconds | 391 seconds |  376 seconds |  850 seconds |  803 seconds |
| Crumsort           |  20 seconds |  19 seconds |  91 seconds |  81 seconds |  44 seconds |  38 seconds | 126 seconds | 109 seconds | 246 seconds |  211 seconds |  567 seconds |  479 seconds |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
| Best Time (bare    |                           |                           |                           |                           |                            |                             |
| bone in-place QS): |  19s for Crumsort         |  81s for Crumsort         |  38s for Crumsort         | 109s for Crumsort         | 211s for Crumsort          | 479s for Crumsort           |
+--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+

Code: [Select]
Speed Roster, (the base speed 1.00x is GLIBC's qsort):
Rank #1: 2943/937=  3.14x = 19+ 81+ 38+109+211+ 479=  937 seconds for Crumsort
Rank #2: 2943/1462= 2.01x = 21+134+ 59+177+349+ 722= 1462 seconds for Magnetica v.13
Rank #3: 2943/1602= 1.83x = 24+142+ 64+193+376+ 803= 1602 seconds for Bentley-McIlroy
Rank #4: 2943/2943= 1.00x = 45+280+131+354+695+1438= 2943 seconds for GLIBC's qsort
* QS_showdown_r13.7z (Filesize: 14.95 MB, Downloads: 237)
« Last Edit: March 10, 2022, 12:46:29 am by Sanmayce »
He learns not to learn and reverts to what all men pass by.

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #69 on: March 13, 2022, 08:54:21 pm »
A new showdown is ready - Scandum's crumsort v1.1.5.3 vs Magnetica r.14:

Test run: 2022-Mar-14:
Laptop "Compressionette", Intel 'Kaby Lake' i5-7200U 3.1GHz max turbo, 36GB DDR4 2133MHz:
Code: [Select]
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Performer/Keys     | #1, FEW distinct        | #2, MANY distinct        | #3, MANYmore distinct    | #4, ALL distinct          | #5, ALLmore distinct      | #6, ALLmax distinct         |
+--------------------+-------------------------+--------------------------+--------------------------+-------------+-------------+---------------------------+--------------+--------------+
|  Operating System, |   Fedora 35, GCC 11.2.1 |    Fedora 35, GCC 11.2.1 |    Fedora 35, GCC 11.2.1 |     Fedora 35, GCC 11.2.1 |     Fedora 35, GCC 11.2.1 |       Fedora 35, GCC 11.2.1 |
|      Compiler, -O3 |       instructions; IPC |        instructions; IPC |        instructions; IPC |         instructions; IPC |         instructions; IPC |           instructions; IPC |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| qsort              |         385/342 seconds |          527/234 seconds |           165/55 seconds |           516/128 seconds |           999/252 seconds |                        N.A. |
|                    | 6,448,450,744,497; 2.82 |  4,921,980,445,033; 2.06 |  1,369,822,972,984; 1.94 |   3,250,596,357,540; 1.58 |   6,250,299,799,611; 1.59 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Magnetica v.14     |            29/6 seconds |           198/47 seconds |            86/22 seconds |            241/66 seconds |           472/134 seconds |                        N.A. |
|                    |   171,452,642,615; 1.14 |    936,751,248,598; 1.17 |    443,844,553,192; 1.25 |   1,291,624,832,269; 1.28 |   2,503,942,225,085; 1.28 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Bentley-McIlroy    |           38/13 seconds |           223/47 seconds |           100/26 seconds |            304/65 seconds |           597/139 seconds |                        N.A. |
|                    |   246,587,334,436; 1.23 |  1,105,253,285,645; 1.25 |    507,140,963,617; 1.23 |   1,460,132,564,676; 1.22 |   2,848,210,143,410; 1.22 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Crumsort 1.1.5.3   |            29/4 seconds |            129/4 seconds |             61/2 seconds |            173/3  seconds |             339/7 seconds |                        N.A. |
|                    |   351,332,884,902; 2.43 |  1,284,147,329,900; 2.80 |    603,065,127,518; 2.77 |   1,605,371,408,284; 2.64 |   3,102,283,088,926; 2.71 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+

Legend (The time is exactly the Sort process time, first value is for unsorted, second one is for sorted).
* QS_showdown_r14.7z (Filesize: 11.34 MB, Downloads: 235)
« Last Edit: March 13, 2022, 09:14:09 pm by Sanmayce »
He learns not to learn and reverts to what all men pass by.

Offline mdijkens

  • Newbie
  • Posts: 34
    • View Profile
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #70 on: March 14, 2022, 04:45:19 am »
Very interesting thread! Thank you!

I am regularly working with really big files (some over to 100GB) and have been optimizing 2 functions to read/process big files very fast.

I thought I'd add them here for information:

Code: QB64: [Select]
  1. Function readBigFile~&& (file$) '1.5GB/sec
  2.   Const BLOCKSIZE = 4194304 '=64*65536 = 4 MB
  3.   fsize~&& = fileSize(file$)
  4.   Dim mem As _MEM: mem = _MemNew(fsize~&& + BLOCKSIZE)
  5.   Dim block As String * BLOCKSIZE '=64*65536
  6.   filenum% = FreeFile
  7.   Open file$ For Random Access Read As filenum% Len = BLOCKSIZE
  8.   blocks~& = .5 + (fsize~&& / BLOCKSIZE)
  9.   For blck~& = 1 To blocks~&
  10.     Get filenum%, , block
  11.     _MemPut mem, mem.OFFSET + mpos~&&, block
  12.     mpos~&& = mpos~&& + BLOCKSIZE
  13.   Next blck~&
  14.   Close filenum%
  15.         ' Process mem
  16.   _MemFree mem
  17.   readBigFile~&& = fsize~&&
  18.  
  19. Function fileSize~&& (file$)
  20.   If Not _FileExists(file$) Then fileSize~&& = 0: Exit Function
  21.   filenum% = FreeFile
  22.   Open file$ For Binary Access Read As filenum%
  23.   fileSize~&& = LOF(filenum%)
  24.   Close filenum%
  25.  

and for csv's:
Code: QB64: [Select]
  1. Function CSV.read& (fileName$, eol$) ' 4M lines/sec
  2.   Const BLOCKSIZE = 4194304 '=64*65536 = 4 MB
  3.   If Not _FileExists(fileName$) Then CSV.read& = 0: Exit Function
  4.   'Print "Reading lines from "; fileName$; " ";: cpos% = Pos(0) '@@ progress info
  5.   eoll% = Len(eol$)
  6.   Dim block As String * BLOCKSIZE
  7.   ff% = FreeFile
  8.   Open fileName$ For Binary Access Read As #ff%
  9.   blocks& = .5 + LOF(ff%) / Len(block)
  10.   sep& = 0
  11.   lines& = -1
  12.   For curblock& = 1 To blocks&
  13.     Get #ff%, , block
  14.     If curblock& > 1 Then
  15.       buf$ = Mid$(buf$, sep&) + block
  16.       r0& = InStr(buf$, eol$) + eoll%
  17.     Else
  18.       buf$ = block 'Mid$(block, InStr(block, Chr$(10)) + 1)
  19.       r0& = 1
  20.     End If
  21.     r1& = InStr(r0&, buf$, eol$)
  22.     Do While r1& >= r0& And r0& > 0
  23.       lin$ = Mid$(buf$, r0&, r1& - r0& + eoll%)
  24.       ' Process lin$
  25.       lines& = lines& + 1
  26.       sep& = r1&: r0& = r1& + eoll%: r1& = InStr(r0&, buf$, eol$)
  27.     Loop
  28.     'Locate , cpos%, 0: Print lines&; '@@ progress info
  29.   Next curblock&
  30.   Close #ff%
  31.   buf$ = ""
  32.   'Locate , cpos%, 0 '@@ progress info
  33.   CSV.read& = lines&
  34.  
~ 2 million lines of BASIC

Offline Sanmayce

  • Newbie
  • Posts: 63
  • Where is that English Text Sidekick?
    • View Profile
    • Sanmayce's home
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #71 on: March 14, 2022, 06:12:56 am »
Very interesting thread! Thank you!

I am regularly working with really big files (some over to 100GB) and have been optimizing 2 functions to read/process big files very fast.

I thought I'd add them here for information:
...

You are welcome, at the moment have no time to make some dedicated example of sorting (fixed-length or/and variable-length i.e. string (LF instead of NULL) type) tool.
Let me know what interests you most, what is the main bottleneck you encounter in your processings - the parsing, the sorting, the searching...
This thread is all about these BASIC things, but optimized, my wish is to share and thus show the power of QB64 (as in the old days when pairing QB and ASM was, resembles QB64 and C duo, now).

All these micro-projects contribute to the Masakari project, the idea being, we all to have powerful and fast text-oriented routines. If you haven't looked up the Masakari text loading (somewhat similar to yours), it allows sorting lines per size or by their offset in memory (i.e. returning to the original state), the quick usage being:

Code: QB64: [Select]
  1. Declare CustomType Library "qsm_linesize" ' Notice that 'CustomType' makes things work, using 'STATIC' gives errors during compilation
  2.     Sub Quicksort_QB64_v7_linesize (ByVal QWORDSoff As _Offset, Byval QWORDSlen As _Offset, Byval Left As _Integer64, Byval Right As _Integer64)
  3. 'Quicksort_QB64_v7_linesize MhandleOFF.OFFSET, MhandleLEN.OFFSET, 0, ElementsMinusOne
  4.  

For example, do you have a ... nifty (for Linux, Windows) filename parser? A month ago I was tired of not having such a function, and wrote DIRWALKER (the skeleton at the moment, listing AS IT SHOULD the current folder), wanted to write a tiny tool, but the most important work is done, the rest is just gluing into your code. See the  BAS+C source code in the attached archive. Wanna have the HEAVY-DUTY functionality of sorting FILESIZE, NAME, MODIFIED TIME fields/columns... for millions of files, FAST!

 
dirwalker.png
* Dirwalker.zip (Filesize: 1.09 MB, Downloads: 231)
« Last Edit: March 14, 2022, 06:25:45 am by Sanmayce »
He learns not to learn and reverts to what all men pass by.

Offline mdijkens

  • Newbie
  • Posts: 34
    • View Profile
Re: A skeleton code for Text Scroller via Drag-and-Drop
« Reply #72 on: March 14, 2022, 06:45:10 am »
For DIR-walking, I use my routine below.
If I need sorting I add a custom QuickSort for the specific column

Code: QB64: [Select]
  1. $VersionInfo:FILEVERSION#=22,02,28,2
  2. $VersionInfo:FileDescription=mdDir
  3. $VersionInfo:CompanyName=dijkens.com
  4. $VersionInfo:ProductName=mdDir
  5. $VersionInfo:InternalName=mdDir
  6. $VersionInfo:OriginalFilename=mdDir
  7. $VersionInfo:LegalCopyright=dijkens.com
  8. $VersionInfo:LegalTrademarks=dijkens.com
  9. $VersionInfo:Comments=dijkens.com
  10. $VersionInfo:Web=www.dijkens.com
  11.  
  12. '$Let DEVTEST = 1
  13.  
  14. Type parmType
  15.   fspec As String 'fpsec
  16.   subs As Integer 'include subs
  17.   search As Integer 'search terms
  18.   icase As Integer 'ignore case
  19. Dim Shared parm As parmType
  20. Dim Shared stext(10) As String
  21. Const FALSE = 0, TRUE = Not FALSE
  22.  
  23. init
  24. f& = Dir(parm.fspec)
  25. $If DEVTEST = DEFINED Then
  26.  
  27. '$include: 'DIR.BI'
  28.  
  29. Sub init
  30.   Print "mdDir.exe v"; getVersion$
  31.   getParams
  32.   $If DEVTEST = DEFINED Then
  33.     ChDir "E:\_todo\qb64dev"
  34.   $End If
  35.  
  36. Sub getParams
  37.   'd:\dev\qb64\*.bas /t console:only /c
  38.   For ccount% = 1 To Val(getCommand$(-1))
  39.     cmd$ = getCommand$(ccount%)
  40.     If InStr("-/", Left$(cmd$, 1)) > 0 Then
  41.       Select Case UCase$(Mid$(cmd$, 2, 1))
  42.         Case "S"
  43.           parm.subs = TRUE
  44.         Case "T"
  45.           stext(0) = "sinit"
  46.         Case "C"
  47.           parm.icase = TRUE
  48.         Case Else
  49.           Print " mdDIR [fspec] [/s] [/t search1 [...] [/c]]"
  50.           Print "    fspec           = [d:][path\][filespec] e.g. 'd:\data\*.txt'"
  51.           Print "   /s               = Include sub-directories"
  52.           Print "   /t search1 [...] = Search content for 'search1' (and '...')"
  53.           Print "   /c               = Ignore case in search"
  54.           $If DEVTEST = DEFINED Then
  55.             Sleep
  56.           $End If
  57.           System
  58.       End Select
  59.     ElseIf stext(0) = "" Then
  60.       parm.fspec = cmd$
  61.     Else
  62.       parm.search = parm.search + 1
  63.       stext$(parm.search) = cmd$
  64.     End If
  65.   Next ccount%
  66.   If parm.fspec = "" Then
  67.     parm.fspec = "*.*"
  68.   End If
  69.  
  70. Function Dir& (p$)
  71.   b% = DIR.CheckPath(p$)
  72.   If b% > 0 Then
  73.     ddcount& = 0
  74.     If Mid$(p$, b%) = "\*.*" Then
  75.       Print Left$(p$, b%)
  76.       f& = DIR.GetFiles(p$)
  77.       DIR.SortFiles
  78.       If parm.search = 0 Then PrintDirs
  79.       PrintFiles Left$(p$, b%)
  80.       If DIR.Dcount > 0 Then
  81.         ddcount& = DIR.Dcount
  82.         subdir$ = "|"
  83.         For i& = 1 To ddcount&
  84.           subdir$ = subdir$ + DIR.Dname(i&) + "|"
  85.         Next i&
  86.       End If
  87.     Else
  88.       d& = DIR.GetFiles(Left$(p$, b%) + "*")
  89.       If DIR.Dcount > 0 Then
  90.         DIR.SortFiles
  91.         ddcount& = DIR.Dcount
  92.         subdir$ = "|"
  93.         For i& = 1 To ddcount&
  94.           subdir$ = subdir$ + DIR.Dname(i&) + "|"
  95.         Next i&
  96.       End If
  97.       f& = DIR.GetFiles(p$)
  98.       If DIR.Fcount > 0 Then
  99.         DIR.SortFiles
  100.         Print Left$(p$, b%)
  101.         PrintFiles Left$(p$, b%)
  102.       End If
  103.     End If
  104.     If parm.subs Then
  105.       For i& = 1 To ddcount&
  106.         sp& = 0
  107.         For s& = 1 To i&
  108.           sp& = InStr(sp& + 1, subdir$, "|")
  109.         Next s&
  110.         sd$ = Mid$(subdir$, sp& + 1, InStr(sp& + 1, subdir$, "|") - sp& - 1)
  111.         x& = Dir(Left$(p$, b%) + sd$ + Mid$(p$, b%))
  112.       Next i&
  113.     End If
  114.   Else
  115.     f& = 0
  116.   End If
  117.   Dir& = f&
  118.  
  119. Sub PrintFiles (path$)
  120.   q$ = Chr$(34)
  121.   For i& = 1 To DIR.Fcount
  122.     If textSearch(path$ + DIR.Fname(i&)) Then
  123.       a = DIR.Fattr(i&)
  124.       Attr$ = Space$(4)
  125.       If (a And &H20) = &H20 Then
  126.         Mid$(Attr$, 1, 1) = "A" ' archive
  127.       End If
  128.       If (a And &H4) = &H4 Then
  129.         Mid$(Attr$, 2, 1) = "S" ' system
  130.       End If
  131.       If (a And &H2) = &H2 Then
  132.         Mid$(Attr$, 3, 1) = "H" ' hidden
  133.       End If
  134.       If (a And &H1) = &H1 Then
  135.         Mid$(Attr$, 4, 1) = "R" ' read-only
  136.       End If
  137.       Print Using "\          \##,###,###,### \  \ \                 \ "; DIR.Fshort(i&); DIR.Fsize(i&); Attr$; DIR.Ftime(i&);
  138.       Print DIR.Fname(i&)
  139.     End If
  140.   Next i&
  141.  
  142. Function textSearch (fileName$)
  143.   If parm.search = 0 Then textSearch = TRUE: Exit Function
  144.   If Not _FileExists(fileName$) Then textSearch = TRUE: Exit Function
  145.   Dim content As String, block As String * 4194304 '=64*65536
  146.   Dim blocks As _Unsigned Long, curblock As _Unsigned Long
  147.   bsize = Len(block)
  148.   ff% = FreeFile: Open fileName$ For Random Access Read As #ff% Len = bsize
  149.   fsize = LOF(ff%): content = Space$(fsize): blocks = .5 + fsize / bsize: cpos = 1
  150.   For curblock = 1 To blocks
  151.     Get #ff%, curblock, block
  152.     Mid$(content, cpos, bsize) = block
  153.     cpos = cpos + bsize
  154.   Next curblock
  155.   Close #ff%
  156.   If parm.icase Then content = LCase$(content)
  157.   For s% = 1 To parm.search
  158.     If InStr(content, LCase$(stext(s%))) = 0 Then Exit For
  159.   Next s%
  160.   content = ""
  161.   If s% > parm.search Then
  162.     textSearch = TRUE
  163.   Else
  164.     textSearch = FALSE
  165.   End If
  166.  
  167. Sub PrintDirs
  168.   For i& = 1 To DIR.Dcount
  169.     a = DIR.Dattr(i&)
  170.     Attr$ = Space$(4)
  171.     If (a And &H20) = &H20 Then
  172.       Mid$(Attr$, 1, 1) = "A" ' archive
  173.     End If
  174.     If (a And &H4) = &H4 Then
  175.       Mid$(Attr$, 2, 1) = "S" ' system
  176.     End If
  177.     If (a And &H2) = &H2 Then
  178.       Mid$(Attr$, 3, 1) = "H" ' hidden
  179.     End If
  180.     If (a And &H1) = &H1 Then
  181.       Mid$(Attr$, 4, 1) = "R" ' read-only
  182.     End If
  183.   PRINT USING "\          \     <DIR>     \  \ \                 \ "; _
  184.   DIR.Dshort(i&); Attr$; DIR.Dtime(i&);
  185.     Print DIR.Dname(i&)
  186.   Next i&
  187.  
  188. Function getCommand$ (n%)
  189.   Static cmd$(100), ccount As Integer
  190.   If cmd$(0) = "" Then
  191.       Function GetCommandLineA%& ()
  192.     End Declare
  193.     Dim m As _MEM, ms As String * 1000
  194.     a%& = GetCommandLineA: m = _Mem(a%&, Len(ms)): ms = _MemGet(m, m.OFFSET, String * 1000)
  195.     ccount = 0: sp0% = 1: sp1% = InStr(ms, " ")
  196.     Do While sp1% > 0
  197.       cmd$(ccount) = _Trim$(Mid$(ms, sp0%, sp1% - sp0%))
  198.       If cmd$(ccount) <> "" Then ccount = ccount + 1
  199.       sp0% = sp1% + 1: sp1% = InStr(sp1% + 1, ms, " ")
  200.     Loop
  201.     cmd$(ccount) = _Trim$(Mid$(ms, sp0%)): If Left$(cmd$(ccount), 1) = Chr$(0) Then ccount = ccount - 1
  202.     _MemFree m
  203.   End If
  204.   If n% < 0 Then
  205.     getCommand$ = _Trim$(Str$(ccount))
  206.   ElseIf n% <= ccount Then
  207.     getCommand$ = cmd$(n%)
  208.   Else
  209.     getCommand$ = ""
  210.   End If
  211.  
  212. Function getVersion$
  213.   Declare Library 'Directory Information using KERNEL32
  214.     Function GetModuleFileNameA (ByVal hModule As Long, lpFileName As String, Byval nSize As Long)
  215.   Declare Dynamic Library "version"
  216.     Function GetFileVersionInfoA (lptstrFilename As String, Byval dwHandle As Long, Byval dwLen As Long, lpData As String)
  217.   FileName$ = Space$(256): res = GetModuleFileNameA(0, FileName$, Len(FileName$))
  218.   lpData$ = Space$(1024): res = GetFileVersionInfoA(FileName$, 0, Len(lpData$), lpData$)
  219.   offset% = Asc(Mid$(lpData$, 3, 1))
  220.   getVersion$ = LTrim$(Str$(Asc(Mid$(lpData$, offset%-1, 1)))) + "." + _
  221.   LTrim$(Str$(Asc(Mid$(lpData$, offset%-3, 1)))) + "." + _
  222.   LTrim$(Str$(Asc(Mid$(lpData$, offset%+3, 1)))) + "." + _
  223.   LTrim$(Str$(Asc(Mid$(lpData$, offset%+1, 1))))
  224.  

DIR.BI:
Code: QB64: [Select]
  1. '   ÉÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍ»
  2. '   º      DIR.BI  v1.22 (20200815)     maurits@dijkens.com       º
  3. '   ÇÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄĶ
  4. '   º                                                             º
  5. '   º ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ º
  6. '   º ³                                                         ³ º
  7. '   º ³ FUNCTION DIR.CheckPath (dfspec$)                        ³ º
  8. '   º ³ FUNCTION DIR.GetFiles& (dfspec$)                        ³ º
  9. '   º ³ SUB DIR.SortFiles ()                                    ³ º
  10. '   º ³ SUB DIR.QSortFiles (leftN AS LONG, rightN AS LONG)      ³ º
  11. '   º ³ SUB DIR.QSortDirs (leftN AS INTEGER, rightN AS INTEGER) ³ º
  12. '   º ³                                                         ³ º
  13. '   º ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ º
  14. '   º                                                             º
  15. '   ÈÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍÍͼ
  16. Const __DIR_MAX_PATH = 260
  17. Const __INVALID_HANDLE_VALUE = -1
  18.  
  19. Const DIR.READONLY = &H1
  20. Const DIR.HIDDEN = &H2
  21. Const DIR.SYSTEM = &H4
  22. Const DIR.DIRECTORY = &H10
  23. Const DIR.ARCHIVE = &H20
  24. Const DIR.NORMAL = &H80
  25. Const DIR.TEMPORARY = &H100
  26. Const DIR.SYMBOLICLINK = &H400
  27. Const DIR.COMPRESSED = &H800
  28. Const DIR.NOTINDEXED = &H2000
  29. Const DIR.ENCRYPTED = &H4000
  30.  
  31.  
  32. Type __DIR_FILETIME_TYPE
  33.   dwLowDateTime As _Unsigned Long
  34.   dwHighDateTime As _Unsigned Long
  35.  
  36. Type __DIR_SYSTEMTIME_TYPE
  37.   wYear As Integer
  38.   wMonth As Integer
  39.   wDayOfWeek As Integer
  40.   wDay As Integer
  41.   wHour As Integer
  42.   wMinute As Integer
  43.   wSecond As Integer
  44.   wMilliseconds As Integer
  45.  
  46. Type WIN32_FIND_DATAA
  47.   dwFileAttributes As _Unsigned Long
  48.   ftCreationTime As __DIR_FILETIME_TYPE
  49.   ftLastAccessTime As __DIR_FILETIME_TYPE
  50.   ftLastWriteTime As __DIR_FILETIME_TYPE
  51.   nFileSizeHigh As _Unsigned Long
  52.   nFileSizeLow As _Unsigned Long
  53.   dwReserved0 As _Unsigned Long
  54.   dwReserved1 As _Unsigned Long
  55.   cFileName As String * __Dir_max_path
  56.   cAlternateFileName As String * 14
  57.  
  58.   Function FindFirstFileA~%& (ByVal lpFileName~%&, Byval lpFindFileData~%&)
  59.   Function FindNextFileA& (ByVal hFindFile~%&, Byval lpFindFileData~%&)
  60.   Function FindClose& (ByVal hFindFile~%&)
  61.   Function FileTimeToSystemTime& (lpFileTime As __DIR_FILETIME_TYPE, lpSystemTime As __DIR_SYSTEMTIME_TYPE)
  62.   Function FileTimeToLocalFileTime& (lpFileTime As __DIR_FILETIME_TYPE, lpLocalFileTime As __DIR_FILETIME_TYPE)
  63.  
  64. Dim Shared DIR.Fname(1) As String
  65. Dim Shared DIR.Dname(1) As String
  66. Dim Shared DIR.Fcount As Long
  67. Dim Shared DIR.Dcount As Long
  68. Dim Shared DIR.Fsize(1) As _Integer64
  69. Dim Shared DIR.Dtime(1) As String * 19 ' mm-dd-yyyy hh:mm:ss
  70. Dim Shared DIR.Ftime(1) As String * 19 ' mm-dd-yyyy hh:mm:ss
  71. Dim Shared DIR.Dattr(1) As _Unsigned Long
  72. Dim Shared DIR.Fattr(1) As _Unsigned Long
  73. Dim Shared DIR.Dshort(1) As String
  74. Dim Shared DIR.Fshort(1) As String
  75.  
  76. DIR.err:
  77. DIR.Error = Err
  78.  
  79. Function DIR.CheckPath% (dfspec$)
  80.   dfspec$ = _Trim$(dfspec$)
  81.   If dfspec$ = "" Or Right$(dfspec$, 1) = ":" Then
  82.     dfspec$ = dfspec$ + ".\*.*"
  83.   End If
  84.   For pend% = Len(dfspec$) To 1 Step -1
  85.     If Mid$(dfspec$, pend%, 1) = "\" Then
  86.       Exit For
  87.     End If
  88.   Next pend%
  89.   If _DirExists(dfspec$) Then
  90.     If pend% = Len(dfspec$) Then
  91.       dfspec$ = dfspec$ + "*.*"
  92.     Else
  93.       dfspec$ = dfspec$ + "\*.*"
  94.       pend% = Len(dfspec$) - 3
  95.     End If
  96.   ElseIf pend% = 0 Then
  97.     dp% = InStr(dfspec$, ":")
  98.     dfspec$ = Left$(dfspec$, dp%) + ".\" + Mid$(dfspec$, dp% + 1)
  99.     pend% = dp% + 2
  100.   End If
  101.   If _DirExists(Left$(dfspec$, pend%)) Then
  102.     DIR.Error = 0
  103.     On Error GoTo DIR.err
  104.     ChDir Left$(dfspec$, pend%)
  105.     On Error GoTo 0
  106.     dspec$ = _CWD$
  107.     If DIR.Error = 0 Then
  108.       If Right$(dspec$, 1) <> "\" Then
  109.         dspec$ = dspec$ + "\"
  110.       End If
  111.       DIR.CheckPath% = Len(dspec$)
  112.       dfspec$ = dspec$ + Mid$(dfspec$, pend% + 1)
  113.     Else
  114.       DIR.CheckPath% = 0
  115.     End If
  116.   Else
  117.     DIR.CheckPath% = 0
  118.   End If
  119.  
  120. Function DIR.GetFiles& (dfspec$)
  121.   Dim Attribute As _Unsigned Long
  122.   Dim ASCIIZ As String * 260
  123.   Dim finddata As WIN32_FIND_DATAA
  124.   Dim Wfile.Handle As _Unsigned _Offset
  125.   Dim SysTime As __DIR_SYSTEMTIME_TYPE
  126.   Dim LocalTime As __DIR_FILETIME_TYPE
  127.  
  128.   Var$ = dfspec$
  129.   DIR.Dcount = 0
  130.   DIR.Fcount = 0
  131.   ASCIIZ = Var$ + Chr$(0)
  132.   Wfile.Handle = FindFirstFileA(_Offset(ASCIIZ), _Offset(finddata))
  133.   If Wfile.Handle <> __INVALID_HANDLE_VALUE Then
  134.     Do
  135.       Attribute = finddata.dwFileAttributes
  136.       Filename$ = finddata.cFileName
  137.       Filename$ = Left$(Filename$, InStr(Filename$, Chr$(0)) - 1)
  138.       If Filename$ <> "." And Filename$ <> ".." And Filename$ <> "$RECYCLE.BIN" And Filename$ <> "System Volume Information" Then
  139.  
  140.         ' store date/time
  141.         x& = FileTimeToLocalFileTime&(finddata.ftLastWriteTime, LocalTime)
  142.         x& = FileTimeToSystemTime&(LocalTime, SysTime)
  143.         Var$ = Right$("00" + LTrim$(Str$(SysTime.wMonth)), 2) + "-"
  144.         Var$ = Var$ + Right$("00" + LTrim$(Str$(SysTime.wDay)), 2) + "-"
  145.         Var$ = Var$ + LTrim$(Str$(SysTime.wYear)) + " "
  146.         Var$ = Var$ + Right$("00" + LTrim$(Str$(SysTime.wHour)), 2) + ":"
  147.         Var$ = Var$ + Right$("00" + LTrim$(Str$(SysTime.wMinute)), 2) + ":"
  148.         Var$ = Var$ + Right$("00" + LTrim$(Str$(SysTime.wSecond)), 2)
  149.  
  150.         If (Attribute And DIR.DIRECTORY) = DIR.DIRECTORY Then
  151.           DIR.Dcount = DIR.Dcount + 1
  152.           ReDim _Preserve DIR.Dname(DIR.Dcount) As String
  153.           DIR.Dname(DIR.Dcount) = Filename$
  154.  
  155.           ReDim _Preserve DIR.Dtime(DIR.Dcount) As String * 19
  156.           DIR.Dtime(DIR.Dcount) = Var$
  157.  
  158.           ReDim _Preserve DIR.Dattr(DIR.Dcount) As _Unsigned Long
  159.           DIR.Dattr(DIR.Dcount) = Attribute
  160.  
  161.           Filename$ = finddata.cAlternateFileName
  162.           Filename$ = Left$(Filename$, InStr(Filename$, Chr$(0)) - 1)
  163.           ReDim _Preserve DIR.Dshort(DIR.Dcount) As String
  164.           If Filename$ <> "" Then
  165.             DIR.Dshort(DIR.Dcount) = Filename$
  166.           Else
  167.             DIR.Dshort(DIR.Dcount) = UCase$(DIR.Dname(DIR.Dcount))
  168.           End If
  169.         Else
  170.           DIR.Fcount = DIR.Fcount + 1
  171.           ReDim _Preserve DIR.Fname(DIR.Fcount) As String
  172.           DIR.Fname(DIR.Fcount) = Filename$
  173.           ReDim _Preserve DIR.Fsize(DIR.Fcount) As _Integer64
  174.           F&& = finddata.nFileSizeHigh * &H100000000~&& Or finddata.nFileSizeLow
  175.           DIR.Fsize(DIR.Fcount) = F&&
  176.  
  177.           ReDim _Preserve DIR.Ftime(DIR.Fcount) As String * 19
  178.           DIR.Ftime(DIR.Fcount) = Var$
  179.  
  180.           ReDim _Preserve DIR.Fattr(DIR.Fcount) As _Unsigned Long
  181.           DIR.Fattr(DIR.Fcount) = Attribute
  182.  
  183.           Filename$ = finddata.cAlternateFileName
  184.           Filename$ = Left$(Filename$, InStr(Filename$, Chr$(0)) - 1)
  185.           ReDim _Preserve DIR.Fshort(DIR.Fcount) As String
  186.           If Filename$ <> "" Then
  187.             DIR.Fshort(DIR.Fcount) = Filename$
  188.           Else
  189.             DIR.Fshort(DIR.Fcount) = UCase$(DIR.Fname(DIR.Fcount))
  190.           End If
  191.         End If
  192.       End If
  193.     Loop While FindNextFileA(Wfile.Handle, _Offset(finddata))
  194.     x& = FindClose(Wfile.Handle)
  195.     DIR.GetFiles& = DIR.Fcount
  196.   Else
  197.     DIR.GetFiles& = -1
  198.   End If
  199.  
  200. Sub DIR.SortFiles ()
  201.   DIR.QSortDirs 1, DIR.Dcount
  202.   DIR.QSortFiles 1, DIR.Fcount
  203.  
  204. Sub DIR.QSortFiles (leftN As Long, rightN As Long)
  205.   Dim pivot As Long, leftNIdx As Long, rightNIdx As Long
  206.   leftNIdx = leftN
  207.   rightNIdx = rightN
  208.   If (rightN - leftN) > 0 Then
  209.     pivot = (leftN + rightN) / 2
  210.     While (leftNIdx <= pivot) And (rightNIdx >= pivot)
  211.       While (UCase$(DIR.Fname(leftNIdx)) < UCase$(DIR.Fname(pivot))) And (leftNIdx <= pivot)
  212.         leftNIdx = leftNIdx + 1
  213.       Wend
  214.       While (UCase$(DIR.Fname(rightNIdx)) > UCase$(DIR.Fname(pivot))) And (rightNIdx >= pivot)
  215.         rightNIdx = rightNIdx - 1
  216.       Wend
  217.       Swap DIR.Fname(leftNIdx), DIR.Fname(rightNIdx)
  218.       Swap DIR.Fshort(leftNIdx), DIR.Fshort(rightNIdx)
  219.       Swap DIR.Fsize(leftNIdx), DIR.Fsize(rightNIdx)
  220.       Swap DIR.Ftime(leftNIdx), DIR.Ftime(rightNIdx)
  221.       Swap DIR.Fattr(leftNIdx), DIR.Fattr(rightNIdx)
  222.       leftNIdx = leftNIdx + 1
  223.       rightNIdx = rightNIdx - 1
  224.       If (leftNIdx - 1) = pivot Then
  225.         rightNIdx = rightNIdx + 1
  226.         pivot = rightNIdx
  227.       ElseIf (rightNIdx + 1) = pivot Then
  228.         leftNIdx = leftNIdx - 1
  229.         pivot = leftNIdx
  230.       End If
  231.     Wend
  232.     DIR.QSortFiles leftN, pivot - 1
  233.     DIR.QSortFiles pivot + 1, rightN
  234.   End If
  235.  
  236. Sub DIR.QSortDirs (leftN As Integer, rightN As Integer)
  237.   Dim pivot As Integer, leftNIdx As Integer, rightNIdx As Integer
  238.   leftNIdx = leftN
  239.   rightNIdx = rightN
  240.   If (rightN - leftN) > 0 Then
  241.     pivot = (leftN + rightN) / 2
  242.     While (leftNIdx <= pivot) And (rightNIdx >= pivot)
  243.       While (UCase$(DIR.Dname(leftNIdx)) < UCase$(DIR.Dname(pivot))) And (leftNIdx <= pivot)
  244.         leftNIdx = leftNIdx + 1
  245.       Wend
  246.       While (UCase$(DIR.Dname(rightNIdx)) > UCase$(DIR.Dname(pivot))) And (rightNIdx >= pivot)
  247.         rightNIdx = rightNIdx - 1
  248.       Wend
  249.       Swap DIR.Dname(leftNIdx), DIR.Dname(rightNIdx)
  250.       Swap DIR.Dshort(leftNIdx), DIR.Dshort(rightNIdx)
  251.       Swap DIR.Dtime(leftNIdx), DIR.Dtime(rightNIdx)
  252.       Swap DIR.Dattr(leftNIdx), DIR.Dattr(rightNIdx)
  253.       leftNIdx = leftNIdx + 1
  254.       rightNIdx = rightNIdx - 1
  255.       If (leftNIdx - 1) = pivot Then
  256.         rightNIdx = rightNIdx + 1
  257.         pivot = rightNIdx
  258.       ElseIf (rightNIdx + 1) = pivot Then
  259.         leftNIdx = leftNIdx - 1
  260.         pivot = leftNIdx
  261.       End If
  262.     Wend
  263.     DIR.QSortDirs leftN, pivot - 1
  264.     DIR.QSortDirs pivot + 1, rightN
  265.   End If
  266.  
~ 2 million lines of BASIC