Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Messages - Sanmayce

Pages: [1] 2 3 ... 5
1
Programs / Re: A skeleton code for Text Scroller via Drag-and-Drop
« on: March 14, 2022, 06:12:56 am »
Very interesting thread! Thank you!

I am regularly working with really big files (some over to 100GB) and have been optimizing 2 functions to read/process big files very fast.

I thought I'd add them here for information:
...

You are welcome, at the moment have no time to make some dedicated example of sorting (fixed-length or/and variable-length i.e. string (LF instead of NULL) type) tool.
Let me know what interests you most, what is the main bottleneck you encounter in your processings - the parsing, the sorting, the searching...
This thread is all about these BASIC things, but optimized, my wish is to share and thus show the power of QB64 (as in the old days when pairing QB and ASM was, resembles QB64 and C duo, now).

All these micro-projects contribute to the Masakari project, the idea being, we all to have powerful and fast text-oriented routines. If you haven't looked up the Masakari text loading (somewhat similar to yours), it allows sorting lines per size or by their offset in memory (i.e. returning to the original state), the quick usage being:

Code: QB64: [Select]
  1. Declare CustomType Library "qsm_linesize" ' Notice that 'CustomType' makes things work, using 'STATIC' gives errors during compilation
  2.     Sub Quicksort_QB64_v7_linesize (ByVal QWORDSoff As _Offset, Byval QWORDSlen As _Offset, Byval Left As _Integer64, Byval Right As _Integer64)
  3. 'Quicksort_QB64_v7_linesize MhandleOFF.OFFSET, MhandleLEN.OFFSET, 0, ElementsMinusOne
  4.  

For example, do you have a ... nifty (for Linux, Windows) filename parser? A month ago I was tired of not having such a function, and wrote DIRWALKER (the skeleton at the moment, listing AS IT SHOULD the current folder), wanted to write a tiny tool, but the most important work is done, the rest is just gluing into your code. See the  BAS+C source code in the attached archive. Wanna have the HEAVY-DUTY functionality of sorting FILESIZE, NAME, MODIFIED TIME fields/columns... for millions of files, FAST!

 
dirwalker.png

2
Programs / Re: A skeleton code for Text Scroller via Drag-and-Drop
« on: March 13, 2022, 08:54:21 pm »
A new showdown is ready - Scandum's crumsort v1.1.5.3 vs Magnetica r.14:

Test run: 2022-Mar-14:
Laptop "Compressionette", Intel 'Kaby Lake' i5-7200U 3.1GHz max turbo, 36GB DDR4 2133MHz:
Code: [Select]
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Performer/Keys     | #1, FEW distinct        | #2, MANY distinct        | #3, MANYmore distinct    | #4, ALL distinct          | #5, ALLmore distinct      | #6, ALLmax distinct         |
+--------------------+-------------------------+--------------------------+--------------------------+-------------+-------------+---------------------------+--------------+--------------+
|  Operating System, |   Fedora 35, GCC 11.2.1 |    Fedora 35, GCC 11.2.1 |    Fedora 35, GCC 11.2.1 |     Fedora 35, GCC 11.2.1 |     Fedora 35, GCC 11.2.1 |       Fedora 35, GCC 11.2.1 |
|      Compiler, -O3 |       instructions; IPC |        instructions; IPC |        instructions; IPC |         instructions; IPC |         instructions; IPC |           instructions; IPC |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| qsort              |         385/342 seconds |          527/234 seconds |           165/55 seconds |           516/128 seconds |           999/252 seconds |                        N.A. |
|                    | 6,448,450,744,497; 2.82 |  4,921,980,445,033; 2.06 |  1,369,822,972,984; 1.94 |   3,250,596,357,540; 1.58 |   6,250,299,799,611; 1.59 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Magnetica v.14     |            29/6 seconds |           198/47 seconds |            86/22 seconds |            241/66 seconds |           472/134 seconds |                        N.A. |
|                    |   171,452,642,615; 1.14 |    936,751,248,598; 1.17 |    443,844,553,192; 1.25 |   1,291,624,832,269; 1.28 |   2,503,942,225,085; 1.28 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Bentley-McIlroy    |           38/13 seconds |           223/47 seconds |           100/26 seconds |            304/65 seconds |           597/139 seconds |                        N.A. |
|                    |   246,587,334,436; 1.23 |  1,105,253,285,645; 1.25 |    507,140,963,617; 1.23 |   1,460,132,564,676; 1.22 |   2,848,210,143,410; 1.22 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+
| Crumsort 1.1.5.3   |            29/4 seconds |            129/4 seconds |             61/2 seconds |            173/3  seconds |             339/7 seconds |                        N.A. |
|                    |   351,332,884,902; 2.43 |  1,284,147,329,900; 2.80 |    603,065,127,518; 2.77 |   1,605,371,408,284; 2.64 |   3,102,283,088,926; 2.71 |                        N.A. |
+--------------------+-------------------------+--------------------------+--------------------------+---------------------------+---------------------------+-----------------------------+

Legend (The time is exactly the Sort process time, first value is for unsorted, second one is for sorted).

3
Programs / Re: A skeleton code for Text Scroller via Drag-and-Drop
« on: March 08, 2022, 05:10:05 pm »
A week ago, Scandum released a killer, the current BEST Quicksort, named Crumsort, being the FASTEST unstable, in-place, Quicksort in my tests.

As I see it, he did dive deep in sorting, achieving outstanding results.

Yet, intuitively, I see different approach, which potentially can reach and even surpass his awesome speeds, to be seen...

Currently, the picture of hi-speed sorting is this (in order to reproduce the results, the benchmark package is attached):

Code: [Select]
// Test run: 2022-Mar-08:
// Laptop "Compressionette", Intel 'Kaby Lake' i5-7200U 3.1GHz max turbo, 36GB DDR4 2133MHz:
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// |  Operating System, | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,  | Windows 10, | Fedora 35,   | Windows 10,  | Fedora 35,   |
// |      Compiler, -O3 | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1   | Intel v15.0  | GCC 11.2.1   |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// | qsort              |  59 seconds | 377 seconds | 336 seconds | 541 seconds | 157 seconds | 195 seconds | 435 seconds | 534 seconds | 851 seconds | 1036 seconds |         N.A. |         N.A. |
// | Magnetica v.13     |  31 seconds |  30 seconds | 202 seconds | 196 seconds |  88 seconds |  85 seconds | 259 seconds | 250 seconds | 506 seconds |  493 seconds |         N.A. |         N.A. |
// | Bentley-McIlroy    |  38 seconds |  36 seconds | 205 seconds | 208 seconds |  92 seconds |  94 seconds | 279 seconds | 281 seconds | 544 seconds |  553 seconds |         N.A. |         N.A. |
// | Crumsort           |  30 seconds |  32 seconds | 132 seconds | 150 seconds |  64 seconds |  70 seconds | 184 seconds | 192 seconds | 357 seconds |  376 seconds |         N.A. |         N.A. |
// +--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
// | Best Time (bare    |                           |                           |                           |                           |                            |                             |
// | bone in-place QS): |  30s PARITY               | 132s for Crumsort         |  64s for Crumsort         | 184s for Crumsort         | 357s for Crumsort          | N.A.                        |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
Code: [Select]
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// |  Operating System, | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1     | Fedora 35, GCC 11.2.1      | Fedora 35, GCC 11.2.1       |
// |      Compiler, -O3 | instructions; IPC         | instructions; IPC         | instructions; IPC         | instructions; IPC         | instructions; IPC          | instructions; IPC           |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
// | qsort              |   3,302,993,934,921; 2.75 |   2,983,579,082,155; 1.75 |     886,263,153,476; 1.41 |   2,352,769,705,563; 1.38 |    4,527,367,288,814; 1.39 |                        N.A. |
// | Magnetica v.13     |     131,873,917,282; 1.00 |     658,478,895,600; 1.02 |     309,149,594,748; 1.08 |     884,297,729,161; 1.06 |    1,726,931,029,634; 1.08 |                        N.A. |
// | Bentley-McIlroy    |     164,915,835,204; 1.09 |     681,956,155,364; 1.00 |     322,584,009,352; 1.04 |     944,038,959,690; 1.02 |    1,719,825,062,847; 0.97 |                        N.A. |
// | Crumsort           |     312,328,497,447; 2.29 |   1,295,551,817,497; 2.57 |     603,276,911,007; 2.53 |   1,597,685,291,532; 2.46 |    3,091,001,982,856; 2.51 |                        N.A. |
// +--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+

Code: [Select]
// Speed Roster, (the base speed 1.00x is GLIBC's qsort):
// Rank #1: 2683/820=  3.27x =  32+150+ 70+192+ 376=  820 seconds for Crumsort
// Rank #2: 2683/1054= 2.54x =  30+196+ 85+250+ 493= 1054 seconds for Magnetica v.13
// Rank #3: 2683/1172= 2.28x =  36+208+ 94+281+ 553= 1172 seconds for Bentley-McIlroy
// Rank #4: 2683/2683= 1.00x = 377+541+195+534+1036= 2683 seconds for GLIBC's qsort
//
// Legend (The time is exactly the Sort process time):
// #1,FEW = 2,233,861,800 keys, of them distinct = 10; 178,708,944 bytes 22338618_QWORDS.bin; elements = 178,708,944/8 *100; // Keys are 100 times duplicated
// #2,MANY = 2,482,300,900 keys, of them distinct = 2,847,531; 24,823,016 bytes mobythesaurus.txt; elements = 24823016 -8+1; // BuildingBlocks are size-order+1, they are 100 times duplicated
// #3,MANYmore = 1,137,582,073 keys, of them distinct = 77,275,994; 1,137,582,080 bytes linux-5.15.25.tar; elements = 1137582080 -8+1; // BuildingBlocks are size-order+1
// #4,ALL = 2,009,333,753 keys, of them distinct = 1,912,608,132; 2,009,333,760 bytes Fedora-Workstation-Live-x86_64-35-1.2.iso; elements = 2009333760 -8+1; // BuildingBlocks are size-order+1
// #5,ALLmore = 3,803,483,825 keys, of them distinct = 3,346,259,533; 3,803,483,832 bytes Fedora-Workstation-35-1.2.aarch64.raw.xz; elements = 3803483832 -8+1; // BuildingBlocks are size-order+1
// #6,ALLmax = 7,798,235,435 keys, of them distinct = 6,770,144,405; 7,798,235,442 bytes math.stackexchange.com_en_all_2019-02.zim; elements = 7798235442 -8+1; // BuildingBlocks are size-order+1
Code: [Select]
// Notes:
// - Scandum's Crumsort is the FASTEST in-place sorter, known to me, hail Scandum!
// - All the runs were in "Current priority class is REALTIME_PRIORITY_CLASS" for Windows and "Current priority is -20." for Linux;
// To see more stats (the tables were deriving from) see 'log_i5-7200U_MAR08.txt';
// - Benchmark needs 32GB RAM, and 64GB for the 6th testset;
// - The whole package (except the 3rd, 4th, 5th and 6th datasets) is downloadable at:
//   www.sanmayce.com/QS_showdown_r13.zip
// - To reproduce the roster, run on Windows or Linux:
//   - BENCH_ICL32GB.BAT
//   - BENCH_ICL64GB.BAT
//   - sh bench_gcc32GB.sh
//   - sh bench_gcc64GB.sh
// - 3rd dataset is downloadable at:
// https://cdn.kernel.org/pub/linux/kernel/v5.x/linux-5.15.25.tar.xz
// - 4th dataset is downloadable at:
// https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/x86_64/iso/Fedora-Workstation-Live-x86_64-35-1.2.iso
// - 5th dataset is downloadable at:
// https://download.fedoraproject.org/pub/fedora/linux/releases/35/Workstation/aarch64/images/Fedora-Workstation-35-1.2.aarch64.raw.xz
// - 6th dataset is downloadable at:
// https://download.kiwix.org/zim/stack_exchange/math.stackexchange.com_en_all_2019-02.zim
// - Managed to reduce the mainloop down to 68 bytes (from 71 in r.12), the gain comes from switching to pointers, had to do that long time ago, wanted to keep the etude ARRAY-syntax friendly. Now, r.12 uses arrays, whereas r.13 uses pointers. Another turning point, GCC now beats ICL in all tests.

The current best is at:
https://github.com/scandum/crumsort

Here comes Magnetica v13, a view from the kitchen, this time around, GCC v11.2.1 proves superior to ICL v15.0, generating faster code in all tests:
Code: [Select]
// And the Assembly code generated by GCC 11.2.1 for Magnetica_v13:
// 001 Quicksort_QB64_v13:
// 002 .LFB214:
// 003 .cfi_startproc
// 004 pushq %r15
// 005 .cfi_def_cfa_offset 16
// 006 .cfi_offset 15, -16
// 007 leaq -8(%rdi,%rsi,8), %r10
// 008 movq %rdi, %xmm0
// 009 movq %rdi, %rax
// 010 pushq %r14
// 011 .cfi_def_cfa_offset 24
// 012 .cfi_offset 14, -24
// 013 movq %r10, %xmm2
// 014 movl $2, %r15d
// 015 pushq %r13
// 016 .cfi_def_cfa_offset 32
// 017 .cfi_offset 13, -32
// 018 punpcklqdq %xmm2, %xmm0
// 019 pushq %r12
// 020 .cfi_def_cfa_offset 40
// 021 .cfi_offset 12, -40
// 022 pushq %rbp
// 023 .cfi_def_cfa_offset 48
// 024 .cfi_offset 6, -48
// 025 pushq %rbx
// 026 .cfi_def_cfa_offset 56
// 027 .cfi_offset 3, -56
// 028 subq $143920, %rsp
// 029 .cfi_def_cfa_offset 143976
// 030 movups %xmm0, -96(%rsp)
// 031 .p2align 4,,10
// 032 .p2align 3
// 033 .L5115:
// 034 movq %r10, %rcx
// 035 subq $2, %r15
// 036 subq %rax, %rcx
// 037 testq %rcx, %rcx
// 038 jle .L5096
// 039 leaq 8(%rax), %rbx
// 040 .L5113:
// 041 movq %rcx, %rsi
// 042 movq (%rax), %rdx
// 043 sarq $3, %rsi
// 044 cmpq $55, %rcx
// 045 ja .L5097
// 046 jmp *.L5099(,%rsi,8)
// 047 .section .rodata
// 048 .align 8
// 049 .align 4
// 050 .L5099:
// 051 .quad .L5097
// 052 .quad .L5104
// 053 .quad .L5103
// 054 .quad .L5102
// 055 .quad .L5101
// 056 .quad .L5100
// 057 .quad .L5098
// 058 .text
// 059 .p2align 4,,10
// 060 .p2align 3
// 061 .L5098:
// 062 movq 8(%rax), %rbp
// 063 movq 16(%rax), %rbx
// 064 xorl %ecx, %ecx
// 065 movq 24(%rax), %r11
// 066 movq 32(%rax), %r10
// 067 cmpq %rdx, %rbp
// 068 movq 40(%rax), %r9
// 069 movq 48(%rax), %r12
// 070 setb %cl
// 071 cmpq %rdx, %rbx
// 072 adcl $0, %ecx
// 073 cmpq %rdx, %r11
// 074 adcl $0, %ecx
// 075 cmpq %rdx, %r10
// 076 adcl $0, %ecx
// 077 cmpq %rdx, %r9
// 078 adcl $0, %ecx
// 079 cmpq %rdx, %r12
// 080 adcl $0, %ecx
// 081 xorl %r8d, %r8d
// 082 cmpq %rdx, %rbp
// 083 movl %ecx, -108(%rsp)
// 084 setnb %r8b
// 085 xorl %ecx, %ecx
// 086 cmpq %rbx, %rbp
// 087 seta %cl
// 088 addl %ecx, %r8d
// 089 xorl %ecx, %ecx
// 090 cmpq %r11, %rbp
// 091 seta %cl
// 092 addl %ecx, %r8d
// 093 xorl %ecx, %ecx
// 094 cmpq %r10, %rbp
// 095 seta %cl
// 096 addl %ecx, %r8d
// 097 xorl %ecx, %ecx
// 098 cmpq %r9, %rbp
// 099 seta %cl
// 100 addl %ecx, %r8d
// 101 xorl %ecx, %ecx
// 102 cmpq %r12, %rbp
// 103 seta %cl
// 104 xorl %edi, %edi
// 105 addl %ecx, %r8d
// 106 cmpq %rdx, %rbx
// 107 setnb %dil
// 108 xorl %ecx, %ecx
// 109 cmpq %rbx, %rbp
// 110 setbe %cl
// 111 addl %ecx, %edi
// 112 xorl %ecx, %ecx
// 113 cmpq %r11, %rbx
// 114 seta %cl
// 115 addl %ecx, %edi
// 116 xorl %ecx, %ecx
// 117 cmpq %r10, %rbx
// 118 seta %cl
// 119 addl %ecx, %edi
// 120 xorl %ecx, %ecx
// 121 cmpq %r9, %rbx
// 122 seta %cl
// 123 addl %ecx, %edi
// 124 xorl %ecx, %ecx
// 125 cmpq %r12, %rbx
// 126 seta %cl
// 127 xorl %esi, %esi
// 128 addl %ecx, %edi
// 129 cmpq %rdx, %r11
// 130 setnb %sil
// 131 xorl %ecx, %ecx
// 132 cmpq %r11, %rbp
// 133 setbe %cl
// 134 addl %ecx, %esi
// 135 xorl %ecx, %ecx
// 136 cmpq %r11, %rbx
// 137 setbe %cl
// 138 addl %ecx, %esi
// 139 xorl %ecx, %ecx
// 140 cmpq %r10, %r11
// 141 seta %cl
// 142 addl %ecx, %esi
// 143 xorl %ecx, %ecx
// 144 cmpq %r9, %r11
// 145 seta %cl
// 146 addl %ecx, %esi
// 147 xorl %ecx, %ecx
// 148 cmpq %r12, %r11
// 149 seta %cl
// 150 addl %ecx, %esi
// 151 xorl %ecx, %ecx
// 152 cmpq %rdx, %r10
// 153 setnb %cl
// 154 xorl %r13d, %r13d
// 155 cmpq %r10, %rbp
// 156 setbe %r13b
// 157 addl %r13d, %ecx
// 158 xorl %r13d, %r13d
// 159 cmpq %r10, %rbx
// 160 setbe %r13b
// 161 addl %r13d, %ecx
// 162 xorl %r13d, %r13d
// 163 cmpq %r10, %r11
// 164 setbe %r13b
// 165 addl %r13d, %ecx
// 166 xorl %r13d, %r13d
// 167 cmpq %r9, %r10
// 168 seta %r13b
// 169 addl %r13d, %ecx
// 170 xorl %r13d, %r13d
// 171 cmpq %r12, %r10
// 172 seta %r13b
// 173 xorl %r14d, %r14d
// 174 addl %r13d, %ecx
// 175 cmpq %rdx, %r9
// 176 setnb %r14b
// 177 xorl %r13d, %r13d
// 178 cmpq %r9, %rbp
// 179 setbe %r13b
// 180 addl %r14d, %r13d
// 181 xorl %r14d, %r14d
// 182 cmpq %r9, %rbx
// 183 setbe %r14b
// 184 addl %r13d, %r14d
// 185 xorl %r13d, %r13d
// 186 cmpq %r9, %r11
// 187 setbe %r13b
// 188 addl %r14d, %r13d
// 189 xorl %r14d, %r14d
// 190 cmpq %r9, %r10
// 191 setbe %r14b
// 192 addl %r13d, %r14d
// 193 xorl %r13d, %r13d
// 194 cmpq %r12, %r9
// 195 seta %r13b
// 196 addl %r14d, %r13d
// 197 movslq -108(%rsp), %r14
// 198 movq %rdx, (%rax,%r14,8)
// 199 movslq %r8d, %rdx
// 200 movl -108(%rsp), %r14d
// 201 movq %rbp, (%rax,%rdx,8)
// 202 movslq %edi, %rdx
// 203 movq %rbx, (%rax,%rdx,8)
// 204 movslq %esi, %rdx
// 205 movq %r11, (%rax,%rdx,8)
// 206 movslq %ecx, %rdx
// 207 movq %r10, (%rax,%rdx,8)
// 208 movslq %r13d, %rdx
// 209 movq %r9, (%rax,%rdx,8)
// 210 leal (%r14,%r8), %edx
// 211 addl %edi, %edx
// 212 addl %esi, %edx
// 213 addl %ecx, %edx
// 214 movl $21, %ecx
// 215 addl %r13d, %edx
// 216 subl %edx, %ecx
// 217 movslq %ecx, %rdx
// 218 movq %r12, (%rax,%rdx,8)
// 219 .p2align 4,,10
// 220 .p2align 3
// 221 .L5096:
// 222 testq %r15, %r15
// 223 je .L5095
// 224 .L5121:
// 225 movq -104(%rsp,%r15,8), %r10
// 226 movq -112(%rsp,%r15,8), %rax
// 227 jmp .L5115
// 228 .p2align 4,,10
// 229 .p2align 3
// 230 .L5103:
// 231 movq 8(%rax), %rdi
// 232 movq 16(%rax), %r8
// 233 xorl %ecx, %ecx
// 234 cmpq %rdx, %rdi
// 235 setb %cl
// 236 cmpq %rdx, %r8
// 237 adcl $0, %ecx
// 238 xorl %esi, %esi
// 239 cmpq %rdx, %rdi
// 240 setnb %sil
// 241 xorl %r9d, %r9d
// 242 cmpq %r8, %rdi
// 243 seta %r9b
// 244 addl %r9d, %esi
// 245 movslq %ecx, %r9
// 246 movq %rdx, (%rax,%r9,8)
// 247 movslq %esi, %rdx
// 248 addl %esi, %ecx
// 249 movq %rdi, (%rax,%rdx,8)
// 250 movl $3, %edx
// 251 subl %ecx, %edx
// 252 movslq %edx, %rdx
// 253 movq %r8, (%rax,%rdx,8)
// 254 testq %r15, %r15
// 255 jne .L5121
// 256 .L5095:
// 257 addq $143920, %rsp
// 258 .cfi_remember_state
// 259 .cfi_def_cfa_offset 56
// 260 popq %rbx
// 261 .cfi_def_cfa_offset 48
// 262 popq %rbp
// 263 .cfi_def_cfa_offset 40
// 264 popq %r12
// 265 .cfi_def_cfa_offset 32
// 266 popq %r13
// 267 .cfi_def_cfa_offset 24
// 268 popq %r14
// 269 .cfi_def_cfa_offset 16
// 270 popq %r15
// 271 .cfi_def_cfa_offset 8
// 272 ret
// 273 .p2align 4,,10
// 274 .p2align 3
// 275 .L5104:
// 276 .cfi_restore_state
// 277 movq 8(%rax), %rcx
// 278 cmpq %rdx, %rcx
// 279 sbbq %rsi, %rsi
// 280 andl $8, %esi
// 281 cmpq %rdx, %rcx
// 282 movq %rdx, (%rax,%rsi)
// 283 movl $1, %edx
// 284 sbbl $0, %edx
// 285 movslq %edx, %rdx
// 286 movq %rcx, (%rax,%rdx,8)
// 287 jmp .L5096
// 288 .p2align 4,,10
// 289 .p2align 3
// 290 .L5101:
// 291 movq 8(%rax), %r10
// 292 movq 16(%rax), %r9
// 293 xorl %ecx, %ecx
// 294 movq 24(%rax), %r8
// 295 movq 32(%rax), %rdi
// 296 cmpq %rdx, %r10
// 297 setb %cl
// 298 cmpq %rdx, %r9
// 299 adcl $0, %ecx
// 300 cmpq %rdx, %r8
// 301 adcl $0, %ecx
// 302 cmpq %rdx, %rdi
// 303 adcl $0, %ecx
// 304 xorl %esi, %esi
// 305 cmpq %rdx, %r10
// 306 setnb %sil
// 307 xorl %r11d, %r11d
// 308 cmpq %r9, %r10
// 309 seta %r11b
// 310 addl %r11d, %esi
// 311 xorl %r11d, %r11d
// 312 cmpq %r8, %r10
// 313 seta %r11b
// 314 addl %r11d, %esi
// 315 xorl %r11d, %r11d
// 316 cmpq %rdi, %r10
// 317 seta %r11b
// 318 addl %esi, %r11d
// 319 xorl %esi, %esi
// 320 cmpq %rdx, %r9
// 321 setnb %sil
// 322 xorl %ebx, %ebx
// 323 cmpq %r9, %r10
// 324 setbe %bl
// 325 addl %ebx, %esi
// 326 xorl %ebx, %ebx
// 327 cmpq %r8, %r9
// 328 seta %bl
// 329 addl %ebx, %esi
// 330 xorl %ebx, %ebx
// 331 cmpq %rdi, %r9
// 332 seta %bl
// 333 addl %esi, %ebx
// 334 xorl %esi, %esi
// 335 cmpq %rdx, %r8
// 336 setnb %sil
// 337 xorl %ebp, %ebp
// 338 cmpq %r8, %r10
// 339 setbe %bpl
// 340 addl %ebp, %esi
// 341 xorl %ebp, %ebp
// 342 cmpq %r8, %r9
// 343 setbe %bpl
// 344 addl %ebp, %esi
// 345 xorl %ebp, %ebp
// 346 cmpq %rdi, %r8
// 347 seta %bpl
// 348 addl %ebp, %esi
// 349 movslq %ecx, %rbp
// 350 addl %r11d, %ecx
// 351 movq %rdx, (%rax,%rbp,8)
// 352 movslq %r11d, %rdx
// 353 addl %ebx, %ecx
// 354 movq %r10, (%rax,%rdx,8)
// 355 movslq %ebx, %rdx
// 356 addl %esi, %ecx
// 357 movq %r9, (%rax,%rdx,8)
// 358 movslq %esi, %rdx
// 359 movq %r8, (%rax,%rdx,8)
// 360 movl $10, %edx
// 361 subl %ecx, %edx
// 362 movslq %edx, %rdx
// 363 movq %rdi, (%rax,%rdx,8)
// 364 jmp .L5096
// 365 .p2align 4,,10
// 366 .p2align 3
// 367 .L5102:
// 368 movq 8(%rax), %r10
// 369 movq 16(%rax), %r9
// 370 xorl %ecx, %ecx
// 371 movq 24(%rax), %r8
// 372 cmpq %rdx, %r10
// 373 setb %cl
// 374 cmpq %rdx, %r9
// 375 adcl $0, %ecx
// 376 cmpq %rdx, %r8
// 377 adcl $0, %ecx
// 378 xorl %edi, %edi
// 379 cmpq %rdx, %r10
// 380 setnb %dil
// 381 xorl %esi, %esi
// 382 cmpq %r9, %r10
// 383 seta %sil
// 384 addl %esi, %edi
// 385 xorl %esi, %esi
// 386 cmpq %r8, %r10
// 387 seta %sil
// 388 addl %esi, %edi
// 389 xorl %esi, %esi
// 390 cmpq %rdx, %r9
// 391 setnb %sil
// 392 xorl %r11d, %r11d
// 393 cmpq %r9, %r10
// 394 setbe %r11b
// 395 addl %r11d, %esi
// 396 xorl %r11d, %r11d
// 397 cmpq %r8, %r9
// 398 seta %r11b
// 399 addl %r11d, %esi
// 400 movslq %ecx, %r11
// 401 addl %edi, %ecx
// 402 movq %rdx, (%rax,%r11,8)
// 403 movslq %edi, %rdx
// 404 addl %esi, %ecx
// 405 movq %r10, (%rax,%rdx,8)
// 406 movslq %esi, %rdx
// 407 movq %r9, (%rax,%rdx,8)
// 408 movl $6, %edx
// 409 subl %ecx, %edx
// 410 movslq %edx, %rdx
// 411 movq %r8, (%rax,%rdx,8)
// 412 jmp .L5096
// 413 .p2align 4,,10
// 414 .p2align 3
// 415 .L5100:
// 416 movq 8(%rax), %r12
// 417 movq 16(%rax), %rbp
// 418 xorl %ecx, %ecx
// 419 movq 24(%rax), %r11
// 420 movq 32(%rax), %r10
// 421 cmpq %rdx, %r12
// 422 movq 40(%rax), %rbx
// 423 setb %cl
// 424 cmpq %rdx, %rbp
// 425 adcl $0, %ecx
// 426 cmpq %rdx, %r11
// 427 adcl $0, %ecx
// 428 cmpq %rdx, %r10
// 429 adcl $0, %ecx
// 430 cmpq %rdx, %rbx
// 431 adcl $0, %ecx
// 432 xorl %r9d, %r9d
// 433 cmpq %rdx, %r12
// 434 setnb %r9b
// 435 xorl %esi, %esi
// 436 cmpq %rbp, %r12
// 437 seta %sil
// 438 addl %esi, %r9d
// 439 xorl %esi, %esi
// 440 cmpq %r11, %r12
// 441 seta %sil
// 442 addl %esi, %r9d
// 443 xorl %esi, %esi
// 444 cmpq %r10, %r12
// 445 seta %sil
// 446 addl %esi, %r9d
// 447 xorl %esi, %esi
// 448 cmpq %rbx, %r12
// 449 seta %sil
// 450 xorl %r8d, %r8d
// 451 addl %esi, %r9d
// 452 cmpq %rdx, %rbp
// 453 setnb %r8b
// 454 xorl %esi, %esi
// 455 cmpq %rbp, %r12
// 456 setbe %sil
// 457 addl %esi, %r8d
// 458 xorl %esi, %esi
// 459 cmpq %r11, %rbp
// 460 seta %sil
// 461 addl %esi, %r8d
// 462 xorl %esi, %esi
// 463 cmpq %r10, %rbp
// 464 seta %sil
// 465 addl %esi, %r8d
// 466 xorl %esi, %esi
// 467 cmpq %rbx, %rbp
// 468 seta %sil
// 469 xorl %edi, %edi
// 470 addl %esi, %r8d
// 471 cmpq %rdx, %r11
// 472 setnb %dil
// 473 xorl %esi, %esi
// 474 cmpq %r11, %r12
// 475 setbe %sil
// 476 addl %esi, %edi
// 477 xorl %esi, %esi
// 478 cmpq %r11, %rbp
// 479 setbe %sil
// 480 addl %esi, %edi
// 481 xorl %esi, %esi
// 482 cmpq %r10, %r11
// 483 seta %sil
// 484 addl %esi, %edi
// 485 xorl %esi, %esi
// 486 cmpq %rbx, %r11
// 487 seta %sil
// 488 addl %esi, %edi
// 489 xorl %esi, %esi
// 490 cmpq %rdx, %r10
// 491 setnb %sil
// 492 xorl %r13d, %r13d
// 493 cmpq %r10, %r12
// 494 setbe %r13b
// 495 addl %r13d, %esi
// 496 xorl %r13d, %r13d
// 497 cmpq %r10, %rbp
// 498 setbe %r13b
// 499 addl %r13d, %esi
// 500 xorl %r13d, %r13d
// 501 cmpq %r10, %r11
// 502 setbe %r13b
// 503 addl %r13d, %esi
// 504 xorl %r13d, %r13d
// 505 cmpq %rbx, %r10
// 506 seta %r13b
// 507 addl %r13d, %esi
// 508 movslq %ecx, %r13
// 509 movq %rdx, (%rax,%r13,8)
// 510 movslq %r9d, %rdx
// 511 movq %r12, (%rax,%rdx,8)
// 512 movslq %r8d, %rdx
// 513 movq %rbp, (%rax,%rdx,8)
// 514 movslq %edi, %rdx
// 515 movq %r11, (%rax,%rdx,8)
// 516 movslq %esi, %rdx
// 517 movq %r10, (%rax,%rdx,8)
// 518 leal (%rcx,%r9), %edx
// 519 movl $15, %ecx
// 520 addl %r8d, %edx
// 521 addl %edi, %edx
// 522 addl %esi, %edx
// 523 subl %edx, %ecx
// 524 movslq %ecx, %rdx
// 525 movq %rbx, (%rax,%rdx,8)
// 526 jmp .L5096
// 527 .L5097:
// 528 sarq $2, %rsi
// 529 leaq (%rax,%rsi,8), %rcx
// 530 movq (%rcx), %r8
// 531 movq %rdx, (%rcx)
// 532 movq %r8, (%rax)
// 533 cmpq %rax, %r10
// 534 jbe .L5105
// 535 movq %rax, %rdx
// 536 movq %rax, %r9
// 537 movq %r10, %rcx
// 538 jmp .L5111
// 539 .p2align 4,,10
// 540 .p2align 3
// 541 .L5122:
// 542 movq (%r9), %rdi
// 543 movq %rsi, (%r9)
// 544 leaq 16(%rdx), %rsi
// 545 addq $8, %r9
// 546 movq %rdi, 8(%rdx)
// 547 movq %r11, %rdx
// 548 .L5107:
// 549 cmpq %rdx, %rcx
// 550 jbe .L5112
// 551 .L5111:
// 552 movq 8(%rdx), %rsi
// 553 leaq 8(%rdx), %r11
// 554 cmpq %r8, %rsi
// 555 jb .L5122
// 556 ja .L5120
// 557 leaq 16(%rdx), %rsi
// 558 movq %r11, %rdx
// 559 cmpq %rdx, %rcx
// 560 ja .L5111
// 561 .L5112:
// 562 movq 8(%rdx), %rdi
// 563 movq %rsi, %xmm0
// 564 movq %rcx, %rsi
// 565 movq %r10, %xmm1
// 566 subq %rdx, %rsi
// 567 leaq -8(%r9), %r10
// 568 punpcklqdq %xmm1, %xmm0
// 569 addq $2, %r15
// 570 movq %rdi, %r8
// 571 sarq $63, %rsi
// 572 subq 8(%rcx), %r8
// 573 movups %xmm0, -112(%rsp,%r15,8)
// 574 andq %r8, %rsi
// 575 subq %rsi, %rdi
// 576 movq %rdi, 8(%rdx)
// 577 addq %rsi, 8(%rcx)
// 578 movq %r10, %rcx
// 579 subq %rax, %rcx
// 580 testq %rcx, %rcx
// 581 jg .L5113
// 582 jmp .L5096
// 583 .p2align 4,,10
// 584 .p2align 3
// 585 .L5110:
// 586 subq $8, %rcx
// 587 .L5120:
// 588 movq (%rcx), %rdi
// 589 cmpq %r8, %rdi
// 590 ja .L5110
// 591 movq %rdi, 8(%rdx)
// 592 subq $8, %rcx
// 593 movq %rsi, 8(%rcx)
// 594 movq %r11, %rsi
// 595 jmp .L5107
// 596 .p2align 4,,10
// 597 .p2align 3
// 598 .L5105:
// 599 movq %rbx, %rsi
// 600 movq %rax, %rdx
// 601 movq %rax, %r9
// 602 movq %r10, %rcx
// 603 jmp .L5112
// 604 .cfi_endproc

Tomorrow night, will run the GCC and ICL executables on Windows 10, the testmachine will be 'Brutalitto' - the mini-monster with Zen 2 4800H and 64GB DDR4 - running the heaviest sort benchmark on Internet - 7+ billion QWORDS...

Add-on:
Code: [Select]
Test run: 2022-Mar-09:
Laptop "Brutalitto", AMD 'Renoir' 4800H 4.3GHz max turbo, 64GB DDR4 3200MHz:
+--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+
| Performer/Keys     | #1, FEW distinct          | #2, MANY distinct         | #3, MANYmore distinct     | #4, ALL distinct          | #5, ALLmore distinct       | #6, ALLmax distinct         |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
|  Operating System, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10, | Windows 10,  | Windows 10,  | Windows 10,  |
|      Compiler, -O3 | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1  | Intel v15.0 | GCC 11.2.1   | Intel v15.0  | GCC 11.2.1   |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
| qsort              |  42 seconds |  45 seconds | 242 seconds | 280 seconds | 113 seconds | 131 seconds | 319 seconds | 354 seconds | 620 seconds |  695 seconds | 1282 seconds | 1438 seconds |
| Magnetica v.13     |  22 seconds |  21 seconds | 135 seconds | 134 seconds |  60 seconds |  59 seconds | 177 seconds | 177 seconds | 350 seconds |  349 seconds |  724 seconds |  722 seconds |
| Bentley-McIlroy    |  24 seconds |  24 seconds | 146 seconds | 142 seconds |  66 seconds |  64 seconds | 200 seconds | 193 seconds | 391 seconds |  376 seconds |  850 seconds |  803 seconds |
| Crumsort           |  20 seconds |  19 seconds |  91 seconds |  81 seconds |  44 seconds |  38 seconds | 126 seconds | 109 seconds | 246 seconds |  211 seconds |  567 seconds |  479 seconds |
+--------------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+-------------+--------------+--------------+--------------+
| Best Time (bare    |                           |                           |                           |                           |                            |                             |
| bone in-place QS): |  19s for Crumsort         |  81s for Crumsort         |  38s for Crumsort         | 109s for Crumsort         | 211s for Crumsort          | 479s for Crumsort           |
+--------------------+---------------------------+---------------------------+---------------------------+---------------------------+----------------------------+-----------------------------+

Code: [Select]
Speed Roster, (the base speed 1.00x is GLIBC's qsort):
Rank #1: 2943/937=  3.14x = 19+ 81+ 38+109+211+ 479=  937 seconds for Crumsort
Rank #2: 2943/1462= 2.01x = 21+134+ 59+177+349+ 722= 1462 seconds for Magnetica v.13
Rank #3: 2943/1602= 1.83x = 24+142+ 64+193+376+ 803= 1602 seconds for Bentley-McIlroy
Rank #4: 2943/2943= 1.00x = 45+280+131+354+695+1438= 2943 seconds for GLIBC's qsort

4
Programs / Re: A skeleton code for Text Scroller via Drag-and-Drop
« on: February 22, 2022, 03:52:53 am »
Glad to share the latest-n-fastest Quicksort around - Magnetica r.5 - the source code is attached.

 
Quicksort Showdown_2022-Feb-22.pdf.png


I spotted a conditional swap within the mainloop, now, in r.5 this branch is removed and put outside the hottest loop.

So, r.5 is algorithmically superior compared to r.4, the performance varies between AMD and Intel, between ICL and GCC, overall it pushes the limits once more.

Behold the simplicity itself:
Code: QB64: [Select]
  1. // Magnetica r.5BB:
  2. /*
  3. ; mark_description "Intel(R) C++ Compiler XE for applications running on Intel(R) 64, Version 15.0.0.108 Build 20140726";
  4. ; mark_description "-S -O3 -D_N_HIGH_PRIORITY";
  5.  
  6. Quicksort_QB64_v9       PROC
  7. ; parameter 1: rcx
  8. ; parameter 2: rdx
  9. ; parameter 3: r8
  10. .B12.1::                        ; Preds .B12.0
  11.         push      r13                                           ;1172.72
  12.         push      r14                                           ;1172.72
  13.         push      r15                                           ;1172.72
  14.         push      rbp                                           ;1172.72
  15.         mov       eax, 800024                                   ;1172.72
  16.         call      __chkstk                                      ;1172.72
  17.         sub       rsp, 800024                                   ;1172.72
  18.         mov       eax, 2                                        ;1191.5
  19.         mov       QWORD PTR [40+rsp], rdx                       ;1190.17
  20.         mov       QWORD PTR [48+rsp], r8                        ;1191.17
  21.                                
  22. .B12.2::                        ; Preds .B12.20 .B12.1
  23.         mov       rdx, QWORD PTR [32+rsp+rax*8]                 ;1193.17
  24.         mov       r13, QWORD PTR [24+rsp+rax*8]                 ;1194.16
  25.         add       rax, -2                                       ;1195.31
  26.         cmp       r13, rdx                                      ;1197.47
  27.         jge       .B12.20       ; Prob 10%                      ;1197.47
  28.                                
  29. .B12.3::                        ; Preds .B12.2
  30.         mov       r9, r13                                       ;1199.13
  31.                                
  32. .B12.4::                        ; Preds .B12.18 .B12.3
  33.         mov       r10, QWORD PTR [rcx+r13*8]                    ;1229.13
  34.         lea       r15, QWORD PTR [r13+rdx]                      ;1229.48
  35.         sar       r15, 1                                        ;1229.56
  36.         mov       r14, r13                                      ;1200.13
  37.         mov       r11, rdx                                      ;1198.13
  38.         mov       rbp, r9                                       ;1199.13
  39.         mov       r9, r14                                       ;1200.13
  40.         mov       r8, QWORD PTR [rcx+r15*8]                     ;1229.13
  41.         mov       QWORD PTR [rcx+r15*8], r10                    ;1229.13
  42.         mov       QWORD PTR [rcx+r13*8], r8                     ;1229.13
  43.                                
  44. .B12.5::                        ; Preds .B12.4 .B12.13
  45.         inc       r14                                           ;1282.27
  46.         mov       r10, QWORD PTR [rcx+r14*8]                    ;1283.29
  47.         cmp       r8, r10                                       ;1283.29
  48.         ja        .B12.12       ; Prob 22%                      ;1283.29
  49.                                
  50. .B12.6::                        ; Preds .B12.5
  51.         jae       .B12.13       ; Prob 50%                      ;1287.36
  52.                                
  53. .B12.7::                        ; Preds .B12.6
  54.         mov       r15, QWORD PTR [rcx+r11*8]                    ;1288.35
  55.         cmp       r8, r15                                       ;1288.35
  56.         jae       .B12.11       ; Prob 10%                      ;1288.35
  57.                                
  58. .B12.9::                        ; Preds .B12.7 .B12.9
  59.         dec       r11                                           ;1289.39
  60.         mov       r15, QWORD PTR [rcx+r11*8]                    ;1288.35
  61.         cmp       r8, r15                                       ;1288.35
  62.         jb        .B12.9        ; Prob 82%                      ;1288.35
  63.                                
  64. .B12.11::                       ; Preds .B12.9 .B12.7
  65.         mov       QWORD PTR [rcx+r14*8], r15                    ;1291.21
  66.         dec       r14                                           ;1296.31
  67.         mov       QWORD PTR [rcx+r11*8], r10                    ;1291.21
  68.         dec       r11                                           ;1295.35
  69.         jmp       .B12.13       ; Prob 100%                     ;1295.35
  70.                                
  71. .B12.12::                       ; Preds .B12.5
  72.         mov       r15, QWORD PTR [rcx+rbp*8]                    ;1284.21
  73.         mov       QWORD PTR [rcx+rbp*8], r10                    ;1284.21
  74.         inc       rbp                                           ;1285.31
  75.         mov       QWORD PTR [rcx+r14*8], r15                    ;1284.21
  76.                                
  77. .B12.13::                       ; Preds .B12.12 .B12.11 .B12.6
  78.         cmp       r14, r11                                      ;1281.24
  79.         jl        .B12.5        ; Prob 82%                      ;1281.24
  80.                                
  81. .B12.14::                       ; Preds .B12.13
  82.         jle       .B12.16       ; Prob 78%                      ;1299.22
  83.                                
  84. .B12.15::                       ; Preds .B12.14
  85.         mov       r8, QWORD PTR [8+rcx+r11*8]                   ;1300.21
  86.         mov       r10, QWORD PTR [8+rcx+r14*8]                  ;1300.21
  87.         mov       QWORD PTR [8+rcx+r14*8], r8                   ;1300.21
  88.         mov       QWORD PTR [8+rcx+r11*8], r10                  ;1300.21
  89.                                
  90. .B12.16::                       ; Preds .B12.14 .B12.15
  91.         inc       r14                                           ;1655.25
  92.         cmp       rdx, r14                                      ;1730.49
  93.         jle       .B12.18       ; Prob 62%                      ;1730.49
  94.                                
  95. .B12.17::                       ; Preds .B12.16
  96.         add       rax, 2                                        ;1731.39
  97.         mov       QWORD PTR [24+rsp+rax*8], r14                 ;1732.17
  98.         mov       QWORD PTR [32+rsp+rax*8], rdx                 ;1733.17
  99.                                
  100. .B12.18::                       ; Preds .B12.17 .B12.16
  101.         lea       rdx, QWORD PTR [-1+rbp]                       ;1654.25
  102.         cmp       r13, rdx                                      ;1197.47
  103.         jl        .B12.4        ; Prob 82%                      ;1197.47
  104.                                
  105. .B12.20::                       ; Preds .B12.18 .B12.2
  106.         DB        15                                            ;1743.28
  107.         DB        31                                            ;1743.28
  108.         DB        64                                            ;1743.28
  109.         DB        0                                             ;1743.28
  110.         DB        15                                            ;1743.28
  111.         DB        31                                            ;1743.28
  112.         DB        128                                           ;1743.28
  113.         DB        0                                             ;1743.28
  114.         DB        0                                             ;1743.28
  115.         DB        0                                             ;1743.28
  116.         DB        0                                             ;1743.28
  117.         test      rax, rax                                      ;1743.28
  118.         jne       .B12.2        ; Prob 99%                      ;1743.28
  119.                                
  120. .B12.21::                       ; Preds .B12.20
  121.         add       rsp, 800024                                   ;1753.1
  122.         pop       rbp                                           ;1753.1
  123.         pop       r15                                           ;1753.1
  124.         pop       r14                                           ;1753.1
  125.         pop       r13                                           ;1753.1
  126.         ret                                                     ;1753.1
  127.         ALIGN     16
  128.                                
  129. .B12.22::
  130. ; mark_end;
  131. Quicksort_QB64_v9 ENDP
  132. */

In order gladly to say QB64 stands for "Quickness" once again, my wish is QB64 community to have the qsm.h header (built-in function to QB64 library) outperforming both qsort() C implementations of Intel and GNU, enjoy.

5
Here comes the dedicated homethread of Ефиалти a.k.a. epiALTES - the Linux/Windows GUI audio application allowing simplistic navigation through your audio tree while visualizing the magnitude of the frequencies and the dynamic range of both channels:

 
Screenshot at 2022-01-11 20-21-13.png


As the thread title goes, the playing of .WAV files bigger than 4GB is supported, the header limits the size to 4 bytes, however e-pi-ALTES has two modes - when the file is bigger than 1GB (arbitrarily chosen) it switches to RAW mode which disregards the header and reads ON-THE-FLY (without loading the file into RAM), thus enabling 4+GB support. For instance, I downloaded an YouTube video, converted/extracted the audio (with VLC) and for a 32+hours long audio it became 20GB (try to navigate in those 100+ thousand seconds with [Shift+|Ctrl+]Arrows, the skip step is 1s/10s/100s), the funny thing is that neither KMP/POT player, nor Winamp and even VLC can play the 20GB .WAV - they stupidly read only some 5 hours of it:

 
20gb.png


The case-insensitive filter (shown at the bottom left corner) is applied instantaneously, the allowed keys are 'Backspace', 'Enter' and all letters, for scrolling: Up/Down/Home/End/PgUp/PgDn:

 
Screenshot at 2022-01-11 20-23-15.png


The idea is to discard all unnecessary window pop-ups (as I saw Winamp doing) and lower the key pressings to the maximum minimum. In next revisions I intend to add mouse support similar to the Masakari code, then the left hand will deal with the filtering whereas the right rolling the wheel and clicking the third button instead of pressing 'Enter'... Often, I wonder why the hell the skilled coders cannot see how nifty the simplicism is, inhere, not stressing the eyes with different colors/fonts/windows is a key to ... easy on the eyes GUI. Along with simplistic approach, as a "by-product" comes the file-explorer-like list of your (usually) highly nested tree of files, seeing the "fullpath" gives the feeling of ... accessing library.

The .ZIP package contains .EXE and .ELF i.e. the executables for Linux and Windows (32bit and 64bit).
Just extract the package in the parent folder of all your folders containing .MP3 and .WAV i.e. in the root of the tree of your phonotheque:

 
Screenshot at 2022-01-11 20-25-38.png


Make a link/shortcut to the .exe or .elf, place it on your desktop and enjoy!

Suggestions, feedback or critics are welcome... the idea is one unique player to be written in QB64 language showcasing the BASIC i.e. SIMPLISTIC approach to audio.

Edit, 2022-Jan-14:
Updated, fixed the path error by adding quotes, the bug happened for paths with spaces. Still, if no .MP3s or .WAVs in the current tree then an error will be thrown.

6
Programs / Re: Get frequency in sound using _Vince algorithm
« on: December 31, 2021, 12:19:34 am »
Glad to share the latest and finest e-pi-ALTES, it features the "snowfall" effect:

 
epi_r2++.png


The rewind (backwards or forwards) can be made with Left/Right arrow as well.
The volume Up/Down can be made with Up/Down arrow as well.

Also, did some screen-resolution-aware actions, on old laptops (with 1366x768) e-pi-ALTES enforces the 8x8 Toshiba font, on FHD (1920x1080) laptops it enforces another superb Japanese font but 6x12, my calculations are in the source code as comments as to how to preserve undistorted scaling when you go FULL-SCREEN, hope it looks good.

I strongly recommend using e-pi-ALTES when listening to audiobooks or Japanese audio narration, the later are AWESOMELY done, visualization comes ... falling, as in the superb song ending the classic film "Roadhouse", it goes "The night comes falling".

I love this very revision, for the first time I see what I hear, hee-hee.

Having compiled it on Linux Fedora 35, here comes the promo video (this is how old laptops will look like, in 1680 or 1366 modes):



Have a nice new year!

7
Programs / Re: A skeleton code for Text Scroller via Drag-and-Drop
« on: December 26, 2021, 11:48:14 pm »
I was able to speed up a little (~300ms, down to 1,677 ms) the spell-check rate by checking only the words with equal length.
That is, the previous stupid checking of each word against the whole 1,000,000+ Binary-Search pool now was limited only to THE fraction of the pool where the corresponding length is located - which lowers the number of seeks/comparisons.

Additionally, all the UNFAMILIAR words, now, are dumped as an index, but placed in the very beginning, in order of appearance.
Thanks to that feature, one is allowed quickly to spot misspelled words, as I did only with a few glances, 2 buggy "daughters" found:
daughers
daughers


Project Gutenberg failed to deliver quality, obviously the correctors still rely on non-automatic methods.

Finally, we have a decent benchmark utilizing the Binary-Search in a speedy manner, it is a good start for all kind of experiments since it is creating as a result the .HTM counterpart (with all unfamiliar words in BOLD, all OED distinct words are used as a spell-check wordlist) of our .TXT file:

 
r7_1.png


It throws all the words of KJV Bible at OED 2nd edition, which is 700,000+ words against the 1,000,000+ OED words:

 
r7_2.png


The spotted errors are to go along with the entire .HTM into .DOC|.PDF file for hardcopy i.e. paper reviewing, with good old pencils:

 
r7_3.png


Spell-checking KJV Bible in 1.677 seconds, can we do better?

 
satanella_sharpen_.png


On Linux, the newer GCC (v10.2.1) generates 100ms faster code, lowering the conversion down to:

 
Screenshot at 2021-12-27 05-22-44.png


 
Screenshot at 2021-12-27 05-32-30.png


Anyway, this TXT2HTM code will be part of incoming Masakari r.9, enabling having spell-checked .HTMs only a key and few seconds away...

The source code and binaries for Linux and Windows (also the OED wordlist) are in the attached package.

8
Programs / Re: Get frequency in sound using _Vince algorithm
« on: December 26, 2021, 04:11:16 am »
Glad to share the first operational revision, the QB64 source and the Windows executable (64bit, -O3 -mavx2) are in the attached file.
In order to play a WAV/MP3, just give the name as a parameter on the command line. Use quotes if necessary.

 
epi_r1+.png


My i5-7200U struggled bigtime when changing the FFT samples from 1024 to 4096, at 8192 the tearing is nasty.
I have an idea to precalculate (in next revisions) all the 8192 samples long chunk and to JUST draw the magnitudes.

Also, I encountered a nasty noise-like problem in the decoded stream (bug?!), many sequences with unnatural high values making the visual output blinking badly, what is this, anyone?!

EDIT, 2021-Dec-29:
The bug was fixed, now the revision 2 is fully functional. I hate when don't have enough time to finish something, anyway, found it, the problem was in reading the decompressed stream with at offsets not divisible by 2, duh.

I tested e-pi-ALTES with Japanese narration (Sekirei, Ninja Resurrection) - the visualization is as it should - the vocal range is well presented, some nice bell shapes in the Cyan zone the 1KHz-5KHz, to me visualizer of the magnitudes of frequencies lacking in showing a RICH presence in vocal range is of no use, at least to me.

 
epi_r2a.png


Visualizing the male vocal of the demon advisor Mori:

 
epi_r2b.png


You may press Alt+Enter to toggle fullscreen/windowed modes, on my i5-7200U the transforming speed is at 25 frames-per-second for the highest FFT chunk - 8192 samples, it can vary between 128, 256, 512, 1024, 2048, 4096, 8192:

 
epi_r2c.png


As always, the source code is in the attached package, also 32bit and 64bit executables for Windows.

Enjoy!

9
Programs / Re: text file browser
« on: December 24, 2021, 01:23:44 am »
Hi Colonel_Panic,
in case of not seeing my text browser, you are welcome to chop whatever part you like from my Masakari source.

If you discard the unwanted parts, or rather salvage the useful ones, you can end up with even more simplistic viewer - which is always nice.

10
Programs / Re: Get frequency in sound using _Vince algorithm
« on: December 24, 2021, 01:10:58 am »
Glad to share the first DRAFT revision,
QB64 source plus the 32bit binary are attached.

 
epi1.png


The jumping of bars is too fast (even at 30 Frames-Per-Second), but this is RAW as it gets, for smoothing the animation some tweaks are needed, no idea at the moment what.

 
epi2.png


Critical feedback is welcome, also suggestions what can be bettered.

11
Programs / Re: Binary Sequence Predictor "game"?
« on: December 23, 2021, 03:30:25 am »
Interesting, some time ago a fellow member posted some word-guesser, my suggestion is to try something new, namely "Letter-Guesser".

Instead of zeros and ones, the game to allow writing letters in a string (it could be more space effective if the written string is placed in a box instead of a line), and in REAL-TIME, heh-heh, to guess the next letter, thus whether the previous input was words making sense, or not, it would be an useful ... auto-completion etude.

I have in my to-do list to write a phrase-guesser, deriving from millions of bi-grams (as a start), in next year.

For a long time I wanted to have (and share) a simple application allowing typing (in kinda search box) English words, and the word being currently typed to be PAIRED with the previous one, thus forming a bi-gram being matched versus some big-ass bi-gram corpus. Then when, for example, I type "Sylvester St" the predictor to suggest few bi-grams, thus people who are unaware how the family name of the beloved actor is, would be saved from the shame.
Here is what Google suggests when typing "Sy":

 
predict.png


Oh, and to boost its practicalness, if the user started on a wrong foot e.g. "Silvestor St" then the predictor to be ready to fire the correct "Sylvester Stallone" - that is to be capable of detecting "errors" in the word before last one as well.

12
Programs / Re: Fast Fourier Transform
« on: December 23, 2021, 02:17:43 am »
Many thanks for sharing!
If I am successful, in next days will share my BAR-JUMPER - to enable visual feedback to the MP3/WAV files...

13
Programs / Re: Get frequency in sound using _Vince algorithm
« on: December 23, 2021, 12:32:42 am »
Love seeing those jumping bars, I am eager to announce what is on my mind - e-pi-ALTES - the QB64 Audio_(Frequency_Spectrum)_Visualizer:

 
epialtes copy.png


Three things inspired me to name my bar-jumper after the famous traitor:
- To honor Euler with 'e', to honor 'π' as well, as they are the basics of DFT (Discrete-Fourier-Transform);
- The etymology being "one who leaps upon";
- The superb dialogue in '300' movie, emphasizing the "beating heart", where the heart is the audio and spear/shield/sword are different frequencies:

Ephialtes : My father trained me to feel no fear to make spear and shield and sword as much a part of me as my own beating heart!
King Leonidas : [Ephialtes shows King Leonidas his thrust; it's good and the King is surprisingly impressed]  A fine thrust.

Indeed those jumping bars resemble the thrust of Epialtes' spear.

Note: Some write 'ph' and pronounce it as 'f', however the Greek historian Herodotus spelled it as Ἐπιάλτης, if you listen to the movie, the traitor himself pronounces P not F.

14
Programs / Re: Get frequency in sound using _Vince algorithm
« on: December 23, 2021, 12:19:58 am »
Nice, looks interesting.  I'll have to update QB64 at some point to get the new memsound.  I wish I would get more involved in this but I haven't had much interest in qb64 recently.  Btw, I did not invent these algorithms, it's decades old math translated into qb64.

Hi Vince and Petr,
thanks for sharing this exciting etude.

Wanna write (with help and feedback from the community) a decent (at least not crappy) MP3/WAV frequency visualizer, will share my code in next days, at the moment I have a draft. The main goal is to have the jumping bars corresponding to the actual frequency peaks, to be not only synchronized but to drop no "frames".

By the way, @_vince , what was the source you used, can you share the link.
If your code fails to deliver, my intention is to write my own DFT etude, not FFT though - just the simplicity itself.

15
Speaking of QB64 IDE functionality, please consider adding the legacy shortcut 'Ctrl+Y' in future releases.
I love current 2.01, but it comes short everytime I need to put the current line into clipboard while removing it, too often I used it in the past, truly useful it is.

Seeing that currently 'Ctrl+Y' is mapped onto REDO, I have a suggestion: remove REDO altogether while mimicking the Photoshop scheme:

  [ You are not allowed to view this attachment ]  

UNDO - Ctrl+Z
Step Forward - Shift+Ctrl+Z
Step Backward - Alt+Ctrl+Z

Pages: [1] 2 3 ... 5