Author Topic: DIM a (Max Index) (Read 77718 times)

SMcNeill · « **Reply #30 on:** October 17, 2018, 09:39:10 pm »

Quote from: soniaa on October 17, 2018, 06:13:34 pm

The numbers I have,
is in thes style;
strigs;
q1
q2
q3
....
q33
NOT q(1) etccc
The value in those strings are in the range 1-255

Can you share an example portion of the datafile/code? Like Codeguy, I'm lost as to how you can have both byte values and 33-bit values stored sequentially. How do you know how many bits/bytes to assign to a value? What delimiters are used? What structure does this data have?

More details are needed, if we're going to try and sort out a way to help speed things up for you.

codeguy · « **Reply #31 on:** October 17, 2018, 10:42:49 pm »

if you ever get this figured out, here's a decent in-place algorithm to use:
performance (millisecond) profile for DOUBLE precision array until 134M+ elements :
1 0
3 0
7 0
15 0
31 0
63 0
127 0
255 0
511 0
1023 .9765625
2047 .9765625
4095 2.929688
8191 5.859375
16383 17.08984
32767 51.26953
65535 136.2305
131071 241.6992
262143 415.0391
524287 946.7773
1048575 1986.816
2097151 4613.77
4194303 9541.016
8388607 18500.98
16777215 42821.29
33554431 84074.22
67108863 186534.7
134217727 379580.1

demo code:

Code: QB64: [Select]

WIDTH 80
n& = 0
_CLIPBOARD$ = ""
DO
    n& = n& + 1
    REDIM a(0 TO n&) AS DOUBLE '* 65s
    DIM u AS LONG
    FOR u = LBOUND(a) TO UBOUND(a)
        a(u) = RND
    NEXT
    s! = TIMER(.001)
    QuickSortIntrospective a(), LBOUND(a), UBOUND(a), 1
    f! = TIMER(.001)
    'd$ = "." + STRING$(15, "#")
    DIM l AS LONG
    l = LBOUND(a)
    FOR u = LBOUND(a) TO UBOUND(a)
        IF a(u) < a(l) THEN STOP
        'PRINT USING d$; a(u);
        l = u
    NEXT
    _CLIPBOARD$ = _CLIPBOARD$ + STR$(n&) + STR$(1000 * (f! - s!)) + CHR$(13) + CHR$(10)
    n& = n& * 2
LOOP
 

Complete IntroSort, in-place (uses no auxiliary) and combines QuickSort, HeapSort and InsertionSort. This will allow the maximum range of in-memory sorting in reasonable time.

Code: QB64: [Select]

'****************************************
'* The IntroSort() algorithm extended to QBxx because there is no qbxx-compatible code
'* The IntroSort algorithm extended to qb64 because i could find no pure qbxx code
'* 03Jun2017, by CodeGuy -- further mods for use in sorting library 03 Aug 2017
'* Introspective Sort (IntroSort) falls back to MergeSort after so many levels of
'* recursion and is good for highly redundant data (aka few unique)
'* for very short runs, it falls back to InsertionSort.
 
SUB QuickSortIntrospective (CGSortLibArr() AS DOUBLE, IntroSort_start AS LONG, IntroSort_finish AS LONG, order AS INTEGER)
    DIM IntroSort_i AS LONG
    DIM IntroSort_J AS LONG
    STATIC IntroSort_level AS INTEGER
    STATIC IntroSort_MaxRecurseLevel AS INTEGER
    IntroSort_MaxRecurseLevel = 15
    IF IntroSort_start < IntroSort_finish THEN
        IF IntroSort_finish - IntroSort_start > 31 THEN
            IF IntroSort_level > IntroSort_MaxRecurseLevel THEN
                HeapSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
            ELSE
                IntroSort_level = IntroSort_level + 1
                QuickSortIJ CGSortLibArr(), IntroSort_start, IntroSort_finish, IntroSort_i, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_start, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_i, IntroSort_finish, order
                IntroSort_level = IntroSort_level - 1
            END IF
        ELSE
            InsertionSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
        END IF
    END IF
END SUB
 
SUB QuickSortIJ (CGSortLibArr() AS DOUBLE, start AS LONG, finish AS LONG, i AS LONG, j AS LONG, order AS INTEGER)
    DIM Compare AS DOUBLE '* MUST be the same type as CGSortLibArr()
    i = start
    j = finish
    Compare = CGSortLibArr(i + (j - i) \ 2)
    SELECT CASE order
        CASE 1
            DO
                DO WHILE CGSortLibArr(i) < Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) > Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
        CASE ELSE
            DO
                DO WHILE CGSortLibArr(i) > Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) < Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
    END SELECT
END SUB
 
SUB InsertionSort (CGSortLibArr() AS DOUBLE, start AS LONG, finish AS LONG, order AS INTEGER)
    DIM InSort_Local_ArrayTemp AS DOUBLE
    DIM InSort_Local_i AS LONG
    DIM InSort_Local_j AS LONG
    SELECT CASE order
        CASE 1
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) < CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO UNTIL InSort_Local_j < start
                        IF (InSort_Local_ArrayTemp < CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
        CASE ELSE
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) > CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO UNTIL InSort_Local_j < start
                        IF (InSort_Local_ArrayTemp > CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
    END SELECT
END SUB
 
SUB HeapSort (CGSortLibArr() AS DOUBLE, start AS LONG, finish AS LONG, order AS INTEGER)
    DIM i AS LONG
    DIM j AS LONG
    FOR i = start + 1 TO finish
        PercolateUp CGSortLibArr(), start, i, order
    NEXT i
 
    FOR i = finish TO start + 1 STEP -1
        SWAP CGSortLibArr(start), CGSortLibArr(i)
        PercolateDown CGSortLibArr(), start, i - 1, order
    NEXT i
END SUB
 
SUB PercolateDown (CGSortLibArr() AS DOUBLE, start AS LONG, MaxLevel AS LONG, order AS INTEGER)
    DIM i AS LONG
    DIM child AS LONG
    DIM ax AS LONG
    i = start
    '* Move the value in GetPixel!!!(start) down the heap until it has
    '* reached its proper node (that is, until it is less than its parent
    '* node or until it has reached MaxLevel!!!, the bottom of the current heap):
    DO
        child = 2 * (i - start) + start ' Get the subscript for the Child node.
        '* Reached the bottom of the heap, so exit this procedure:
        IF child > MaxLevel THEN EXIT DO
        SELECT CASE order
            CASE 1
                '* If there are two Child nodes, find out which one is bigger:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) > CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
 
                '* Move the value down if it is still not bigger than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) < CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
            CASE ELSE
                '* If there are two Child nodes, find out which one is smaller:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) < CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
                '* Move the value down if it is still not smaller than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) > CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
        END SELECT
    LOOP
END SUB
 
SUB PercolateUp (CGSortLibArr() AS DOUBLE, start AS LONG, MaxLevel AS LONG, order AS INTEGER)
    DIM parent AS LONG
    DIM i AS LONG
    SELECT CASE order
        CASE 1
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still bigger than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) > CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
        CASE ELSE
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still smaller than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) < CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
    END SELECT
END SUB
 

IntroSort, completely in-place. Don't forget to change AS LONG to as _INTEGER64 to avoid overflow problems on indexes and change types as appropriate.

code to use _UNSIGNED LONG for indexes into arrays

Code: QB64: [Select]

WIDTH 80
DIM testn AS _UNSIGNED LONG
testn = 0
'_CLIPBOARD$ = ""
DO
    testn = testn + 1
    REDIM a(0 TO testn) AS DOUBLE '* 65s
    DIM u AS _UNSIGNED LONG
    FOR u = LBOUND(a) TO UBOUND(a)
        a(u) = RND
    NEXT
    s! = TIMER(.001)
    QuickSortIntrospective a(), LBOUND(a), UBOUND(a), 1
    f! = TIMER(.001)
    'd$ = "." + STRING$(15, "#")
    DIM lp AS _UNSIGNED LONG
    lp = LBOUND(a)
    FOR u = LBOUND(a) TO UBOUND(a)
        IF a(u) < a(lp) THEN STOP
        'PRINT USING d$; a(u);
        lp = u
    NEXT
    ''_CLIPBOARD$ = _CLIPBOARD$ + STR$(testn) + STR$(1000 * (f! - s!)) + CHR$(13) + CHR$(10)
    PRINT STR$(testn); STR$(1000 * (f! - s!)) 
    testn = testn * 2
LOOP
 
'****************************************
'* The IntroSort() algorithm extended to QBxx because there is no qbxx-compatible code
'* The IntroSort algorithm extended to qb64 because i could find no pure qbxx code
'* 03Jun2017, by CodeGuy -- further mods for use in sorting library 03 Aug 2017
'* Introspective Sort (IntroSort) falls back to MergeSort after so many levels of
'* recursion and is good for highly redundant data (aka few unique)
'* for very short runs, it falls back to InsertionSort.
 
SUB QuickSortIntrospective (CGSortLibArr() AS DOUBLE, IntroSort_start AS _UNSIGNED LONG, IntroSort_finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM IntroSort_i AS _UNSIGNED LONG
    DIM IntroSort_J AS _UNSIGNED LONG
    STATIC IntroSort_level AS INTEGER
    STATIC IntroSort_MaxRecurseLevel AS INTEGER
    IntroSort_MaxRecurseLevel = 15
    IF IntroSort_start < IntroSort_finish THEN
        IF IntroSort_finish - IntroSort_start > 31 THEN
            IF IntroSort_level > IntroSort_MaxRecurseLevel THEN
                HeapSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
            ELSE
                IntroSort_level = IntroSort_level + 1
                QuickSortIJ CGSortLibArr(), IntroSort_start, IntroSort_finish, IntroSort_i, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_start, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_i, IntroSort_finish, order
                IntroSort_level = IntroSort_level - 1
            END IF
        ELSE
            InsertionSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
        END IF
    END IF
END SUB
 
SUB QuickSortIJ (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, i AS _UNSIGNED LONG, j AS _UNSIGNED LONG, order AS INTEGER)
    DIM Compare AS DOUBLE '* MUST be the same type as CGSortLibArr()
    i = start
    j = finish
    Compare = CGSortLibArr(i + (j - i) \ 2)
    SELECT CASE order
        CASE 1
            DO
                DO WHILE CGSortLibArr(i) < Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) > Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
        CASE ELSE
            DO
                DO WHILE CGSortLibArr(i) > Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) < Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
    END SELECT
END SUB
 
SUB InsertionSort (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM InSort_Local_ArrayTemp AS DOUBLE
    DIM InSort_Local_i AS _UNSIGNED LONG
    DIM InSort_Local_j AS _UNSIGNED LONG
    SELECT CASE order
        CASE 1
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) < CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO WHILE InSort_Local_j + 1 > start
                        IF (InSort_Local_ArrayTemp < CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
        CASE ELSE
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) > CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO WHILE InSort_Local_j + 1 > start
                        IF (InSort_Local_ArrayTemp > CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
    END SELECT
END SUB
 
SUB HeapSort (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM i AS _UNSIGNED LONG
    DIM j AS _UNSIGNED LONG
    FOR i = start + 1 TO finish
        PercolateUp CGSortLibArr(), start, i, order
    NEXT i
 
    FOR i = finish TO start + 1 STEP -1
        SWAP CGSortLibArr(start), CGSortLibArr(i)
        PercolateDown CGSortLibArr(), start, i - 1, order
    NEXT i
END SUB
 
SUB PercolateDown (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, MaxLevel AS _UNSIGNED LONG, order AS INTEGER)
    DIM i AS _UNSIGNED LONG
    DIM child AS _UNSIGNED LONG
    DIM ax AS _UNSIGNED LONG
    i = start
    '* Move the value in GetPixel!!!(start) down the heap until it has
    '* reached its proper node (that is, until it is less than its parent
    '* node or until it has reached MaxLevel!!!, the bottom of the current heap):
    DO
        child = 2 * (i - start) + start ' Get the subscript for the Child node.
        '* Reached the bottom of the heap, so exit this procedure:
        IF child > MaxLevel THEN EXIT DO
        SELECT CASE order
            CASE 1
                '* If there are two Child nodes, find out which one is bigger:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) > CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
 
                '* Move the value down if it is still not bigger than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) < CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
            CASE ELSE
                '* If there are two Child nodes, find out which one is smaller:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) < CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
                '* Move the value down if it is still not smaller than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) > CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
        END SELECT
    LOOP
END SUB
 
SUB PercolateUp (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, MaxLevel AS _UNSIGNED LONG, order AS INTEGER)
    DIM parent AS _UNSIGNED LONG
    DIM i AS _UNSIGNED LONG
    SELECT CASE order
        CASE 1
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still bigger than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) > CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
        CASE ELSE
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still smaller than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) < CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
    END SELECT
END SUB
 

InsertionSort was not very friendly to _UNSIGNED until modified this way:

Code: QB64: [Select]

WIDTH 80
DIM testn AS _UNSIGNED LONG
testn = 0
_CLIPBOARD$ = ""
DO
    testn = testn + 1
    REDIM a(0 TO testn) AS DOUBLE '* 65s
    DIM u AS _UNSIGNED LONG
    FOR u = LBOUND(a) TO UBOUND(a)
        a(u) = RND
    NEXT
    s! = TIMER(.001)
    QuickSortIntrospective a(), LBOUND(a), UBOUND(a), 1
    f! = TIMER(.001)
    'd$ = "." + STRING$(15, "#")
    DIM lp AS _UNSIGNED LONG
    lp = LBOUND(a)
    FOR u = LBOUND(a) TO UBOUND(a)
        IF a(u) < a(lp) THEN STOP
        'PRINT USING d$; a(u);
        lp = u
    NEXT
    _CLIPBOARD$ = _CLIPBOARD$ + STR$(testn) + STR$(1000 * (f! - s!)) + CHR$(13) + CHR$(10)
    testn = testn * 2
LOOP
 
 
'****************************************
'* The IntroSort() algorithm extended to QBxx because there is no qbxx-compatible code
'* The IntroSort algorithm extended to qb64 because i could find no pure qbxx code
'* 03Jun2017, by CodeGuy -- further mods for use in sorting library 03 Aug 2017
'* Introspective Sort (IntroSort) falls back to MergeSort after so many levels of
'* recursion and is good for highly redundant data (aka few unique)
'* for very short runs, it falls back to InsertionSort.
 
SUB QuickSortIntrospective (CGSortLibArr() AS DOUBLE, IntroSort_start AS _UNSIGNED LONG, IntroSort_finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM IntroSort_i AS _UNSIGNED LONG
    DIM IntroSort_J AS _UNSIGNED LONG
    STATIC IntroSort_level AS INTEGER
    STATIC IntroSort_MaxRecurseLevel AS INTEGER
    IntroSort_MaxRecurseLevel = 15
    IF IntroSort_start < IntroSort_finish THEN
        IF IntroSort_finish - IntroSort_start > 31 THEN
            IF IntroSort_level > IntroSort_MaxRecurseLevel THEN
                HeapSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
            ELSE
                IntroSort_level = IntroSort_level + 1
                QuickSortIJ CGSortLibArr(), IntroSort_start, IntroSort_finish, IntroSort_i, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_start, IntroSort_J, order
                QuickSortIntrospective CGSortLibArr(), IntroSort_i, IntroSort_finish, order
                IntroSort_level = IntroSort_level - 1
            END IF
        ELSE
            InsertionSort CGSortLibArr(), IntroSort_start, IntroSort_finish, order
        END IF
    END IF
END SUB
 
SUB QuickSortIJ (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, i AS _UNSIGNED LONG, j AS _UNSIGNED LONG, order AS INTEGER)
    DIM Compare AS DOUBLE '* MUST be the same type as CGSortLibArr()
    i = start
    j = finish
    Compare = CGSortLibArr(i + (j - i) \ 2)
    SELECT CASE order
        CASE 1
            DO
                DO WHILE CGSortLibArr(i) < Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) > Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
        CASE ELSE
            DO
                DO WHILE CGSortLibArr(i) > Compare
                    i = i + 1
                LOOP
                DO WHILE CGSortLibArr(j) < Compare
                    j = j - 1
                LOOP
                IF i <= j THEN
                    IF i <> j THEN
                        SWAP CGSortLibArr(i), CGSortLibArr(j)
                    END IF
                    i = i + 1
                    j = j - 1
                END IF
            LOOP UNTIL i > j
    END SELECT
END SUB
 
SUB InsertionSort (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM InSort_Local_ArrayTemp AS DOUBLE
    DIM InSort_Local_i AS _UNSIGNED LONG
    DIM InSort_Local_j AS _UNSIGNED LONG
    SELECT CASE order
        CASE 1
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) < CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO WHILE InSort_Local_j + 1 > start
                        IF (InSort_Local_ArrayTemp < CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
        CASE ELSE
            FOR InSort_Local_i = start + 1 TO finish
                InSort_Local_j = InSort_Local_i - 1
                IF CGSortLibArr(InSort_Local_i) > CGSortLibArr(InSort_Local_j) THEN
                    InSort_Local_ArrayTemp = CGSortLibArr(InSort_Local_i)
                    DO WHILE InSort_Local_j + 1 > start
                        IF (InSort_Local_ArrayTemp > CGSortLibArr(InSort_Local_j)) THEN
                            CGSortLibArr(InSort_Local_j + 1) = CGSortLibArr(InSort_Local_j)
                            InSort_Local_j = InSort_Local_j - 1
                        ELSE
                            EXIT DO
                        END IF
                    LOOP
                    CGSortLibArr(InSort_Local_j + 1) = InSort_Local_ArrayTemp
                END IF
            NEXT
    END SELECT
END SUB
 
SUB HeapSort (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, finish AS _UNSIGNED LONG, order AS INTEGER)
    DIM i AS _UNSIGNED LONG
    DIM j AS _UNSIGNED LONG
    FOR i = start + 1 TO finish
        PercolateUp CGSortLibArr(), start, i, order
    NEXT i
 
    FOR i = finish TO start + 1 STEP -1
        SWAP CGSortLibArr(start), CGSortLibArr(i)
        PercolateDown CGSortLibArr(), start, i - 1, order
    NEXT i
END SUB
 
SUB PercolateDown (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, MaxLevel AS _UNSIGNED LONG, order AS INTEGER)
    DIM i AS _UNSIGNED LONG
    DIM child AS _UNSIGNED LONG
    DIM ax AS _UNSIGNED LONG
    i = start
    '* Move the value in GetPixel!!!(start) down the heap until it has
    '* reached its proper node (that is, until it is less than its parent
    '* node or until it has reached MaxLevel!!!, the bottom of the current heap):
    DO
        child = 2 * (i - start) + start ' Get the subscript for the Child node.
        '* Reached the bottom of the heap, so exit this procedure:
        IF child > MaxLevel THEN EXIT DO
        SELECT CASE order
            CASE 1
                '* If there are two Child nodes, find out which one is bigger:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) > CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
 
                '* Move the value down if it is still not bigger than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) < CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
            CASE ELSE
                '* If there are two Child nodes, find out which one is smaller:
                ax = child + 1
                IF ax <= MaxLevel THEN
                    IF CGSortLibArr(ax) < CGSortLibArr(child) THEN
                        child = ax
                    END IF
                END IF
                '* Move the value down if it is still not smaller than either one of
                '* its Child!!!ren:
                IF CGSortLibArr(i) > CGSortLibArr(child) THEN
                    SWAP CGSortLibArr(i), CGSortLibArr(child)
                    i = child
                ELSE
                    '* Otherwise, CGSortLibArr() has been restored to a heap from start to MaxLevel!!!,
                    '* so exit:
                    EXIT DO
                END IF
        END SELECT
    LOOP
END SUB
 
SUB PercolateUp (CGSortLibArr() AS DOUBLE, start AS _UNSIGNED LONG, MaxLevel AS _UNSIGNED LONG, order AS INTEGER)
    DIM parent AS _UNSIGNED LONG
    DIM i AS _UNSIGNED LONG
    SELECT CASE order
        CASE 1
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still bigger than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) > CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
        CASE ELSE
            i = MaxLevel
            '* Move the value in CGSortLibArr(MaxLevel!!!) up the heap until it has
            '* reached its proper node (that is, until it is greater than either
            '* of its Child!!! nodes, or until it has reached 1, the top of the heap):
            DO UNTIL i = start
                '* Get the subscript for the parent node.
                parent = start + (i - start) \ 2
                '* The value at the current node is still smaller than the value at
                '* its parent node, so swap these two array elements:
                IF CGSortLibArr(i) < CGSortLibArr(parent) THEN
                    SWAP CGSortLibArr(parent), CGSortLibArr(i)
                    i = parent
                ELSE
                    '* Otherwise, the element has reached its proper place in the heap,
                    '* so exit this procedure:
                    EXIT DO
                END IF
            LOOP
    END SELECT
END SUB
 

codeguy · « **Reply #32 on:** October 18, 2018, 03:43:42 am »

There really is no point changing from long because long before you overflow the range of long numbers, you'll run out of heap space anyway, even storing _BYTE arrays. at the most, 1, MAYBE 2 GB will be in memory while the other 5 or 4 is untouched, awaiting processing. My suggestion? split whatever it is you're sorting into no more than 1 GB files and sort each individually. Then merge those files sequentially. Even optimistically, I wouldn't expect anything greater than (1/2 million BYTE-size elements/second for each GHz of CPU speed) for a generalized comparison sort. You MIGHT be lucky and get 1.5 GB/GHz this way. Yes, there are specialized sorts that can handle FAR better hourly volume, but the information you give us makes determination of a truly suitable algorithm pretty much impossible. Believe me, I know. I am among the most viewed and upvoted in Sorting algorithms on Quora, as well as a Top Writer for 2018. I don't know everything, but I am very well versed in what I do know. Please feel free to add more specifics and clarify what you really need so we can give a more satisfactory answer. I have only discussed POSSIBLE solutions to your problem as described as that is all the information you've given allows.

soniaa · « **Reply #33 on:** October 18, 2018, 11:57:10 am »

Quote from: soniaa on October 17, 2018, 06:13:34 pm

The numbers I have,
is in thes style;

ARRAY ( DIMensioned _INTEGER64 ) - (NO strigs) message modificated
q1
q2
q3
....
last is q31

NO q(1) etccc

The value in those ARRAY are in the range (1 - 8.589.934.590)

In the program these numbers they arrive not in sequence !!! ....and I do not need to sort.

Every time in the moment I have acquire a number, I need to know if that number has already been processed;
q1 .... q31
- IF Already processed THEN howmanytimesthisnumberalreadyprocessed = howmanytimesthisnumberalreadyprocessed +1
(howmanytimesthisnumberalreadyprocessed is in the range 0 - 255 (_UNSIGNED _BYTE)

- IF NO Already processed THEN record this because is an new number
(IF NO Already processed in this case to increase speed, it is not necessary also assign 1 at howmanytimesthisnumberalreadyprocessed)

SMcNeill · « **Reply #34 on:** October 18, 2018, 12:38:17 pm »

Easiest way in this case is a BINARY file:

DIM number AS _INTEGER64
DIM TimesProcessed AS _UNSIGNED _BYTE
OPEN "RecordCount.bin" FOR BINARY AS #1

DO
INPUT number
GET #1, number, TimesProcessed
TimesProcessed = TimesProcessed + 1
PUT #1, number, TimesProcessed
LOOP UNTIL number = -1 'or some other exit code

The above gets a number, puts it on the hard drive in a file which will be 8,589,934,590 bytes in size, and each byte will tell you how many times that number has "been processed". (Up to a maximum of 255 times).

***********************************

Now, this method is assuming you have 8 billion records to work with. If you have a limited dataset (the q31 almost seems to indicate there's only 31 records with values from 0 to 8.5B??), then I'd build an index, store the value and count, and keep it all in memory for speed and efficiency.

But, since it seems like you're going to process billions of records (2 years to run), saving the results in a file seems to me to be the best way to do this.

soniaa · « **Reply #35 on:** October 18, 2018, 12:43:01 pm »

the strategies I tried are the following:

strategy1;
create file.txt for any number that the program create
into this .txt file write howmanytimesthisnumberalreadyprocessed
To do this I do;
- IF _FILEEXISTS (if 0 is an new number , if -1 no new number)
- OPEN FOR APPEND (if no exist this procedure create the file)
- WRITE 1 (append new line in .txt file contain "1" - it can also be good in this way)
- CLOSE

strategy2;
DIM Array(1 to 8589934590) AS _UNSIGNED _BYTE
- IF Array(mynuber) = 0 THEN GOTO ......
- IF Array(mynuber) = >0 THEN Array(mynuber) = Array(mynuber) +1

strategy3;
OPEN "c.dat" FOR BINARY AS #1
GET #1 , mynumber , howmanytimesthisnumberalreadyprocessed

- IF howmanytimesthisnumberalreadyprocessed > 0 THEN PUT #1 , mynumber , howmanytimesthisnumberalreadyprocessed + 1 :GOTO...

- PUT #1 , mynumber , 1

SMcNeill · « **Reply #36 on:** October 18, 2018, 12:48:50 pm »

Strategy 3 doesn't even require an IF; just an increment since 0 + 1 = 1

Get #1, number, TimesProcessed
TimesProcessed = TimesProcessed + 1
Put #1, number, TimesProcessed

Logically, if the value is 0, it's going to increment to 1 when you do the math.

TempodiBasic · « **Reply #37 on:** October 18, 2018, 01:44:32 pm »

Hi guys /Ciao ragazzi
Hi Soniaa/Ciao Soniaa
welcome into the bilingual post /benvenuta nel post bilingue

1. it seems that you must calculate frequency of a number in a gigasize array of data ranging as you post
/1. sembra che tu debba calcolare la frequenza di un numero in un array delle dimensioni di GIGA di dati con un range di valori che hai messaggiato

Quote

The numbers I have,
is in thes style;

ARRAY ( DIMensioned _INTEGER64 ) - (NO strigs) message modificated
q1
q2
q3
....
last is q31

NO q(1) etccc

The value in those ARRAY are in the range (1 - 8.589.934.590) ( _INTEGER64 )

Quote

my question;
in the program I'm making, I estimated that the processing process is 2 years.
I have to adopt more strategies to get the maximum speed.

In some phases of the program, I have numbers in the range (1 to 8.589.934.591) (33bit)
NOT all numbers in the range, ONLY 20%.

I have to check if the number I'm processing has already been processed and how many times.

2.about https://www.qb64.org/forum/index.php?topic=705.msg5867#msg5867
if you are not able to speak ItalianEnglish please help yourself with Google Translator, it is better than nothing! Moreover you can break the language barrier and you can get more from help of great coders like those are answering to you (except me I am coder as hobby)
/2. riguardo a https://www.qb64.org/forum/index.php?topic=705.msg5867#msg5867
se non sai parlare l'IngleseItalianizzato, prego aiutati con Google Translator, è meglio di niente! Inoltre puoi infrangere la barriera del linguaggio e puoi ottenere di più dall'aiuto di grandi programmatori come quelli che ti stanno rispondendo (eccetto me che sono programmatore per hobby)

3. about this

Quote

(howmanytimesthisnumberalreadyprocessed is in the range 0 - 255 (_UNSIGNED _BYTE)

my questions to understand better (you are talking about processing for at least 2 years, an error should waste much time!):
3.1 how can you say surely that the frequency cannot be more than 255?
3.2 must the numbers that are in the range (1 - 8.589.934.590) and that are not acquired, be registered as 0 frequency?
/3. circa questo

Quote

(howmanytimesthisnumberalreadyprocessed is in the range 0 - 255 (_UNSIGNED _BYTE)

le mie domande per capire meglio (stai parlado di processare per almeno 2 anni, un errore sprecherebbe molto tempo!):
3.1 come puoi dire sicuramente che la frequenza non può essere maggiore di 255?
3.2 i numeri che sono nel range (1 - 8.589.934.590) e che non sono acquisiti devono essere registrati con frequenza 0?

4. about ARRAY

Quote

ARRAY ( DIMensioned _INTEGER64 ) - (NO strigs) message modificated
q1
q2
q3
....
last is q31

NO q(1) etccc

Quote

Every time in the moment I have acquire a number, I need to know if that number has already been processed;
q1 .... q31
- IF Already processed THEN howmanytimesthisnumberalreadyprocessed = howmanytimesthisnumberalreadyprocessed +1
(howmanytimesthisnumberalreadyprocessed is in the range 0 - 255 (_UNSIGNED _BYTE)

- IF NO Already processed THEN record this because is an new number
(IF NO Already processed in this case to increase speed, it is not necessary also assign 1 at howmanytimesthisnumberalreadyprocessed)

are you sure that this is an array? You talk of array q1 q2 q3 until q31 and not q(1) q(2).... It seems to me that this is a packed field data to manage by UDT
4.1 How can you distinguish among q1 q2 q3 ..q31 of the array in the acquiring phase of data? Or do you acquire a single number of large range?
4.2 How do you know that you can have only q31 number to process and to count the frequency?

/4. riguardo al tuo ARRAY
sei sicura che questo sia un array? lo definisci composto da q31 elementi da q1 a q31 ma non tipo q(1) q(2)...q(31)
mi sembra che questo sia più un gruppo di dati compressi da gestire tramite UDT.
4.1 Come puoi distinguere tra q1 q2 q3...q31 dell'array nella fase di acquisizione dei dati? Oppure tu acquisci un singolo numero di ampio range?
4.2 Come fai a sapere che tu puoi avere solo q31 numeri da processare e di cui contare la frequenza?

5. can you be clearer using a pseudocode with large chunks? i.e

Quote

acquiring a large number ranging from 1 to 8....... -> record it in a list of processed number -> count how many times it has been processed

/5. puoi essere più chiara usando pseudocodice a grandi blocchi_operazioni? per esempio

Quote

acquisire un grande numero nel range da 1 to 8...... -> registrare il numero in una lista di numeri processati -> contare quante volte esso sia stato processato

Thanks to read
/Grazie per la lettura

soniaa · « **Reply #38 on:** October 18, 2018, 02:41:28 pm »

Thanks for all reply;
I do not care about ordering the numbers.

in reply to;

SMcNeill;
the IF is necessary fom my because;
- if GET acquisition >1 THEN put value +1 AND goto inaditection
- else PUT value 1 AND goto otherdirection

TempodiBasic;
3.1 - its max 255
3.2 - it is not necessary
4.1 - I have write a program in mode for no use q(1) because using this have lowww performance, q1 is moooore fasttt
4.2 - the program loop in 31 step forward and back billions of times (brute force)
5.0 - ?????

soniaa · « **Reply #39 on:** October 18, 2018, 02:58:20 pm »

The numbers generated by the array q1 q2 .... q31 they can be stored together in an singe registry,
but in some way they must be associated with howmanytimesalreadyprocessed (range 1 - 255 )

the registry it must not have references of the number has arrived by q1 or q2...

SMcNeill · « **Reply #40 on:** October 18, 2018, 03:10:20 pm »

Quote from: soniaa on October 18, 2018, 02:41:28 pm

SMcNeill;
the IF is necessary fom my because;
- if GET acquisition >1 THEN put value +1 AND goto inaditection
- else PUT value 1 AND goto otherdirection

The IF action is independent of the GET/ PUT action.

Get #1, number, TimesProcessed
TimesProcessed = TimesProcessed + 1
Put #1, number, TimesProcessed
IF TimesProcessed = 1 THEN GOTO inaditection ELSE GOTO otherdirection

soniaa · « **Reply #41 on:** October 18, 2018, 03:48:46 pm »

Quote from: SMcNeill on October 18, 2018, 03:10:20 pm

Quote from: soniaa on October 18, 2018, 02:41:28 pm
SMcNeill;
the IF is necessary fom my because;
- if GET acquisition >1 THEN put value +1 AND goto inaditection
- else PUT value 1 AND goto otherdirection

The IF action is independent of the GET/ PUT action.

Get #1, number, TimesProcessed
TimesProcessed = TimesProcessed + 1
Put #1, number, TimesProcessed
IF TimesProcessed = 1 THEN GOTO inaditection ELSE GOTO otherdirection

****what you wrote is exactly the same as what I wrote !!??...
...or no?

SMcNeill · « **Reply #42 on:** October 18, 2018, 04:02:31 pm »

Here's a real world implementation of how I'd handle this problem:

Code: QB64: [Select]

DIM q AS _INTEGER64
DIM TimesProcessed AS _UNSIGNED _BYTE
 
RANDOMIZE TIMER
 
PRINT "Clearing and Opening Multiple files.  This will take a few seconds."
FOR i = 0 TO 8589 'multiple files to hold our values, since we have such a large range for output
    file$ = "FileCounter" + LTRIM$(RTRIM$(STR$(i))) + ".bin"
    OPEN file$ FOR OUTPUT AS #i + 1: CLOSE #i + 1 'blank the file if it exists
    OPEN file$ FOR BINARY AS #i + 1 'open a new file to track values
NEXT
 
Limit = 100 'The number of values we're going to process (times 31, since you have values from q1 to q33)
 
 
StartTime## = TIMER
FOR i = 1 TO Limit
    FOR j = 1 TO 31 '31 times since the values are for q1 to q31
        q = INT(RND * 8589934590)
        FileNumber = q \ 1000000
        ValueStored = q MOD 1000000
        GET #FileNumber + 1, ValueStored, TimesProcessed
        TimesProcessed = TimesProcessed + 1
        PUT #FileNumber + 1, ValueStored, TimesProcessed
        IF TimesProcessed = 1 THEN PRINT "New Entry", ELSE PRINT "Processed"; TimesProcessed; " already",
    NEXT
NEXT
CLOSE 'close all files
PRINT
FinishTime## = TIMER - StartTime##
 
PRINT USING "#####.##### seconds to process 31 numbers, ### times."; FinishTime##, Limit
 
PRINT
PRINT
PRINT "Clean Up the Disk?  (Y/N)"
DO
    _LIMIT 30
    i$ = UCASE$(INKEY$)
LOOP UNTIL i$ = "Y" OR i$ = "N"
 
IF i$ = "Y" THEN
    PRINT
    PRINT "Cleaning disk... This too may take several seconds."
    FOR i = 0 TO 8589 'multiple files to hold our values, since we have such a large range for output
        file$ = "FileCounter" + LTRIM$(RTRIM$(STR$(i))) + ".bin"
        KILL file$
    NEXT
END IF
SYSTEM
 

Now, a few things to note:

1) Notice that I've broken my data down to 1MB files? This is because it's much faster for the drive to reserve space for a new file that's 1MB in size, than it is to do the same for one that's 8GB in size. In fact, it's a lot faster to create almost 9000 1MB files, than it is for my drive to create and reserve space for a single 8GB one. Play around with the values if you want and see how the speed is affected...

2) It still takes me about 40 seconds to process 100 sets of numbers, using this method -- but that's the hit you're going to get for having and using so much file access. Times will probably be much better on your SSD than it is on my old one here, but it's still doing to be a lot slower than running some sort of counter in memory.

3) You've never told us what the MAXIMUM AMOUNT OF DATA NEEDS TO BE PROCESSED. If there's less than a million total records, I'd use an index to see what values have appeared, increment them, and keep the whole process in memory. You've told us that there's 31 values, ranging from 0 to 8.5B, but you haven't mentioned how many times you're going to repeat this process. If there's 31 total values, there's absolutely no reason to use file access to track number of times a value is processed. I'd only go the file route, like above, if it was a process which I had to repeat billions of times for those q1 to q31 numbers. (And how can you be certain that they're only going to occur less than 255 times total, if there's billions of times of them being processed?)

From what you've said, the above is the best way I can think of the deal with the problem you've presented: tracking a counter of a limitless number of passes, of 31 values, which range from 0 to 8GB... If what you need ISN'T for a limitless number of passes, then give us an idea of the max number of passes which are used with your data. There's MUCH better ways to deal with the issue, with smaller datasets.

soniaa · « **Reply #43 on:** October 18, 2018, 07:11:31 pm »

I repeat;
The number I want stored for compare are in the range 0 to 33bit.
No all number in the range the program create, only approximately 20 -30 %.
The program create these number billion of time.
Every when a program create an number I have to know if that number is already processed, then store how many times.
thats is all,
no other.

I think the solutions OPEN c.dat for binary is for the moment the best I try.
If you have other solution muck speed write to me.

TempodiBasic · « **Reply #44 on:** October 18, 2018, 07:43:36 pm »

@soniaa

I can't imagine how to structure a program to do what you are talking about!
so I read again your requests in this thread:

the first:

Quote

I have 8.000.000.000 of items,
the value of any ia is max 100
I want create:
Dim q(8000000000) as _byte
BUT THES Produce error

in my TOSHIBA i3 8GB di ram it gives me Out of memory error on compiling

the second:

Quote

I have a goog idea;
Many of my data is value 0
i can no create array for this,
in thes mode the array I want to DIM are approximately 3.000.000.000
Now BIG Problem;
I want put my data into the arrays one by one,
example:
visitornumber1000000000 = 5
visitornumber2000000000 = 7
I do;
DIM a(1000000000 to 1000000000) as _BYTE
a(1000000000) = 5

but why do you create an array made by one single element with index 1000000000?

the third:

Quote

I have test this;
1 REM $DYNAMIC
2 DIM a(1000000000 to 1000000000) as _BYTE
3 a(1000000000) = 5
4 REDIM _PRESERVE a(2000000000 to 2000000000)
But receive error

I cannot agree.
If I copy and paste this your code in QB64 ide I get compiling no error!
But we must observe that there is a logic error...the array a() as _BYTE on linecode 2 and the array a () on linecode 4 are not the same array, the first is of type _BYTE while the second is of type SINGLE....
moreover also if I add as _BYTE at the end of linecode 4 it runs ok in QB64.

the fourth

Quote

In some phases of the program, I have numbers in the range (1 to 8.589.934.591) (33bit)
NOT all numbers in the range, ONLY 20%.

I have to check if the number I'm processing has already been processed and how many times.

How many numbers? the only 20% of how many? We must process only 20% of numbers because only they are in the range.
We must record what number and how many times has been precessed. But the 20% of infinite is infinite! :-)
How can you be specific about number of number to process?

the fifth

Quote

The numbers I have,
is in thes style;

ARRAY ( DIMensioned _INTEGER64 ) - (NO strigs) message modificated
q1
q2
q3
....
last is q31

NO q(1) etccc

The value in those ARRAY are in the range (1 - 8.589.934.590) ( _INTEGER64 )

Here you decide to use 31 variables _INTEGER64. Why you do not say!

the sixth:

Quote

In the program these numbers they arrive not in sequence !!! ....and I do not need to sort.

Every time in the moment I have acquire a number, I need to know if that number has already been processed;
q1 .... q31

Again how many numbers we do not know! Why do you use a cluster of 31 variable of _INTEGER64?

the seventh

Quote

strategy1;
create file.txt for any number that the program create
into this .txt file write howmanytimesthisnumberalreadyprocessed
To do this I do;
- IF _FILEEXISTS (if 0 is an new number , if -1 no new number)
- OPEN FOR APPEND (if no exist this procedure create the file)
- WRITE 1 (append new line in .txt file contain "1" - it can also be good in this way)
- CLOSE

IMHO file/disk access is slower than RAM access... also if you can optimize this strategy above showed

the eighth

Quote

strategy2;
DIM Array(1 to 8589934590) AS _UNSIGNED _BYTE
- IF Array(mynuber) = 0 THEN GOTO ......
- IF Array(mynuber) = >0 THEN Array(mynuber) = Array(mynuber) +1

this strategy can work if you have enough RAM but so you have already solved the issue

the nineth

Quote

strategy3;
OPEN "c.dat" FOR BINARY AS #1
GET #1 , mynumber , howmanytimesthisnumberalreadyprocessed

- IF howmanytimesthisnumberalreadyprocessed > 0 THEN PUT #1 , mynumber , howmanytimesthisnumberalreadyprocessed + 1 :GOTO...

- PUT #1 , mynumber , 1

IMHO an access to file/disk in bynary mode should be faster than to txt files but again slower to RAM access.

the tenth

Quote

the program loop in 31 step forward and back billions of times (brute force)

Here I ask "Why a loop in 31 steps and not 31 times a loop of a single step as all 31 variable are _INTEGER64?

the eleventh

Quote

The numbers generated by the array q1 q2 .... q31 they can be stored together in an singe registry,
but in some way they must be associated with howmanytimesalreadyprocessed (range 1 - 255 )

a simple UDT can do this:

Code: QB64: [Select]

TYPE q31
    HowMany AS _UNSIGNED _BYTE
    q AS _INTEGER64
END TYPE

each variable of this type uses 9 byte of RAM (8 for _INTEGER64 and 1 for _BYTE) using this structure you can save/load from file/disk each variable with one access and the first byte is byself loaded in howmany and the last 8 bytes in q that record the value of number.

the twelveth

Quote

The number I want stored for compare are in the range 0 to 33bit.
No all number in the range the program create, only approximately 20 -30 %.
The program create these number billion of time.
Every when a program create an number I have to know if that number is already processed, then store how many times.
thats is all,

the other 80-70% of numbers generated by program are <1 and > 8.589.934.590
you process only numbers in the range 1--8.589.934.590 and you rate the frequency....
well the 31 variables _INTEGER64 seems only an your choice in projecting the program...
now I can create a right generator of numbers to emulate your issue and show how is my solution!

Perchè non ti aiuti con il traduttore di Google o Reverso, non è il massimo ma almeno costruisci la frase in inglese più correttamente.
Non tutti i membri di questo forum sono inglesi o americani.

Good Coding

News:

Author Topic: DIM a (Max Index) (Read 77718 times)

SMcNeill

Re: DIM a (Max Index)

codeguy

Re: DIM a (Max Index)

codeguy

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

SMcNeill

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

SMcNeill

Re: DIM a (Max Index)

TempodiBasic

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

SMcNeill

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

SMcNeill

Re: DIM a (Max Index)

soniaa

Re: DIM a (Max Index)

TempodiBasic

Re: DIM a (Max Index)