Author Topic: associative arrays / dictionaries in QB64? (Read 5949 times)

bplus · « **Reply #15 on:** December 20, 2020, 03:03:15 pm »

Quote from: madscijr on December 20, 2020, 01:16:22 pm

Well as you know with computer programming, you can build stuff however you like!

What will be great, will be when AI gets smart enough, that we can create our own languages without having to actually program low-level code. Just tell the AI in plain English what you want the syntax to be like, etc., and it will create a compiler for that. And eventually the tools will be smart enough to where they can translate code written in one programming language into some other language (and paradigm!) of your choice. That's coming, but I'm not holding my breath!

Oh yeah, that's here (almost) probably already in colleges and universities, math, sci and research labs, Google, IBM, Microsoft... one of Fellippe's Episodes talked about some of that!

TempodiBasic · « **Reply #16 on:** December 20, 2020, 07:16:37 pm »

Hi MadSciJr

have you asked another way to create a dictionary ...?
Bplus has showed you the UDT by TYPE...END TYPE and the array of UDT!
I show here a bidimensional array that manages the issue of 2 value: Key string, value number

Code: QB64: [Select]

' #############################################################################
' EMULATE AN ASSOCIATIVE ARRAY
 
' BASED OFF SOME DISCUSSION AT
' https://www.qb64.org/forum/index.php?topic=1001.15
 
' Tries to emulate a simple dictionary / associative array in QB64.
 
' Some limitations
' 1. Uses a fixed array size
' 2. Only stores integer values
' 3. Not very efficient (simply appends new keys to the end of the array,
'    and searches for them in linear order)
' 4. Requires declaring and passing around 2 arrays
'    (one for keys, one for values), instead of a single variable.
 
' Questions
' 1. Instead of a fixed size, can we implement this with variable arrays and REDIM?
' 2. How can we get it to store mixed data types?
'    We could store the values as string, and have a third array of "type",
'    but would we have to manually cast each retrieved value with VAL?
' 3. What are some other ways this could be better implemented?
 
' #############################################################################
 
' =============================================================================
' GLOBAL DECLARATIONS a$=string, i%=integer, L&=long, s!=single, d#=double
 
' boolean constants
CONST FALSE = 0
CONST TRUE = NOT FALSE
 
CONST cMax = 40
CONST dKey = 2, dValue = 1
' =============================================================================
' GLOBAL VARIABLES
DIM ProgramPath$
DIM ProgramName$
 
' =============================================================================
' INITIALIZE
ProgramName$ = MID$(COMMAND$(0), _INSTRREV(COMMAND$(0), "\") + 1)
ProgramPath$ = LEFT$(COMMAND$(0), _INSTRREV(COMMAND$(0), "\"))
 
 
' =============================================================================
' RUN TEST
TestDictionary
 
' =============================================================================
' FINISH
SYSTEM ' return control to the operating system
PRINT ProgramName$ + " finished."
END
 
' /////////////////////////////////////////////////////////////////////////////
 
SUB TestDictionary ()
    REDIM arrKey(0 TO cMax, dValue TO dKey) AS STRING ' the element 0,1 is the upper limit index
    DIM iLoop AS INTEGER
    DIM bResult AS INTEGER
    DIM sKey AS STRING
    DIM iValue AS INTEGER
    DIM iDefault AS INTEGER
    DIM iIndex AS INTEGER
 
    CLS
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "Initializing dictionary..."
    InitDictionary arrKey()
    'PRINT "arrKey(0,dValue) = " + arrKey(0,dvalue))
 
    PRINT: PRINT DumpDictionary$(arrKey()): WaitForEnter
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "Writing some values..."
 
    sKey = "apple": iValue = 3:
    bResult = SaveDictionary%(arrKey(), sKey, iValue)
    PRINT "SaveDictionary%(arrKey(),  " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iValue) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "africa": iValue = 1:
    bResult = SaveDictionary%(arrKey(), sKey, iValue)
    PRINT "SaveDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iValue) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "zebra": iValue = 0:
    bResult = SaveDictionary%(arrKey(), sKey, iValue)
    PRINT "SaveDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iValue) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "xylophone": iValue = 2:
    bResult = SaveDictionary%(arrKey(), sKey, iValue)
    PRINT "SaveDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iValue) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "zebra": iValue = 4:
    bResult = SaveDictionary%(arrKey(), sKey, iValue)
    PRINT "SaveDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iValue) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    PRINT: PRINT DumpDictionary$(arrKey()): WaitForEnter
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "Deleting some keys..."
 
    sKey = "zebra": bResult = DeleteKey%(arrKey(), sKey)
    PRINT "DeleteKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "nonesuch": bResult = DeleteKey%(arrKey(), sKey)
    PRINT "DeleteKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    sKey = "xylophone": bResult = DeleteKey%(arrKey(), sKey)
    PRINT "DeleteKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + IIFSTR$(bResult, "TRUE", "FALSE")
 
    PRINT: PRINT DumpDictionary$(arrKey()): WaitForEnter
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "Look for keys..."
 
    sKey = "africa": iIndex = FoundKey%(arrKey(), sKey)
    PRINT "FoundKey%(arrKey(),  " + CHR$(34) + sKey + CHR$(34) + ") returns " + cstr$(iIndex) + " which evaluates to " + IIFSTR$(iIndex, "TRUE", "FALSE")
 
    sKey = "xenophobe": iIndex = FoundKey%(arrKey(), sKey)
    PRINT "FoundKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + cstr$(iIndex) + " which evaluates to " + IIFSTR$(iIndex, "TRUE", "FALSE")
 
    sKey = "APPLE": iIndex = FoundKey%(arrKey(), sKey)
    PRINT "FoundKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + cstr$(iIndex) + " which evaluates to " + IIFSTR$(iIndex, "TRUE", "FALSE")
 
    sKey = "zambia": iIndex = FoundKey%(arrKey(), sKey)
    PRINT "FoundKey%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ") returns " + cstr$(iIndex) + " which evaluates to " + IIFSTR$(iIndex, "TRUE", "FALSE")
 
    PRINT: PRINT DumpDictionary$(arrKey()): WaitForEnter
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "Retrieving values..."
 
    sKey = "africa": iDefault = -1000
    PRINT "ReadDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iDefault) + ") returns: " + cstr$(ReadDictionary%(arrKey(), sKey, iDefault))
 
    sKey = "zebra": iDefault = -3000
    PRINT "ReadDictionary%(arrKey(),  " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iDefault) + ") returns: " + cstr$(ReadDictionary%(arrKey(), sKey, iDefault))
 
    sKey = "apple": iDefault = -2000
    PRINT "ReadDictionary%(arrKey(), " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iDefault) + ") returns: " + cstr$(ReadDictionary%(arrKey(), sKey, iDefault))
 
    sKey = "nonesuch": iDefault = -4000
    PRINT "ReadDictionary%(arrKey(),  " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iDefault) + ") returns: " + cstr$(ReadDictionary%(arrKey(), sKey, iDefault))
 
    sKey = "xylophone": iDefault = -5000
    PRINT "ReadDictionary%(arrKey(),  " + CHR$(34) + sKey + CHR$(34) + ", " + cstr$(iDefault) + ") returns: " + cstr$(ReadDictionary%(arrKey(), sKey, iDefault))
 
    PRINT: PRINT DumpDictionary$(arrKey()): WaitForEnter
 
    PRINT "-------------------------------------------------------------------------------"
    PRINT "This concludes the test of the pseudo-associative array / dictionary."
    PRINT
 
    WaitForEnter
END SUB ' TestDictionary
 
' /////////////////////////////////////////////////////////////////////////////
 
SUB WaitForEnter
    DIM in$
    INPUT "Press <ENTER> to continue", in$
END SUB ' WaitForEnter
 
' /////////////////////////////////////////////////////////////////////////////
 
SUB InitDictionary (Arrkey( ,) AS STRING)
    Arrkey(0, dValue) = _TRIM$(STR$(0))
END SUB ' InitDictionary
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION SaveDictionary% (arrKey() AS STRING, sKey AS STRING, iValue AS INTEGER)
    DIM iLoop AS INTEGER
    iIndex = 0
    FOR iLoop = 1 TO VAL(arrKey(0, dValue)) ' ubound(arrKey)
        IF LCASE$(arrKey(iLoop, dKey)) = LCASE$(sKey) THEN
            iIndex = iLoop
            EXIT FOR
        END IF
    NEXT iLoop
    IF (iIndex > 0) THEN
        ' KEY EXISTS, UPDATE VALUE
        arrKey(iIndex, dValue) = _TRIM$(STR$(iValue))
        SaveDictionary% = TRUE
    ELSE
        ' KEY DOESN'T EXIST, ADD IT
 
        ' IS THERE ROOM?
        iIndex = VAL(arrKey(0, dValue)) + 1
        IF iIndex > UBOUND(arrKey) THEN
            ' DICTIONARY FULL!
            SaveDictionary% = FALSE
        ELSE
            ' ADD VALUE AND KEY
            arrKey(0, dValue) = _TRIM$(STR$(iIndex))
            arrKey(iIndex, dValue) = _TRIM$(STR$(iValue))
            arrKey(iIndex, dKey) = sKey
            SaveDictionary% = TRUE
        END IF
    END IF
END FUNCTION ' SaveDictionary%
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION FoundKey% (arrKey() AS STRING, sKey AS STRING)
    DIM iLoop AS INTEGER
    DIM iIndex AS INTEGER
 
    iIndex = 0
    FOR iLoop = 1 TO VAL(arrKey(0, dValue)) ' ubound(arrKey)
        IF LCASE$(arrKey(iLoop, dKey)) = LCASE$(sKey) THEN
            iIndex = iLoop
            EXIT FOR
        END IF
    NEXT iLoop
 
    FoundKey% = iIndex
END FUNCTION ' FoundKey%
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION DeleteKey% (arrKey() AS STRING, sKey AS STRING)
    DIM iLoop AS INTEGER
    DIM iIndex AS INTEGER
    DIM bResult AS INTEGER
 
    iIndex = 0
    FOR iLoop = 1 TO VAL(arrKey(0, dValue)) ' ubound(arrKey)
        IF LCASE$(arrKey(iLoop, dKey)) = LCASE$(sKey) THEN
            iIndex = iLoop
            EXIT FOR
        END IF
    NEXT iLoop
 
    IF (iIndex = 0) THEN
        bResult = FALSE
    ELSE
        IF iIndex < VAL(arrKey(0, dValue)) THEN
            FOR iLoop = iIndex TO (VAL(arrKey(0, dValue)) - 1)
                arrKey(iLoop, dKey) = arrKey(iLoop + 1, dKey)
                arrKey(iLoop, dValue) = arrKey(iLoop + 1, dValue)
            NEXT iLoop
        END IF
        arrKey(0, dValue) = _TRIM$(STR$(VAL(arrKey(0, dValue)) - 1))
        bResult = TRUE
    END IF
 
    DeleteKey% = bResult
END FUNCTION ' DeleteKey%
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION DumpDictionary$ (arrKey() AS STRING)
    DIM iLoop AS INTEGER
    DIM sResult AS STRING
    sResult = ""
    sResult = sResult + "Dictionary size: " + arrKey(0, dValue) + CHR$(13)
    FOR iLoop = 1 TO VAL(arrKey(0, dValue)) ' ubound(arrKey)
        sResult = sResult + "Item(" + CHR$(34) + arrKey(iLoop, dKey) + CHR$(34) + ") = " + (arrKey(iLoop, dValue)) + CHR$(13)
    NEXT iLoop
    DumpDictionary$ = sResult
END FUNCTION ' DumpDictionary$
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION ReadDictionary% (arrKey() AS STRING, sKey AS STRING, iDefault AS INTEGER)
    DIM iLoop AS INTEGER
    DIM iIndex AS INTEGER
 
    iIndex = 0
    FOR iLoop = 1 TO VAL(arrKey(0, dValue)) ' ubound(arrKey)
        IF LCASE$(arrKey(iLoop, dKey)) = LCASE$(sKey) THEN
            iIndex = iLoop
            EXIT FOR
        END IF
    NEXT iLoop
 
    IF (iIndex = 0) THEN
        ReadDictionary% = iDefault
    ELSE
        ReadDictionary% = VAL(arrKey(iIndex, dValue))
    END IF
END FUNCTION ' ReadDictionary%
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION cstr$ (myValue)
    cstr = LTRIM$(RTRIM$(STR$(myValue)))
END FUNCTION ' cstr$
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION IIF (Condition, IfTrue, IfFalse)
    IF Condition THEN IIF = IfTrue ELSE IIF = IfFalse
END FUNCTION
 
' /////////////////////////////////////////////////////////////////////////////
 
FUNCTION IIFSTR$ (Condition, IfTrue$, IfFalse$)
    IF Condition THEN IIFSTR$ = IfTrue$ ELSE IIFSTR$ = IfFalse$
END FUNCTION
 
' /////////////////////////////////////////////////////////////////////////////
 
 

As you can see, if you run this code, it works and the output is the same except for the iDefault values used in the Retrieving values section. I hate not so clear values so I use the thousands for iDefault (-1000,-3000 ....-4000)!

About your little database:
1. have you built it to overwrite the value if the new key is already in dictionary? Or it is a feature coming out itself?
In other words: have you decided to not have a duplicate of a Key?

2. you use a raw bubblesort to find the key into the database. Bplus have suggested to use the binary algorhytm to improve the searching section. I want to stress that if you want a speed database you need of as many indexes as many elements are different in the structure of database. The indexes array as integer must have 2 dimensions, one for Keys and the other for values.

3. why do you use a short main module and the work of main module has been made by a SUB (TestDictionary)? So you cut out any use of SHARED, or DIM SHARED or REDIM SHARED and you need to pass the data as parameters.

luke · « **Reply #17 on:** December 20, 2020, 08:12:31 pm »

My interpreter uses a hash table for storing program symbols. I've pulled it out and added a small demo program.

It does a proper hashing of the contents and can support any kind of data for the "value" half because it works with a UDT, but it does require two SHARED arrays and a SHARED variable. I suppose if you really wanted to you could convert it to a _MEM based thing with keep it all as local variables, but I only needed one instance of the table in my case.

Code: [Select]

DEFLNG A-Z

'Object to store in the symbol table
TYPE symtab_entry_t
    identifier AS STRING
    v1 AS LONG
    v2 AS LONG
    v3 AS LONG
END TYPE

'Actual stored obejcts
DIM SHARED symtab(1000) AS symtab_entry_t
'Mapping between hash and index
DIM SHARED symtab_map(1750) AS LONG
DIM SHARED symtab_last_entry AS LONG

'The symtab optionally supports transactions; calling symtab_rollback will
'remove all items added since the last call to symtab_commit.
DIM SHARED symtab_last_commit_id



DIM entry AS symtab_entry_t
entry.identifier = "rambunctious"
entry.v1 = 12
entry.v2 = 5
entry.v3 = 7
symtab_add_entry entry

entry.identifier = "contradict"
entry.v1 = 10
entry.v2 = 3
entry.v3 = 7
symtab_add_entry entry

entry.identifier = "explode"
entry.v1 = 7
entry.v2 = 3
entry.v3 = 4
symtab_add_entry entry

PRINT "Table has"; symtab_last_entry; "entries:"
FOR i = 1 TO symtab_last_entry
    PRINT "    "; symtab(i).identifier
NEXT i

DO
    INPUT "Select item:", lookupkey$
    result = symtab_get_id(lookupkey$)
    IF result = 0 THEN
        PRINT lookupkey$; " not in table"
    ELSE
        PRINT lookupkey$; " has"; symtab(result).v1; "letters;"; symtab(result).v2; "vowels and"; symtab(result).v3; "consonants."
    END IF
LOOP



'Copyright 2020 Luke Ceddia
'SPDX-License-Identifier: Apache-2.0
'symtab.bm - Symbol Table

SUB symtab_add_entry (entry AS symtab_entry_t)
    symtab_expand_if_needed
    symtab_last_entry = symtab_last_entry + 1
    symtab(symtab_last_entry) = entry
    symtab_map_insert entry.identifier, symtab_last_entry
END SUB

FUNCTION symtab_get_id (identifier$)
    h~& = symtab_hash~&(identifier$, UBOUND(symtab_map))
    DO
        id = symtab_map(h~&)
        IF id = 0 THEN
            EXIT FUNCTION
        END IF
        IF symtab(id).identifier = identifier$ THEN
            symtab_get_id = id
            EXIT FUNCTION
        END IF
        h~& = (h~& + 1) MOD (UBOUND(symtab_map) + 1)
    LOOP
END FUNCTION

'I'd like to be able to use this code for recoverable errors when in interactive
'mode (rollback if an invalid line was entered), but it's not clear how one does
'recoverable error handling in the parser since we don't have proper exceptions.
SUB symtab_commit
    symtab_last_commit_id = symtab_last_entry
END SUB

SUB symtab_rollback
    'Would it be more efficient to do this in reverse order?
    'Does anyone care about how fast it is?
    FOR i = symtab_last_commit_id + 1 TO symtab_last_entry
        identifier$ = symtab(i).identifier
        h~& = symtab_hash~&(identifier$, UBOUND(symtab_map))
        DO
            id = symtab_map(h~&)
            IF symtab(id).identifier = identifier$ THEN EXIT DO
            h~& = (h~& + 1) MOD (UBOUND(symtab_map) + 1)
        LOOP
        symtab_map(h~&) = 0
    NEXT i
    symtab_last_entry = symtab_last_commit_id
END SUB

'Strictly internal functions below
SUB symtab_expand_if_needed
    CONST SYMTAB_MAX_LOADING = 0.75
    CONST SYMTAB_GROWTH_FACTOR = 2
    IF symtab_last_entry = UBOUND(symtab) THEN
        REDIM _PRESERVE symtab(UBOUND(symtab) * SYMTAB_GROWTH_FACTOR) AS symtab_entry_t
    END IF

    IF symtab_last_entry / UBOUND(symtab_map) <= SYMTAB_MAX_LOADING THEN EXIT FUNCTION
    REDIM symtab_map(UBOUND(symtab_map) * SYMTAB_GROWTH_FACTOR)
    FOR i = 1 TO symtab_last_entry
        symtab_map_insert symtab(i).identifier, i
    NEXT i
END SUB

SUB symtab_map_insert (k$, v)
    h~& = symtab_hash~&(k$, UBOUND(symtab_map))
    DO
        IF symtab_map(h~&) = 0 THEN EXIT DO
        h~& = (h~& + 1) MOD (UBOUND(symtab_map) + 1)
    LOOP
    symtab_map(h~&) = v
END SUB

'http://www.cse.yorku.ca/~oz/hash.html
'Attributed to D. J. Bernstein
FUNCTION symtab_hash~& (k$, max)
    hash~& = 5381
    FOR i = 1 TO LEN(k$)
        hash~& = ((hash~& * 33) XOR ASC(k$, i)) MOD max
    NEXT i
    '0<=hash<=max-1, so 1<=hash+1<=max
    symtab_hash~& = hash~& + 1
END FUNCTION

News:

Author Topic: associative arrays / dictionaries in QB64? (Read 5949 times)

bplus

Re: associative arrays / dictionaries in QB64?

TempodiBasic

Re: associative arrays / dictionaries in QB64?

luke

Re: associative arrays / dictionaries in QB64?