Author Topic: WordCracker (Read 8002 times)

Zeppelin · « **on:** October 07, 2018, 03:56:15 am »

Hey,
I run into a issue with my program.
Im trying to make a program that will, when given a string of 9 letters and 1 key letter to print out all the possible words using these letters.
For example:
9 Letters: ABCDEFGHI
Key Letter: A
PRINTS:
BAD
CAB
DEAF
etc....

I am using a .txt database full of words from the Oxford dictionary and each time I run the program nothing is output. I cant find the issue.

Thanks,
Zeppelin

Ps. The program and wordlist are attached below.

SMcNeill · « **Reply #1 on:** October 07, 2018, 07:37:00 am »

http://qb64.freeforums.net/thread/42/scrabble-word-maker -- Sounds like what you're wanting is basically the same thing I've did here. Give it a look and see if it helps.

bplus · « **Reply #2 on:** October 07, 2018, 10:33:25 am »

Code: QB64: [Select]

'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr$(n) = MID$(in$, n, 1)
NEXT n
PRINT LEN(ltr$)  '>>>>>>>>>>>> 0 !!!
END
 

Code: QB64: [Select]

            FOR x = 1 TO LEN(in$)  '<<<<<<<<<<<<<<<<<<<<<<< change to this?
                IF ltr$(x) = templtr$ THEN
                    ltr$(x) = ""
                    count = count + 1
                END IF
            NEXT x
 

OH!!! This is screwing you up too!

Code: QB64: [Select]

    FOR n = 1 TO 9
        ltr$(n) = MID$(in$, n, 1)
    NEXT n

You are using n for the word index from the file, and then changing n here when resetting ltr$() array!!!

bplus · « **Reply #3 on:** October 07, 2018, 10:47:30 am »

OK now get stuff printed in reasonable time!

Code: QB64: [Select]

 
DIM word$(84100)
DIM key$
DIM SHARED ltr$(9)  '<<<<<<<<<<< shared to debug
 
'SET KEY AND LETTERS
key$ = "a"
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 
'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr$(n) = MID$(in$, n, 1)
NEXT n
 
'LOAD ALL WORDS FROM .TXT FILE
PRINT "LOADING...."
OPEN "WordList.txt" FOR INPUT AS #1  '<<<<<<<<< moved to more appropriate place
WHILE NOT EOF(1)
    filecount = filecount + 1
    INPUT #1, line$
    word$(filecount) = line$
WEND
CLOSE #1
CLS
 
 
'RUN THROUGH WORDS
FOR n = 1 TO filecount
    findkey = INSTR(word$(n), key$)
 
    IF findkey THEN
        FOR i = 1 TO LEN(word$(n))
            templtr$ = MID$(word$(n), i, 1)
            FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
                IF ltr$(x) = templtr$ THEN
                    ltr$(x) = ""
                    count = count + 1
                END IF
                'debugging
                'PRINT "Update: file word = "; word$(n); " and here is current letters crossed off: "; letters$
                'INPUT "OK press enter... "; wate$
            NEXT x
        NEXT i
 
        IF count = LEN(word$(n)) THEN
            PRINT word$(n)
        END IF
    END IF
 
    FOR m = 1 TO 9  'main fix #2 n's  to m's
        ltr$(m) = MID$(in$, m, 1)
    NEXT m
    count = 0
NEXT n
 
PRINT "DONE..."
 
'for debugging
FUNCTION letters$ ()
    FOR i = 1 TO 9
        b$ = b$ + ltr$(i)
    NEXT
    letters$ = "*" + b$ + "*"
END FUNCTION
 

DONE? Looks like the logic was OK, just the variable assignments needed fixing.

codeguy · « **Reply #4 on:** October 07, 2018, 02:09:15 pm »

Gives you every permutation of letters, numbers, etc. If parray() is ordered, so is the output.

Code: QB64: [Select]

WIDTH 80, 43
DIM parray(0 TO 3) AS LONG
FOR i = 0 TO UBOUND(parray)
    parray(i) = i
NEXT
PRINT
PRINT
PRINT "permutations"
Permute parray(), 0, UBOUND(parray), np
DO
    x$ = INKEY$
LOOP UNTIL x$ > ""
SYSTEM
 
SUB DisplayResults (PArray() AS LONG, start, finish, np AS DOUBLE)
DIM i AS LONG
PRINT USING "#,###,###,###,###"; np;
FOR i = LBOUND(parray) TO UBOUND(parray)
    PRINT PArray(i);
NEXT
PRINT
END SUB
 
SUB Rotate (parray() AS LONG, Start AS LONG, finish AS LONG)
DIM ts AS LONG
ts = parray(Start)
FOR i = Start TO finish - 1
    SWAP parray(i), parray(i + 1)
NEXT
parray(finish) = ts
END SUB
 
SUB Permute (parray() AS LONG, start AS LONG, finish AS LONG, np AS DOUBLE)
np = np + 1
DisplayResults parray(), LBOUND(parray), UBOUND(parray), np
IF start < finish THEN
    DIM i AS LONG
    DIM j AS LONG
    FOR i = finish - 1 TO start STEP -1
        FOR j = i + 1 TO finish
            SWAP parray(i), parray(j)
            Permute parray(), i + 1, finish, np
        NEXT
        Rotate parray(), i, finish
    NEXT
END IF
END SUB
 

bplus · « **Reply #5 on:** October 07, 2018, 02:52:59 pm »

Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?

SMcNeill · « **Reply #6 on:** October 07, 2018, 03:32:28 pm »

Quote from: bplus on October 07, 2018, 02:52:59 pm

Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?

Here's how I'd go, I think:

First, reduce the word list to exclude any words > 9 digits. No need to search for impossible combinations.
Then count letters. Save these in an array WordLetters(1 TO WordCount,1 TO 26).

Then count letters in the target word.
Compare. If target count > word count then it's a match!

************************
The word list is going to be rather limited (less than 50k words I'd imagine), and most searches will terminate rather quickly. (need an "A", don't have any? Quit the search at this point.)

I really don't think you'd need to worry about optimizing for speed any more than that, to be honest.

bplus · « **Reply #7 on:** October 07, 2018, 03:44:47 pm »

Quote from: SMcNeill on October 07, 2018, 03:32:28 pm

Quote from: bplus on October 07, 2018, 02:52:59 pm
Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?

Here's how I'd go, I think:

First, reduce the word list to exclude any words > 9 digits. No need to search for impossible combinations.
Then count letters. Save these in an array WordLetters(1 TO WordCount,1 TO 26).

Then count letters in the target word.
Compare. If target count > word count then it's a match!

************************
The word list is going to be rather limited (less than 50k words I'd imagine), and most searches will terminate rather quickly. (need an "A", don't have any? Quit the search at this point.)

I really don't think you'd need to worry about optimizing for speed any more than that, to be honest.

Hmm... I think you are saying permutations is slower too?

I am thinking this part:

Code: QB64: [Select]

'SET KEY AND LETTERS
key$ = "a"
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 

key$ and in$ might be input into the program at run time, so the letters might not be nine and in$ might be real words too not segments of the alphabet. That's how I would use this code for word games or word game solving. And the real words might have 2 or 3 of the same letters. I was planning on optimizing for such events.

SMcNeill · « **Reply #8 on:** October 07, 2018, 04:28:46 pm »

Here's an easy fix for you to optimize your code and run it faster. (An increase of about 15x faster on my machine.)

I'll post 2 routines here so you can run them and compare for speed differences:

YOURS (modified to loop and run for a set amount of time -- 5 seconds in this case):

Code: QB64: [Select]

DIM word$(84100)
DIM key$
DIM SHARED ltr$(9) '<<<<<<<<<<< shared to debug
 
'SET KEY AND LETTERS
key$ = "a"
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 
'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr$(n) = MID$(in$, n, 1)
NEXT n
 
 
 
'LOAD ALL WORDS FROM .TXT FILE
PRINT "LOADING...."
OPEN "WordList.txt" FOR INPUT AS #1 '<<<<<<<<< moved to more appropriate place
filecount = 0
WHILE NOT EOF(1)
    filecount = filecount + 1
    INPUT #1, line$
    word$(filecount) = line$
WEND
CLOSE #1
 
 
t# = TIMER
 
timelimit = 5
DO UNTIL TIMER > t# + timelimit
    CLS
    loopsran = loopsran + 1
 
 
 
    'RUN THROUGH WORDS
    FOR n = 1 TO filecount
        findkey = INSTR(word$(n), key$)
 
        IF findkey THEN
            FOR i = 1 TO LEN(word$(n))
                templtr$ = MID$(word$(n), i, 1)
                FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
                    IF ltr$(x) = templtr$ THEN
                        ltr$(x) = ""
                        count = count + 1
                    END IF
                    'debugging
                    'PRINT "Update: file word = "; word$(n); " and here is current letters crossed off: "; letters$
                    'INPUT "OK press enter... "; wate$
                NEXT x
            NEXT i
 
            IF count = LEN(word$(n)) THEN
                PRINT word$(n)
            END IF
        END IF
 
        FOR m = 1 TO 9 'main fix #2 n's  to m's
            ltr$(m) = MID$(in$, m, 1)
        NEXT m
        count = 0
    NEXT n
 
    PRINT "DONE..."
LOOP
PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
 
 
'for debugging
FUNCTION letters$ ()
    FOR i = 1 TO 9
        b$ = b$ + ltr$(i)
    NEXT
    letters$ = "*" + b$ + "*"
END FUNCTION

Modified:

Code: QB64: [Select]

DEFLNG A-Z
DIM word$(84100)
DIM key$
DIM SHARED ltr(9) '<<<<<<<<<<< shared to debug
 
'SET KEY AND LETTERS
key$ = "a"
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 
'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr(n) = ASC(in$, n)
NEXT n
 
 
 
 
 
'LOAD ALL WORDS FROM .TXT FILE
PRINT "LOADING...."
OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
filecount = 0
WHILE NOT EOF(1)
    LINE INPUT #1, line$
    IF LEN(line$) < 10 THEN
        filecount = filecount + 1
        word$(filecount) = line$
    END IF
WEND
CLOSE #1
 
 
t# = TIMER
timelimit = 5
DO UNTIL TIMER > t# + timelimit
    CLS
    loopsran = loopsran + 1
 
 
    'RUN THROUGH WORDS
    FOR n = 1 TO filecount
        findkey = INSTR(word$(n), key$)
 
        IF findkey THEN
            FOR i = 1 TO LEN(word$(n))
                templtr = ASC(word$(n), i)
                FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
                    IF ltr(x) = templtr THEN
                        ltr(x) = 0
                        count = count + 1
                    END IF
                NEXT x
                IF count <> i GOTO skipmore
            NEXT i
 
            IF count = LEN(word$(n)) THEN
                PRINT word$(n),
            END IF
        END IF
 
        skipmore:
        FOR m = 1 TO 9 'main fix #2 n's  to m's
            ltr(m) = ASC(in$, m)
        NEXT m
        count = 0
    NEXT n
 
    PRINT "DONE..."
LOOP
PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
 
'for debugging
FUNCTION letters$ ()
    FOR i = 1 TO 9
        b$ = b$ + ltr$(i)
    NEXT
    letters$ = "*" + b$ + "*"
END FUNCTION

Yours will run 22 times in 5 seconds, mine 324 times...

The changes?

#1) STRINGS are gone. Why do we need them? Especially, WHY DO WE NEED MID$??

When going for speed routines, ASC outperforms MID$(x$,y,1) every time, hands down! This is a major performance boost.

#2) The word list is limited to begin with.

IF LEN(line$) < 10 THEN
filecount = filecount + 1
word$(filecount) = line$
END IF

You're not going to find 10 letter word matches with only 9 letter words.

#3) (Not timed, but a huge speed boost): Changed file from INPUT to BINARY and changed INPUT # to LINE INPUT#. This makes loading the word list a whole heck of a lot faster. Even faster would be to load it all at once and then parse the words out of it, but who wants to go through the trouble for all of the few microseconds we'd save in this case?

Simple little things, but the affect the performance like crazy.

SMcNeill · « **Reply #9 on:** October 07, 2018, 05:01:34 pm »

And, if we create a letter index for our words (since we're going to do a limit by key$), we can almost double the performance. (Probably more for certain letters which aren't going to be indexed as much as "a" is, in this case.)

Code: QB64: [Select]

DEFLNG A-Z
DIM word$(84100)
DIM key$
DIM SHARED ltr(9) '<<<<<<<<<<< shared to debug
DIM backup(9)
DIM keycount(97 TO 122, 50000) 'letter/index
 
'SET KEY AND LETTERS
key$ = "a"
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 
'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr(n) = ASC(in$, n)
    backup(n) = ltr(n)
NEXT n
 
 
 
 
 
'LOAD ALL WORDS FROM .TXT FILE
PRINT "LOADING...."
OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
filecount = 0
WHILE NOT EOF(1)
    LINE INPUT #1, line$
    IF LEN(line$) < 10 THEN
        filecount = filecount + 1
        word$(filecount) = line$
        FOR i = 97 TO 122
            IF INSTR(line$, CHR$(i)) THEN
                keycount(i, 0) = keycount(i, 0) + 1 'record 0 tells us how many there are for that letter
                keycount(i, keycount(i, 0)) = filecount 'add the word number to the proper index
            END IF
        NEXT
    END IF
WEND
CLOSE #1
 
 
t# = TIMER
timelimit = 5
DO UNTIL TIMER > t# + timelimit
    CLS
    loopsran = loopsran + 1
 
 
    'RUN THROUGH WORDS
    findkey = ASC(key$)
    FOR n = 1 TO keycount(findkey, 0)
        w$ = word$(keycount(findkey, n))
        l = LEN(w$)
        FOR i = 1 TO l
            templtr = ASC(w$, i)
            FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
                IF ltr(x) = templtr THEN
                    ltr(x) = 0
                    count = count + 1
                END IF
            NEXT x
            IF count <> i GOTO skipmore
        NEXT i
 
        IF count = l THEN PRINT w$,
 
        skipmore:
        FOR m = 1 TO 9 'main fix #2 n's  to m's
            ltr(m) = backup(m)
        NEXT m
        count = 0
    NEXT n
 
    PRINT "DONE..."
LOOP
PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
 
'for debugging
FUNCTION letters$ ()
    FOR i = 1 TO 9
        b$ = b$ + ltr$(i)
    NEXT
    letters$ = "*" + b$ + "*"
END FUNCTION
 

No need to check each and every word for a letter, if we already have built an index to tell us which words contain those letters. ;)

(Current run time is 488 loops in 5 seconds on my PC, if you're curious about the actual number to compare improvement.)

bplus · « **Reply #10 on:** October 07, 2018, 05:01:54 pm »

Here's what I had in mind:

Code: QB64: [Select]

' WordCrack mod 1.bas B+ 2018-10-07
 
INPUT "Enter a string to build words from "; in$
'load words
OPEN "WordList.txt" FOR BINARY AS #1
gulp& = LOF(1)
buff$ = STRING$(gulp&, " ")
GET #1, , buff$
CLOSE #1
REDIM word$(0)
Split buff$, CHR$(10), word$()
filecount& = UBOUND(word$)
'RUN THROUGH WORDS
FOR n& = 0 TO filecount&
    c$ = in$
    OK% = -1
    FOR i% = 1 TO LEN(word$(n&))
        p% = INSTR(c$, MID$(word$(n&), i%, 1))
        IF p% = 0 THEN
            OK% = 0: EXIT FOR
        ELSE
            MID$(c$, p%, 1) = "+"
        END IF
    NEXT
    IF OK% THEN PRINT word$(n&); ", ";
NEXT
PRINT "DONE..."
 
 
SUB Split (mystr AS STRING, delim AS STRING, arr() AS STRING)
    ' bplus modifications of Galleon fix of Bulrush Split reply #13
    ' http://www.[abandoned, outdated and now likely malicious qb64 dot net website - don’t go there]/forum/index.php?topic=1612.0
    ' this sub further developed and tested here: \test\Strings\Split test.bas
    DIM copy AS STRING, p AS LONG, curpos AS LONG, arrpos AS LONG, lc AS LONG, dpos AS LONG
    copy = mystr 'make copy since we are messing with mystr
    'special case if delim is space, probably want to remove all excess space
    IF delim = " " THEN
        copy = RTRIM$(LTRIM$(copy))
        p = INSTR(copy, "  ")
        WHILE p > 0
            copy = MID$(copy, 1, p - 1) + MID$(copy, p + 1)
            p = INSTR(copy, "  ")
        WEND
    END IF
    curpos = 1
    arrpos = 0
    lc = LEN(copy)
    dpos = INSTR(curpos, copy, delim)
    DO UNTIL dpos = 0
        arr(arrpos) = MID$(copy, curpos, dpos - curpos)
        arrpos = arrpos + 1
        REDIM _PRESERVE arr(arrpos + 1) AS STRING
        curpos = dpos + LEN(delim)
        dpos = INSTR(curpos, copy, delim)
    LOOP
    arr(arrpos) = MID$(copy, curpos)
    REDIM _PRESERVE arr(arrpos) AS STRING
END SUB
 

Looks pretty simple to me.

Yes, Binary load most significant speed improvement! :)

SMcNeill · « **Reply #11 on:** October 07, 2018, 05:26:23 pm »

And another small modification of the original gives us another decent speed boost:

Code: QB64: [Select]

DEFLNG A-Z
DIM word$(84100)
DIM keys
DIM SHARED ltr(1 TO 9) '<<<<<<<<<<< shared to debug
DIM backup(1 TO 9)
DIM keycount(97 TO 122, 50000) 'letter/index
DIM m1 AS _MEM, m2 AS _MEM 'just for quick array restoration
m1 = _MEM(ltr()): m2 = _MEM(backup())
 
'SET KEY AND LETTERS
keys = ASC("a")
in$ = "abcdefghi"
lenin = LEN(in$) '<<<<<<<<<<< for faster processing
 
'SPLIT UP LETTERS
FOR n = 1 TO 9
    ltr(n) = ASC(in$, n)
NEXT n
_MEMPUT m2, m2.OFFSET, ltr()
 
 
'LOAD ALL WORDS FROM .TXT FILE
PRINT "LOADING...."
OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
filecount = 0
WHILE NOT EOF(1)
    LINE INPUT #1, line$
    IF LEN(line$) < 10 THEN
        filecount = filecount + 1
        word$(filecount) = line$
        FOR i = 97 TO 122
            IF INSTR(line$, CHR$(i)) THEN
                keycount(i, 0) = keycount(i, 0) + 1 'record 0 tells us how many there are for that letter
                keycount(i, keycount(i, 0)) = filecount 'add the word number to the proper index
            END IF
        NEXT
    END IF
WEND
CLOSE #1
 
 
t# = TIMER
timelimit = 5
 
 
DO UNTIL TIMER > t# + timelimit
    CLS
    loopsran = loopsran + 1
 
 
    'RUN THROUGH WORDS
    FOR n = 1 TO keycount(keys, 0)
        w$ = word$(keycount(keys, n))
        l = LEN(w$)
        FOR i = 1 TO l
            templtr = ASC(w$, i)
            FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
                IF ltr(x) = templtr THEN
                    ltr(x) = 0
                    count = count + 1
                    EXIT FOR
                END IF
            NEXT x
            IF count <> i GOTO skipmore
        NEXT i
 
        PRINT w$,
 
        skipmore:
        $CHECKING:OFF
        _MEMPUT m1, m1.OFFSET, backup()
        $CHECKING:ON
        count = 0
    NEXT n
 
    PRINT "DONE..."
LOOP
PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
 
'for debugging
FUNCTION letters$ ()
    FOR i = 1 TO 9
        b$ = b$ + ltr$(i)
    NEXT
    letters$ = "*" + b$ + "*"
END FUNCTION
 

Instead of a FOR...LOOP to reset the search list, I simply restored it with a backup array with _MEM.

From:

FOR m = 1 TO 9 'main fix #2 n's to m's
ltr$(m) = MID$(in$, m, 1)
NEXT m

To:

$CHECKING:OFF
_MEMPUT m1, m1.OFFSET, backup()
$CHECKING:ON

*****************
*****************

I'm now getting ~900 runs in a short 5 second period. I'd call that fast enough for just about anything I'd need to use it for. (And a nice improvement from the original 22 runs in 5 seconds.) ;)

bplus · « **Reply #12 on:** October 07, 2018, 05:40:28 pm »

I am getting 363 cycles for abcdefghi in 5 secs with Windows 10 Intel Core i5 processor without any preprocessing of the word$() array.

bplus · « **Reply #13 on:** October 07, 2018, 07:24:40 pm »

Steve your last code improvement is running 1067 or so loops on my machine.

Yeah OK, checking word length first does make sense and adds another 100 loops in 5 secs:

Code: QB64: [Select]

' WordCrack mod 1.bas B+ 2018-10-07
 
' now with timer mod
 
'INPUT "Enter a string to build words from "; in$
in$ = "abcdefghi"
in$ = LCASE$(in$)
lenin = LEN(in$)
 
'load words
OPEN "WordList.txt" FOR BINARY AS #1
gulp& = LOF(1)
buff$ = STRING$(gulp&, " ")
GET #1, , buff$
CLOSE #1
REDIM word$(0)
Split buff$, CHR$(10), word$()
filecount& = UBOUND(word$)
start! = TIMER
WHILE TIMER - start! < 5
    'RUN THROUGH WORDS
    FOR n& = 0 TO filecount&
        IF LEN(word$(n&)) <= lenin THEN
            c$ = in$
            OK% = -1
            FOR i% = 1 TO LEN(word$(n&))
                p% = INSTR(c$, MID$(word$(n&), i%, 1))
                IF p% = 0 THEN
                    OK% = 0: EXIT FOR
                ELSE
                    MID$(c$, p%, 1) = "+"
                END IF
            NEXT
            IF OK% THEN PRINT word$(n&); ", ";
        END IF
    NEXT
    counter = counter + 1
WEND
PRINT "Loop count in 5 secs ="; counter
END
 
SUB Split (mystr AS STRING, delim AS STRING, arr() AS STRING)
    ' bplus modifications of Galleon fix of Bulrush Split reply #13
    ' http://www.[abandoned, outdated and now likely malicious qb64 dot net website - don’t go there]/forum/index.php?topic=1612.0
    ' this sub further developed and tested here: \test\Strings\Split test.bas
    DIM copy AS STRING, p AS LONG, curpos AS LONG, arrpos AS LONG, lc AS LONG, dpos AS LONG
    copy = mystr 'make copy since we are messing with mystr
    'special case if delim is space, probably want to remove all excess space
    IF delim = " " THEN
        copy = RTRIM$(LTRIM$(copy))
        p = INSTR(copy, "  ")
        WHILE p > 0
            copy = MID$(copy, 1, p - 1) + MID$(copy, p + 1)
            p = INSTR(copy, "  ")
        WEND
    END IF
    curpos = 1
    arrpos = 0
    lc = LEN(copy)
    dpos = INSTR(curpos, copy, delim)
    DO UNTIL dpos = 0
        arr(arrpos) = MID$(copy, curpos, dpos - curpos)
        arrpos = arrpos + 1
        REDIM _PRESERVE arr(arrpos + 1) AS STRING
        curpos = dpos + LEN(delim)
        dpos = INSTR(curpos, copy, delim)
    LOOP
    arr(arrpos) = MID$(copy, curpos)
    REDIM _PRESERVE arr(arrpos) AS STRING
END SUB
 

codeguy · « **Reply #14 on:** October 08, 2018, 12:23:37 am »

generates strings and retrieves matches for permutations of letters creating words up to 8 characters long. in order too. On my humble machine, (39/64)s. Because permutations are generated in lexical order AND the wordlist is in lexical order (both ascending), this becomes nothing more than a simple merge/find algorithm, the same used for batch database updates. Quick? Judge for yourself. found 58 matches in 322560 variable-length permutations. I am uncertain how many loops/second this code executes, but I'm gonna say it's 322560/(39/64), roughly 529329 trials/second on my humble 2.16GHz machine. On my machine, this will also execute about 8 times, perhaps a wee bit more. I am uncertain how fast Steve's CPU is, but mine is no speed demon nor is it super slow. Mine is roughly 240,000 trials/GHz.

Code: QB64: [Select]

WIDTH 80, 43
DIM parray(0 TO 7) AS LONG '* checks words up to 8 characters long
theword$ = "sequoias"
FOR i = 0 TO UBOUND(parray)
    parray(i) = ASC(theword$, i + 1)
NEXT
s& = LBOUND(parray)
h& = UBOUND(parray)
DO
    an& = s& + 1
    FOR q& = an& TO h&
        IF parray(q&) < parray(s&) THEN
            SWAP parray(q&), parray(s&)
        END IF
    NEXT
    s& = an&
LOOP WHILE s& <= h&
PRINT
PRINT
Wlist% = FREEFILE
OPEN ".\wordlist.txt" FOR BINARY AS Wlist%
PRINT LOF(Wlist%)
chunk$ = INPUT$(LOF(Wlist%), Wlist%)
CLOSE Wlist%
REDIM words$(0 TO 999999)
Wct& = 0
FOR u& = 1 TO LEN(chunk$)
    IF ASC(chunk$, u&) = 10 OR u& > LEN(chunk$) THEN
        IF LEN(w$) <= 8 THEN
            words$(Wct&) = w$
            PRINT w$; Wct&
            Wct& = Wct& + 1
        END IF
        w$ = ""
    ELSE
        w$ = w$ + MID$(chunk$, u&, 1)
    END IF
NEXT
chunk$ = ""
WordsIndex& = 0
nmatches& = 0
'_DELAY 60
t! = TIMER(.001)
PRINT "permutations"
Permute parray(), 0, UBOUND(parray), np#, words$(), WordIndex&, Wct&, nmatches&, npermtrials#
f! = TIMER(.001)
PRINT f! - t!; nmatches&; np#; npermtrials#
DO
    x$ = INKEY$
LOOP UNTIL x$ > ""
SYSTEM
 
SUB DisplayResults (PArray() AS LONG, start, finish, np AS DOUBLE)
    DIM i AS LONG
    PRINT USING "#,###,###,###,###"; np;
    FOR i = LBOUND(parray) TO UBOUND(parray)
        PRINT PArray(i);
    NEXT
    PRINT
END SUB
 
SUB Rotate (parray() AS LONG, Start AS LONG, finish AS LONG)
    DIM ts AS LONG
    ts = parray(Start)
    FOR i = Start TO finish - 1
        SWAP parray(i), parray(i + 1)
    NEXT
    parray(finish) = ts
END SUB
 
SUB Permute (parray() AS LONG, start AS LONG, finish AS LONG, np AS DOUBLE, words$(), index&, wct&, matchcount&, NPermsTried#)
    np = np + 1
    IF index& < wct& THEN
        FOR a& = 0 TO UBOUND(parray)
            v$ = array2word$(parray(), LBOUND(parray), LBOUND(parray) + a&)
            DO
                IF words$(index&) < v$ THEN
                    index& = index& + 1
                ELSE
                    NPermsTried# = NPermsTried# + 1
                    IF words$(index&) = v$ THEN
                        matchcount& = matchcount& + 1
                        PRINT v$; ":match:"; words$(index&); index&; matchcount&
                    END IF
                    EXIT DO
                END IF
            LOOP
        NEXT
    ELSE
        EXIT SUB
    END IF
    IF start < finish THEN
        DIM i AS LONG
        DIM j AS LONG
        FOR i = finish - 1 TO start STEP -1
            FOR j = i + 1 TO finish
                SWAP parray(i), parray(j)
                Permute parray(), i + 1, finish, np, words$(), index&, wct&, matchcount&, NPermsTried#
            NEXT
            Rotate parray(), i, finish
        NEXT
    END IF
END SUB
 
FUNCTION array2word$ (parray() AS LONG, start AS LONG, finish AS LONG)
    m$ = SPACE$(finish - start + 1)
    POSITION& = 1
    FOR z& = start TO finish
        MID$(m$, POSITION&) = CHR$(parray(z&))
        POSITION& = POSITION& + 1
    NEXT
    array2word$ = m$
END FUNCTION
 

News:

Author Topic: WordCracker (Read 8002 times)

Zeppelin

WordCracker

SMcNeill

Re: WordCracker

bplus

Re: WordCracker

bplus

Re: WordCracker

codeguy

Re: WordCracker

bplus

Re: WordCracker

SMcNeill

Re: WordCracker

bplus

Re: WordCracker

SMcNeill

Re: WordCracker

SMcNeill

Re: WordCracker

bplus

Re: WordCracker

SMcNeill

Re: WordCracker

bplus

Re: WordCracker

bplus

Re: WordCracker

codeguy

Re: WordCracker