Author Topic: WordCracker  (Read 8002 times)

0 Members and 1 Guest are viewing this topic.

Offline Zeppelin

  • Newbie
  • Posts: 43
    • Zeppelin Games ItchIo
WordCracker
« on: October 07, 2018, 03:56:15 am »
Hey,
I run into a issue with my program.
Im trying to make a program that will, when given a string of 9 letters and 1 key letter to print out all the possible words using these letters.
For example:
9 Letters: ABCDEFGHI
Key Letter: A
PRINTS:
BAD
CAB
DEAF
etc....

I am using a .txt database full of words from the Oxford dictionary and each time I run the program nothing is output. I cant find the issue.

Thanks,
Zeppelin

Ps. The program and wordlist are attached below.
* WordCrack.bas (Filesize: 0.98 KB, Downloads: 205)
* WordList.txt (Filesize: 775.81 KB, Downloads: 27960)
+[--->++<]>+.+++[->++++<]>.[--->+<]>+.-[---->+<]>++.+[->+++<]>+.+++++++++++.----------.[--->+<]>----.+[---->+<]>+++.---[->++++<]>.------------.+.++++++++++.+[---->+<]>+++.+[->+++<]>++.[--->+<]>+.[->+++<]>+.++++++++++.+.>++++++++++.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • Steve’s QB64 Archive Forum
Re: WordCracker
« Reply #1 on: October 07, 2018, 07:37:00 am »
http://qb64.freeforums.net/thread/42/scrabble-word-maker -- Sounds like what you're wanting is basically the same thing I've did here.  Give it a look and see if it helps.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #2 on: October 07, 2018, 10:33:25 am »
Code: QB64: [Select]
  1. 'SPLIT UP LETTERS
  2. FOR n = 1 TO 9
  3.     ltr$(n) = MID$(in$, n, 1)
  4. PRINT LEN(ltr$)  '>>>>>>>>>>>> 0 !!!
  5.  

Code: QB64: [Select]
  1.             FOR x = 1 TO LEN(in$)  '<<<<<<<<<<<<<<<<<<<<<<< change to this?
  2.                 IF ltr$(x) = templtr$ THEN
  3.                     ltr$(x) = ""
  4.                     count = count + 1
  5.                 END IF
  6.             NEXT x
  7.  

OH!!! This is screwing you up too!

Code: QB64: [Select]
  1.     FOR n = 1 TO 9
  2.         ltr$(n) = MID$(in$, n, 1)
  3.     NEXT n
You are using n for the word index from the file, and then changing n here when resetting ltr$() array!!!
« Last Edit: October 07, 2018, 10:39:21 am by bplus »

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #3 on: October 07, 2018, 10:47:30 am »
OK now get stuff printed in reasonable time!
Code: QB64: [Select]
  1.  
  2. DIM word$(84100)
  3. DIM SHARED ltr$(9)  '<<<<<<<<<<< shared to debug
  4.  
  5. 'SET KEY AND LETTERS
  6. key$ = "a"
  7. in$ = "abcdefghi"
  8. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  9.  
  10. 'SPLIT UP LETTERS
  11. FOR n = 1 TO 9
  12.     ltr$(n) = MID$(in$, n, 1)
  13.  
  14. 'LOAD ALL WORDS FROM .TXT FILE
  15. PRINT "LOADING...."
  16. OPEN "WordList.txt" FOR INPUT AS #1  '<<<<<<<<< moved to more appropriate place
  17.     filecount = filecount + 1
  18.     INPUT #1, line$
  19.     word$(filecount) = line$
  20.  
  21.  
  22. 'RUN THROUGH WORDS
  23. FOR n = 1 TO filecount
  24.     findkey = INSTR(word$(n), key$)
  25.  
  26.     IF findkey THEN
  27.         FOR i = 1 TO LEN(word$(n))
  28.             templtr$ = MID$(word$(n), i, 1)
  29.             FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
  30.                 IF ltr$(x) = templtr$ THEN
  31.                     ltr$(x) = ""
  32.                     count = count + 1
  33.                 END IF
  34.                 'debugging
  35.                 'PRINT "Update: file word = "; word$(n); " and here is current letters crossed off: "; letters$
  36.                 'INPUT "OK press enter... "; wate$
  37.             NEXT x
  38.         NEXT i
  39.  
  40.         IF count = LEN(word$(n)) THEN
  41.             PRINT word$(n)
  42.         END IF
  43.     END IF
  44.  
  45.     FOR m = 1 TO 9  'main fix #2 n's  to m's
  46.         ltr$(m) = MID$(in$, m, 1)
  47.     NEXT m
  48.     count = 0
  49.  
  50. PRINT "DONE..."
  51.  
  52. 'for debugging
  53. FUNCTION letters$ ()
  54.     FOR i = 1 TO 9
  55.         b$ = b$ + ltr$(i)
  56.     NEXT
  57.     letters$ = "*" + b$ + "*"
  58.  

DONE? Looks like the logic was OK, just the variable assignments needed fixing.
Word crack.PNG
* Word crack.PNG (Filesize: 22.07 KB, Dimensions: 728x679, Views: 329)
« Last Edit: October 07, 2018, 10:57:43 am by bplus »

Offline codeguy

  • Forum Regular
  • Posts: 174
Re: WordCracker
« Reply #4 on: October 07, 2018, 02:09:15 pm »
Gives you every permutation of letters, numbers, etc. If parray() is ordered, so is the output.
Code: QB64: [Select]
  1. WIDTH 80, 43
  2. DIM parray(0 TO 3) AS LONG
  3. FOR i = 0 TO UBOUND(parray)
  4.     parray(i) = i
  5. PRINT "permutations"
  6. Permute parray(), 0, UBOUND(parray), np
  7.     x$ = INKEY$
  8. LOOP UNTIL x$ > ""
  9.  
  10. SUB DisplayResults (PArray() AS LONG, start, finish, np AS DOUBLE)
  11. PRINT USING "#,###,###,###,###"; np;
  12. FOR i = LBOUND(parray) TO UBOUND(parray)
  13.     PRINT PArray(i);
  14.  
  15. SUB Rotate (parray() AS LONG, Start AS LONG, finish AS LONG)
  16. ts = parray(Start)
  17. FOR i = Start TO finish - 1
  18.     SWAP parray(i), parray(i + 1)
  19. parray(finish) = ts
  20.  
  21. SUB Permute (parray() AS LONG, start AS LONG, finish AS LONG, np AS DOUBLE)
  22. np = np + 1
  23. DisplayResults parray(), LBOUND(parray), UBOUND(parray), np
  24. IF start < finish THEN
  25.     DIM i AS LONG
  26.     DIM j AS LONG
  27.     FOR i = finish - 1 TO start STEP -1
  28.         FOR j = i + 1 TO finish
  29.             SWAP parray(i), parray(j)
  30.             Permute parray(), i + 1, finish, np
  31.         NEXT
  32.         Rotate parray(), i, finish
  33.     NEXT
  34.  

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #5 on: October 07, 2018, 02:52:59 pm »
Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?



Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • Steve’s QB64 Archive Forum
Re: WordCracker
« Reply #6 on: October 07, 2018, 03:32:28 pm »
Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?

Here's how I'd go, I think:

First, reduce the word list to exclude any words > 9 digits.  No need to search for impossible combinations.
Then count letters.  Save these in an array WordLetters(1 TO WordCount,1 TO 26).

Then count letters in the target word.
Compare.  If target count > word count then it's a match!

************************
The word list is going to be rather limited (less than 50k words I'd imagine), and most searches will terminate rather quickly.  (need an "A", don't have any?  Quit the search at this point.) 

I really don't think you'd need to worry about optimizing for speed any more than that, to be honest.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #7 on: October 07, 2018, 03:44:47 pm »
Hi codeguy,

I thought about permutations and then thought nah! won't work with real words at varying lengths.

But I could be wrong, wouldn't be first time.

Wanna race? You modify your code for checking for real words and I will see if I can optimize Zeppellin's code some more.... post in 24 hours?

Here's how I'd go, I think:

First, reduce the word list to exclude any words > 9 digits.  No need to search for impossible combinations.
Then count letters.  Save these in an array WordLetters(1 TO WordCount,1 TO 26).

Then count letters in the target word.
Compare.  If target count > word count then it's a match!

************************
The word list is going to be rather limited (less than 50k words I'd imagine), and most searches will terminate rather quickly.  (need an "A", don't have any?  Quit the search at this point.) 

I really don't think you'd need to worry about optimizing for speed any more than that, to be honest.

Hmm... I think you are saying permutations is slower too?

I am thinking this part:
Code: QB64: [Select]
  1. 'SET KEY AND LETTERS
  2. key$ = "a"
  3. in$ = "abcdefghi"
  4. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  5.  

key$ and in$ might be input into the program at run time, so the letters might not be nine and in$ might be real words too not segments of the alphabet. That's how I would use this code for word games or word game solving. And the real words might have 2 or 3 of the same letters. I was planning on optimizing for such events.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • Steve’s QB64 Archive Forum
Re: WordCracker
« Reply #8 on: October 07, 2018, 04:28:46 pm »
Here's an easy fix for you to optimize your code and run it faster.  (An increase of about 15x faster on my machine.) 

I'll post 2 routines here so you can run them and compare for speed differences:

YOURS (modified to loop and run for a set amount of time -- 5 seconds in this case):
Code: QB64: [Select]
  1. DIM word$(84100)
  2. DIM SHARED ltr$(9) '<<<<<<<<<<< shared to debug
  3.  
  4. 'SET KEY AND LETTERS
  5. key$ = "a"
  6. in$ = "abcdefghi"
  7. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  8.  
  9. 'SPLIT UP LETTERS
  10. FOR n = 1 TO 9
  11.     ltr$(n) = MID$(in$, n, 1)
  12.  
  13.  
  14.  
  15. 'LOAD ALL WORDS FROM .TXT FILE
  16. PRINT "LOADING...."
  17. OPEN "WordList.txt" FOR INPUT AS #1 '<<<<<<<<< moved to more appropriate place
  18. filecount = 0
  19.     filecount = filecount + 1
  20.     INPUT #1, line$
  21.     word$(filecount) = line$
  22.  
  23.  
  24. t# = TIMER
  25.  
  26. timelimit = 5
  27. DO UNTIL TIMER > t# + timelimit
  28.     CLS
  29.     loopsran = loopsran + 1
  30.  
  31.  
  32.  
  33.     'RUN THROUGH WORDS
  34.     FOR n = 1 TO filecount
  35.         findkey = INSTR(word$(n), key$)
  36.  
  37.         IF findkey THEN
  38.             FOR i = 1 TO LEN(word$(n))
  39.                 templtr$ = MID$(word$(n), i, 1)
  40.                 FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
  41.                     IF ltr$(x) = templtr$ THEN
  42.                         ltr$(x) = ""
  43.                         count = count + 1
  44.                     END IF
  45.                     'debugging
  46.                     'PRINT "Update: file word = "; word$(n); " and here is current letters crossed off: "; letters$
  47.                     'INPUT "OK press enter... "; wate$
  48.                 NEXT x
  49.             NEXT i
  50.  
  51.             IF count = LEN(word$(n)) THEN
  52.                 PRINT word$(n)
  53.             END IF
  54.         END IF
  55.  
  56.         FOR m = 1 TO 9 'main fix #2 n's  to m's
  57.             ltr$(m) = MID$(in$, m, 1)
  58.         NEXT m
  59.         count = 0
  60.     NEXT n
  61.  
  62.     PRINT "DONE..."
  63. PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
  64.  
  65.  
  66. 'for debugging
  67. FUNCTION letters$ ()
  68.     FOR i = 1 TO 9
  69.         b$ = b$ + ltr$(i)
  70.     NEXT
  71.     letters$ = "*" + b$ + "*"

Modified:
Code: QB64: [Select]
  1. DEFLNG A-Z
  2. DIM word$(84100)
  3. DIM SHARED ltr(9) '<<<<<<<<<<< shared to debug
  4.  
  5. 'SET KEY AND LETTERS
  6. key$ = "a"
  7. in$ = "abcdefghi"
  8. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  9.  
  10. 'SPLIT UP LETTERS
  11. FOR n = 1 TO 9
  12.     ltr(n) = ASC(in$, n)
  13.  
  14.  
  15.  
  16.  
  17.  
  18. 'LOAD ALL WORDS FROM .TXT FILE
  19. PRINT "LOADING...."
  20. OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
  21. filecount = 0
  22.     LINE INPUT #1, line$
  23.     IF LEN(line$) < 10 THEN
  24.         filecount = filecount + 1
  25.         word$(filecount) = line$
  26.     END IF
  27.  
  28.  
  29. t# = TIMER
  30. timelimit = 5
  31. DO UNTIL TIMER > t# + timelimit
  32.     CLS
  33.     loopsran = loopsran + 1
  34.  
  35.  
  36.     'RUN THROUGH WORDS
  37.     FOR n = 1 TO filecount
  38.         findkey = INSTR(word$(n), key$)
  39.  
  40.         IF findkey THEN
  41.             FOR i = 1 TO LEN(word$(n))
  42.                 templtr = ASC(word$(n), i)
  43.                 FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
  44.                     IF ltr(x) = templtr THEN
  45.                         ltr(x) = 0
  46.                         count = count + 1
  47.                     END IF
  48.                 NEXT x
  49.                 IF count <> i GOTO skipmore
  50.             NEXT i
  51.  
  52.             IF count = LEN(word$(n)) THEN
  53.                 PRINT word$(n),
  54.             END IF
  55.         END IF
  56.  
  57.         skipmore:
  58.         FOR m = 1 TO 9 'main fix #2 n's  to m's
  59.             ltr(m) = ASC(in$, m)
  60.         NEXT m
  61.         count = 0
  62.     NEXT n
  63.  
  64.     PRINT "DONE..."
  65. PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
  66.  
  67. 'for debugging
  68. FUNCTION letters$ ()
  69.     FOR i = 1 TO 9
  70.         b$ = b$ + ltr$(i)
  71.     NEXT
  72.     letters$ = "*" + b$ + "*"

Yours will run 22 times in 5 seconds, mine 324 times...

The changes?

#1) STRINGS are gone.  Why do we need them?  Especially, WHY DO WE NEED MID$??

When going for speed routines, ASC outperforms MID$(x$,y,1) every time, hands down!  This is a major performance boost.

#2) The word list is limited to begin with.

    IF LEN(line$) < 10 THEN
        filecount = filecount + 1
        word$(filecount) = line$
    END IF

You're not going to find 10 letter word matches with only 9 letter words.

#3) (Not timed, but a huge speed boost): Changed file from INPUT to BINARY and changed INPUT # to LINE INPUT#.  This makes loading the word list a whole heck of a lot faster.  Even faster would be to load it all at once and then parse the words out of it, but who wants to go through the trouble for all of the few microseconds we'd save in this case?

Simple little things, but the affect the performance like crazy.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • Steve’s QB64 Archive Forum
Re: WordCracker
« Reply #9 on: October 07, 2018, 05:01:34 pm »
And, if we create a letter index for our words (since we're going to do a limit by key$), we can almost double the performance.  (Probably more for certain letters which aren't going to be indexed as much as "a" is, in this case.)

Code: QB64: [Select]
  1. DEFLNG A-Z
  2. DIM word$(84100)
  3. DIM SHARED ltr(9) '<<<<<<<<<<< shared to debug
  4. DIM backup(9)
  5. DIM keycount(97 TO 122, 50000) 'letter/index
  6.  
  7. 'SET KEY AND LETTERS
  8. key$ = "a"
  9. in$ = "abcdefghi"
  10. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  11.  
  12. 'SPLIT UP LETTERS
  13. FOR n = 1 TO 9
  14.     ltr(n) = ASC(in$, n)
  15.     backup(n) = ltr(n)
  16.  
  17.  
  18.  
  19.  
  20.  
  21. 'LOAD ALL WORDS FROM .TXT FILE
  22. PRINT "LOADING...."
  23. OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
  24. filecount = 0
  25.     LINE INPUT #1, line$
  26.     IF LEN(line$) < 10 THEN
  27.         filecount = filecount + 1
  28.         word$(filecount) = line$
  29.         FOR i = 97 TO 122
  30.             IF INSTR(line$, CHR$(i)) THEN
  31.                 keycount(i, 0) = keycount(i, 0) + 1 'record 0 tells us how many there are for that letter
  32.                 keycount(i, keycount(i, 0)) = filecount 'add the word number to the proper index
  33.             END IF
  34.         NEXT
  35.     END IF
  36.  
  37.  
  38. t# = TIMER
  39. timelimit = 5
  40. DO UNTIL TIMER > t# + timelimit
  41.     CLS
  42.     loopsran = loopsran + 1
  43.  
  44.  
  45.     'RUN THROUGH WORDS
  46.     findkey = ASC(key$)
  47.     FOR n = 1 TO keycount(findkey, 0)
  48.         w$ = word$(keycount(findkey, n))
  49.         l = LEN(w$)
  50.         FOR i = 1 TO l
  51.             templtr = ASC(w$, i)
  52.             FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
  53.                 IF ltr(x) = templtr THEN
  54.                     ltr(x) = 0
  55.                     count = count + 1
  56.                 END IF
  57.             NEXT x
  58.             IF count <> i GOTO skipmore
  59.         NEXT i
  60.  
  61.         IF count = l THEN PRINT w$,
  62.  
  63.         skipmore:
  64.         FOR m = 1 TO 9 'main fix #2 n's  to m's
  65.             ltr(m) = backup(m)
  66.         NEXT m
  67.         count = 0
  68.     NEXT n
  69.  
  70.     PRINT "DONE..."
  71. PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
  72.  
  73. 'for debugging
  74. FUNCTION letters$ ()
  75.     FOR i = 1 TO 9
  76.         b$ = b$ + ltr$(i)
  77.     NEXT
  78.     letters$ = "*" + b$ + "*"
  79.  

No need to check each and every word for a letter, if we already have built an index to tell us which words contain those letters.  ;)

(Current run time is 488 loops in 5 seconds on my PC, if you're curious about the actual number to compare improvement.)
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #10 on: October 07, 2018, 05:01:54 pm »
Here's what I had in mind:
Code: QB64: [Select]
  1. ' WordCrack mod 1.bas B+ 2018-10-07
  2.  
  3. INPUT "Enter a string to build words from "; in$
  4. 'load words
  5. OPEN "WordList.txt" FOR BINARY AS #1
  6. gulp& = LOF(1)
  7. buff$ = STRING$(gulp&, " ")
  8. GET #1, , buff$
  9. REDIM word$(0)
  10. Split buff$, CHR$(10), word$()
  11. filecount& = UBOUND(word$)
  12. 'RUN THROUGH WORDS
  13. FOR n& = 0 TO filecount&
  14.     c$ = in$
  15.     OK% = -1
  16.     FOR i% = 1 TO LEN(word$(n&))
  17.         p% = INSTR(c$, MID$(word$(n&), i%, 1))
  18.         IF p% = 0 THEN
  19.             OK% = 0: EXIT FOR
  20.         ELSE
  21.             MID$(c$, p%, 1) = "+"
  22.         END IF
  23.     NEXT
  24.     IF OK% THEN PRINT word$(n&); ", ";
  25. PRINT "DONE..."
  26.  
  27.  
  28. SUB Split (mystr AS STRING, delim AS STRING, arr() AS STRING)
  29.     ' bplus modifications of Galleon fix of Bulrush Split reply #13
  30.     ' http://www.[abandoned, outdated and now likely malicious qb64 dot net website - don’t go there]/forum/index.php?topic=1612.0
  31.     ' this sub further developed and tested here: \test\Strings\Split test.bas
  32.     DIM copy AS STRING, p AS LONG, curpos AS LONG, arrpos AS LONG, lc AS LONG, dpos AS LONG
  33.     copy = mystr 'make copy since we are messing with mystr
  34.     'special case if delim is space, probably want to remove all excess space
  35.     IF delim = " " THEN
  36.         copy = RTRIM$(LTRIM$(copy))
  37.         p = INSTR(copy, "  ")
  38.         WHILE p > 0
  39.             copy = MID$(copy, 1, p - 1) + MID$(copy, p + 1)
  40.             p = INSTR(copy, "  ")
  41.         WEND
  42.     END IF
  43.     curpos = 1
  44.     arrpos = 0
  45.     lc = LEN(copy)
  46.     dpos = INSTR(curpos, copy, delim)
  47.     DO UNTIL dpos = 0
  48.         arr(arrpos) = MID$(copy, curpos, dpos - curpos)
  49.         arrpos = arrpos + 1
  50.         REDIM _PRESERVE arr(arrpos + 1) AS STRING
  51.         curpos = dpos + LEN(delim)
  52.         dpos = INSTR(curpos, copy, delim)
  53.     LOOP
  54.     arr(arrpos) = MID$(copy, curpos)
  55.     REDIM _PRESERVE arr(arrpos) AS STRING
  56.  

Looks pretty simple to me.

Yes, Binary load most significant speed improvement! :)

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • Steve’s QB64 Archive Forum
Re: WordCracker
« Reply #11 on: October 07, 2018, 05:26:23 pm »
And another small modification of the original gives us another decent speed boost:

Code: QB64: [Select]
  1. DEFLNG A-Z
  2. DIM word$(84100)
  3. DIM keys
  4. DIM SHARED ltr(1 TO 9) '<<<<<<<<<<< shared to debug
  5. DIM backup(1 TO 9)
  6. DIM keycount(97 TO 122, 50000) 'letter/index
  7. DIM m1 AS _MEM, m2 AS _MEM 'just for quick array restoration
  8. m1 = _MEM(ltr()): m2 = _MEM(backup())
  9.  
  10. 'SET KEY AND LETTERS
  11. keys = ASC("a")
  12. in$ = "abcdefghi"
  13. lenin = LEN(in$) '<<<<<<<<<<< for faster processing
  14.  
  15. 'SPLIT UP LETTERS
  16. FOR n = 1 TO 9
  17.     ltr(n) = ASC(in$, n)
  18. _MEMPUT m2, m2.OFFSET, ltr()
  19.  
  20.  
  21. 'LOAD ALL WORDS FROM .TXT FILE
  22. PRINT "LOADING...."
  23. OPEN "WordList.txt" FOR BINARY AS #1 '<<<<<<<<< moved to more appropriate place
  24. filecount = 0
  25.     LINE INPUT #1, line$
  26.     IF LEN(line$) < 10 THEN
  27.         filecount = filecount + 1
  28.         word$(filecount) = line$
  29.         FOR i = 97 TO 122
  30.             IF INSTR(line$, CHR$(i)) THEN
  31.                 keycount(i, 0) = keycount(i, 0) + 1 'record 0 tells us how many there are for that letter
  32.                 keycount(i, keycount(i, 0)) = filecount 'add the word number to the proper index
  33.             END IF
  34.         NEXT
  35.     END IF
  36.  
  37.  
  38. t# = TIMER
  39. timelimit = 5
  40.  
  41.  
  42. DO UNTIL TIMER > t# + timelimit
  43.     CLS
  44.     loopsran = loopsran + 1
  45.  
  46.  
  47.     'RUN THROUGH WORDS
  48.     FOR n = 1 TO keycount(keys, 0)
  49.         w$ = word$(keycount(keys, n))
  50.         l = LEN(w$)
  51.         FOR i = 1 TO l
  52.             templtr = ASC(w$, i)
  53.             FOR x = 1 TO lenin '<<<<<<<<<<<<<<<<< main fix #1
  54.                 IF ltr(x) = templtr THEN
  55.                     ltr(x) = 0
  56.                     count = count + 1
  57.                     EXIT FOR
  58.                 END IF
  59.             NEXT x
  60.             IF count <> i GOTO skipmore
  61.         NEXT i
  62.  
  63.         PRINT w$,
  64.  
  65.         skipmore:
  66.         $CHECKING:OFF
  67.         _MEMPUT m1, m1.OFFSET, backup()
  68.         $CHECKING:ON
  69.         count = 0
  70.     NEXT n
  71.  
  72.     PRINT "DONE..."
  73. PRINT USING "###,###,###,###,### loops ran in ##.# seconds"; loopsran, timelimit
  74.  
  75. 'for debugging
  76. FUNCTION letters$ ()
  77.     FOR i = 1 TO 9
  78.         b$ = b$ + ltr$(i)
  79.     NEXT
  80.     letters$ = "*" + b$ + "*"
  81.  

Instead of a FOR...LOOP to reset the search list, I simply restored it with a backup array with _MEM.

From:

        FOR m = 1 TO 9 'main fix #2 n's  to m's
            ltr$(m) = MID$(in$, m, 1)
        NEXT m

To:

        $CHECKING:OFF
        _MEMPUT m1, m1.OFFSET, backup()
        $CHECKING:ON

*****************
*****************

I'm now getting ~900 runs in a short 5 second period.  I'd call that fast enough for just about anything I'd need to use it for.  (And a nice improvement from the original 22 runs in 5 seconds.)  ;)
« Last Edit: October 07, 2018, 05:29:51 pm by SMcNeill »
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #12 on: October 07, 2018, 05:40:28 pm »
I am getting 363 cycles for abcdefghi in 5 secs with Windows 10 Intel Core i5 processor without any preprocessing of the word$() array.

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
Re: WordCracker
« Reply #13 on: October 07, 2018, 07:24:40 pm »
Steve your last code improvement is running 1067 or so loops on my machine.

Yeah OK, checking word length first does make sense and adds another 100 loops in 5 secs:
Code: QB64: [Select]
  1. ' WordCrack mod 1.bas B+ 2018-10-07
  2.  
  3. ' now with timer mod
  4.  
  5. 'INPUT "Enter a string to build words from "; in$
  6. in$ = "abcdefghi"
  7. in$ = LCASE$(in$)
  8. lenin = LEN(in$)
  9.  
  10. 'load words
  11. OPEN "WordList.txt" FOR BINARY AS #1
  12. gulp& = LOF(1)
  13. buff$ = STRING$(gulp&, " ")
  14. GET #1, , buff$
  15. REDIM word$(0)
  16. Split buff$, CHR$(10), word$()
  17. filecount& = UBOUND(word$)
  18. start! = TIMER
  19. WHILE TIMER - start! < 5
  20.     'RUN THROUGH WORDS
  21.     FOR n& = 0 TO filecount&
  22.         IF LEN(word$(n&)) <= lenin THEN
  23.             c$ = in$
  24.             OK% = -1
  25.             FOR i% = 1 TO LEN(word$(n&))
  26.                 p% = INSTR(c$, MID$(word$(n&), i%, 1))
  27.                 IF p% = 0 THEN
  28.                     OK% = 0: EXIT FOR
  29.                 ELSE
  30.                     MID$(c$, p%, 1) = "+"
  31.                 END IF
  32.             NEXT
  33.             IF OK% THEN PRINT word$(n&); ", ";
  34.         END IF
  35.     NEXT
  36.     counter = counter + 1
  37. PRINT "Loop count in 5 secs ="; counter
  38.  
  39. SUB Split (mystr AS STRING, delim AS STRING, arr() AS STRING)
  40.     ' bplus modifications of Galleon fix of Bulrush Split reply #13
  41.     ' http://www.[abandoned, outdated and now likely malicious qb64 dot net website - don’t go there]/forum/index.php?topic=1612.0
  42.     ' this sub further developed and tested here: \test\Strings\Split test.bas
  43.     DIM copy AS STRING, p AS LONG, curpos AS LONG, arrpos AS LONG, lc AS LONG, dpos AS LONG
  44.     copy = mystr 'make copy since we are messing with mystr
  45.     'special case if delim is space, probably want to remove all excess space
  46.     IF delim = " " THEN
  47.         copy = RTRIM$(LTRIM$(copy))
  48.         p = INSTR(copy, "  ")
  49.         WHILE p > 0
  50.             copy = MID$(copy, 1, p - 1) + MID$(copy, p + 1)
  51.             p = INSTR(copy, "  ")
  52.         WEND
  53.     END IF
  54.     curpos = 1
  55.     arrpos = 0
  56.     lc = LEN(copy)
  57.     dpos = INSTR(curpos, copy, delim)
  58.     DO UNTIL dpos = 0
  59.         arr(arrpos) = MID$(copy, curpos, dpos - curpos)
  60.         arrpos = arrpos + 1
  61.         REDIM _PRESERVE arr(arrpos + 1) AS STRING
  62.         curpos = dpos + LEN(delim)
  63.         dpos = INSTR(curpos, copy, delim)
  64.     LOOP
  65.     arr(arrpos) = MID$(copy, curpos)
  66.     REDIM _PRESERVE arr(arrpos) AS STRING
  67.  
« Last Edit: October 07, 2018, 08:02:23 pm by bplus »

Offline codeguy

  • Forum Regular
  • Posts: 174
Re: WordCracker
« Reply #14 on: October 08, 2018, 12:23:37 am »
generates strings and retrieves matches for permutations of letters creating words up to 8 characters long. in order too. On my humble machine, (39/64)s. Because permutations are generated in lexical order AND the wordlist is in lexical order (both ascending), this becomes nothing more than a simple merge/find algorithm, the same used for batch database updates. Quick? Judge for yourself. found 58 matches in 322560 variable-length permutations. I am uncertain how many loops/second this code executes, but I'm gonna say it's 322560/(39/64), roughly 529329 trials/second on my humble 2.16GHz machine. On my machine, this will also execute about 8 times, perhaps a wee bit more. I am uncertain how fast Steve's CPU is, but mine is no speed demon nor is it super slow. Mine is roughly 240,000 trials/GHz.
Code: QB64: [Select]
  1. WIDTH 80, 43
  2. DIM parray(0 TO 7) AS LONG '* checks words up to 8 characters long
  3. theword$ = "sequoias"
  4. FOR i = 0 TO UBOUND(parray)
  5.     parray(i) = ASC(theword$, i + 1)
  6. s& = LBOUND(parray)
  7. h& = UBOUND(parray)
  8.     an& = s& + 1
  9.     FOR q& = an& TO h&
  10.         IF parray(q&) < parray(s&) THEN
  11.             SWAP parray(q&), parray(s&)
  12.         END IF
  13.     NEXT
  14.     s& = an&
  15. LOOP WHILE s& <= h&
  16. Wlist% = FREEFILE
  17. OPEN ".\wordlist.txt" FOR BINARY AS Wlist%
  18. PRINT LOF(Wlist%)
  19. chunk$ = INPUT$(LOF(Wlist%), Wlist%)
  20. CLOSE Wlist%
  21. REDIM words$(0 TO 999999)
  22. Wct& = 0
  23. FOR u& = 1 TO LEN(chunk$)
  24.     IF ASC(chunk$, u&) = 10 OR u& > LEN(chunk$) THEN
  25.         IF LEN(w$) <= 8 THEN
  26.             words$(Wct&) = w$
  27.             PRINT w$; Wct&
  28.             Wct& = Wct& + 1
  29.         END IF
  30.         w$ = ""
  31.     ELSE
  32.         w$ = w$ + MID$(chunk$, u&, 1)
  33.     END IF
  34. chunk$ = ""
  35. WordsIndex& = 0
  36. nmatches& = 0
  37. '_DELAY 60
  38. t! = TIMER(.001)
  39. PRINT "permutations"
  40. Permute parray(), 0, UBOUND(parray), np#, words$(), WordIndex&, Wct&, nmatches&, npermtrials#
  41. f! = TIMER(.001)
  42. PRINT f! - t!; nmatches&; np#; npermtrials#
  43.     x$ = INKEY$
  44. LOOP UNTIL x$ > ""
  45.  
  46. SUB DisplayResults (PArray() AS LONG, start, finish, np AS DOUBLE)
  47.     DIM i AS LONG
  48.     PRINT USING "#,###,###,###,###"; np;
  49.     FOR i = LBOUND(parray) TO UBOUND(parray)
  50.         PRINT PArray(i);
  51.     NEXT
  52.     PRINT
  53.  
  54. SUB Rotate (parray() AS LONG, Start AS LONG, finish AS LONG)
  55.     DIM ts AS LONG
  56.     ts = parray(Start)
  57.     FOR i = Start TO finish - 1
  58.         SWAP parray(i), parray(i + 1)
  59.     NEXT
  60.     parray(finish) = ts
  61.  
  62. SUB Permute (parray() AS LONG, start AS LONG, finish AS LONG, np AS DOUBLE, words$(), index&, wct&, matchcount&, NPermsTried#)
  63.     np = np + 1
  64.     IF index& < wct& THEN
  65.         FOR a& = 0 TO UBOUND(parray)
  66.             v$ = array2word$(parray(), LBOUND(parray), LBOUND(parray) + a&)
  67.             DO
  68.                 IF words$(index&) < v$ THEN
  69.                     index& = index& + 1
  70.                 ELSE
  71.                     NPermsTried# = NPermsTried# + 1
  72.                     IF words$(index&) = v$ THEN
  73.                         matchcount& = matchcount& + 1
  74.                         PRINT v$; ":match:"; words$(index&); index&; matchcount&
  75.                     END IF
  76.                     EXIT DO
  77.                 END IF
  78.             LOOP
  79.         NEXT
  80.     ELSE
  81.         EXIT SUB
  82.     END IF
  83.     IF start < finish THEN
  84.         DIM i AS LONG
  85.         DIM j AS LONG
  86.         FOR i = finish - 1 TO start STEP -1
  87.             FOR j = i + 1 TO finish
  88.                 SWAP parray(i), parray(j)
  89.                 Permute parray(), i + 1, finish, np, words$(), index&, wct&, matchcount&, NPermsTried#
  90.             NEXT
  91.             Rotate parray(), i, finish
  92.         NEXT
  93.     END IF
  94.  
  95. FUNCTION array2word$ (parray() AS LONG, start AS LONG, finish AS LONG)
  96.     m$ = SPACE$(finish - start + 1)
  97.     POSITION& = 1
  98.     FOR z& = start TO finish
  99.         MID$(m$, POSITION&) = CHR$(parray(z&))
  100.         POSITION& = POSITION& + 1
  101.     NEXT
  102.     array2word$ = m$
  103.  
« Last Edit: October 08, 2018, 01:32:00 am by codeguy »