Author Topic: Word ladder - Rosetta Code  (Read 21191 times)

0 Members and 1 Guest are viewing this topic.

Offline david_uwi

  • Newbie
  • Posts: 71
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #30 on: September 10, 2021, 02:29:23 pm »
you can replace
if w(i)="*****" then 500
with
if w(i)<>"*****" then
.
.
end if
and put an "end if" before the 500 label. I seems to make it go faster - who would have thought.

The matrix z4(a,b) is storing all the letter movements. The matrix q4(a,b) is storing all the words that can be formed on each iteration. Maybe they could be combined somehow but it could be a big job. I am having difficulty remembering how it all works as this is a remnant from previous versions which has survived.
If memory storage was still at a premium I would have to work something out, but it isn't.

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #31 on: September 11, 2021, 08:38:06 pm »
In studying david_uwi code first thing I notice is that there is no Type and specially no appending to arrays using REDIM _Preserve

so throw out these:
Code: QB64: [Select]
  1. SUB Split (SplitMeString AS STRING, delim AS STRING, loadMeArray() AS STRING)
  2.     DIM curpos AS LONG, arrpos AS LONG, LD AS LONG, dpos AS LONG 'fix use the Lbound the array already has
  3.     curpos = 1: arrpos = LBOUND(loadMeArray): LD = LEN(delim)
  4.     dpos = INSTR(curpos, SplitMeString, delim)
  5.     DO UNTIL dpos = 0
  6.         loadMeArray(arrpos) = MID$(SplitMeString, curpos, dpos - curpos)
  7.         arrpos = arrpos + 1
  8.         IF arrpos > UBOUND(loadMeArray) THEN REDIM _PRESERVE loadMeArray(LBOUND(loadMeArray) TO UBOUND(loadMeArray) + 1000) AS STRING
  9.         curpos = dpos + LD
  10.         dpos = INSTR(curpos, SplitMeString, delim)
  11.     LOOP
  12.     loadMeArray(arrpos) = MID$(SplitMeString, curpos)
  13.     REDIM _PRESERVE loadMeArray(LBOUND(loadMeArray) TO arrpos) AS STRING 'get the ubound correct
  14.  
  15. SUB cAppend (arr() AS connectType, cWrd$, w$)
  16.     REDIM _PRESERVE arr(LBOUND(arr) TO UBOUND(arr) + 1) AS connectType
  17.     arr(UBOUND(arr)).connect = cWrd$
  18.     arr(UBOUND(arr)).word = w$
  19.  

Make simple arrays that are over dim'd and use max indexes to track last item in arrays instead and you get this:
Code: QB64: [Select]
  1. _Title "Word ladder v5 - Rosetta Code" 'b+ start 2021-09-03 ref: http://rosettacode.org/wiki/Word_ladder
  2. ' 2021-09-10  v5 is modified from v3 (original) plus times and TheList uBound added to the report
  3. ' minus connectType will use 2 simple oversized arrays that have their own indexes
  4. ' OK for this version I cut 2.3 mins off v3 version by above changes, working my way towards david_uwi
  5. ' best time.
  6.  
  7. DefLng A-Z
  8. Dim start!
  9. start! = Timer(.001)
  10. Print ladder$("boy", "man") '     quick
  11. Print ladder$("girl", "lady") '   this takes awhile
  12. Print ladder$("john", "jane") '   quick enough
  13. Print ladder$("alien", "drool") ' cool but takes a long long time!
  14. Print ladder$("child", "adult") ' and this takes awhile
  15. Print ladder$("play", "ball") '   goes quick
  16. Print ladder$("fun", "job") '     ditto
  17. Print Timer(.001) - start!
  18.  
  19.  
  20. Function ladder$ (w1$, w2$) ' assume two words are same length
  21.     Dim start!, wl, wd$, maxWI, connect$, i, found, j, done, listplace, target$, trail$, testChange, lastChange
  22.     'If Len(w1$) <> Len(w2$) Then ladder$ = "Error: the 2 words have different letter amounts": Exit Function
  23.     start! = Timer(.001)
  24.     wl = Len(w1$)
  25.     Dim w$(4100) ' over dim so no append or redim preserve, miminize if's
  26.     Open "unixdict.txt" For Input As #1
  27.     While EOF(1) = 0
  28.         Input #1, wd$ ' w is for word
  29.         If Len(wd$) = wl Then maxWI = maxWI + 1: w$(maxWI) = wd$ 'words all same length in strings, save time no seperators
  30.     Wend
  31.     Close #1 ' done with file
  32.     ReDim ListConnects$(1000000), ListWords$(1000000), maxLI
  33.     connect$ = w1$
  34.     newConnect:
  35.     For i = 1 To maxWI
  36.         If oneChange%(connect$, w$(i)) Then
  37.             If maxLI = 0 Then
  38.                 maxLI = 1
  39.                 ListConnects$(1) = connect$
  40.                 ListWords$(1) = w$(i)
  41.             Else ' add to list only if word isn't a connect
  42.                 found = 0
  43.                 For j = 1 To maxLI
  44.                     If w$(i) = ListConnects$(j) Then found = -1: Exit For
  45.                 Next
  46.                 If found = 0 And w$(i) <> connect$ Then
  47.                     maxLI = maxLI + 1
  48.                     ListConnects$(maxLI) = connect$
  49.                     ListWords$(maxLI) = w$(i)
  50.                 End If
  51.             End If
  52.             If w$(i) = w2$ Then done = -1: Exit For
  53.         End If
  54.     Next
  55.     If done = 0 Then
  56.         listplace = listplace + 1
  57.         If listplace > maxLI Then
  58.             ladder$ = "Could NOT connect " + w1$ + " to " + w2$
  59.         Else
  60.             connect$ = ListWords$(listplace)
  61.             GoTo newConnect
  62.         End If
  63.     Else
  64.         'should be able to backtrack our path?
  65.         ladder$ = "Could connect " + w1$ + " to " + w2$
  66.         target$ = w2$: trail$ = w2$
  67.         again:
  68.         For i = 1 To maxLI
  69.             If ListWords$(i) = target$ Then target$ = ListConnects$(i): trail$ = target$ + " > " + trail$: GoTo again
  70.         Next
  71.         ladder$ = ladder$ + Chr$(10) + trail$
  72.     End If
  73.     ladder$ = ladder$ + Chr$(10) + Str$(Timer(.001) - start!) + " secs, TheList() has" + Str$(maxLI) + " items."
  74.  
  75. Function oneChange% (last$, test$)
  76.     Dim i, strike, ll
  77.     ll = Len(last$)
  78.     For i = 1 To ll
  79.         If Mid$(last$, i, 1) <> Mid$(test$, i, 1) Then strike = strike + 1
  80.     Next
  81.     If strike = 1 Then oneChange% = -1

and cut 2.3 mins or so off time!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #32 on: September 11, 2021, 08:50:05 pm »
But david_uwi is tracking the change place when connecting one word to next and somehow reducing the number of new words to continue the path to target word. After much fiddling I track the change position going from connect word to next and am able to reduce the number of new connections to check, not nearly as few as david_uwi but significantly less than I was having to check. Now my time is down to between 90 and 100 secs comparing to my 3rd mod of david_uwi's code to get the 2 output screens about the same for comparing so
my v6 version of Word Ladder:
Code: QB64: [Select]
  1. _Title "Word ladder v6 - Rosetta Code" 'b+ start 2021-09-03 ref: http://rosettacode.org/wiki/Word_ladder
  2. ' 2021-09-10  v5 is modified from v3 (original) plus times and TheList uBound added to the report
  3. ' minus connectType will use 2 simple oversized arrays that have their own indexes
  4. ' OK for this version I cut 2.3 mins off v3 version by above changes, working my way towards david_uwi
  5. ' best time.
  6. ' 2021-09-11 v6 try to use the fact that a change of letter at position x cannot occur between n and n+1 on a path.
  7. ' eg boy > bot > bon will never happen in a shortest path that is 2 changes in a row of the 3rd letter.
  8. ' so here in v6 we will make a third array d() d for delta that tracks the change from connect to word
  9. ' modify oneChange Function to Change and it will return the letter position changed if only one letter was changed.
  10.  
  11. DefLng A-Z
  12. Dim start!
  13. start! = Timer(.001)
  14. ladder "boy", "man" '     quick
  15. ladder "girl", "lady" '   this takes awhile
  16. ladder "john", "jane" '   quick enough
  17. ladder "alien", "drool" ' cool but takes a long long time!
  18. ladder "child", "adult" ' and this takes awhile
  19. ladder "play", "ball" '   goes quick
  20. ladder "fun", "job" '     ditto
  21. Print: Print "Total time:"; Timer(.001) - start!
  22.  
  23. Sub ladder (w1$, w2$) ' assume two words are same length
  24.     Dim start!, wl, wd$, maxWI, connect$, i, found, j, done, listplace, target$, trail$, testChange, lastChange
  25.     'If Len(w1$) <> Len(w2$) Then ladder$ = "Error: the 2 words have different letter amounts": Exit Function
  26.     start! = Timer(.001)
  27.     wl = Len(w1$)
  28.     Dim w$(4100) ' over dim so no append or redim preserve, miminize if's
  29.     Open "unixdict.txt" For Input As #1
  30.     While EOF(1) = 0
  31.         Input #1, wd$ ' w is for word
  32.         If Len(wd$) = wl Then maxWI = maxWI + 1: w$(maxWI) = wd$ 'words all same length in strings, save time no seperators
  33.     Wend
  34.     Close #1 ' done with file
  35.     ReDim ListConnects$(1000000), ListWords$(1000000), ListChanges(1000000), maxLI
  36.     connect$ = w1$
  37.     newConnect:
  38.     For i = 1 To maxWI
  39.         testChange = Change%(connect$, w$(i))
  40.         If testChange Then
  41.             If maxLI = 0 Then
  42.                 maxLI = 1
  43.                 ListConnects$(1) = connect$
  44.                 ListWords$(1) = w$(i)
  45.             Else ' add to list only if word isn't a connect
  46.                 found = 0
  47.                 For j = 1 To maxLI
  48.                     If w$(i) = ListConnects$(j) Then found = -1: Exit For
  49.                 Next
  50.                 If found = 0 And w$(i) <> connect$ Then
  51.                     If testChange <> lastChange Then
  52.                         maxLI = maxLI + 1
  53.                         ListConnects$(maxLI) = connect$
  54.                         ListWords$(maxLI) = w$(i)
  55.                         ListChanges(maxLI) = testChange
  56.                     End If
  57.                 End If
  58.             End If
  59.             If w$(i) = w2$ Then done = -1: Exit For
  60.         End If
  61.     Next
  62.     If done = 0 Then
  63.         listplace = listplace + 1
  64.         If listplace > maxLI Then
  65.             Print "Could NOT connect " + w1$ + " to " + w2$
  66.         Else
  67.             connect$ = ListWords$(listplace)
  68.             lastChange = ListChanges(listplace)
  69.             GoTo newConnect
  70.         End If
  71.     Else
  72.         'should be able to backtrack our path?
  73.         'ladder$ = "Could connect " + w1$ + " to " + w2$
  74.         target$ = w2$: trail$ = w2$
  75.         again:
  76.         For i = 1 To maxLI
  77.             If ListWords$(i) = target$ Then target$ = ListConnects$(i): trail$ = target$ + " > " + trail$: GoTo again
  78.         Next
  79.         Print trail$
  80.     End If
  81.     Print Str$(Timer(.001) - start!) + " secs, Max index:" + Str$(maxLI)
  82.  
  83. Function Change% (last$, test$)
  84.     Dim i, strike, L, saveI
  85.     L = Len(last$)
  86.     For i = 1 To L
  87.         If Mid$(last$, i, 1) <> Mid$(test$, i, 1) Then strike = strike + 1: saveI = i
  88.     Next
  89.     If strike = 1 Then Change% = saveI
  90.  
  91.  

and david_uwi mod 3 by bplus:
Code: QB64: [Select]
  1. _Title "Word ladder v3 b+ mod david_uwi" '2021-09-10 david has already improved his first version!
  2. ' ref:  https://www.qb64.org/forum/index.php?topic=4157.msg135293#msg135293
  3. ' get original version above, I am modifying below to study
  4. ' 2021-09-11 in v3 b+ mod david_uwi I modify to compare whole set to my orignal whole set
  5.  
  6. Dim start!
  7. start! = Timer(.001)
  8. ladder "boy", "man" '     quick
  9. ladder "girl", "lady" '   this takes awhile
  10. ladder "john", "jane" '   quick enough
  11. ladder "alien", "drool" ' cool but takes a long long time!
  12. ladder "child", "adult" ' and this takes awhile
  13. ladder "play", "ball" '   goes quick
  14. ladder "fun", "job" '     ditto
  15. Print: Print "Total time:"; Timer(.001) - start!
  16.  
  17. Sub ladder (q1$, q2$)
  18.     Open "unixdict.txt" For Input As #1
  19.     'OPEN "c:\cw1\english3.txt" FOR INPUT AS #1 'bigger dictionary
  20.     Dim w(10000) As String * 5 'make sure we have enough storage!!
  21.     Dim q4(10000, 100) As String * 5 ' fixed string! storing connect words
  22.     Dim k1(100)
  23.     Dim z4(10000, 100) As String
  24.     tt = Timer(.001) 'include time taken to load to RAM bplus mod to accuracy (.001)
  25.  
  26.     n = Len(q1$)
  27.     If n < 5 Then q1$ = q1$ + Space$(5 - n): q2$ = q2$ + Space$(5 - n)
  28.     Do Until EOF(1)
  29.         Input #1, a$
  30.         If Len(a$) = n Then
  31.             k = k + 1
  32.             w(k) = a$
  33.             If a$ = q1$ Then w(k) = "*****"
  34.         End If
  35.     Loop
  36.     Close #1
  37.     'tt = TIMER
  38.     jk = 1
  39.     k1(jk) = 1
  40.     q4(1, 1) = q1$
  41.     Do
  42.         For i = k To 1 Step -1
  43.             If w(i) = "*****" Then _Continue '  500
  44.             cnt = 0
  45.             For kk = 1 To k1(jk)
  46.                 cnt = 0
  47.  
  48.                 For j = 1 To n
  49.                     If Mid$(w(i), j, 1) = Mid$(q4(kk, jk), j, 1) Then cnt = cnt + 1 Else zz = j
  50.                 Next j
  51.                 If cnt = n - 1 Then
  52.                     k1(jk + 1) = k1(jk + 1) + 1
  53.                     q4(k1(jk + 1), jk + 1) = w(i)
  54.                     z4(k1(jk + 1), jk + 1) = z4(kk, jk) + Mid$(w(i), zz, 1) + Chr$(zz + 48) + " "
  55.                     w(i) = "*****"
  56.                 End If
  57.             Next kk
  58.         Next i
  59.         kflag = 0
  60.         For i = 1 To k1(jk + 1)
  61.             If q4(i, jk + 1) = q2$ Then kflag = 99: final$ = z4(i, jk + 1)
  62.         Next i
  63.         'Print
  64.         'Print jk; k1(jk + 1)
  65.         If k1(jk + 1) = 0 Then kflag = 99
  66.         jk = jk + 1
  67.         If kflag = 99 Then Exit Do
  68.     Loop
  69.     If k1(jk) = 0 Then Print "No path found! from "; q1$; " to "; q2$; ' b+ removed a print blank line
  70.     If k1(jk) > 0 Then
  71.         xlen = Len(final$)
  72.         'Print:
  73.         Print q1$; " ";
  74.         For i = 1 To xlen Step 3
  75.             c1$ = Mid$(final$, i, 1)
  76.             c2$ = Mid$(final$, i + 1, 1)
  77.             Mid$(q1$, Val(c2$), 1) = c1$
  78.             Print q1$; " ";
  79.         Next i
  80.     End If
  81.     Print: Print "time taken = "; Timer(.001) - tt; " seconds"
  82.  

now we have 2 screens to compare: see attached

So I am closing in :)

My version 6 versus my 3rd mod of david_uwi.PNG
* My version 6 versus my 3rd mod of david_uwi.PNG (Filesize: 34.01 KB, Dimensions: 1283x368, Views: 172)
« Last Edit: September 11, 2021, 08:52:07 pm by bplus »

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #33 on: September 11, 2021, 08:56:42 pm »
But ho!

I see a way to cut david_uwi's time very significantly by gulping in the file, splitting it and then processing it:
Code: QB64: [Select]
  1. _Title "Word ladder v4 b+ mod david_uwi" '2021-09-10 david has already improved his first version!
  2. ' ref:  https://www.qb64.org/forum/index.php?topic=4157.msg135293#msg135293
  3. ' get original version above, I am modifying below to study
  4. ' 2021-09-11 in v3 b+ mod david_uwi I modify to compare whole set to my orignal whole set
  5. ' 2021-09-11 I think I can load the words faster
  6.  
  7. Dim start!
  8. start! = Timer(.001)
  9. ladder "boy", "man" '     quick
  10. ladder "girl", "lady" '   this takes awhile
  11. ladder "john", "jane" '   quick enough
  12. ladder "alien", "drool" ' cool but takes a long long time!
  13. ladder "child", "adult" ' and this takes awhile
  14. ladder "play", "ball" '   goes quick
  15. ladder "fun", "job" '     ditto
  16. Print: Print "Total time:"; Timer(.001) - start!
  17.  
  18. Sub ladder (q1$, q2$)
  19.  
  20.     'Open "unixdict.txt" For Input As #1
  21.     'OPEN "c:\cw1\english3.txt" FOR INPUT AS #1 'bigger dictionary
  22.     Dim w(10000) As String * 5 'make sure we have enough storage!!
  23.     Dim q4(10000, 100) As String * 5 ' fixed string! storing connect words
  24.     Dim k1(100)
  25.     Dim z4(10000, 100) As String
  26.     tt = Timer(.001) 'include time taken to load to RAM bplus mod to accuracy (.001)
  27.  
  28.     Open "unixdict.txt" For Binary As #1 ' take the file in a gulp
  29.     buf$ = Space$(LOF(1))
  30.     Get #1, , buf$
  31.     Close #1
  32.     ReDim Fwords$(1 To 1), FI
  33.     Split buf$, Chr$(10), Fwords$()
  34.     FI = 1
  35.     n = Len(q1$)
  36.     If n < 5 Then q1$ = q1$ + Space$(5 - n): q2$ = q2$ + Space$(5 - n)
  37.     While FI <= UBound(fwords$)
  38.         If Len(Fwords$(FI)) = n Then
  39.             k = k + 1
  40.             If Fwords$(FI) = q1$ Then w(k) = "*****" Else w(k) = Fwords$(FI)
  41.         End If
  42.         FI = FI + 1
  43.     Wend
  44.     'tt = TIMER
  45.     jk = 1
  46.     k1(jk) = 1
  47.     q4(1, 1) = q1$
  48.     Do
  49.         For i = k To 1 Step -1
  50.             If w(i) = "*****" Then _Continue '  500
  51.             cnt = 0
  52.             For kk = 1 To k1(jk)
  53.                 cnt = 0
  54.                 For j = 1 To n
  55.                     If Mid$(w(i), j, 1) = Mid$(q4(kk, jk), j, 1) Then cnt = cnt + 1 Else zz = j
  56.                 Next j
  57.                 If cnt = n - 1 Then
  58.                     k1(jk + 1) = k1(jk + 1) + 1
  59.                     q4(k1(jk + 1), jk + 1) = w(i)
  60.                     z4(k1(jk + 1), jk + 1) = z4(kk, jk) + Mid$(w(i), zz, 1) + Chr$(zz + 48) + " "
  61.                     w(i) = "*****"
  62.                 End If
  63.             Next kk
  64.         Next i
  65.         kflag = 0
  66.         For i = 1 To k1(jk + 1)
  67.             If q4(i, jk + 1) = q2$ Then kflag = 99: final$ = z4(i, jk + 1)
  68.         Next i
  69.         'Print
  70.         'Print jk; k1(jk + 1)
  71.         If k1(jk + 1) = 0 Then kflag = 99
  72.         jk = jk + 1
  73.         If kflag = 99 Then Exit Do
  74.     Loop
  75.     If k1(jk) = 0 Then Print "No path found! from "; q1$; " to "; q2$; ' b+ removed a print blank line
  76.     If k1(jk) > 0 Then
  77.         xlen = Len(final$)
  78.         'Print:
  79.         Print q1$; " ";
  80.         For i = 1 To xlen Step 3
  81.             c1$ = Mid$(final$, i, 1)
  82.             c2$ = Mid$(final$, i + 1, 1)
  83.             Mid$(q1$, Val(c2$), 1) = c1$
  84.             Print q1$; " ";
  85.         Next i
  86.     End If
  87.     Print: Print "time taken = "; Timer(.001) - tt; " seconds"
  88.  
  89. Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
  90.     Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
  91.     curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
  92.     dpos = InStr(curpos, SplitMeString, delim)
  93.     Do Until dpos = 0
  94.         loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
  95.         arrpos = arrpos + 1
  96.         If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
  97.         curpos = dpos + LD
  98.         dpos = InStr(curpos, SplitMeString, delim)
  99.     Loop
  100.     loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
  101.     ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
  102.  
  103.  

 
Word ladder v4 david_uwi b+ mod 4.PNG



Offline johnno56

  • Forum Resident
  • Posts: 1270
  • Live long and prosper.
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #34 on: September 11, 2021, 09:12:13 pm »
Cool... 9.19  Sooo much faster... Nicely done!
Logic is the beginning of wisdom.

Offline david_uwi

  • Newbie
  • Posts: 71
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #35 on: September 12, 2021, 03:31:18 am »
Yes it needed tidying up thanks for doing that. I am not familiar with many of the new QB64 commands.
I particularly like the _continue (like I'm using FORTRAN again).
The kflag=99 is also a bit of a mess, but apart from that I think it is a wrap!


Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #36 on: September 12, 2021, 12:27:01 pm »
Another 2.5 sec shaved off with just obvious time saving stuff:
- access disk file once for whole set and share the array and it's ubound
- default to Long type instead of Single
- removed _continue, though I'm glad david-uwi became aware of option and it doesn't save time, it just looks cleaner code wise. It was david's idea anyway.

Code: QB64: [Select]
  1. _Title "Word ladder v5 b+ mod david_uwi" '2021-09-10 david has already improved his first version!
  2. ' ref:  https://www.qb64.org/forum/index.php?topic=4157.msg135293#msg135293
  3. ' get original version above, I am modifying below to study
  4. ' 2021-09-11 in v3 b+ mod david_uwi I modify to compare whole set to my orignal whole set
  5. ' 2021-09-11 in v4 I think I can load the words faster
  6. ' 2021-09-11 in v5 lets just call the external file once, use david's idea instead of _continue,
  7. ' I think it looks cleaner without _continue which tool out skipping a For loop by a THEN (GOTO) line #.
  8. ' Ave of 5 runs before this (mod 4) was 11.27.. see if we improve still more, also DefLng A-Z  oh wow.
  9. ' Ave of 5 runs now 8.74 another 2.5 secs shaved off
  10.  
  11. DefLng A-Z ' 2 secs off from old Single default!
  12. ReDim Shared Fwords$(1 To 1), UBF ' ubound of Fwords$
  13. start! = Timer(.001)
  14.  
  15. ' do just once for all ladder calls
  16. Open "unixdict.txt" For Binary As #1 ' take the file in a gulp
  17. buf$ = Space$(LOF(1))
  18. Get #1, , buf$
  19. Split buf$, Chr$(10), Fwords$()
  20. UBF = UBound(fwords$)
  21.  
  22. ' test set of ladder calls
  23. ladder "boy", "man" '     quick
  24. ladder "girl", "lady" '   this takes awhile
  25. ladder "john", "jane" '   quick enough
  26. ladder "alien", "drool" ' cool but takes a long long time!
  27. ladder "child", "adult" ' and this takes awhile
  28. ladder "play", "ball" '   goes quick
  29. ladder "fun", "job" '     ditto
  30. Print: Print "Total time including one disk file access:"; Timer(.001) - start!
  31.  
  32. Sub ladder (q1$, q2$)
  33.     tt! = Timer(.001) 'include time taken to load to RAM bplus mod to accuracy (.001)
  34.     Dim w(10000) As String * 5 'make sure we have enough storage!!
  35.     Dim q4(10000, 100) As String * 5 ' fixed string! storing connect words
  36.     Dim k1(100)
  37.     Dim z4(10000, 100) As String
  38.     FI = 1
  39.     n = Len(q1$)
  40.     If n < 5 Then q1$ = q1$ + Space$(5 - n): q2$ = q2$ + Space$(5 - n)
  41.     While FI <= UBF
  42.         If Len(Fwords$(FI)) = n Then
  43.             k = k + 1
  44.             If Fwords$(FI) = q1$ Then w(k) = "*****" Else w(k) = Fwords$(FI)
  45.         End If
  46.         FI = FI + 1
  47.     Wend
  48.     jk = 1
  49.     k1(jk) = 1
  50.     q4(1, 1) = q1$
  51.     Do
  52.         For i = k To 1 Step -1
  53.             If w(i) <> "*****" Then '_Continue '  500  just check before entering loop
  54.                 cnt = 0
  55.                 For kk = 1 To k1(jk)
  56.                     cnt = 0
  57.                     For j = 1 To n
  58.                         If Mid$(w(i), j, 1) = Mid$(q4(kk, jk), j, 1) Then cnt = cnt + 1 Else zz = j
  59.                     Next j
  60.                     If cnt = n - 1 Then
  61.                         k1(jk + 1) = k1(jk + 1) + 1
  62.                         q4(k1(jk + 1), jk + 1) = w(i)
  63.                         z4(k1(jk + 1), jk + 1) = z4(kk, jk) + Mid$(w(i), zz, 1) + Chr$(zz + 48) + " "
  64.                         w(i) = "*****"
  65.                     End If
  66.                 Next kk
  67.             End If
  68.         Next i
  69.         kflag = 0
  70.         For i = 1 To k1(jk + 1)
  71.             If q4(i, jk + 1) = q2$ Then kflag = 99: final$ = z4(i, jk + 1)
  72.         Next i
  73.         If k1(jk + 1) = 0 Then kflag = 99
  74.         jk = jk + 1
  75.         If kflag = 99 Then Exit Do
  76.     Loop
  77.     If k1(jk) = 0 Then Print "No path found! from "; q1$; " to "; q2$; ' b+ removed a print blank line
  78.     If k1(jk) > 0 Then
  79.         xlen = Len(final$)
  80.         'Print:
  81.         Print q1$; " ";
  82.         For i = 1 To xlen Step 3
  83.             c1$ = Mid$(final$, i, 1)
  84.             c2$ = Mid$(final$, i + 1, 1)
  85.             Mid$(q1$, Val(c2$), 1) = c1$
  86.             Print q1$; " ";
  87.         Next i
  88.     End If
  89.     Print: Print "time taken = "; Timer(.001) - tt!; " seconds"
  90.  
  91. Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
  92.     Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
  93.     curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
  94.     dpos = InStr(curpos, SplitMeString, delim)
  95.     Do Until dpos = 0
  96.         loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
  97.         arrpos = arrpos + 1
  98.         If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
  99.         curpos = dpos + LD
  100.         dpos = InStr(curpos, SplitMeString, delim)
  101.     Loop
  102.     loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
  103.     ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
  104.  
  105.  

I am now averaging 8.74 secs, fluky low 8.08 and highest high 9.1 removed from ave calc, on my aging Windows 10-64 laptop.

« Last Edit: September 12, 2021, 12:30:06 pm by bplus »

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Word ladder - Rosetta Code
« Reply #37 on: September 12, 2021, 01:05:42 pm »
If you’re looking for speed, wouldn’t the quickest way be to built and compare an index and not the actual strings?

For example, I have 10 words: aaa, bbb, ccc, dog, dot, fog, hot, hog, xxx, zzz

I’d build a base index and then a connection index:

1) aaa:
2) bbb:
3) ccc:
4) dog: 5, 6, 8
5) dot: 4, 7
6) fog: 4, 8
7) hot: 5, 8
8) hog: 4, 6, 7

The only words in your list that “dog” can transform into a single letter at a time is word 5, 6, or 8…

If I want to transform word 4 (dog) into word 7 (hot), I’d just run a recursive loop checking only to see if my indexes would ever match.

START: 4
PASS 1, CHECK 1: 5 (no match to 7)
     RECURSIVE PASS 2, CHECK 1 (from 5 to…) 4 (no match to 7)
            RECURSIVE PASS 3, CHECK 1 (from 4 to…) NO CHECK, ALREADY COVERED
     RECURSIVE PASS 2, CHECK 2 (from 5 to…) 7 (match!  abort checking!!)

4 to 5 to 7

OR….

dog to dot to hot

Like this, you’d be building a recursive connection tree using the INTEGER index values, rather than any sort of string comparison.  As for run times, I don’t see why you’d ever have more than a fraction of a second involved in your search routines.

You’re not checking to see if words match via character by character…. You’re just checking to see if one integer value can be chained in such a matter to reach another.

One things for certain: Index linking would definitely remove any differences in search times due to word length.  5 letter words would chain just as quickly via index as 3 letter ones…

Note: Prebuild your initial connection index once and be done with it.  Don’t rebuild it with each and every search run. 
« Last Edit: September 12, 2021, 01:09:52 pm by SMcNeill »
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #38 on: September 12, 2021, 02:17:05 pm »
I like integer indexes for words but you can't skip comparing letters between 2 words to see if they are one change different though using asc instead of mid$ might go faster.

Also for shortest path the letter change has to be a different position between level n and level n+1 in path, so you need the 2 words not indexes to see what letter position changes. Level n+ 2 can have the same change position as level n but can't be same word as n.

(So at least store that also when building list of one offs once.)
« Last Edit: September 12, 2021, 02:22:52 pm by bplus »

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #39 on: September 12, 2021, 02:45:06 pm »
Quote
though using asc instead of mid$ might go faster.

Oh it does go faster, shaved off another 4.29 secs to 4.45 ave run on my system:
Code: QB64: [Select]
  1. _Title "Word ladder v6 b+ mod david_uwi" '2021-09-10 david has already improved his first version!
  2. ' ref:  https://www.qb64.org/forum/index.php?topic=4157.msg135293#msg135293
  3. ' get original version above, I am modifying below to study
  4. ' 2021-09-11 in v3 b+ mod david_uwi I modify to compare whole set to my orignal whole set
  5. ' 2021-09-11 in v4 I think I can load the words faster
  6. ' 2021-09-12 in v5 lets just call the external file once, use david's idea instead of _continue,
  7. ' I think it looks cleaner without _continue which tool out skipping a For loop by a THEN (GOTO) line #.
  8. ' Ave of 5 runs before this (mod 4) was 11.27.. see if we improve still more, also DefLng A-Z  oh wow.
  9. ' Ave of 5 runs now 8.74 another 2.5 secs shaved off
  10. ' 2021-09-12 in v6 though using asc instead of mid$ might go faster.
  11. ' Ave of 5 runs now is 4.45 secs another 4.29 sec off!
  12.  
  13. DefLng A-Z ' 2 secs off from old Single default!
  14. ReDim Shared Fwords$(1 To 1), UBF ' ubound of Fwords$
  15. start! = Timer(.001)
  16.  
  17. ' do just once for all ladder calls
  18. Open "unixdict.txt" For Binary As #1 ' take the file in a gulp
  19. buf$ = Space$(LOF(1))
  20. Get #1, , buf$
  21. Split buf$, Chr$(10), Fwords$()
  22. UBF = UBound(fwords$)
  23.  
  24. ' test set of ladder calls
  25. ladder "boy", "man" '     quick
  26. ladder "girl", "lady" '   this takes awhile
  27. ladder "john", "jane" '   quick enough
  28. ladder "alien", "drool" ' cool but takes a long long time!
  29. ladder "child", "adult" ' and this takes awhile
  30. ladder "play", "ball" '   goes quick
  31. ladder "fun", "job" '     ditto
  32. Print: Print "Total time including one disk file access:"; Timer(.001) - start!
  33.  
  34. Sub ladder (q1$, q2$)
  35.     tt! = Timer(.001) 'include time taken to load to RAM bplus mod to accuracy (.001)
  36.     Dim w(10000) As String * 5 'make sure we have enough storage!!
  37.     Dim q4(10000, 100) As String * 5 ' fixed string! storing connect words
  38.     Dim k1(100)
  39.     Dim z4(10000, 100) As String
  40.     FI = 1
  41.     n = Len(q1$)
  42.     If n < 5 Then q1$ = q1$ + Space$(5 - n): q2$ = q2$ + Space$(5 - n)
  43.     While FI <= UBF
  44.         If Len(Fwords$(FI)) = n Then
  45.             k = k + 1
  46.             If Fwords$(FI) = q1$ Then w(k) = "*****" Else w(k) = Fwords$(FI)
  47.         End If
  48.         FI = FI + 1
  49.     Wend
  50.     jk = 1
  51.     k1(jk) = 1
  52.     q4(1, 1) = q1$
  53.     Do
  54.         For i = k To 1 Step -1
  55.             If w(i) <> "*****" Then '_Continue '  500  just check before entering loop
  56.                 cnt = 0
  57.                 For kk = 1 To k1(jk)
  58.                     cnt = 0
  59.                     For j = 1 To n
  60.                         If Asc(w(i), j) = Asc(q4(kk, jk), j) Then cnt = cnt + 1 Else zz = j
  61.                         'If Mid$(w(i), j, 1) = Mid$(q4(kk, jk), j, 1) Then cnt = cnt + 1 Else zz = j
  62.                     Next j
  63.                     If cnt = n - 1 Then
  64.                         k1(jk + 1) = k1(jk + 1) + 1
  65.                         q4(k1(jk + 1), jk + 1) = w(i)
  66.                         z4(k1(jk + 1), jk + 1) = z4(kk, jk) + Mid$(w(i), zz, 1) + Chr$(zz + 48) + " "
  67.                         w(i) = "*****"
  68.                     End If
  69.                 Next kk
  70.             End If
  71.         Next i
  72.         kflag = 0
  73.         For i = 1 To k1(jk + 1)
  74.             If q4(i, jk + 1) = q2$ Then kflag = 99: final$ = z4(i, jk + 1)
  75.         Next i
  76.         If k1(jk + 1) = 0 Then kflag = 99
  77.         jk = jk + 1
  78.         If kflag = 99 Then Exit Do
  79.     Loop
  80.     If k1(jk) = 0 Then Print "No path found! from "; q1$; " to "; q2$; ' b+ removed a print blank line
  81.     If k1(jk) > 0 Then
  82.         xlen = Len(final$)
  83.         'Print:
  84.         Print q1$; " ";
  85.         For i = 1 To xlen Step 3
  86.             c1$ = Mid$(final$, i, 1)
  87.             c2$ = Mid$(final$, i + 1, 1)
  88.             Mid$(q1$, Val(c2$), 1) = c1$
  89.             Print q1$; " ";
  90.         Next i
  91.     End If
  92.     Print: Print "time taken = "; Timer(.001) - tt!; " seconds"
  93.  
  94. Sub Split (SplitMeString As String, delim As String, loadMeArray() As String)
  95.     Dim curpos As Long, arrpos As Long, LD As Long, dpos As Long 'fix use the Lbound the array already has
  96.     curpos = 1: arrpos = LBound(loadMeArray): LD = Len(delim)
  97.     dpos = InStr(curpos, SplitMeString, delim)
  98.     Do Until dpos = 0
  99.         loadMeArray(arrpos) = Mid$(SplitMeString, curpos, dpos - curpos)
  100.         arrpos = arrpos + 1
  101.         If arrpos > UBound(loadMeArray) Then ReDim _Preserve loadMeArray(LBound(loadMeArray) To UBound(loadMeArray) + 1000) As String
  102.         curpos = dpos + LD
  103.         dpos = InStr(curpos, SplitMeString, delim)
  104.     Loop
  105.     loadMeArray(arrpos) = Mid$(SplitMeString, curpos)
  106.     ReDim _Preserve loadMeArray(LBound(loadMeArray) To arrpos) As String 'get the ubound correct
  107.  
  108.  

I'm sorry I missed it doing the other obvious time shaving things.

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #40 on: September 12, 2021, 03:18:44 pm »
BTW here is some intel on the unixdict.txt file:

As the number of letters increase the number of words increase up to 7 that I tested but makes sense as each letter allows for 26 times more permutations. 3 letter words < 800 and 7 letter < 4100.

The most one off's any one word had was 30, if I recall it was for a 3 letter word that was tan or rhymes with it.

I did preprocess all the words down to n letter word files with the word followed by all the one offs on the same line and then put a count of the one offs for the word at the start of the line (in one version of preprocessing).

The preprocessing took so much more time than what david_uwi got for alien to drool time that I gave up on the preprocessing idea Thurs or Friday.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Word ladder - Rosetta Code
« Reply #41 on: September 12, 2021, 03:31:29 pm »
I like integer indexes for words but you can't skip comparing letters between 2 words to see….

That’s why I mentioned building an index once.  Currently, you’re checking for your word connections repeatedly one letter at a time.  (Unless I’m reading the code wrong.)

                    FOR j = 1 TO n
                        IF ASC(w(i), j) = ASC(q4(kk, jk), j) THEN cnt = cnt + 1 ELSE zz = j
                    NEXT j

Build that connection list once, and then be done with it.

Think of it as solving this mini-problem first: How many words are only one letter different from each other?

bog - cog - dog - fog - gog - hog - jog - log - nog - tog - and so on.

Make that list first, and then you only need to recursively check to see if any of those words connect to the word you’re looking for (or if the words they’re connected to are connected to…).

I figured you guys have worked hard on this with your method, so I’d just suggest an indexing method and let you try it first, before I did.  I really don’t want to just barge in and but into your work.  ;D

If you’re not interested in testing it out however, or if you just want me to see what I can come up with on my own, let me know, and I might try out a few different methods for speed comparisons. 
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Word ladder - Rosetta Code
« Reply #42 on: September 12, 2021, 03:39:08 pm »
The preprocessing took so much more time than what david_uwi got for alien to drool time that I gave up on the preprocessing idea Thurs or Friday.

For preprocessing, do a letter count comparison, like we did in the word boggle routines or the scramble routines before.  Those did word comparisons blazingly fast.

If you guys don’t mind me butting in, I might give this a go in a few days when I get free time.  I can’t swear I can do faster, but it’d be interesting to try.  ;)
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline bplus

  • Global Moderator
  • Forum Resident
  • Posts: 8053
  • b = b + ...
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #43 on: September 12, 2021, 04:44:50 pm »
Have at it Steve always like your input on things usually very helpful.

I've still been meaning to give david_uwi's variable more meaningful names so I can follow his code all the way through. I am thinking the q4 or the z4 arrays, one of them, is saving the one offs list so it isn't done over and over but I can't be sure.

For sure, he is checking way fewer things each round than I in my versions.
« Last Edit: September 12, 2021, 04:46:48 pm by bplus »

Offline johnno56

  • Forum Resident
  • Posts: 1270
  • Live long and prosper.
    • View Profile
Re: Word ladder - Rosetta Code
« Reply #44 on: September 12, 2021, 09:29:21 pm »
New time of 3.88 seconds... Definitely MUCH faster. Well done!
Logic is the beginning of wisdom.