And a timed comparison of the three routines:
a(0) = ""
d(0) = " "
a(1) = " test test test " 'good no error!
d(1) = " "
a(2) = " test"
d(2) = " "
a(3) = "3d,z6d,z1 10 #d,z5"
d(3) = ",z"
a(4) = "Monday, , Wednesday, THursday, Friday, , Sunday"
d(4) = ", "
FOR j
= 1 TO Limit
'repeat the process multiple times so we can time it. SteveSplit a(i), d(i), results1()
FOR j
= 1 TO Limit
'repeat the process multiple times so we can time it. Split a(i), d(i), results2()
FOR j
= 1 TO Limit
'repeat the process multiple times so we can time it. Lsplit a(i), d(i), results3()
PRINT USING "###.#### ###.#### ###.####"; t1#
- t#
, t2#
- t1#
, t3#
- t2#
'notes: REDIM the array(0) to be loaded before calling Split '<<<<<<<<<<<<<<<<<<<<<<< IMPORTANT!!!!
' bplus modifications of Galleon fix of Bulrush Split reply #13
' http://xmaxw.[abandoned, outdated and now likely malicious qb64 dot net website - don’t go there]/forum/index.php?topic=1612.0
' this sub further developed and tested here: \test\Strings\Split test.bas
copy = mystr 'make copy since we are messing with mystr when the delimiter is a space
'special case if delim is space, probably want to remove all excess space
copy
= MID$(copy
, 1, p
- 1) + MID$(copy
, p
+ 1) curpos = 1
arrpos = 0
dpos
= INSTR(curpos
, copy
, delim
) arr
(arrpos
) = MID$(copy
, curpos
, dpos
- curpos
) arrpos = arrpos + 1
curpos
= dpos
+ LEN(delim
) dpos
= INSTR(curpos
, copy
, delim
) arr
(arrpos
) = MID$(copy
, curpos
)
' Luke 2019-02-15
'Split in$ into pieces, chopping at every occurrence of delimiter$. Multiple consecutive occurrences
'of delimiter$ are treated as a single instance. The chopped pieces are stored in result$().
'
'delimiter$ must be one character long.
'result$() must have been REDIMmed previously.
SUB Lsplit
(in$
, delimiter$
, result$
()) start = 1
start = start + 1
finish
= INSTR(start
, in$
, delimiter$
) result$
(UBOUND(result$
)) = MID$(in$
, start
, finish
- start
) start = finish + 1
'Combine all elements of in$() into a single string with delimiter$ separating the elements.
result$ = result$ + delimiter$ + in$(i)
join$ = result$
SUB SteveSplit
(text$
, delimiter$
, storage_array
() AS STRING) count = count + 1
i
= INSTR(text$
, delimiter$
) storage_array
(count
) = LEFT$(text$
, i
- 1) SteveSplit
MID$(text$
, i
+ LEN(delimiter$
)), delimiter$
, storage_array
() storage_array(count) = text$
count = 0
On my machine, the test speeds are as follows for:
SteveSplit , Split (bplus), LSplit (Luke)
TEST #0: 0.1650 , 0.4395, 0.3296
TEST #1: 2.3076, 2.3076, 1.5386
TEST #2: 0.4399, 0.5493, 0.5493
TEST #3: 1.0439, 1.2637,
1.5381 (False results)TEST #4: 1.9229, 2.1431,
2.6924 (False results)In tests 0, 2, 3, 4, SteveSplit ran fastest.
In test 1, LSplit was the fastest routine.
The difference in the speeds in Test #1 is a general philosophy of *how* we behave with the act of splitting. For my routine, we generate several null strings, delimited by spaces. For everyone else, the extra spaces are simply ignored and removed.
Luke's routine is set to only use a single character as a string delimiter, so for test 3 and 4, it produces false results as we have a 2-character delimiter.
******************
As to WHY my routine tosses out those extra null strings, it's to simply uniformly answer the question of, "How would we behave if if used a FIND AND REPLACE to change all the spaces to commas?"
Instead of a(1) = " test test test ", what results would we expect if a(1) = ",test,test,,,,test," and we used a comma as a delimiter instead of a space? Would we not then count the characters between each comma as as null-string?
I'd think so, and if two commas side-by-side designate a null-string result between them, then I feel like two spaces side-by-side should do the same for space delimited data. This concept also allows me to perfectly reproduce the original data when using a join function, without automatically losing those spaces which may (or may not) have been important for formatting or data purposes.
If I don't want the extra spaces, all I need to do is ignore those null-strings.