Samples Gallery & Reference > Utilities
Versatile String Parsing Function by RhoSigma
(1/1)
Junior Librarian:
Versatile String Parsing Function
Author: @RhoSigma
Source: qb64.org Forum
URL: https://www.qb64.org/forum/index.php?topic=4142.0
Version: 2021-08-27
Author's Description:
I guess every developer is sooner or later in need of such a parsing function: Doesn't matter if it's to split a simple text line into its single words, quickly reading CSV data into an array, break up a path specification into the single folder names or get the individual options of a given command line or of an URL query string.
Obviously such a function must be able to recognize several separator chars and needs to be able to suppress the splitting of components in quoted sections. Special to this function is the ability to optionally use different chars for opening quotes and closing quotes, which e.g. allows to read out sections in parenthesis or brackets.
For usage, see the full description available in separate HTML document (compressed file).
Source Code:
--- Code: QB64: --- '--- Full description available in separate HTML document. '--------------------------------------------------------------------- FUNCTION ParseLine& (inpLine$, sepChars$, quoChars$, outArray$(), minUB&) '--- option _explicit requirements --- DIM ilen&, icnt&, slen%, s1%, s2%, s3%, s4%, s5%, q1%, q2% DIM oalb&, oaub&, ocnt&, flag%, ch%, nest%, spos&, epos& '--- so far return nothing --- ParseLine& = -1 '--- init & check some runtime variables --- ilen& = LEN(inpLine$): icnt& = 1 IF ilen& = 0 THEN EXIT FUNCTION slen% = LEN(sepChars$) IF slen% > 0 THEN s1% = ASC(sepChars$, 1) IF slen% > 1 THEN s2% = ASC(sepChars$, 2) IF slen% > 2 THEN s3% = ASC(sepChars$, 3) IF slen% > 3 THEN s4% = ASC(sepChars$, 4) IF slen% > 4 THEN s5% = ASC(sepChars$, 5) IF slen% > 5 THEN slen% = 5 'max. 5 chars, ignore the rest IF LEN(quoChars$) > 0 THEN q1% = ASC(quoChars$, 1): ELSE q1% = 34 IF LEN(quoChars$) > 1 THEN q2% = ASC(quoChars$, 2): ELSE q2% = q1% oalb& = LBOUND(outArray$): oaub& = UBOUND(outArray$): ocnt& = oalb& '--- skip preceding separators --- plSkipSepas: flag% = 0 WHILE icnt& <= ilen& AND NOT flag% ch% = ASC(inpLine$, icnt&) SELECT CASE slen% CASE 0: flag% = -1 CASE 1: flag% = ch% <> s1% CASE 2: flag% = ch% <> s1% AND ch% <> s2% CASE 3: flag% = ch% <> s1% AND ch% <> s2% AND ch% <> s3% CASE 4: flag% = ch% <> s1% AND ch% <> s2% AND ch% <> s3% AND ch% <> s4% CASE 5: flag% = ch% <> s1% AND ch% <> s2% AND ch% <> s3% AND ch% <> s4% AND ch% <> s5% END SELECT icnt& = icnt& + 1 WEND IF NOT flag% THEN 'nothing else? - then exit IF ocnt& > oalb& GOTO plEnd EXIT FUNCTION END IF '--- redim to clear array on 1st word/component --- IF ocnt& = oalb& THEN REDIM outArray$(oalb& TO oaub&) '--- expand array, if required --- plNextWord: IF ocnt& > oaub& THEN oaub& = oaub& + 10 REDIM _PRESERVE outArray$(oalb& TO oaub&) END IF '--- get current word/component until next separator --- flag% = 0: nest% = 0: spos& = icnt& - 1 WHILE icnt& <= ilen& AND NOT flag% IF ch% = q1% AND nest% = 0 THEN nest% = 1 ELSEIF ch% = q1% AND nest% > 0 THEN nest% = nest% + 1 ELSEIF ch% = q2% AND nest% > 0 THEN nest% = nest% - 1 END IF ch% = ASC(inpLine$, icnt&) SELECT CASE slen% CASE 0: flag% = (nest% = 0 AND (ch% = q1%)) OR (nest% = 1 AND ch% = q2%) CASE 1: flag% = (nest% = 0 AND (ch% = s1% OR ch% = q1%)) OR (nest% = 1 AND ch% = q2%) CASE 2: flag% = (nest% = 0 AND (ch% = s1% OR ch% = s2% OR ch% = q1%)) OR (nest% = 1 AND ch% = q2%) CASE 3: flag% = (nest% = 0 AND (ch% = s1% OR ch% = s2% OR ch% = s3% OR ch% = q1%)) OR (nest% = 1 AND ch% = q2%) CASE 4: flag% = (nest% = 0 AND (ch% = s1% OR ch% = s2% OR ch% = s3% OR ch% = s4% OR ch% = q1%)) OR (nest% = 1 AND ch% = q2%) CASE 5: flag% = (nest% = 0 AND (ch% = s1% OR ch% = s2% OR ch% = s3% OR ch% = s4% OR ch% = s5% OR ch% = q1%)) OR (nest% = 1 AND ch% = q2%) END SELECT icnt& = icnt& + 1 WEND epos& = icnt& - 1 IF ASC(inpLine$, spos&) = q1% THEN spos& = spos& + 1 outArray$(ocnt&) = MID$(inpLine$, spos&, epos& - spos&) ocnt& = ocnt& + 1 '--- more words/components following? --- IF flag% AND ch% = q1% AND nest% = 0 GOTO plNextWord IF flag% GOTO plSkipSepas IF (ch% <> q1%) AND (ch% <> q2% OR nest% = 0) THEN outArray$(ocnt& - 1) = outArray$(ocnt& - 1) + CHR$(ch%) '--- final array size adjustment, then exit --- plEnd: IF ocnt& - 1 < minUB& THEN ocnt& = minUB& + 1 REDIM _PRESERVE outArray$(oalb& TO (ocnt& - 1)) ParseLine& = ocnt& - 1 END FUNCTION
Attachments:
ParseLine.zip ParseLine.7z
Navigation
[0] Message Index
Go to full version