QB64.org Forum
Active Forums => QB64 Discussion => Topic started by: MLambert on March 06, 2021, 04:58:12 am
-
Hi,
Some of my data has Serbian names containing ć which I need to convert to c.
The Hex UTF-8 bytes ( double byte code) is C4 8B, and the decimal is 267. Now I was using C48B$ = Chr$(267) and looking for C487$ in the field.
.... but Chr$ only goes up to 255 ... how can I get around this ?
The code I am using is ... If X$ = C48B$ then X$ = "c"
So long as CHR$ is < 256 then everything is ok but when CHR$ > 255 I get a function error meaning I cannot have CHR$ > 255.
Thanks,
Mike
-
Some of my data has Serbian names containing ć which I need to convert to c.
This seems like a good way to offend a lot of Serbians.
If you're looking to print such text I recommend using UPrintString from RhoSigma's utility: https://www.qb64.org/forum/index.php?topic=2248.msg130123
Among UTF-8's nice properties is that you can just search for the sequence CHR$(&HC4) + CHR$(&H88) in your full string. It's then up to you to use LEFT/RIGHT/MID as appropriate. You could also use the (potentially) more aesthetically pleasing MKI$(&H8BC4), but note the endian swap.
But I would like to know why you're replacing ć with c, because they're entirely different letters.
-
Thks Luke.
How do I
"search for the sequence CHR$(&HC4) + CHR$(&H88) in your full string" ?
Mike
-
http://www.qb64.org/wiki/INSTR