Author Topic: Double Byte Characters  (Read 2906 times)

0 Members and 1 Guest are viewing this topic.

Offline MLambert

  • Forum Regular
  • Posts: 115
    • View Profile
Double Byte Characters
« on: March 06, 2021, 04:58:12 am »
Hi,
Some of my data has Serbian names containing    ć  which I need to convert to c.
The Hex UTF-8 bytes ( double byte code)  is C4 8B,  and the decimal is 267. Now I was using C48B$ = Chr$(267) and looking for C487$ in the field.
.... but Chr$ only goes up to 255 ... how can I get around this ?

The code I am using is  ... If X$ = C48B$ then X$ = "c"

So long as CHR$ is < 256 then everything is ok but when CHR$ > 255 I get a function error meaning I cannot have CHR$ > 255.

Thanks,

Mike

Offline luke

  • Administrator
  • Seasoned Forum Regular
  • Posts: 324
    • View Profile
Re: Double Byte Characters
« Reply #1 on: March 06, 2021, 06:02:27 am »
Some of my data has Serbian names containing    ć  which I need to convert to c.
This seems like a good way to offend a lot of Serbians.

If you're looking to print such text I recommend using UPrintString from RhoSigma's utility: https://www.qb64.org/forum/index.php?topic=2248.msg130123

Among UTF-8's nice properties is that you can just search for the sequence CHR$(&HC4) + CHR$(&H88) in your full string. It's then up to you to use LEFT/RIGHT/MID as appropriate. You could also use the (potentially) more aesthetically pleasing MKI$(&H8BC4), but note the endian swap.

But I would like to know why you're replacing ć with c, because they're entirely different letters.
« Last Edit: March 06, 2021, 06:04:51 am by luke »

Offline MLambert

  • Forum Regular
  • Posts: 115
    • View Profile
Re: Double Byte Characters
« Reply #2 on: March 07, 2021, 02:50:14 am »
Thks Luke.

How do I
"search for the sequence CHR$(&HC4) + CHR$(&H88) in your full string" ?

Mike
« Last Edit: March 07, 2021, 02:58:44 am by MLambert »

Offline luke

  • Administrator
  • Seasoned Forum Regular
  • Posts: 324
    • View Profile
Re: Double Byte Characters
« Reply #3 on: March 07, 2021, 06:18:04 am »