1
Programs / Re: Print Unicode, UTF-8
« on: April 06, 2020, 08:55:59 am »
Just to join the club, I also have QB64 function that converts UTF-8 strings into UNICODE number.
Not knowing for Your efforts, I have done this as aid to (hopefully automated) translating among many languages.
Once again I did something needless :-)
It is written in such a way that after 8 years, even I can understand what I was doing today. And that is not always the case :-)
Well, it my be helpful to somebody unfamiliar with UNICODE standard.
I intend to ad some more "error codes" (negative returns) at some later point. Right now I have to deal another issue.
It is obvious that there is a bad blood between QB64 IDE and my OS (Linux Mint).
By
Not knowing for Your efforts, I have done this as aid to (hopefully automated) translating among many languages.
Once again I did something needless :-)
It is written in such a way that after 8 years, even I can understand what I was doing today. And that is not always the case :-)
Well, it my be helpful to somebody unfamiliar with UNICODE standard.
Code: QB64: [Select]
- '------------------------------
- ' INPUT: UTF-8 string
- '------------------------------
- ' OUTPUT: ERROR UTF2UNICODE < 0
- ' ---------------------
- ' OK UTF2UNICODE => 0 AND UTF2UNICODE =< &H10FFFF
- ' Recoginsed unicode character is removed from the begining of argument. This can be turned off (see at the bottom).
- '------------------------------
- UTF2UNICODE& = -1 'Invalid argument
- result = -2 'Unspecified error
- chlen = 1
- result = hb
- 'head-byte + data-byte
- ' 110xxxxx 10yyyyyy
- '---------------------
- chlen = 2
- 'result = (hb AND &B00011111) * &B01000000
- result = (hb AND &H1F) * &H40 ' head-byte shifted left 6 places result | 0000 0000 0000 0000 0000 0xxx xx00 0000 |
- 'result = result OR (db AND &B00111111)
- 'head-byte + data-byte1 + data-byte2
- ' 1110xxxx 10yyyyyy 10zzzzzz
- '-----------------------------------
- chlen = 3
- 'result = (hb AND &B00001111) * &B 0001 0000 0000 0000
- result = (hb AND &HF) * &H1000 ' head-byte shifted left 12 places result | 0000 0000 0000 0000 xxxx 0000 0000 0000 |
- 'head-byte + data-byte1 + data-byte2 + data-byte3
- ' 11110xxx 10yyyyyy 10zzzzzz 10wwwwww
- '------------------------------------------------
- chlen = 4
- 'result = (hb AND &B00000111) * &B 0000 0100 0000 0000 0000 0000
- result = (hb AND &H6) * &H400000 ' head-byte shifted left 18 places result | 0000 0000 000x xx00 0000 0000 0000 0000 |
- 'Not a head-byte.
- result = hb
- UTF2UNICODE& = result
I intend to ad some more "error codes" (negative returns) at some later point. Right now I have to deal another issue.
It is obvious that there is a bad blood between QB64 IDE and my OS (Linux Mint).
By