Author Topic: BASE64 (Read 6354 times)

SMcNeill · « **Reply #15 on:** December 23, 2019, 11:46:57 pm »

Quote from: Petr on December 23, 2019, 05:43:13 pm

And you say - 7 bit encoding? Hmmmm :) Is there a standard, something that specifies which characters to use? Definitely, I'll try to do this!

If I was going for 7-bit encoding, I’d go with 45 + 7-bit value, giving you a range of CHR$(45) to CHR$(173), which all copy/paste and display nicely in the QB64 IDE.

The reason why I’d start at 45?

It’s the next symbol after the comma, and we’d want to avoid the low value control codes, the quote, and the comma, so we won’t mess up our DATA statement. It also allows for ease of conversion back and forth, without the need for a SELECT CASE table like Base64 uses. (Just use CHR$ (value + 45) and ASC(value -45) to convert.

I dont think there’s any standard 128-bit encoding which displays nicely in all programs (most would just use 128-bit ANSI encoding, and the first 26 values of it are control codes), so you get to set whatever suits you the best for personal use. ;)

SMcNeill · « **Reply #16 on:** December 25, 2019, 06:27:50 am »

And a Base128 version, just in time for Christmas for you Petr, if you're interested in it:

Code: QB64: [Select]

text$ = "ABCDEF"
PRINT "ORIGINAL: "; text$
a$ = B256to128("ABCDEF")
PRINT "BASE128 : "; a$
b$ = B128to256(a$)
PRINT "BASE256 : "; b$
 
 
 
FUNCTION B256to128$ (text$)
    l = 8 * LEN(text$)
    DIM A(1 TO l) AS _UNSIGNED _BIT, b AS _UNSIGNED _BYTE
    'convert the text to the 8 bit array
    FOR i = 1 TO LEN(text$)
        b = ASC(text$, i)
        p = (i - 1) * 8 + 1
        FOR j = 0 TO 7
            IF b AND (2 ^ j) THEN A(p + j) = 1 ELSE A(p + j) = 0
        NEXT
    NEXT
    'convert the array to 7bit strings
    FOR i = 1 TO l STEP 7
        b = 0
        FOR j = 6 TO 0 STEP -1
            IF i + j < l THEN
                IF A(i + j) THEN b = b + (2 ^ j)
            END IF
        NEXT
        b = b + 45
        t$ = t$ + CHR$(b)
    NEXT
    B256to128 = t$
END FUNCTION
 
FUNCTION B128to256$ (text$)
    l = 7 * LEN(text$)
    DIM A(1 TO l) AS _UNSIGNED _BIT, b AS _UNSIGNED _BYTE
    'convert the text to the 8 bit array
    FOR i = 1 TO LEN(text$)
        b = ASC(text$, i) - 45
        FOR j = 0 TO 6
            p = p + 1: IF p > l THEN EXIT FOR
            IF b AND (2 ^ j) THEN A(p) = 1 ELSE A(p) = 0
        NEXT
    NEXT
    'convert the array to 8bit strings
    p = 0
    FOR i = 1 TO l STEP 8
        b = 0
        FOR j = 0 TO 7
            p = p + 1
            IF p > l THEN EXIT FOR
            IF A(p) THEN b = b + (2 ^ j)
        NEXT
        t$ = t$ + CHR$(b)
    NEXT
    B128to256 = t$
END FUNCTION

I, myself, might start using this for several things. The overhead is about as small as we can possibly get it, and still keep formatting that works in a DATA type statement, without causing the IDE any problems. Hex would double the length of the string, from 6 characters to 12, whereas this only increases by 8/7 * its original size. :)

(There's probably much faster ways to do this with the new bit shifting routines, but they're not in the official language yet, so I stuck with something simple to work with -- just convert the text to a bit array, and then read the bits as needed/wanted from that array to make our base 128 or base 256 characters.)

Petr · « **Reply #17 on:** December 25, 2019, 09:48:13 am »

Thank you Steve, I'll look into it later. I make gifts for children, woman, myself .... I need 6 hands, three legs and 4 heads ... Just a quick look and try - the expected output for 1 byte input, which is also 7 bits or less, is 7 bits, not 2 bytes. Specifically, try to enter "!" [00100001] is 6 bits long, but your function returns 2 bytes. That doesn't seem to be okay.

for 7 bit encoding, 7 bytes of input (8 * 7) = 56 bits are used,
which are written as 8 seven-bit numbers. If the input has a shorter input, the missing bits are filled with zeroes to the left.

SMcNeill · « **Reply #18 on:** December 25, 2019, 10:37:12 am »

Quote from: Petr on December 25, 2019, 09:48:13 am

Thank you Steve, I'll look into it later. I make gifts for children, woman, myself .... I need 6 hands, three legs and 4 heads ... Just a quick look and try - the expected output for 1 byte input, which is also 7 bits or less, is 7 bits, not 2 bytes. Specifically, try to enter "!" [00100001] is 6 bits long, but your function returns 2 bytes. That doesn't seem to be okay.

for 7 bit encoding, 7 bytes of input (8 * 7) = 56 bits are used,
which are written as 8 seven-bit numbers. If the input has a shorter input, the missing bits are filled with zeroes to the left.

It’s just a case of making certain all data is preserved. The most you’d ever save is a single byte, so it doesnt seem like it’s worth the hit to performance to strip off those leading 0’s.

Space is CHR(32), which is &B00100000.... As long as it’s a singular character, we could strip off those leading 0’s and write it as 100000, but if we’re dealing with 2 spaces in a row, we’d have to preserve at least one of them.

00100000,00100000 would translate to: 00,1000000,0100000... Again, only 1 real byte we can save, from those leading 2 bits in the left byte.

00100000,00100000,00100000 would translate to: 001,0000000,1000000,0100000... At 3 bytes of spaces, we no longer save that single byte of leftover 0’s, as we have a significant 1 in the last byte.

The most we’d ever have is a single byte of padding, and I’m not that concerned over saving that singular byte. It doesn’t bother me to keep it in there, to keep the conversion process simple and efficient.

A simple string check can remove that leading extra byte, if it’s really bothersome. If len(converted_text$) MOD 8 <> 0 AND CHR$(RIGHT$(converted_text$)) = 45 THEN ‘strip off that extra character, which is padding of unused 0’s.

Quote

If the input has a shorter input, the missing bits are filled with zeroes to the left.

Either add padding at decode time, or leave padding at encode time — both are perfectly valid means of converting. I’m just doing the second, rather than the first. ;)

SMcNeill · « **Reply #19 on:** December 25, 2019, 03:30:24 pm »

Took a few moments to write up the change for you, as I mentioned above, now that Christmas lunch is over and everyone is just lazing around and nodding off napping.

Code: QB64: [Select]

SCREEN _NEWIMAGE(800, 600, 32)
 
FOR i = 0 TO 25 'Print A to Z, from the alphabet
    text$ = text$ + CHR$(65 + i)
    a$ = B256to128(text$)
    PRINT "B128: "; a$;
    LOCATE , 40
    b$ = B128to256(a$)
    PRINT "B256: "; b$
    SLEEP
NEXT
_DELAY 2
END
 
FUNCTION B256to128$ (text$)
    l = 8 * LEN(text$)
    DIM A(1 TO l) AS _UNSIGNED _BIT, b AS _UNSIGNED _BYTE
    'convert the text to the 8 bit array
    FOR i = 1 TO LEN(text$)
        b = ASC(text$, i)
        p = (i - 1) * 8 + 1
        FOR j = 0 TO 7
            IF b AND (2 ^ j) THEN A(p + j) = 1 ELSE A(p + j) = 0
        NEXT
    NEXT
    'convert the array to 7bit strings
    FOR i = 1 TO l STEP 7
        b = 0
        FOR j = 6 TO 0 STEP -1
            IF i + j < l THEN
                IF A(i + j) THEN b = b + (2 ^ j)
            END IF
        NEXT
        b = b + 45
        t$ = t$ + CHR$(b)
    NEXT
    IF LEN(t$) MOD 8 <> 0 AND RIGHT$(t$, 1) = "-" THEN t$ = LEFT$(t$, LEN(t$) - 1)
    B256to128 = t$
END FUNCTION
 
FUNCTION B128to256$ (text$)
    l = 7 * LEN(text$)
    DIM A(1 TO l) AS _UNSIGNED _BIT, b AS _UNSIGNED _BYTE
    'convert the text to the 8 bit array
    FOR i = 1 TO LEN(text$)
        b = ASC(text$, i) - 45
        FOR j = 0 TO 6
            p = p + 1: IF p > l THEN EXIT FOR
            IF b AND (2 ^ j) THEN A(p) = 1 ELSE A(p) = 0
        NEXT
    NEXT
    'convert the array to 8bit strings
    p = 0
    FOR i = 1 TO l STEP 8
        b = 0
        FOR j = 0 TO 7
            p = p + 1
            IF p > l THEN EXIT FOR
            IF A(p) THEN b = b + (2 ^ j)
        NEXT
        t$ = t$ + CHR$(b)
    NEXT
    B128to256 = t$
END FUNCTION
 

The only real change here was the addition of this one line to strip off that extra byte for padding:

Code: [Select]

IF LEN(t$) MOD 8 <> 0 AND RIGHT$(t$, 1) = "-" THEN t$ = LEFT$(t$, LEN(t$) - 1)

Petr · « **Reply #20 on:** December 25, 2019, 04:42:57 pm »

So, I used my previous program, edited it, set it up from CHR$ (40) - and it works.

Here is the first output of Base128. This is just a decoder of encoded content. This code is not copyable by this forum code block.
Is possible, it works not for you, because characters >127 for coding are used. Try this ZIP file, contains BAS and EXE.

News:

Author Topic: BASE64 (Read 6354 times)

SMcNeill

Re: BASE64

SMcNeill

Re: BASE64

Petr

Re: BASE64

SMcNeill

Re: BASE64

SMcNeill

Re: BASE64

Petr

Re: BASE64