Author Topic: DATA constants with type suffix, opinions needed... (Read 7099 times)

RhoSigma · « **on:** February 07, 2021, 08:59:21 am »

I'am currently working on a fix for Github issue #124 to allow type suffixes for numeric values in DATA/READ constructs. However, the current implementation does impose some limits, so I'd like to ask for some opinions on how far the implementation of type suffixes should go here.

In general there are two very similar functions in libqb.cpp called "n_inputnumberfromdata" and "n_inputnumberfromfile". The interresting thing here is, that the "file" function simply ignores any invalid chars, where the "data" function will throw an "Syntax Error", so my first thought was just to ignore those conditions for the "data" function too, but the second thought was: Wait a moment!! - A file may usually be seen as an external 3rd party resource not under control of the application developer, hence it is absolutely legitimate to just ignore any misformatted file data and try to interpret the read number as good as possible (best guess). - In DATA/READ constructs otherwise the application developer can be held responsible for any nonsense he enters into the DATA lines, hence kickin' his ass with a "Syntax Error" is absolutely legitimate here too.

So what "n_inputnumberfromdata" basically does, is not more than copying the chars found in DATAs into a temporary buffer and while doing so is just checking, if there are number related chars only (such as -, +, ., digits 0-9 etc.). It will, however, in no way evaluate the read number yet, as that is just done one step further using the functions "func_read_float", "func_read_int64" and "func_read_uint64". Which of these 3 functions is called depends on the variable types you give at the READ statement. These functions take the chars from the temporary buffer (see above) and evaluate it into the actual number, but as you can see they simply use the biggest possible variable type to do so, hence using type suffixes on the numbers it only easily allow us to distinguish between integer types %%,%,&,&& and floating point types !,#,## and unsigned ~ for integers, but we can't distinguish between the actual size/precision of the variable types. To implement that, it would need a massive intervention into the whole function calling chain involved in reading numbers from DATA, respectively files, which use the same evaluation functions.

I'd prefer to waive to this massive interventions, as we would risk a whole bunch of new number related bugs, so I'll let you know what I have so far and listen to your thoughts, if that is enough type suffix support or not.

1.) Any type suffixes in DATA numbers are ignored in general doesn't matter of their kind or count, ie. you could write any combination of suffixes, even non existing or ambiguous ones like %~&!, ###&~ etc.. However, here I personally tend to improve it to allow real suffixes only, which would be not that big deal to do from the point I'm right now.

2.) If unsigned (~) is found, but the number is obviously entered as a negative number (leading -) in the DATAs, it's an error.

3.) If an integer type (%,&) is found, and the number would obviously evaluate into a floating point value, it's an error. Eg. 1.111&, 1E-5%, but not 1.111E+3& or 1E+5%. Hence, if the exponent (implied or explicitly given) is positive and equal to the number of digits after the decimal dot, then the
integer condition is fulfilled.

4.) If more digits, dots or exponents follow after the first type symbol, it's an error, eg. 125&.45, 125%E+5 etc..

5.) Now for given HEX/OCT/BIN numbers, those are introduced with & and need to be followed by either letter of H/O/B (case ignored), otherwise it's an error, further see 1.), but allowing integer suffixes only is an option.

6.) HEX/OCT/BIN is always evaluated as unsigned value unless it's in the _INTEGER64 range, where it would be signed, if the most significant bit is set (this all is imposed by the fact that the evaluation functions operate on int64/uint64 only). Changing this behaviour would require a lot of intervention, as the type suffixes are known to "n_inputnumberfromdata" only, but the size/sign of the evaluation would need to go into the "func_read_int64/uint64" functions. Also an explicit sign precheck on bits 63/31/15/8 need to be put into "n_inputnumberfromdata" then.

And even if this would all work as expected, then it would impose a new rule how to write HEX/OCT/BIN DATAs correctly, eg. &Hff, &H8000, &H81f2a000 would then evaluate into negative numbers unless they are given with explicit types eg. &Hff&, &H8000~%, &H81f2a000&&, which certainly would again confuse many people, I know we had such discussions several times in the forum already.

So now it's time for you guys, what do you think?? - Done as is, or what??

FellippeHeitor · « **Reply #1 on:** February 07, 2021, 09:11:10 am »

I say make the routine ignore the suffix for the sake of retrocompatibility, without throwing an error if they're there. In all these years I honestly didn't even know you could that in QB4.5 - and since only now someone pointed it out, I must assume it's not the most common thing people would have done back in the day.

So, if a suffix is found, let it be stripped away/ignored. It's the variable that was passed that will either store it properly or overflow. Both of which can be handled by the compiler as is.

RhoSigma · « **Reply #2 on:** February 07, 2021, 09:33:59 am »

Quote from: FellippeHeitor on February 07, 2021, 09:11:10 am

I say make the routine ignore the suffix for the sake of retrocompatibility, without throwing an error if they're there. In all these years I honestly didn't even know you could that in QB4.5 - and since only now someone pointed it out, I must assume it's not the most common thing people would have done back in the day.

So, if a suffix is found, let it be stripped away/ignored. It's the variable that was passed that will either store it properly or overflow. Both of which can be handled by the compiler as is.

Well that would be really the easiest way and very close to my 1st thought, just do it the same way as when reading the values from a file, just ignore them. And it makes sense in the sake of retrocompatibility, as there's not really a way to use the types for what they are, as reading and evaluating is done in different functions.

However, stripping my stuff down to the "Type Ignorer" part would still contain the error conditions mentioned in 4.) and 5.).

SMcNeill · « **Reply #3 on:** February 07, 2021, 11:12:06 am »

Here’s a question Rho, since you’ve dug this deep into things:

Would it be possible to add a new _DATABLOCK type structure to take advantage of the read from file syntax, and make it internal? An example usage might be:

_DATABLOCK my_image:
All sorts of raw data here, full of characters that DATA would normally object too.
END _DATABLOCK

Then, to read that data, all one would need to do is:
RESTORE my_image
READ raw_image_data$

FellippeHeitor · « **Reply #4 on:** February 07, 2021, 11:13:08 am »

The IDE would have a terrible time with binary data flowing freely around.

SMcNeill · « **Reply #5 on:** February 07, 2021, 11:14:59 am »

Quote from: FellippeHeitor on February 07, 2021, 11:13:08 am

The IDE would have a terrible time with binary data flowing freely around.

Not really. It can just interact with the lines between like it does with an invalid $IF block. ;)

FellippeHeitor · « **Reply #6 on:** February 07, 2021, 11:15:45 am »

CHR$(9) gets converted to spaces, CHR$(10) is a new line marker. Binary data in text source code is not a good idea.

FellippeHeitor · « **Reply #7 on:** February 07, 2021, 11:16:55 am »

Also, completely alien to what Rho is helping us solve. Let us not derail.

RhoSigma · « **Reply #8 on:** February 07, 2021, 11:32:41 am »

@SMcNeill, my favorite answer to such kind of tasks I usually tell my boss: "Oh well, theoretically I can do practically everything."

From the pure processing side of things I'd say no problem, but I agree with Fellippe, placing binary data directly into an ASCII text source file isn't really wise. Even if the IDE would accept it, what happens if you paste your program in a codebox here. Even a C++ compiler doesn't allow for that, you would need to link in your binary data as an object file later.

RhoSigma · « **Reply #9 on:** February 07, 2021, 05:13:22 pm »

@FellippeHeitor, as no more suggestions came in here (as you assumed it seems not to be a very popular practice), I've done the changes as mentioned above:

Type symbols are ignored, but numbers cannot continue after those. Also HEX/OCT/BIN require a valid type introducer.

Pull request ready for review...

FellippeHeitor · « **Reply #10 on:** February 07, 2021, 05:48:33 pm »

Thank you so much, Rho. Much appreciated.

PS: Merged into development!

RhoSigma · « **Reply #11 on:** February 10, 2021, 08:58:44 am »

@FellippeHeitor, thanks for merging it in, just got the latest build and it works as intended. The issue #124 can be closed then too.

FellippeHeitor · « **Reply #12 on:** February 10, 2021, 09:02:01 am »

True.

News:

Author Topic: DATA constants with type suffix, opinions needed... (Read 7099 times)

RhoSigma

DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...

RhoSigma

Re: DATA constants with type suffix, opinions needed...

SMcNeill

Re: DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...

SMcNeill

Re: DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...

RhoSigma

Re: DATA constants with type suffix, opinions needed...

RhoSigma

Re: DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...

RhoSigma

Re: DATA constants with type suffix, opinions needed...

FellippeHeitor

Re: DATA constants with type suffix, opinions needed...