Author Topic: find last record in a random file (Read 24249 times)

badger · « **on:** September 29, 2020, 03:54:24 pm »

Hello

how does on go about finding the last record in a random file that has lots of records

Badger

bplus · « **Reply #1 on:** September 29, 2020, 03:57:50 pm »

Quote from: badger on September 29, 2020, 03:54:24 pm

Hello

how does on go about finding the last record in a random file that has lots of records

Badger

If all else fails divide the file size by record size.

badger · « **Reply #2 on:** September 29, 2020, 04:06:34 pm »

Hello

now that rings a bell i have not use any form of qb since the 90s lol thanks

badger

Pete · « **Reply #3 on:** September 29, 2020, 08:01:19 pm »

Here's an example of what Mark (bplus) recommended...

Warning: This will make a file in the local directory named: myfiletest.txt. If such a file already exists, not made from this program, it will corrupt it by overwriting the existing data with the data elements in this program.

Code: QB64: [Select]

DIM a AS STRING * 20
DIM i AS INTEGER
OPEN "myfiletest.txt" FOR RANDOM AS #1 LEN = LEN(a)
DO
    READ a
    IF MID$(a, 1, 3) = "EOD" THEN EXIT DO
    i = i + 1
    PUT #1, i, a
LOOP
CLOSE #1
 
OPEN "myfiletest.txt" FOR RANDOM AS #1 LEN = LEN(a)
GET #1, LOF(1) / LEN(a), a
PRINT LOF(1), LEN(a), LOF(1) / LEN(a), a
END
 
DATA dog,cat,rabbit,horse,mouse,elephant,moose,racoon,emu,butterfly,EOD
 

Pete

bplus · « **Reply #4 on:** September 29, 2020, 08:45:23 pm »

Hey Pete, glad you are keeping an eye on this. I haven't opened a file for Random since the 90's either!

TempodiBasic · « **Reply #5 on:** September 30, 2020, 04:48:50 am »

Quote

Hey Pete, glad you are keeping an eye on this. I haven't opened a file for Random since the 90's either!

In my little experience I thought that RANDOM access to file data was the best ... until I understood the power and flexibility of BINARY access to file

Here a version using BINARY access to the file... you can see the slighty difference in syntax

Code: QB64: [Select]

DIM a AS STRING * 20
DIM i AS INTEGER
OPEN "myfiletest.txt" FOR BINARY AS #1 LEN = LEN(a)
COLOR 2, 1: LOCATE 2, 1: PRINT " Written into file"
DO
    READ a
    IF MID$(a, 1, 3) = "EOD" THEN EXIT DO
    i = i + 1
    PUT #1, , a
    LOCATE , 1: PRINT a
LOOP
CLOSE #1
COLOR 1, 2: LOCATE 2, 30: PRINT "Read from file"
OPEN "myfiletest.txt" FOR BINARY AS #1 LEN = LEN(a)
DO WHILE NOT EOF(1)
    GET #1, , a
    LOCATE , 30: PRINT a
LOOP
COLOR 4, 7: PRINT " The last record is..."
GET #1, LOF(1) - LEN(a), a
PRINT LOF(1), LEN(a), LOF(1) - LEN(a), a
COLOR 15, 0
END
 
DATA dog,cat,rabbit,horse,mouse,elephant,moose,racoon,emu,butterfly, free_pizza,EOD
 
 

if you run this code you can GET the same result of the Pete's code! FREE PIZZA

SMcNeill · « **Reply #6 on:** September 30, 2020, 06:45:38 am »

There’s very little difference between RANDOM access and BINARY access. What it all boils down to is basically:

With RANDOM access, you GET/PUT to record numbers.
With BINARY access you GET/PUT to record numbers * record size.

That’s basically the whole dang difference!

OPEN file$ FOR RANDOM AS #1 LEN = 100
PUT #1, 100, yourdata

If the above is putting data into the 100th record of a RANDOM file, below would be the exact same process in binary format.

OPEN file$ FOR BINARY AS #1
PUT #1, 100 * LEN(yourdata), yourdata

That’s it!

There’s just one big advantage to using BINARY over RANDOM: you can calculate for a header file to store database information (such as a password, last date accessed, description, version, ect...) in the same file as your records.

Honestly, the sooner one “graduates” up to binary, instead of random, the more possibilities they open up in their code with things they can program and do easily.

Pete · « **Reply #7 on:** September 30, 2020, 12:55:08 pm »

You can do that header calculation with RANDOM, too. Just assign the first however blocks you need for those extras, and start your file reading below that last record number. You could save 10 lines for header info, and use:

startread% = 10
GET #1, startread% + i, a

Just sayin'

Pete

SMcNeill · « **Reply #8 on:** September 30, 2020, 01:41:56 pm »

Quote from: Pete on September 30, 2020, 12:55:08 pm

You can do that header calculation with RANDOM, too. Just assign the first however blocks you need for those extras, and start your file reading below that last record number. You could save 10 lines for header info, and use:

startread% = 10
GET #1, startread% + i, a

Just sayin'

Pete

You could, but you couldn’t read that header in RANDOM format. It wouldn’t match your same data structure.

My data might be:
CustomerName AS STRING * 25
Age AS SINGLE
Phone AS _INTEGER64

The header might be:
Descriptor: QB64 Demo Database
Programmer: SMcNeill
Year: 2020
Version: 0.159a
Last Update: 09/31/2020:13:36:54
Total Records: 42
Deleted Records: 12

Trying to squeeze that information into a couple of RANDOM records is possible, but it’s gonna be a complete PITA to do so.

With BINARY, it’s as simple as PUT #1, 1, Header, and then to access our data is PUT #1, 1 + LEN(Header) + RecordOn * LEN(Record), Record

JohnUKresults · « **Reply #9 on:** September 30, 2020, 07:30:13 pm »

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

OldMoses · « **Reply #10 on:** September 30, 2020, 08:35:54 pm »

Quote from: JohnUKresults on September 30, 2020, 07:30:13 pm

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

I would find that an enlightening exercise/discussion. I know nothing of BINARY access, but from what Steve describes, I should probably be using BINARY. Up to this point, I've used RANDOM exclusively and have lately found it to be somewhat "limiting".

Pete · « **Reply #11 on:** September 30, 2020, 10:02:20 pm »

@Steve

Here's a quirkieway to work with headers, and RA files. Instead of defining file entries of a specific fixed length, just calculate the longest string, add 2, and don't ever open the file in Notepad, because it's ugly... but it works.

Code: QB64: [Select]

RESTORE radata
DO
    READ a$
    IF a$ = "EOD" THEN EXIT DO
    IF LEN(a$) > longeststring% THEN longeststring% = LEN(a$)
LOOP
OPEN "tmp9.tmp" FOR RANDOM AS #1 LEN = longeststring% + 2
a$ = "Pete": PUT #1, 1, a$
a$ = TIME$: PUT #1, 2, a$
a$ = DATE$: PUT #1, 3, a$
startread% = 5
RESTORE radata
i = 0
DO
    i = i + 1
    READ a$
    IF a$ = "EOD" THEN EXIT DO
    PUT #1, startread% - 1 + i, a$
LOOP
GET #1, 1, header$
PRINT "Header 1: "; header$
GET #1, 2, header$
PRINT "Header 2: "; header$
GET #1, 3, header$
PRINT "Header 3: "; header$
PRINT
DO
    LINE INPUT "Input a number from 1 to 9: "; x$
    IF VAL(x$) >= 1 AND VAL(x$) <= 9 THEN
        GET #1, startread% - 1 + VAL(x$), a$
        PRINT a$, LEN(a$)
    ELSE
        PRINT "Redo from start..."
    END IF
LOOP
CLOSE #1
END
 
radata:
DATA first entry,second entry,third entry,fourth entry,fifth entry,sixth entry,seventh entry,eighth entry,ninth entry,EOD
 

SMcNeill · « **Reply #12 on:** October 01, 2020, 01:47:28 am »

Quote from: JohnUKresults on September 30, 2020, 07:30:13 pm

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

From my testing in the past, BINARY access is usually a bit faster than RANDOM access, but it’s barely enough to worry about unless you’re just dealing with excessively large amounts of data.

The main reasons I’d advocate folks using BINARY over RANDOM, is:

1) Ease of using headers, and for stacking data “chunks”, such as many image formats do. (PNG files are chunks of header, palette, pixel color data, all contained in a single file.)

2) Once one is familiar with BINARY access, the exact same principles apply to, and are used with all the _MEM commands. Get used to working with one, and the other comes naturally. (Think of it as the progression from using LOCATE x, y: PRINT, to using _PRINTSTRING(x, y...)

If I had existing programs working using RANDOM, I wouldn’t bother to swap them to BINARY, but for new programs I’d use BINARY instead. As for backwards compatibility, you don’t need to worry — both RANDOM and BINARY are fully usable in both QB45 and QB64.

If you want to post a few of your old programs, I’d be happy to show how simple the conversion process is, so you can compare the slight differences between the two methods. ;)

SMcNeill · « **Reply #13 on:** October 01, 2020, 01:56:28 am »

@Pete: The biggest problem with that method is you have to parse and convert all those strings into the proper data types. It *can* be done, but it’s more work than anyone should ever have to do, to avoid a much simpler method. ;)

JohnUKresults · « **Reply #14 on:** October 01, 2020, 07:04:32 am »

Hi

Thanks for the replies. I only work with text data, so images and such like I don't need to worry about. I'm compiling times for runners at races, so need to store certain brief details about them, pared down to the minimum I need for any event. Each race has it's own set of files (there are 6 files, which combine into effectively a relational database) but I'll just touch on a couple here).

First, I suppose this is the equivalent of your header, I have a very small text file with data like this storing details of the event:
143,"Podium Sub-30.00 10k, Barrowford","19th September 2020","998",21
143 shows the number of entries, then it's the name and date of the race, 998 is the number of club names stored (a separate file to look up by reference) and 21 is the number of finish times recorded, so if there had been 40 finishers, 21 would be replaced by 40. I open this file briefly (sequential read, as always in the same order) after selecting the event, to read in those variables for use in the main program. They are updated during operation and then written back to that file when I have finished with the data, with any changes reflected.

Next is the main runner data, for example:
1 Callum Smith MS 606 2 Mohd Ahmed MS 452 29.114 3 Emile France MS 79 28.461 4 Graham White MS 79 29.437
All of these are fixed length fields:
Runner number (4 characters), name (20 characters), category (3 characters). club number (3 characters), time (8 characters), position (4 characters), free text field (5 characters) to set any markers realting to sub-events or simply to flag up certain things like local runners.

Time remains blank if no time recorded

There are some other files with additional data (all effectively keyed on the race number) but if I can see how to read and write to the main data file then the same principles will apply to the other files.

I never have races over about 3000, so there are never massive amounts of data to be manipulating. I generally copy info from the random access files into arrays for sorting, and use temporary files as well, so apart from record sorting to store in numerical order during data entry, the files don't change much.

Sorting etc. only takes fractions of a second, so this is probably a bit of a theoretical exercise for me, rather than something I might actually change over to, but it would be interesting to see how it might work in practice.

Am I correct in assuming that there wouldn't be much change in the actual file structure for the main, as each record would need to remain a fixed length so as to access it accurately, or would binary files enable me to have flexibility in field length?

Thanks again for your input.

News: