Print Page - find last record in a random file

Active Forums => QB64 Discussion => Topic started by: badger on September 29, 2020, 03:54:24 pm

Title: find last record in a random file
Post by: badger on September 29, 2020, 03:54:24 pm

Hello

how does on go about finding the last record in a random file that has lots of records

Badger

Title: Re: find last record in a random file
Post by: bplus on September 29, 2020, 03:57:50 pm

Quote from: badger on September 29, 2020, 03:54:24 pm

Hello

how does on go about finding the last record in a random file that has lots of records

Badger

If all else fails divide the file size by record size.

Title: Re: find last record in a random file
Post by: badger on September 29, 2020, 04:06:34 pm

Hello

now that rings a bell i have not use any form of qb since the 90s lol thanks

badger

Title: Re: find last record in a random file
Post by: Pete on September 29, 2020, 08:01:19 pm

Here's an example of what Mark (bplus) recommended...

Warning: This will make a file in the local directory named: myfiletest.txt. If such a file already exists, not made from this program, it will corrupt it by overwriting the existing data with the data elements in this program.

Code: QB64: [Select]

DIM a AS STRING * 20
DIM i AS INTEGER
OPEN "myfiletest.txt" FOR RANDOM AS #1 LEN = LEN(a)
DO
    READ a
    IF MID$(a, 1, 3) = "EOD" THEN EXIT DO
    i = i + 1
    PUT #1, i, a
LOOP
CLOSE #1
 
OPEN "myfiletest.txt" FOR RANDOM AS #1 LEN = LEN(a)
GET #1, LOF(1) / LEN(a), a
PRINT LOF(1), LEN(a), LOF(1) / LEN(a), a
END
 
DATA dog,cat,rabbit,horse,mouse,elephant,moose,racoon,emu,butterfly,EOD
 

Pete

Title: Re: find last record in a random file
Post by: bplus on September 29, 2020, 08:45:23 pm

Hey Pete, glad you are keeping an eye on this. I haven't opened a file for Random since the 90's either!

Title: Re: find last record in a random file
Post by: TempodiBasic on September 30, 2020, 04:48:50 am

Quote

Hey Pete, glad you are keeping an eye on this. I haven't opened a file for Random since the 90's either!

In my little experience I thought that RANDOM access to file data was the best ... until I understood the power and flexibility of BINARY access to file

Here a version using BINARY access to the file... you can see the slighty difference in syntax

Code: QB64: [Select]

DIM a AS STRING * 20
DIM i AS INTEGER
OPEN "myfiletest.txt" FOR BINARY AS #1 LEN = LEN(a)
COLOR 2, 1: LOCATE 2, 1: PRINT " Written into file"
DO
    READ a
    IF MID$(a, 1, 3) = "EOD" THEN EXIT DO
    i = i + 1
    PUT #1, , a
    LOCATE , 1: PRINT a
LOOP
CLOSE #1
COLOR 1, 2: LOCATE 2, 30: PRINT "Read from file"
OPEN "myfiletest.txt" FOR BINARY AS #1 LEN = LEN(a)
DO WHILE NOT EOF(1)
    GET #1, , a
    LOCATE , 30: PRINT a
LOOP
COLOR 4, 7: PRINT " The last record is..."
GET #1, LOF(1) - LEN(a), a
PRINT LOF(1), LEN(a), LOF(1) - LEN(a), a
COLOR 15, 0
END
 
DATA dog,cat,rabbit,horse,mouse,elephant,moose,racoon,emu,butterfly, free_pizza,EOD
 
 

if you run this code you can GET the same result of the Pete's code! FREE PIZZA

Title: Re: find last record in a random file
Post by: SMcNeill on September 30, 2020, 06:45:38 am

There’s very little difference between RANDOM access and BINARY access. What it all boils down to is basically:

With RANDOM access, you GET/PUT to record numbers.
With BINARY access you GET/PUT to record numbers * record size.

That’s basically the whole dang difference!

OPEN file$ FOR RANDOM AS #1 LEN = 100
PUT #1, 100, yourdata

If the above is putting data into the 100th record of a RANDOM file, below would be the exact same process in binary format.

OPEN file$ FOR BINARY AS #1
PUT #1, 100 * LEN(yourdata), yourdata

That’s it!

There’s just one big advantage to using BINARY over RANDOM: you can calculate for a header file to store database information (such as a password, last date accessed, description, version, ect...) in the same file as your records.

Honestly, the sooner one “graduates” up to binary, instead of random, the more possibilities they open up in their code with things they can program and do easily.

Title: Re: find last record in a random file
Post by: Pete on September 30, 2020, 12:55:08 pm

You can do that header calculation with RANDOM, too. Just assign the first however blocks you need for those extras, and start your file reading below that last record number. You could save 10 lines for header info, and use:

startread% = 10
GET #1, startread% + i, a

Just sayin'

Pete

Title: Re: find last record in a random file
Post by: SMcNeill on September 30, 2020, 01:41:56 pm

Quote from: Pete on September 30, 2020, 12:55:08 pm

You can do that header calculation with RANDOM, too. Just assign the first however blocks you need for those extras, and start your file reading below that last record number. You could save 10 lines for header info, and use:

startread% = 10
GET #1, startread% + i, a

Just sayin'

Pete

You could, but you couldn’t read that header in RANDOM format. It wouldn’t match your same data structure.

My data might be:
CustomerName AS STRING * 25
Age AS SINGLE
Phone AS _INTEGER64

The header might be:
Descriptor: QB64 Demo Database
Programmer: SMcNeill
Year: 2020
Version: 0.159a
Last Update: 09/31/2020:13:36:54
Total Records: 42
Deleted Records: 12

Trying to squeeze that information into a couple of RANDOM records is possible, but it’s gonna be a complete PITA to do so.

With BINARY, it’s as simple as PUT #1, 1, Header, and then to access our data is PUT #1, 1 + LEN(Header) + RecordOn * LEN(Record), Record

Title: Re: find last record in a random file
Post by: JohnUKresults on September 30, 2020, 07:30:13 pm

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

Title: Re: find last record in a random file
Post by: OldMoses on September 30, 2020, 08:35:54 pm

Quote from: JohnUKresults on September 30, 2020, 07:30:13 pm

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

I would find that an enlightening exercise/discussion. I know nothing of BINARY access, but from what Steve describes, I should probably be using BINARY. Up to this point, I've used RANDOM exclusively and have lately found it to be somewhat "limiting".

Title: Re: find last record in a random file
Post by: Pete on September 30, 2020, 10:02:20 pm

@Steve

Here's a quirkieway to work with headers, and RA files. Instead of defining file entries of a specific fixed length, just calculate the longest string, add 2, and don't ever open the file in Notepad, because it's ugly... but it works.

Code: QB64: [Select]

RESTORE radata
DO
    READ a$
    IF a$ = "EOD" THEN EXIT DO
    IF LEN(a$) > longeststring% THEN longeststring% = LEN(a$)
LOOP
OPEN "tmp9.tmp" FOR RANDOM AS #1 LEN = longeststring% + 2
a$ = "Pete": PUT #1, 1, a$
a$ = TIME$: PUT #1, 2, a$
a$ = DATE$: PUT #1, 3, a$
startread% = 5
RESTORE radata
i = 0
DO
    i = i + 1
    READ a$
    IF a$ = "EOD" THEN EXIT DO
    PUT #1, startread% - 1 + i, a$
LOOP
GET #1, 1, header$
PRINT "Header 1: "; header$
GET #1, 2, header$
PRINT "Header 2: "; header$
GET #1, 3, header$
PRINT "Header 3: "; header$
PRINT
DO
    LINE INPUT "Input a number from 1 to 9: "; x$
    IF VAL(x$) >= 1 AND VAL(x$) <= 9 THEN
        GET #1, startread% - 1 + VAL(x$), a$
        PRINT a$, LEN(a$)
    ELSE
        PRINT "Redo from start..."
    END IF
LOOP
CLOSE #1
END
 
radata:
DATA first entry,second entry,third entry,fourth entry,fifth entry,sixth entry,seventh entry,eighth entry,ninth entry,EOD
 

Title: Re: find last record in a random file
Post by: SMcNeill on October 01, 2020, 01:47:28 am

Quote from: JohnUKresults on September 30, 2020, 07:30:13 pm

Am confused as to what benefits Binary has over Random where you have defined field sizes in records.

Is it quicker to access or sort, more efficient for memory etc.?

How backwards compatible will it be with QB45 programs (believe it or not I still have some laptops using XP 32 bit)

Would you be OK for me to post some code samples showing how I currently store data for my programs then you can advise how Binary might be better?

Cheers

From my testing in the past, BINARY access is usually a bit faster than RANDOM access, but it’s barely enough to worry about unless you’re just dealing with excessively large amounts of data.

The main reasons I’d advocate folks using BINARY over RANDOM, is:

1) Ease of using headers, and for stacking data “chunks”, such as many image formats do. (PNG files are chunks of header, palette, pixel color data, all contained in a single file.)

2) Once one is familiar with BINARY access, the exact same principles apply to, and are used with all the _MEM commands. Get used to working with one, and the other comes naturally. (Think of it as the progression from using LOCATE x, y: PRINT, to using _PRINTSTRING(x, y...)

If I had existing programs working using RANDOM, I wouldn’t bother to swap them to BINARY, but for new programs I’d use BINARY instead. As for backwards compatibility, you don’t need to worry — both RANDOM and BINARY are fully usable in both QB45 and QB64.

If you want to post a few of your old programs, I’d be happy to show how simple the conversion process is, so you can compare the slight differences between the two methods. ;)

Title: Re: find last record in a random file
Post by: SMcNeill on October 01, 2020, 01:56:28 am

@Pete: The biggest problem with that method is you have to parse and convert all those strings into the proper data types. It *can* be done, but it’s more work than anyone should ever have to do, to avoid a much simpler method. ;)

Title: Re: find last record in a random file
Post by: JohnUKresults on October 01, 2020, 07:04:32 am

Hi

Thanks for the replies. I only work with text data, so images and such like I don't need to worry about. I'm compiling times for runners at races, so need to store certain brief details about them, pared down to the minimum I need for any event. Each race has it's own set of files (there are 6 files, which combine into effectively a relational database) but I'll just touch on a couple here).

First, I suppose this is the equivalent of your header, I have a very small text file with data like this storing details of the event:
143,"Podium Sub-30.00 10k, Barrowford","19th September 2020","998",21
143 shows the number of entries, then it's the name and date of the race, 998 is the number of club names stored (a separate file to look up by reference) and 21 is the number of finish times recorded, so if there had been 40 finishers, 21 would be replaced by 40. I open this file briefly (sequential read, as always in the same order) after selecting the event, to read in those variables for use in the main program. They are updated during operation and then written back to that file when I have finished with the data, with any changes reflected.

Next is the main runner data, for example:
1 Callum Smith MS 606 2 Mohd Ahmed MS 452 29.114 3 Emile France MS 79 28.461 4 Graham White MS 79 29.437
All of these are fixed length fields:
Runner number (4 characters), name (20 characters), category (3 characters). club number (3 characters), time (8 characters), position (4 characters), free text field (5 characters) to set any markers realting to sub-events or simply to flag up certain things like local runners.

Time remains blank if no time recorded

There are some other files with additional data (all effectively keyed on the race number) but if I can see how to read and write to the main data file then the same principles will apply to the other files.

I never have races over about 3000, so there are never massive amounts of data to be manipulating. I generally copy info from the random access files into arrays for sorting, and use temporary files as well, so apart from record sorting to store in numerical order during data entry, the files don't change much.

Sorting etc. only takes fractions of a second, so this is probably a bit of a theoretical exercise for me, rather than something I might actually change over to, but it would be interesting to see how it might work in practice.

Am I correct in assuming that there wouldn't be much change in the actual file structure for the main, as each record would need to remain a fixed length so as to access it accurately, or would binary files enable me to have flexibility in field length?

Thanks again for your input.

Title: Re: find last record in a random file
Post by: Petr on October 01, 2020, 08:06:06 am

Hi. You can write variable-length text to the file. It can then be read, of course. It's all about organizing the data in a file. For this purpose, I personally prefer to use files in binary mode. The following program shows how to do this. Basically, the first record is 4 bytes long and tells you how many records you have in the file. The following are two fields that have a size of 4 * number of records (because they are of type LONG). These contain information on how many bytes a particular text record collects. The first field is for name and the second for text.

Of course, with this solution, if you want to jump to record number 5 immediately, you would first have to add the lengths of the previous 4 strings, so it would be appropriate to add another field to the record in the file, which indicate the start byte position of the record in the file. This can be get with the SEEK function)

Code: QB64: [Select]

'example how write to file strings using variable lenght
 
file$ = "example.dat"
'first record in file is header:  First 4 bytes = total records in file, next record: 4 bytes * total records is records offsets.
 
ff = FREEFILE
OPEN file$ FOR BINARY AS ff
GET ff, , TotalRecords& '       read how much records file contains
IF TotalRecords& THEN
    PRINT "Total records in file:"; TotalRecords&
    DIM LenghtsName(TotalRecords&) AS LONG
    DIM LenghtText(TotalRecords&) AS LONG
    GET ff, , LenghtsName()
    GET ff, , LenghtText()
 
 
    FOR viewit = 1 TO TotalRecords&
        nam$ = SPACE$(LenghtsName(viewit))
        text$ = SPACE$(LenghtText(viewit))
        GET ff, , nam$
        GET ff, , text$
 
 
        PRINT "Record: "; viewit
        PRINT "Name: "; nam$
        PRINT "Text: "; text$
    NEXT
    END
ELSE
 
    n$ = " "
    DO UNTIL n$ = ""
        INPUT "Insert name: (empty string for writing file and exit)"; n$
        INPUT "Insert text:"; t$
 
        IF LEN(n$) THEN
            record& = record& + 1
 
            REDIM _PRESERVE TextLenghtA(record&) AS LONG
            REDIM _PRESERVE TextLenghtB(record&) AS LONG
            REDIM _PRESERVE names(record&) AS STRING
            REDIM _PRESERVE textes(record&) AS STRING
 
            TextLenghtA(record&) = LEN(n$)
            TextLenghtB(record&) = LEN(t$)
            names(record&) = n$
            textes(record&) = t$
        END IF
    LOOP
END IF
 
writeit:
'first record is total records in file:       (& = LONG type, 4 byte value)
PUT ff, 1, record&
'next record are string lenghts for name$:    (all lenghts for all name$)
PUT ff, , TextLenghtA()
'next record are string lenght for text$:     (all lenghts for all text$)
PUT ff, , TextLenghtB()
'next records are: Name$ and then Text$
FOR R = 0 TO record&
    PUT ff, , names(R)
    PUT ff, , textes(R)
NEXT
 
PRINT "File is saved. Run program again for view file content."
CLOSE ff
END
 

Title: Re: find last record in a random file
Post by: JohnUKresults on October 01, 2020, 01:26:57 pm

ok thanks Petr. Looks like I'd have my work cut out and have to think a bit differently!

I don't think I'll be able to change what I already have as the program is about 8000 lines long, and here's a lot of file access throughout it, but it's certainly worth bearing in mind for some later stuff.

Cheers

Title: Re: find last record in a random file
Post by: badger on October 01, 2020, 01:55:03 pm

hello

will the following check for a dup value in a record
anything that starts with a z is integer64 l is long and i is integer s is string. sr is my general input variable
TYPE customer
id AS STRING * 10
sfirst_name AS STRING * 25
slast_name AS STRING * 25
saddress AS STRING * 30
scity AS STRING * 25
sstate AS STRING * 2
szip AS STRING * 5
sphone AS STRING * 10
sfiller AS STRING * 130
END TYPE
dem custinforec as customer
lcustinforeclen =len(cusinforec)

FOR ux = 1 TO LOF(1) / lcustinforeclen
GET #1, zx * lcusinforeclen, custinforec
if custinforec.id =sr then iflag2 = 1
NEXT ux

Title: Re: find last record in a random file
Post by: badger on October 01, 2020, 02:07:14 pm

hello

will this check for a dup in a file.

z is integer64 s is string l is long custinfored type customer

Code: QB64: [Select]

dupidsearch:
WHILE NOT LOF(1) / lcustinforeclen
        GET #1, zcustinforecnum * lcusinforeclen, custinforec
        IF custinforec.id = sr THEN BEEP
WEND

Title: Re: find last record in a random file
Post by: bplus on October 01, 2020, 02:16:32 pm

@badger

before you GET # you have to specify if you are opening for Random Access or Binary, Random is looking for a record position and Binary is looking for a byte position.

Title: Re: find last record in a random file
Post by: badger on October 01, 2020, 02:33:36 pm

Hello

sorry for the double post

the binary files are open at the start of the program then closed at the programs close
i just need to know if my logic here is sound. instead of beep that will be handled in program elsware.

Badger

Title: Re: find last record in a random file
Post by: Pete on October 01, 2020, 03:09:42 pm

Phil, here's a fun one you might want to try. It finds if any of the records are duplicated, by loading the entire binary file in one chunk, and parsing it. Thanks to the Amazing Steve for this one chunk file load approach, and thank to the more amazing me for finding cool ways to use it. :D

Code: QB64: [Select]

OPEN "mytemp.tmp" FOR OUTPUT AS #1: CLOSE #1
CONST fieldsize = 20
DIM a AS STRING * FIELDSIZE
OPEN "mytemp.tmp" FOR BINARY AS #1
DO
    READ a
    IF MID$(a, 1, 3) = "EOD" THEN EXIT DO
    PUT #1, , a
LOOP
CLOSE #1
 
OPEN "mytemp.tmp" FOR BINARY AS #1
x$ = SPACE$(LOF(1)) ' Set x$ as spaces equal to the length of the entire bnary file.
GET #1, , x$
PRINT x$
PRINT: PRINT
' Now parse for duplicates
i = 1: j = 0
DO UNTIL j * fieldsize > LEN(x$)
    j = j + 1
    find$ = MID$(x$, i, fieldsize)
    IF INSTR(i + fieldsize, x$, find$) THEN PRINT j; find$ + "This entry is duplicated in the records."
    i = i + fieldsize
LOOP
END
DATA elephant,dog,cat,rabbit,frog,horse,dog,mouse,elephant,pig,EOD

Pete

PS If you don't intend on using fixed field lengths, you can do this, but you have to put some kind of divider like | between each file record, so you don't get a solid string like dogcatrabbit, etc. with a divider, you would get dog|cat|rabbit| See what I mean? Those dividers would have to be part of the database construction.

Title: Re: find last record in a random file
Post by: badger on October 01, 2020, 03:49:39 pm

Pete

what do you mean by fixed field lengths. LOL i have gotten my self cornfused LOL.... here is my def typel

Code: QB64: [Select]

TYPE customer
        id AS STRING * 10
        sfirst_name AS STRING * 25
        slast_name AS STRING * 25
        saddress AS STRING * 30
        scity AS STRING * 25
        sstate AS STRING * 2
        szip AS STRING * 5
        sphone AS STRING * 10
        sfiller AS STRING * 124
END TYPE

this is what i assume fixed field lengths are. I like to use 256 length fields so i add a filler string to the end for future upgraded without having to change the file data.

Badger

Title: Re: find last record in a random file
Post by: Pete on October 01, 2020, 08:24:47 pm

By fixed length, I man using DIM, as in DIM a as string * 10

So what does that mean? Dog is only three characters, right? Well when you make a = "Dog" the word dog becomes a 10 character length string. Three alphabetical characters, followed by 7 spaces, CHR$(32).

Code: QB64: [Select]

DIM a AS STRING * 10
a = "Dog"
PRINT "a = ["; a; "]";
FOR i = 1 TO 10
    PRINT ASC(MID$(a, i, 1));
NEXT
PRINT: PRINT
OPEN "mytemp2.txt" FOR RANDOM AS #1 LEN = 10
PUT #1, 1, a
a = ""
PRINT "a = ["; a; "]"
PRINT
GET #1, 1, a
PRINT "a = ["; a; "]";
FOR i = 1 TO 10
    PRINT ASC(MID$(a, i, 1));
NEXT

Now if you wanted the record coming out of the file to be without those added spaces, you would need to convert it to: a = RTRIM$(a)

To me, the most difficult thing about fixed length RA files is setting the length, if you are not sure of the largest length record you may be placing into the database. This generally leads to a bloated RA file format, like those animal names being placed in a LEN = 80 RA file. I mean how many animals have a name over 80 characters long, right?

Pete

Title: Re: find last record in a random file
Post by: bplus on October 01, 2020, 10:27:32 pm

I think if I was going to do a RA data base file today, I wouldn't, specially if it were to need lots of strings.

Pete's reply reminded of all the wasted space involved with RA in last reply.

Nope it would be Binary only to load and store. Use a Record divider with one character, why not CHR$(10) and another character to divide the fields maybe "|". Open for Binary and Load it or Download it as a giant string. Split the string to records then split the records as needed to fields. Then reverse that process to load it back to the file again after adding and editing all the data. I might not even need a Type definition, just an double array with Field names with values from each record going in and out. Probably could sort by array index number.

I wonder what I am forgetting?

Title: Re: find last record in a random file
Post by: SMcNeill on October 02, 2020, 01:11:30 am

Quote from: bplus on October 01, 2020, 10:27:32 pm

I think if I was going to do a RA data base file today, I wouldn't, specially if it were to need lots of strings.

Pete's reply reminded of all the wasted space involved with RA in last reply.

Nope it would be Binary only to load and store. Use a Record divider with one character, why not CHR$(10) and another character to divide the fields maybe "|". Open for Binary and Load it or Download it as a giant string. Split the string to records then split the records as needed to fields. Then reverse that process to load it back to the file again after adding and editing all the data. I might not even need a Type definition, just an double array with Field names with values from each record going in and out. Probably could sort by array index number.

I wonder what I am forgetting?

If I need variable length strings in a database, I always write TWO databases. The first is my data, the 2nd is my index to the data.

For example, my data might be: dog, cheese, elephant, basketball

My first database would be binary data of :dogcheeseelephantbasketball

My second database would be the index of 3, 9, 17, 27

When I need any record, all I need is to check 2 indexes. For example, to find the 3rd record in my database, I look at index 2 and 3. Index 2 ends at byte 9, so I start reading the main database at byte 10, and I finish reading the database at Index 3, or at byte 17.

From byte 10 to 17 is “elephant”.

Hopefully, I should be home later today, so if you guys want a demo, I can work up a simple example later this afternoon, if anyone would find it useful.

Title: Re: find last record in a random file
Post by: TempodiBasic on October 02, 2020, 04:45:09 am

the tecnique of Steve let me think about the three database with data plus a pointer to the next data or with 2 pointer 1 for previous data and 1 for next data...
yes going deeper into the Db structure! I love it.

Title: Re: find last record in a random file
Post by: bplus on October 02, 2020, 12:37:34 pm

Well I always liked data files I could edit in txt editor.

Steve, you change one thing in that index thing and you have to re-adjust everything.

Title: Re: find last record in a random file
Post by: SMcNeill on October 02, 2020, 12:58:04 pm

Quote from: bplus on October 02, 2020, 12:37:34 pm

Well I always liked data files I could edit in txt editor.

Steve, you change one thing in that index thing and you have to re-adjust everything.

Take a look at the little sample I posted a bit ago. You can alter/change your data file, with very little editing of the main data. The index is just pointers to your data, so if your data gets longer, you just move it to the end of the existing data and simply point the position to where it’s at.

Text Only | Text with Attachments

QB64.org Forum

Active Forums => QB64 Discussion => Topic started by: badger on September 29, 2020, 03:54:24 pm