Author Topic: Is there someway to speed up reading a text file  (Read 5565 times)

0 Members and 1 Guest are viewing this topic.

Offline EricE

  • Forum Regular
  • Posts: 114
    • View Profile
Re: Is there someway to speed up reading a text file
« Reply #15 on: February 26, 2020, 04:14:07 am »
Hi MLambert,

As for the internal implementation of QB64's read buffer I will leave to the QB64 developers.

A very large file does not need to be completely loaded into memory, only chunks of it need be read into a buffer for processing.
The application program would implement its own read buffer.
3,000,000 records of 600 characters each should be no problem for a QB64 program to handle.

The actual processing would depend on the format of the data to be read.
The A$, B$, in your "INPUT #1, A$, B$, ".

In my opinion a solution might be very close here if we have a little more information about the data to be processed.

But it will also be good to get the input from the QB64 developers as well.
 
« Last Edit: February 26, 2020, 04:18:27 am by EricE »

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
Re: Is there someway to speed up reading a text file
« Reply #16 on: February 26, 2020, 09:43:20 am »
Since these are variable length fields, of various numbers of input, of extremely unusable size, here’s what I’d do:

1) Open file for binary and read what I consider a reasonable buffer.  100,000,000 bytes is a good size, if you’re dealing with GB totals...

2) Use _INSTRREV to find the last CRLF in that 100mb buffer.  This is the point where you break off operations and buffer from on the next pass, so you don’t separate data.

3) Parse the data using commas and CRLF characters as necessary delimiters for the data.

4) once parsing of the 100mb - CRLF last position is done, repeat the process from that CRLF position you found in step 2, until the whole multi-GB file is handled.



INPUT # reads sequentially from the disk a single byte at a time.  It’s SLOOOOOOOOOW.  Binary files read at the size of your disk clusters/sectors — usually 4096+ bytes per pass nowadays, so read times are a FRACTION of using INPUT #.  Parsing from memory is much faster than parsing from disk, and will do the job in a fraction of the time.

Personally, I don’t see why you can’t read the whole file in one go and then process it.  3,000,000 records of 600 characters is 1.8 GB, and most machines can  handle that readily.  I got a few applications which use 22GB of ram to load extensive datasets all at once into memory, and have never had an issue with them with 32GB total system ram...  As long as your PC has enough memory, I don’t see why you’d have any problem just loading in one go and then parsing.
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline MLambert

  • Forum Regular
  • Posts: 115
    • View Profile
Re: Is there someway to speed up reading a text file
« Reply #17 on: February 28, 2020, 04:48:42 am »
I don't read into memory because one day it will blow the limits ..

The reset of the advice sounds great and I will test the read binary.

Thanks again everyone.

Mike