Author Topic: windows data packing  (Read 2603 times)

0 Members and 1 Guest are viewing this topic.

Offline SMcNeill

  • QB64 Developer
  • Forum Resident
  • Posts: 3972
    • View Profile
    • Steve’s QB64 Archive Forum
windows data packing
« on: February 12, 2021, 01:39:07 am »
For a long time now, there's been some difficulty in using windows API functions in QB64x64.  The reason for this is the way that 64-bit data is packed into alignments of 8-bytes, rather than 4-bytes, such as in QB64x32.  After working with it quite a bit, I think I've finally sorted out when and where to put padding into our data types.

TYPE foo
    x as INTEGER
    y as INTEGER
    z as _INTEGER64
END TYPE

Now, the above might be a simple data structure for a windows API call, and yet, it won't work for us in QB64.  WHY?  Because the OS packs the data into 8-byte chunks for ease of loading, unloading, and storage into internal registers.  What we need to do is start at the top of our type and count bytes:

Integer = 2
Integer = 2 (Total 4)
Integer64 = 8 (Total 12)

Now, when we reach, or exceed 8, we need padding between the last two items, of a size of our last total - 8.  In this case, our sum is 12, so 12-8 = 4, leading to us needing 4 bytes of padding between the last two items:

TYPE foo
    x as INTEGER
    y as INTEGER
    Padding1 AS STRING *4 '4 byte of padding to align to 8-byte rules.
    z as _INTEGER64
END TYPE

Now, if we count our bytes, we have a perfect 8-byte alignment from start to finish.

Integer = 2
Integer = 2 (4 total)
String padding = 4 (8 total; here we now need 8 - 8, or 0 more bytes padding.  Reset counter and continue on...)
Integer64 = 8 (8 total.  No padding needed.  Reset counter and continue on...)

And that's basically how 64-bit Windows API data structures are packed and transferred for us.  Once you sort out the trick to it, they're not actually that hard to figure out and manually account for, and the above is your basic method for figuring out where, and how much, padding you need for those 64-bit data structures.



Now, with that said, even 32-bit data structures follow this same rule -- except they pack on 4-byte alignments.  Usually, it seems most windows functions account for this naturally, but you should be aware of this fact just for the odd case where someone forgot to.

For example, if you look at the structure for STARTUPINFO, you'll see this:  https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/ns-processthreadsapi-startupinfoa

Code: [Select]
typedef struct _STARTUPINFOA {
  DWORD  cb;
  LPSTR  lpReserved;
  LPSTR  lpDesktop;
  LPSTR  lpTitle;
  DWORD  dwX;
  DWORD  dwY;
  DWORD  dwXSize;
  DWORD  dwYSize;
  DWORD  dwXCountChars;
  DWORD  dwYCountChars;
  DWORD  dwFillAttribute;
  DWORD  dwFlags;
  WORD   wShowWindow;
  WORD   cbReserved2;
  LPBYTE lpReserved2;
  HANDLE hStdInput;
  HANDLE hStdOutput;
  HANDLE hStdError;
} STARTUPINFOA, *LPSTARTUPINFOA;

Start at the top and count down, and you'll see that we always stay on a 4-byte alignment.  At least, we do, until we get down to wShowWindow, which is only 2 bytes -- but immediately following it is another 2 bytes called lpReserved2...  lpReserved2 must always be 0...  Why??  Because Microsoft has inserted it into the data structure to make certain that it aligns to those 4-byte boundaries!

Nearly all (I won't say all because as a programmer I've learned to always expect an edge case somewhere where someone forgot something) Windows API structures are created with this concept built-in.  Most of the time, you shouldn't ever have to add padding to 32-bit data structures, BUT you should at least keep in mind that it MAY be required.

TYPE foo
    x as INTEGER
    y AS LONG
    z AS INTEGER
END TYPE

To stay on those 4-byte alignment boundaries, the above would need to look like this for a 32-bit OS:

TYPE foo
    x as integer
    padding1 as string * 2
    y as long
    z as integer
    padding2 as string * 2
END TYPE

And, the same data structure would look like this for a 64-bit OS, once alignment was taken into consideration:

TYPE foo
    x as integer
    y as long
    padding1 as string * 2
    z as integer
    padding2 as string * 6
END TYPE



Now, as far as I can tell, that's basically all there is to it.  If I'm wrong with something (which Lord knows, I'm NEVER wrong!), then kindly speak up and let us know -- I'd love to correct whatever misconception or flaw there might be in my thinking for this, for certain! 
https://github.com/SteveMcNeill/Steve64 — A github collection of all things Steve!

Offline RhoSigma

  • QB64 Developer
  • Forum Resident
  • Posts: 565
    • View Profile
Re: windows data packing
« Reply #1 on: February 12, 2021, 02:35:34 am »
Hi Steve, good description,
want to point out that the very same rules must be applied when passing variable argument lists (the ... in C/C++) to stdlib functions as I've done in my QB-StdLib wrappers made a couple years ago. As those lists are not possible in QB64 directly, I use a variable length string and adding in the numbers and/or string pointers using the various MKx$ functions and doing padding by adding further MKI$(0) and MKL$(0) as required for 32/64bit versions.
My Projects:   https://qb64forum.alephc.xyz/index.php?topic=809
GuiTools - A graphic UI framework (can do multiple UI forms/windows in one program)
Libraries - ImageProcess, StringBuffers (virt. files), MD5/SHA2-Hash, LZW etc.
Bonus - Blankers, QB64/Notepad++ setup pack

Offline Dav

  • Forum Resident
  • Posts: 792
    • View Profile
Re: windows data packing
« Reply #2 on: February 12, 2021, 07:43:06 am »
Very informative. Ive been kind of unsure about all this. Thanks for the post.

- Dav