I made some adjustments there. I think While statement is as fast as Do, they're both faster than For, so I didn't change that. The known brake is Point, so I replaced it with MemGet, instead of PSet I used MemPut. Next, before analyzing the image, I saved it as a user-defined RGB field - one index = one pixel. So that RED32, GREEN32 and BLUE32 are still not called. The color components for fill color is made it global like NGreen, NBlue and NRed (all types _unsigned _byte).
I've found that if you add DefLng A-Z to my and your source code at the beginning, it will accelerate further in both cases.