My first thought on this is - what's the end result supposed to look like? That totally influences the beginning...
... Because insert spiel about doing things in proper 3D ... but forgetting that for a second ...
It seems that getting the pseudo-height correct isn't going to be the main problem, but is instead going to be plotting. If you have monsters, projectiles, speech bubbles, or whatever else, the question is how to do draw them in the proper layers so you don't have overlaps? The quick solution I see is to stick your data layers together like a cake, and then slice the scene not horizontally, but in the so-called z-direction (int the screen) from background to foreground. That way, you can just do the painter's algorithm, which is especially easy because of the locked perspective.