I have some very old, legacy Fortran code that I am trying to speed up. A major performance issue, is that it utilizes very many, very large 3-D arrays stored as (Y,Z,X) and individual (Z,X) layers have to communicated over a network.
This requires me to traverse the arrays and buffer them before sending since Fortran uses column major format. Is there an easy way, like a compiler flag or a refactoring tool, to make it so my (Z,X) layers sit in contiguous memory or to swap Fortran to row-major array order?