Improve speed when reading netCDF table data (no grids)
|Target version:||Candidate for next minor release|
Because GMT (a) processes table data record-by-record and (b) can do this across ASCII, native binary, and netCDF tables, it is considerably slower to read netCDF tables because only a single row is being accessed at the time. A rewrite would take advantage of the new GMT→hidden.mem_coord[col] arrays and read in entire columns (or as many rows as the preallocated memory can hold, then read the next section of rows when required). The rec-by-rec machinery can then "read" from these columns whose initial read will be much much faster since entire columns are processed.
#2 Updated by Remko over 2 years ago
- Status changed from Resolved to Closed
(or as many rows as the preallocated memory can hold, then read the next section of rows when required).
GMT_prep_tmp_arrays will always allocate at least as many rows as the major dimension of the netCDF file that is to be read.
To be blunt,
GMT_prep_tmp_arrays should be given a limit on what it can allocate in the line:
while (row >= GMT->hidden.mem_rows) GMT->hidden.mem_rows <<= 1; /* Double up until enough */
mem_rowscan overflow or grow to hideous proportions.
I tested the current implementation on a reasonably sized file and was pleased with the increased performance. Hence I close this issue.