Feature #124

minmax -Af does not handle varying number of fields between each file

Added by Florian almost 6 years ago. Updated about 5 years ago.

Status:FeedbackStart date:2012-07-31
Priority:NormalDue date:
Assignee:-% Done:

0%

Category:-
Target version:-
Platform:

Description

Example:

echo 1 2 > foo2
echo 1 2 3 > foo3
minmax -Af foo3 foo2
foo3: N = 1    <1/1>    <2/2>    <3/3>
minmax (gmt_io.c:1373): Mismatch between actual (2) and expected (3) fields near line 3

History

#1 Updated by Paul almost 6 years ago

  • Status changed from New to Feedback

Well, since it may be asked to yield a total min/max answer we cannot have missing columns etc. It is a requirement that all files given on the command line have the same number of columns. This follows from the API that merges any number of files and sources into a virtual single dataset.

#2 Updated by Florian almost 6 years ago

Well, since it may be asked to yield a total min/max answer we cannot have missing columns etc. It is a requirement that all files given on the command line have the same number of columns. This follows from the API that merges any number of files and sources into a virtual single dataset.

This is fine for -Aa and -As but -Af should report the range for each file individually. So in the case of -Af minmax must not merge the files but has to read each table individually.

#3 Updated by Paul almost 6 years ago

  • Status changed from Feedback to Resolved

It would seem this would add much complexity to the i/o. As it now stands we have adopted a model in which any number of input tables + stdin etc are treated as a single virtual data set. To do this we had to impose the condition that the number of columns is constant across all files. This is reasonable or even expected for a vast amount of cases. Yes, this makes your particular example fail but there are scripting workarounds for that case, which is considerably easier than to redo the i/o.

#4 Updated by Florian almost 6 years ago

Your decision to drop support for varying number of fields makes option -Af somewhat obsolete. Wouldn't it be reasonable simple to implement an outer loop in minmax.c that treats each file as a separate dataset instead of passing the arguments to the API directly?

#5 Updated by Paul over 5 years ago

  • Status changed from Resolved to Feedback

No, not simple since we would have to replicate much of the machinery presently in the API. I.e., we would have to detect that there are in fact input files and not stdin or memory references (GMT_minmax is currently called from at least one other place, passing data via memory reference). So that is a huge amount of work just to make this odd case work.

#6 Updated by Florian over 5 years ago

Is there any way to make this error a warning then?

minmax (gmt_io.c:1375): Mismatch between actual (2) and expected (3) fields near line 3

There is also no way to scan for the number of input cols in every input dataset, right?

#7 Updated by Paul about 5 years ago

  • Tracker changed from Bug to Feature

This is an old thread. No, not possible to scan for input cols without adding much complexity. I think you just have to find peace with the notion that certain things in GMT are done a certain way and the idea of passing files with different number of columns to GMT is not going to work without completely redesigning the i/o. And clearly that is not worth the effort. Nevertheless, I have simply moving this thread from a bug to feature request.

Also available in: Atom PDF