Comment 3 for bug 1023329

Revision history for this message
marcofgalli (marcofgalli) wrote :

Dear Javier, thank you for the tests on this bug.

I will post hereafter a "short and quick" way to reproduce the bug, and a more detailed one. First of all, in both cases, it is necessary to get the input example data file to process. This is here:

https://code.zmaw.de/attachments/3103/t2m_ei_1979.grb

In order to let the bug come out, change the working directory to the one where you downloared t2m_ei_1979.grb and type:

cdo sellonlatbox,23,31,-25,-31 t2m_ei_1979.grb tagliato_t2m_ei_1979.grb

This command produces, in the same working directory, the file tagliato_t2m_ei_1979.grb, which has the wrong indication of the time in the metadata section of the data it contains. The only above command enlights the misfunctioning.

In order to understand why this behaviour of cdo is wrong, I report here more details about the problem I wanted to solve, what I expected and what I got in turn.

The input file, t2m_ei_1979.grb, contains data in grib 1 format (see, for instance, http://en.wikipedia.org/wiki/GRIB and http://www.wmo.int/pages/prog/www/WDM/Guides/Guide-binary-2.html). I think that the difference between grib 1 and grib 2 data formats are not relevant in this context, however I'm 100% sure that the data file used here is in grib 1 (for further details, ask). The data refer to the year 1979, and they are all of two metres temperature on a spatial latitude - longitude grid over all the state of South Africa and some surrounding areas (lat -20.00 to -40.00 by 0.25, long 15.00 to 35.00 by 0.25).

They are structured as follow: at 00 and at 12 UTC of each day of the year, a forecast of two metre temperature is initialised (this procedure is called analysis, a jargon useful to interpret how cdo misfunctions); after each analysis, a forecast of two metres temperature is performed, and the data are written in the grib file every three hours up to twelve hours for each day and initialisation. Only the forecasts, and not the initial values, are reported in the grib file (so, in the original file t2m_ei_1979.grb, for each day in 1979, contains the forecasts valid at 03, 06, 09 and 12 after the initialisation at 00, and the forecasts valid at 15, 18, 21 and 24 after the initialisation at 12). It is important to underline that all this time information is contained in a special descriptive section within the grib data file itself, written, basically, as the initialisation time (called analysis time, accordingly), and forecast time (to be added up to the analysis time to get the validity time).

I typed the command

cdo sellonlatbox,23,31,-25,-31 t2m_ei_1979.grb tagliato_t2m_ei_1979.grb

because I needed to work on a smaller area than the original one. As referred in the appropriate section of the cdo user guide,

https://code.zmaw.de/embedded/cdo/1.5.5/cdo.html#x1-1070002.3.3

the command I reported above means to cut spatially the input datafile (t2m_ei_1979.grb) on a smaller area and to put the results in the output data file (tagliato_t2m_ei_1979.grb ; "tagliato" is the Italian for "it has been cut"), leaving the input untouched.

The trusted and well assessed utility command wgrib is an alternative tool to get some information for grib 1 files. It is available in ubuntu within the grads package, or it is possible to download it from the developers' source:

ftp://ftp.cpc.ncep.noaa.gov/wd51we/wgrib/wgrib.c

and compile it just with gcc, i. e.

gcc -o wgrib wgrib.c

There are many ways to assess that wgrib does very well what it is supposed to do, so I think that this utility should not be investigated.

Without entering in the details of the usage of wgrib, it is important to know that the command (select only the first 8 lines, corresponding to the first day, rather than having all the output for 8 data per day per one year)

wgrib -v t2m_ei_1979.grb | head -n 8

gives in output

1:0:D=1979010100:2T:sfc:kpds=167,1,0:3hr fcst:type=9:winds are N/S:"2 metre temperature [K]
2:13230:D=1979010100:2T:sfc:kpds=167,1,0:6hr fcst:type=9:winds are N/S:"2 metre temperature [K]
3:26460:D=1979010100:2T:sfc:kpds=167,1,0:9hr fcst:type=9:winds are N/S:"2 metre temperature [K]
4:39690:D=1979010100:2T:sfc:kpds=167,1,0:12hr fcst:type=9:winds are N/S:"2 metre temperature [K]
5:52920:D=1979010112:2T:sfc:kpds=167,1,0:3hr fcst:type=9:winds are N/S:"2 metre temperature [K]
6:66150:D=1979010112:2T:sfc:kpds=167,1,0:6hr fcst:type=9:winds are N/S:"2 metre temperature [K]
7:79380:D=1979010112:2T:sfc:kpds=167,1,0:9hr fcst:type=9:winds are N/S:"2 metre temperature [K]
8:92610:D=1979010112:2T:sfc:kpds=167,1,0:12hr fcst:type=9:winds are N/S:"2 metre temperature [K]

For the present purposes, it is sufficient to consider this list as a colon separated list of informations about the different records within the grib file. Each line refers to a specific data set within the grib files (the first record of each line, ranging, in this example, from 1 to 8). The third record of each line is very important in this context, because it is about the analysis time (expressed as YYYYMMDDHH); the seventh is the other important one, because it tells the forecast time (3hr fcst, 6hr fcst...).

If one installs cdo from the Ubuntu repositories, and types the command

cdo sellonlatbox,23,31,-25,-31 t2m_ei_1979.grb tagliato_t2m_ei_1979.grb

then checks the output file tagliato_t2m_ei_1979.grb with wgrib, this is the result:

wgrib -v tagliato_t2m_ei_1979.grb | head -n 8

1:0:D=1979010100:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
2:1758:D=1979010100:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
3:3516:D=1979010100:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
4:5274:D=1979010100:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
5:7032:D=1979010112:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
6:8790:D=1979010112:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
7:10548:D=1979010112:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]
8:12306:D=1979010112:2T:sfc:kpds=167,1,0:anl:type=analysis:winds are N/S:"2 metre temperature [K]

This is wrong because all the information about the forecast time is lost (instead of 3hr fcst, 6hr fcst... it is reported "anl", wich stands for "analysis", so 0 hours forecast time).

It is obvious, in my opinion, that just cutting spatially the data grid should not modify its time information.

If one uses cdo on a Debian system (I tried this on a Debian 6.0.6 squeeze), after installing it from the Debian official repositories, the result is correct, and it is, as expected

wgrib -v tagliato_t2m_ei_1979.grb | head -n 8

1:0:D=1979010100:2T:sfc:kpds=167,1,0:3hr fcst:winds are N/S:"2 metre temperature [K]
2:1736:D=1979010100:2T:sfc:kpds=167,1,0:6hr fcst:winds are N/S:"2 metre temperature [K]
3:3472:D=1979010100:2T:sfc:kpds=167,1,0:9hr fcst:winds are N/S:"2 metre temperature [K]
4:5208:D=1979010100:2T:sfc:kpds=167,1,0:12hr fcst:winds are N/S:"2 metre temperature [K]
5:6944:D=1979010112:2T:sfc:kpds=167,1,0:3hr fcst:winds are N/S:"2 metre temperature [K]
6:8680:D=1979010112:2T:sfc:kpds=167,1,0:6hr fcst:winds are N/S:"2 metre temperature [K]
7:10416:D=1979010112:2T:sfc:kpds=167,1,0:9hr fcst:winds are N/S:"2 metre temperature [K]
8:12152:D=1979010112:2T:sfc:kpds=167,1,0:12hr fcst:winds are N/S:"2 metre temperature [K]

The same happens if one downloads cdo from its developers' site, as I wrote in my previous post, and compiles the program from its source code. The problem is that Ubuntu packager probably disabled the cdo internal grib 1 library by setting the configure option "--disable-cgribex" (which is not cdo default). Then, the Ubuntu compilation uses the grib_api library both for grib 1 and grib 2, but for grib 1 it is not supported (and in fact, it ends up in wrong results).

This should answer both the questions on how to reproduce the bug, and how to get hints if cdo works well or not (by checking its results with wgrib).

I'm not attaching nor the expected correct output file, nor the wrong one, because, trivially, I cannot find a tool to upload them in here. However, despite this has been got with an older version of cdo, this is the correct output (I can confirm it personally, I tried that later with the release of cdo shipping with Ubuntu), and, as I found, the problems are not related to the cdo release, rather on its compilation configuration:

https://code.zmaw.de/attachments/3104/tagliato_t2m_ei_1979_cdo-1.4.6.grb

And this one is the wrong one (same observations as above apply):

https://code.zmaw.de/attachments/3105/tagliato_t2m_ei_1979_cdo-1.5.3.grb

You can check the correctness of the time informations about the above two datafiles yourself, after my brief wgrib tutoring :) !

That should be all. I underline that I found this bug for
       cdo | 1.5.3.dfsg.1-2 precise
       cdo | 1.5.4+dfsg.1-4build1 quantal
and I did not yeat give raring a chance.

Faithfully,
Marco Galli