Time in ARM NetCDF Files

 

Introduction

This document explains most of the issues related to the use of time in ARM netCDF data files.

Time Zones

All ARM netCDF files are in UTC. Note that this has some implications for solar-based data; we tend to split our files at midnight, but the sun is still up at 0000 UTC at SGP in the late spring and summer, and all the time at TWP. This means a given solar arc may be broken across two different files. That’s just the way it is; using local time in ARM files would have been a bigger mess.

Note that splitting files at 0000 UTC is not an ARM standard, and many datastreams follow different splitting conventions.

Epoch Time

All ARM netCDF files represent time as “seconds since January 1, 1970,” which is called the “epoch time” and is the standard way to measure time in the UNIX world. For example, an epoch time of 1 means “Thu Jan 1 00:00:01 1970;” an epoch time of 992794875 is “Sun Jun 17 16:21:15 2001.”

Epoch times are very easy for computers to deal with. There are standard C and Fortran libraries available on most UNIX systems for manipulating data in this format (cf. gmtime(3C) and related functions). Similarly, Perl has several functions that deal directly with epoch times. In addition, epoch times deal with month or year rollovers very cleanly, and were completely impervious to any Y2K issues.

The downside of using the epoch time is that human beings have trouble understanding it. There are several methods for converting an epoch time into a human-readable format; see Conversion Examples and Hints, below.

Time Variables: base_time and time_offset

There are two time variables for every ARM netCDF file: base_time, which contains a single scalar value, and time_offset, which contains a time-series of values, one for each data sample in the file. The epoch time for sample i is given by the value base_time + time_offset[i]. In this way, we can put most of our nine or ten digits of epoch time in just one variable, keeping the time_offset (in theory) fairly small and thus more readable. The variable base_time is stored in the netCDF file as a long integer, while time_offset is a double precision floating point.

It is important to note that the value of base_time in a given file is not defined nor standardized. In most cases, it turns out that base_time is the time of the very first sample in that file, and that the first value of time_offset is 0. However, in some situations this is not true; in particular, if many files were generated at once, they may all have the same base_time, and thus the very first time_offset in the later files will not be 0.

The only standard we have on the value of base_time is in conjunction with time_offset ; base_time + time_offset[0] will always be the sample time corresponding to the time stamp in the ARM netCDF file name. As long as this condition is met, base_time and time_offset[0] can be anything, as long as they add up to the right sample times. To represent an epoch time of 877996800, any of the following are acceptable and give identical results:

base_time time_offset
877994880 1920
877596800 400000
0 877996800
What this means is that when working with ARM data, you should always use base_time + time_offset[0] as your basis. Do not convert the base_time to a human-readable string and use that as a date in the title of your plot. Instead, convert base_time + time_offset[0]. Do not plot values against just the time_offset variable; plot against time_offset[i] – time_offset[0].

Note also that while base_time is stored as a long integer in the netCDF file, time_offset is stored as a double precision floating point number. This allows ARM datastreams to have sample intervals of less than one second (with fractional time_offsets), if the need ever arises.

New Time Field

As of October 2004, a field called time is included in all new ARM netCDF files. This field is defined as “seconds since midnight” (more specifically: seconds since midnight of the day of the first sample of this data file), which is a more natural way of describing time when looking at daily files.

In addition, by naming the field time, we now have what netCDF calls a “coordinate variable”: a field with the same name as the time dimension. This makes it easier to use generic netCDF tools with ARM data. (See, for example, the COARDS netCDF conventions at https://ferret.pmel.noaa.gov/Ferret/)

The base_time and time_offset fields are still in use, so most tools and scripts used in the past will still work. The only exception is if you read in fields by field index rather than name – i.e., the third field, the fifth field, etc.

Because the time field will be placed right after the time_offset field, the scientific fields will no longer have the same field position as before – what used to be the third field in the file will now be the fourth field. The best solution to this is to modify your tools to read by field name and not rely on the field position. ARM netCDF files are not guaranteed to keep the same field positions between revisions.

Conversion Examples and Hints

Perl Example
A quick and dirty way to read an epoch time is to take advantage of perl’s gmtime() command:

% perl -e ‘print scalar gmtime(992794875), “\n”‘
which returns:

Sun Jun 17 16:21:15 2001
The C (and Fortran) function that does much the same thing is called ctime().

C Example
For more complicated manipulations, you can write a program in C, Fortran, or Perl using the gmtime() functions; consult that function’s documentation in those languages for more details. The following C program returns the year and day of the year (i.e., days since Dec 31 of the previous year, or the “julian day”), given a epoch time:

#include “time.h”
/* program time_jday */
main(){
struct tm *t;
long epoch=992794875;
t = gmtime ((time_t *)&epoch);
printf(“The year is: %d\n”, t->tm_year + 1900);
printf(“The julian day is: %d\n”, t->tm_yday + 1);
}
which returns:

% time_jday
The year is: 2001
The julian day is: 168

Fortran Example
Under Fortran in Solaris, the following program converts an epoch time into
the YYYYMMDD.HHMMSS format you see in timestamps for ARM netCDF files:

program ymd_fortran
INTEGER*4 epoch, t(9)

epoch=992794875
CALL gmtime(epoch, t)
write(*,737) t(6)+1900, t(5)+1, t(4), t(3), t(2), t(1)
737 format(I4.4,I2.2,I2.2,”.”,I2.2,I2.2,I2.2)
end
which returns:

% ymd_fortran
20010617.162115

IDL Example
The display and analysis package IDL does not use epoch time, but uses a similar
method with a different offset. (Time 0 in IDL is January 1, 4713 BCE, at noon.)
To convert an epoch time in IDL, use the julday and caldat
functions to get the right offsets. This code fragment in IDL:

stime = 992794875L;
secPerDay = Long(24L*60L*60L)
;; get IDL julian day of day 1 for zeb
day1 = julday(1,1,1970)
;; get julian day of first time sample
jday_foo = long(day1 + stime/secPerDay)
;; now find the day and year for this day
caldat, jday_foo, mm, dd, yyyy
will produce the following output:

IDL> print, “The day string is: “, yyyy, mon, dd, $
format=”(A,I4.4,I2.2,I2.2)”
The day string is: 20010617

Notes on Generating Epoch Times
If you want to create your own epoch times from a different time format (for
example, to generate your own ARM-like netCDF files), the standard C function
to do so is called mktime(). Note, however, that unlike gmtime(),
mktime() deals with local time, so there will be an offset to your
final answer depending upon which time zone your machine is in and, potentially,
whether you are in Daylight Savings Time or not. To get around this, at least
in UNIX-based systems, you can set your environment variable TZ
to be “GMT” (or “UTC”), either in the environment your code runs in or inside
your code itself with the putenv function:


putenv(“TZ=GMT”);
epoch = (long) mktime(t);

There does not seem to be a mktime() (or a putenv()) in the standard Fortran libraries, but under Solaris you can call C functions from your Fortran code. You just need to use a special “pragma” statement to tell the Fortran compiler to link with the C libraries. The following code
demonstrates these concepts:

program mktime_fortran
external Mktime !$pragma C( Mktime )
external Putenv !$pragma C( Putenv )
INTEGER*4 epoch, t(9)
C timestamp: 20010617.162115
t(1)=15
t(2)=21
t(3)=16
t(4)=17
t(5)=5
t(6)=101
call putenv(“TZ=GMT”)
epoch=mktime(t)
write(*,737) epoch
737 format(I12)
end
which will produce the following output:

% mktime_fortran
992794875
For more information, see the man page for mktime(3C) and the chapter on the C/Fortran interface in Solaris at docs.sun.com

Contact Information

Please send questions or comments to Tim Shippert.