Go to the first, previous, next, last section, table of contents.


Functions for Dealing with Time Stamps

A common use for awk programs is the processing of log files containing time stamp information, indicating when a particular log record was written. Many programs log their time stamp in the form returned by the time system call, which is the number of seconds since a particular epoch. On POSIX systems, it is the number of seconds since Midnight, January 1, 1970, UTC.

In order to make it easier to process such log files, and to produce useful reports, gawk provides two functions for working with time stamps. Both of these are gawk extensions; they are not specified in the POSIX standard, nor are they in any other known version of awk.

Optional parameters are enclosed in square brackets ("[" and "]").

systime()
This function returns the current time as the number of seconds since the system epoch. On POSIX systems, this is the number of seconds since Midnight, January 1, 1970, UTC. It may be a different number on other systems.
strftime([format [, timestamp]])
This function returns a string. It is similar to the function of the same name in ANSI C. The time specified by timestamp is used to produce a string, based on the contents of the format string. The timestamp is in the same format as the value returned by the systime function. If no timestamp argument is supplied, gawk will use the current time of day as the time stamp. If no format argument is supplied, strftime uses "%a %b %d %H:%M:%S %Z %Y". This format string produces output (almost) equivalent to that of the date utility. (Versions of gawk prior to 3.0 require the format argument.)

The systime function allows you to compare a time stamp from a log file with the current time of day. In particular, it is easy to determine how long ago a particular record was logged. It also allows you to produce log records using the "seconds since the epoch" format.

The strftime function allows you to easily turn a time stamp into human-readable information. It is similar in nature to the sprintf function (see section Built-in Functions for String Manipulation), in that it copies non-format specification characters verbatim to the returned string, while substituting date and time values for format specifications in the format string.

strftime is guaranteed by the ANSI C standard to support the following date format specifications:

%a
The locale's abbreviated weekday name.
%A
The locale's full weekday name.
%b
The locale's abbreviated month name.
%B
The locale's full month name.
%c
The locale's "appropriate" date and time representation.
%d
The day of the month as a decimal number (01--31).
%H
The hour (24-hour clock) as a decimal number (00--23).
%I
The hour (12-hour clock) as a decimal number (01--12).
%j
The day of the year as a decimal number (001--366).
%m
The month as a decimal number (01--12).
%M
The minute as a decimal number (00--59).
%p
The locale's equivalent of the AM/PM designations associated with a 12-hour clock.
%S
The second as a decimal number (00--60).(14)
%U
The week number of the year (the first Sunday as the first day of week one) as a decimal number (00--53).
%w
The weekday as a decimal number (0--6). Sunday is day zero.
%W
The week number of the year (the first Monday as the first day of week one) as a decimal number (00--53).
%x
The locale's "appropriate" date representation.
%X
The locale's "appropriate" time representation.
%y
The year without century as a decimal number (00--99).
%Y
The year with century as a decimal number (e.g., 1995).
%Z
The time zone name or abbreviation, or no characters if no time zone is determinable.
%%
A literal `%'.

If a conversion specifier is not one of the above, the behavior is undefined.(15)

Informally, a locale is the geographic place in which a program is meant to run. For example, a common way to abbreviate the date September 4, 1991 in the United States would be "9/4/91". In many countries in Europe, however, it would be abbreviated "4.9.91". Thus, the `%x' specification in a "US" locale might produce `9/4/91', while in a "EUROPE" locale, it might produce `4.9.91'. The ANSI C standard defines a default "C" locale, which is an environment that is typical of what most C programmers are used to.

A public-domain C version of strftime is supplied with gawk for systems that are not yet fully ANSI-compliant. If that version is used to compile gawk (see section Installing gawk), then the following additional format specifications are available:

%D
Equivalent to specifying `%m/%d/%y'.
%e
The day of the month, padded with a space if it is only one digit.
%h
Equivalent to `%b', above.
%n
A newline character (ASCII LF).
%r
Equivalent to specifying `%I:%M:%S %p'.
%R
Equivalent to specifying `%H:%M'.
%T
Equivalent to specifying `%H:%M:%S'.
%t
A tab character.
%k
The hour (24-hour clock) as a decimal number (0-23). Single digit numbers are padded with a space.
%l
The hour (12-hour clock) as a decimal number (1-12). Single digit numbers are padded with a space.
%C
The century, as a number between 00 and 99.
%u
The weekday as a decimal number [1 (Monday)--7].
%V
The week number of the year (the first Monday as the first day of week one) as a decimal number (01--53). The method for determining the week number is as specified by ISO 8601 (to wit: if the week containing January 1 has four or more days in the new year, then it is week one, otherwise it is week 53 of the previous year and the next week is week one).
%G
The year with century of the ISO week number, as a decimal number. For example, January 1, 1993, is in week 53 of 1992. Thus, the year of its ISO week number is 1992, even though its year is 1993. Similarly, December 31, 1973, is in week 1 of 1974. Thus, the year of its ISO week number is 1974, even though its year is 1973.
%g
The year without century of the ISO week number, as a decimal number (00--99).
%Ec %EC %Ex %Ey %EY %Od %Oe %OH %OI
%Om %OM %OS %Ou %OU %OV %Ow %OW %Oy
These are "alternate representations" for the specifications that use only the second letter (`%c', `%C', and so on). They are recognized, but their normal representations are used.(16) (These facilitate compliance with the POSIX date utility.)
%v
The date in VMS format (e.g., 20-JUN-1991).
%z
The timezone offset in a +HHMM format (e.g., the format necessary to produce RFC-822/RFC-1036 date headers).

This example is an awk implementation of the POSIX date utility. Normally, the date utility prints the current date and time of day in a well known format. However, if you provide an argument to it that begins with a `+', date will copy non-format specifier characters to the standard output, and will interpret the current time according to the format specifiers in the string. For example:

$ date '+Today is %A, %B %d, %Y.'
-| Today is Thursday, July 11, 1991.

Here is the gawk version of the date utility. It has a shell "wrapper", to handle the `-u' option, which requires that date run as if the time zone was set to UTC.

#! /bin/sh
#
# date --- approximate the P1003.2 'date' command

case $1 in
-u)  TZ=GMT0     # use UTC
     export TZ
     shift ;;
esac

gawk 'BEGIN  {
    format = "%a %b %d %H:%M:%S %Z %Y"
    exitval = 0

    if (ARGC > 2)
        exitval = 1
    else if (ARGC == 2) {
        format = ARGV[1]
        if (format ~ /^\+/)
            format = substr(format, 2)   # remove leading +
    }
    print strftime(format)
    exit exitval
}' "$@"


Go to the first, previous, next, last section, table of contents.