When awk
reads an input record, the record is
automatically separated or parsed by the interpreter into chunks
called fields. By default, fields are separated by whitespace,
like words in a line.
Whitespace in awk
means any string of one or more spaces,
tabs or newlines;(5) other characters such as
formfeed, and so on, that are
considered whitespace by other languages are not considered
whitespace by awk
.
The purpose of fields is to make it more convenient for you to refer to
these pieces of the record. You don't have to use them--you can
operate on the whole record if you wish--but fields are what make
simple awk
programs so powerful.
To refer to a field in an awk
program, you use a dollar-sign,
`$', followed by the number of the field you want. Thus, $1
refers to the first field, $2
to the second, and so on. For
example, suppose the following is a line of input:
This seems like a pretty nice example.
Here the first field, or $1
, is `This'; the second field, or
$2
, is `seems'; and so on. Note that the last field,
$7
, is `example.'. Because there is no space between the
`e' and the `.', the period is considered part of the seventh
field.
NF
is a built-in variable whose value
is the number of fields in the current record.
awk
updates the value of NF
automatically, each time
a record is read.
No matter how many fields there are, the last field in a record can be
represented by $NF
. So, in the example above, $NF
would
be the same as $7
, which is `example.'. Why this works is
explained below (see section Non-constant Field Numbers).
If you try to reference a field beyond the last one, such as $8
when the record has only seven fields, you get the empty string.
$0
, which looks like a reference to the "zeroth" field, is
a special case: it represents the whole input record. $0
is
used when you are not interested in fields.
Here are some more examples:
$ awk '$1 ~ /foo/ { print $0 }' BBS-list -| fooey 555-1234 2400/1200/300 B -| foot 555-6699 1200/300 B -| macfoo 555-6480 1200/300 A -| sabafoo 555-2127 1200/300 C
This example prints each record in the file `BBS-list' whose first
field contains the string `foo'. The operator `~' is called a
matching operator
(see section How to Use Regular Expressions);
it tests whether a string (here, the field $1
) matches a given regular
expression.
By contrast, the following example looks for `foo' in the entire record and prints the first field and the last field for each input record containing a match.
$ awk '/foo/ { print $1, $NF }' BBS-list -| fooey B -| foot B -| macfoo A -| sabafoo C
Go to the first, previous, next, last section, table of contents.