The GNU Awk User's Guide - Multi-dimensional

Go to the first, previous, next, last section, table of contents.

Multi-dimensional Arrays

A multi-dimensional array is an array in which an element is identified by a sequence of indices, instead of a single index. For example, a two-dimensional array requires two indices. The usual way (in most languages, including awk) to refer to an element of a two-dimensional array named grid is with grid[x,y].

Multi-dimensional arrays are supported in awk through concatenation of indices into one string. What happens is that awk converts the indices into strings (see section Conversion of Strings and Numbers) and concatenates them together, with a separator between them. This creates a single string that describes the values of the separate indices. The combined string is used as a single index into an ordinary, one-dimensional array. The separator used is the value of the built-in variable SUBSEP.

For example, suppose we evaluate the expression `foo[5,12] = "value"' when the value of SUBSEP is "@". The numbers five and 12 are converted to strings and concatenated with an `@' between them, yielding "5@12"; thus, the array element foo["5@12"] is set to "value".

Once the element's value is stored, awk has no record of whether it was stored with a single index or a sequence of indices. The two expressions `foo[5,12]' and `foo[5 SUBSEP 12]' are always equivalent.

The default value of SUBSEP is the string "\034", which contains a non-printing character that is unlikely to appear in an awk program or in most input data.

The usefulness of choosing an unlikely character comes from the fact that index values that contain a string matching SUBSEP lead to combined strings that are ambiguous. Suppose that SUBSEP were "@"; then `foo["a@b", "c"]' and `foo["a", "b@c"]' would be indistinguishable because both would actually be stored as `foo["a@b@c"]'.

You can test whether a particular index-sequence exists in a "multi-dimensional" array with the same operator `in' used for single dimensional arrays. Instead of a single index as the left-hand operand, write the whole sequence of indices, separated by commas, in parentheses:

(subscript1, subscript2, ...) in array

The following example treats its input as a two-dimensional array of fields; it rotates this array 90 degrees clockwise and prints the result. It assumes that all lines have the same number of elements.

awk '{
     if (max_nf < NF)
          max_nf = NF
     max_nr = NR
     for (x = 1; x <= NF; x++)
          vector[x, NR] = $x
}

END {
     for (x = 1; x <= max_nf; x++) {
          for (y = max_nr; y >= 1; --y)
               printf("%s ", vector[x, y])
          printf("\n")
     }
}'

When given the input:

1 2 3 4 5 6
2 3 4 5 6 1
3 4 5 6 1 2
4 5 6 1 2 3

it produces:

Go to the first, previous, next, last section, table of contents.