Contents | Prev | Next

Arrays

Most of the necessary introduction to arrays is in the last section of Chapter 4, which shows how you define and initialize an array. Holding objects is the focus of this chapter, and an array is just one way to hold objects. But there are a number of other ways to hold objects, so what makes an array special?

There are two issues that distinguish arrays from other types of collections: efficiency and type. The array is the most efficient way that Java provides to store and access a sequence of objects (actually, object handles). The array is a simple linear sequence, which makes element access fast, but you pay for this speed: when you create an array object, its size is fixed and cannot be changed for the lifetime of that array object. You might suggest creating an array of a particular size and then, if you run out of space, creating a new one and moving all the handles from the old one to the new one. This is the behavior of the Vector class, which will be studied later in the chapter. However, because of the overhead of this size flexibility, a Vector is measurably less efficient than an array.

The vector class in C++ does know the type of objects it holds, but it has a different drawback when compared with arrays in Java: the C++ vector’s operator[] doesn’t do bounds checking, so you can run past the end. (It’s possible, however, to ask how big the vector is, and the at( ) method does perform bounds checking.) In Java, you get bounds checking regardless of whether you’re using an array or a collection – you’ll get a RuntimeException if you exceed the bounds. As you’ll learn in Chapter 9, this type of exception indicates a programmer error and thus you don’t need to check for it in your code. As an aside, the reason the C++ vector doesn’t check bounds with every access is speed – in Java you have the constant performance overhead of bounds checking all the time for both arrays and collections.

The other generic collection classes that will be studied in this chapter, Vector, Stack, and Hashtable, all deal with objects as if they had no specific type. That is, they treat them as type Object, the root class of all classes in Java. This works fine from one standpoint: you need to build only one collection, and any Java object will go into that collection. (Except for primitives – these can be placed in collections as constants using the Java primitive wrapper classes, or as changeable values by wrapping in your own class.) This is the second place where an array is superior to the generic collections: when you create an array, you create it to hold a specific type. This means that you get compile-time type checking to prevent you from putting the wrong type in, or mistaking the type that you’re extracting. Of course, Java will prevent you from sending an inappropriate message to an object, either at compile-time or at run-time. So it’s not as if it’s riskier one way or the other, it’s just nicer if the compiler points it out to you, faster at run-time, and there’s less likelihood that the end user will get surprised by an exception.

For efficiency and type checking it’s always worth trying to use an array if you can. However, when you’re trying to solve a more general problem arrays can be too restrictive. After looking at arrays, the rest of this chapter will be devoted to the collection classes provided by Java.

Arrays are first-class objects

Regardless of what type of array you’re working with, the array identifier is actually a handle to a true object that’s created on the heap. The heap object can be created either implicitly, as part of the array initialization syntax, or explicitly with a new expression. Part of the heap object (in fact, the only field or method you can access) is the read-only length member that tells you how many elements can be stored in that array object. The ‘ []’ syntax is the only other access that you have to the array object.

The following example shows the various ways that an array can be initialized, and how the array handles can be assigned to different array objects. It also shows that arrays of objects and arrays of primitives are almost identical in their use. The only difference is that arrays of objects hold handles while arrays of primitives hold the primitive values directly. (See page 97 if you have trouble executing this program.)

//: ArraySize.java
// Initialization & re-assignment of arrays
package c08;

class Weeble {} // A small mythical creature

public class ArraySize {
  public static void main(String[] args) {
    // Arrays of objects:
    Weeble[] a; // Null handle
    Weeble[] b = new Weeble[5]; // Null handles
    Weeble[] c = new Weeble[4];
    for(int i = 0; i < c.length; i++)
      c[i] = new Weeble();
    Weeble[] d = {
      new Weeble(), new Weeble(), new Weeble()
    };
    // Compile error: variable a not initialized:
    //!System.out.println("a.length=" + a.length);
    System.out.println("b.length = " + b.length);
    // The handles inside the array are 
    // automatically initialized to null:
    for(int i = 0; i < b.length; i++)
      System.out.println("b[" + i + "]=" + b[i]);
    System.out.println("c.length = " + c.length);
    System.out.println("d.length = " + d.length);
    a = d;
    System.out.println("a.length = " + a.length);
    // Java 1.1 initialization syntax:
    a = new Weeble[] {
      new Weeble(), new Weeble()
    };
    System.out.println("a.length = " + a.length);

    // Arrays of primitives:
    int[] e; // Null handle
    int[] f = new int[5];
    int[] g = new int[4];
    for(int i = 0; i < g.length; i++)
      g[i] = i*i;
    int[] h = { 11, 47, 93 };
    // Compile error: variable e not initialized:
    //!System.out.println("e.length=" + e.length);
    System.out.println("f.length = " + f.length);
    // The primitives inside the array are
    // automatically initialized to zero:
    for(int i = 0; i < f.length; i++)
      System.out.println("f[" + i + "]=" + f[i]);
    System.out.println("g.length = " + g.length);
    System.out.println("h.length = " + h.length);
    e = h;
    System.out.println("e.length = " + e.length);
    // Java 1.1 initialization syntax:
    e = new int[] { 1, 2 };
    System.out.println("e.length = " + e.length);
  }
} ///:~

Here’s the output from the program:

b.length = 5
b[0]=null
b[1]=null
b[2]=null
b[3]=null
b[4]=null
c.length = 4
d.length = 3
a.length = 3
a.length = 2
f.length = 5
f[0]=0
f[1]=0
f[2]=0
f[3]=0
f[4]=0
g.length = 4
h.length = 3
e.length = 3
e.length = 2

The array a is initially just a null handle, and the compiler prevents you from doing anything with this handle until you’ve properly initialized it. The array b is initialized to point to an array of Weeble handles, but no actual Weeble objects are ever placed in that array. However, you can still ask what the size of the array is, since b is pointing to a legitimate object. This brings up a slight drawback: you can’t find out how many elements are actually in the array, since length tells you only how many elements can be placed in the array; that is, the size of the array object, not the number of elements it actually holds. However, when an array object is created its handles are automatically initialized to null so you can see whether a particular array slot has an object in it by checking to see whether it’s null. Similarly, an array of primitives is automatically initialized to zero for numeric types, null for char, and false for boolean.

Array c shows the creation of the array object followed by the assignment of Weeble objects to all the slots in the array. Array d shows the “aggregate initialization” syntax that causes the array object to be created (implicitly with new on the heap, just like for array c) and initialized with Weeble objects, all in one statement.

The expression

a = d;

shows how you can take a handle that’s attached to one array object and assign it to another array object, just as you can do with any other type of object handle. Now both a and d are pointing to the same array object on the heap.

Java 1.1 adds a new array initialization syntax, which could be thought of as a “dynamic aggregate initialization.” The Java 1.0 aggregate initialization used by d must be used at the point of d’s definition, but with the Java 1.1 syntax you can create and initialize an array object anywhere. For example, suppose hide( ) is a method that takes an array of Weeble objects. You could call it by saying:

hide(d);

but in Java 1.1 you can also dynamically create the array you want to pass as the argument:

hide(new Weeble[] { new Weeble(), new Weeble() });

This new syntax provides a more convenient way to write code in some situations.

The second part of the above example shows that primitive arrays work just like object arrays except that primitive arrays hold the primitive values directly.

Collections of primitives

Collection classes can hold only handles to objects. An array, however, can be created to hold primitives directly, as well as handles to objects. It is possible to use the “wrapper” classes such as Integer, Double, etc. to place primitive values inside a collection, but as you’ll see later in this chapter in the WordCount.java example, the wrapper classes for primitives are only somewhat useful anyway. Whether you put primitives in arrays or wrap them in a class that’s placed in a collection is a question of efficiency. It’s much more efficient to create and access an array of primitives than a collection of wrapped primitives.

Of course, if you’re using a primitive type and you need the flexibility of a collection that automatically expands when more space is needed, the array won’t work and you’re forced to use a collection of wrapped primitives. You might think that there should be a specialized type of Vector for each of the primitive data types, but Java doesn’t provide this for you. Some sort of templatizing mechanism might someday provide a better way for Java to handle this problem. [32]

Returning an array

Suppose you’re writing a method and you don’t just want to return one thing, but a whole bunch of things. Languages like C and C++ make this difficult because you can’t just return an array, only a pointer to an array. This introduces problems because it becomes messy to control the lifetime of the array, which easily leads to memory leaks.

Java takes a similar approach, but you just “return an array.” Actually, of course, you’re returning a handle to an array, but with Java you never worry about responsibility for that array – it will be around as long as you need it, and the garbage collector will clean it up when you’re done.

As an example, consider returning an array of String:

//: IceCream.java
// Returning arrays from methods

public class IceCream {
  static String[] flav = {
    "Chocolate", "Strawberry",
    "Vanilla Fudge Swirl", "Mint Chip",
    "Mocha Almond Fudge", "Rum Raisin",
    "Praline Cream", "Mud Pie" 
  };
  static String[] flavorSet(int n) {
    // Force it to be positive & within bounds:
    n = Math.abs(n) % (flav.length + 1);
    String[] results = new String[n];
    int[] picks = new int[n];
    for(int i = 0; i < picks.length; i++)
      picks[i] = -1;
    for(int i = 0; i < picks.length; i++) {
      retry:
      while(true) {
        int t =
          (int)(Math.random() * flav.length);
        for(int j = 0; j < i; j++)
          if(picks[j] == t) continue retry;
        picks[i] = t;
        results[i] = flav[t];
        break;
      }
    }
    return results;
  }
  public static void main(String[] args) {
    for(int i = 0; i < 20; i++) {
      System.out.println(
        "flavorSet(" + i + ") = ");
      String[] fl = flavorSet(flav.length);
      for(int j = 0; j < fl.length; j++)
        System.out.println("\t" + fl[j]);
    }
  }
} ///:~

The method flavorSet( ) creates an array of String called results. The size of this array is n, determined by the argument you pass into the method. Then it proceeds to choose flavors randomly from the array flav and place them into results, which it finally returns. Returning an array is just like returning any other object – it’s a handle. It’s not important that the array was created within flavorSet( ), or that the array was created anyplace else, for that matter. The garbage collector takes care of cleaning up the array when you’re done with it, and the array will persist for as long as you need it.

As an aside, notice that when flavorSet( ) chooses flavors randomly, it ensures that a random choice hasn’t been picked before. This is performed in a seemingly infinite while loop that keeps making random choices until it finds one that’s not already in the picks array. (Of course, a String comparison could also have been performed to see if the random choice was already in the results array, but String comparisons are inefficient.) If it’s successful it adds the entry and breaks out to go find the next one ( i gets incremented). But if t is a number that’s already in picks, then a labeled continue is used to jump back two levels, which forces a new t to be selected. It’s particularly convincing to watch this happen with a debugger.

main( ) prints out 20 full sets of flavors, so you can see that flavorSet( ) chooses the flavors in a random order each time. It’s easiest to see this if you redirect the output into a file. And while you’re looking at the file, remember, you’re not really hungry. (You just want the ice cream, you don’t need it.)

[32] This is one of the places where C++ is distinctly superior to Java, since C++ supports parameterized types with the template keyword.

Contents | Prev | Next