Collection classes
The CLDT collection classes are Array, Bag, ByteArray, DBString, Dictionary, IdentityDictionary, Interval, LookupTable, OrderedCollection, Set, SortedCollection, String and Symbol. They are described in Chapters 9 and 10 of the Blue Book.
Note:
Three seldom-used Blue Book collection classes have been excluded from CLDT, namely LinkedList, MappedCollection, and RunArray. The design of LinkedList betrays too much of its implementation and is not an industrial-strength List abstract data type. RunArray was used mainly in the original Smalltalk graphics system to store bitmaps. MappedCollection allows for one level of indirection in Dictionary accesses and is seldom used in practice. Moreover, in most cases, the Dictionary message at:ifPresent: can be used to achieve the same effect.
A collection is a group of objects that can be manipulated and operated on either individually or as a whole. The objects contained by a collection are called its elements. The number of elements in a collection is its size. Most collections are generic containers that can hold any kind of object, but five collections (ByteArray, DBString, Interval, String, and Symbol) place restrictions on the class of their elements. As a rule, developers should use care when defining recursive collections, that is, collections that have themselves, directly or indirectly, as one of their own elements. In a recursive collection there are operations, for example, =, that can fail in some implementations.
The elements of a collection can be operated on as a whole using enumeration methods that traverse each element of the collection. The canonical enumeration method is do:. These enumeration methods replace the for, while, or do constructs used in traditional programming languages to build iterative procedures. Any collection that is ordered has a standard enumeration order, that is, a standard order in which elements are traversed by the do: operation. They are in fact traversed in order of monotonically increasing indices. This ordering is guaranteed to be maintained from traversal to traversal. Unordered collections by definition have no standard traversal order and are therefore traversed in an undefined order. In particular, subsequent traversals of the same unordered collection are not guaranteed to access elements in the same order.
Collections can be indexed or nonindexed. Indexed collections allow individual elements to be accessed by means of an external key. The key can be either an integer or an arbitrary object, depending on the collection.
Bag and Set are the simplest kinds of collections, because they are unordered and do not support an indexing mechanism. Bags can contain duplicate elements, while Sets cannot. Otherwise, their behavior is identical. Neither Bags nor Sets can contain an instance of UndefinedObject (usually referred to as nil); that is, attempts to add nil to a Bag or Set are ignored. The elements of a Bag or Set cannot be individually accessed, because only group operations are supported.
A Dictionary is an unordered collection whose elements are accessed by an explicitly assigned external key. Keys are often strings or symbols, but in principle any object can be used as a Dictionary key. Key matching for storing or accessing elements is based on the equality operation =. Consequently, keys must be unique with respect to the equality operation, that is, two elements cannot be associated with the same key nor can two keys be equal. There are no restrictions on the elements stored in a Dictionary.
IdentityDictionary is a specialization of Dictionary, that uses the equivalence operation (==) rather than equality (=) to match keys. In practice, the keys of an IdentityDictionary are usually symbols, although it is not a requirement.
LookupTable has the same protocols as Dictionary but does not contain any associations. This makes LookupTable faster than Dictionary for everything except associationsDo:. It also requires less space.
The elements of the ordered collections are all indexed by integer keys that begin at 1 and increase. The ordered collections are Array, ByteArray, DBString, OrderedCollection, SortedCollection, String, and Symbol. OrderedCollection offers the most comprehensive protocol of any of the collection classes, and is consequently the one used most often by developers. OrderedCollections do not have a fixed size, but can grow as needed to hold additional elements. The elements of an OrderedCollection can be any class of object. SortedCollection is similar to OrderedCollection, except that the elements are held in sorted order. The sort ordering is determined by the sort block associated with the SortedCollection. A sort block is a block with two formal parameters, which answers a Boolean value when provided with any two elements of the collection. If the answer is true then meter should be sorted before the second, whereas false indicates that the opposite order is correct.
The size of Array, ByteArray, DBString, String, and Symbol collections is fixed when they are created, and cannot be changed. Fixed-size collections are normally implemented as contiguous blocks of memory; consequently they can usually support efficient accessing, copying, and traversing operations. While the elements of an Array can be any object, the elements of a ByteArray can be integers between 0 and 255 (in other words, bytes). The elements of DBString, String and Symbol must be characters. The elements of String must be characters ranging in value from 0 to 255, whereas the elements of DBString can be characters in the range 0 to 65535, that is, double-byte characters. A Symbol is distinguished from a String in that it is immutable, that is, a Symbol is a read-only object once it has been created.
Note:
Symbols cannot contain characters whose value exceeds 255.
Porting tip:
The class DBString is analagous to the class TwoByteString in Objectworks\Smalltalk, and to the class DoubleByteString in Smalltalk/V. Because CLDT does not support Symbols that contain double-byte characters, the Objectworks\Smalltalk class TwoByteSymbol, and Smalltalk/V class DoubleByteSymbol have no CLDT equivalent.
An Interval is not really a collection at all, but a generator that represents a geometric progression. The elements of an Interval must be numbers (integers, fractions, or floats). Rather than store its elements, an Interval computes them when required, using the start, stop, and increment values provided when it is created.