UnicodeView
Description
An abstract class whose subclasses provide a representational stream over a Unicode component. It allows users to access a Unicode component's (<UnicodeString>, <Grapheme>, <UnicodeScalar>....) various Unicode representations in a very flexible and uniform way.
Views also partially conform to a read-only <SequenceableCollection>. There are various APIs that allow for Accessing, Conversion, Copying and Iteration of views that do not impact the view's current position.
Instance State
• viewId: <Integer> constant value from the UnicodeViewIds pool dictionary. This gives generic VM primitives the information they need to perform view-specific operations.
• componentId: <Integer> constant value from the UnicodeComponentIds pool dictionary. This gives VM primitives additional information regarding the component that they are viewing.
• bytes: <Object> byte object providing the actual byte content that the view uses to stream over.
• offset: <Integer> offset into the byte object from which the view should start from.
• len: <Integer> viewable length of the byte object.
• position: <Integer> view-relative position into the component.
• bookmark: <Integer> byte-relative position into the bytes of the component.
Bi-Directional
In addition to the forward cursoring behavior of traditional Smalltalk <Stream>s, views may also provide backward cursoring if the class answers true to #supportsBidirectionalStreaming.
View instances may also use the instance method #isBidirectional.
Below are examples of how to use backward cursoring.
"Backwards cursoring using #previous API. Similar to #next, it will raise an exception if an attempt is made
to go beyond the view boundaries"
| view |
view := 'ab' utf8.
view setToEnd.
self assert: [view atEnd].
self assert: [view previous = $b value].
self assert: [view previous = $a value].
self assert: [view atStart].
self assert: [[view previous. false] on: Exception do: [:ex | ex exitWith: true]].
"Backwards cursoring using #tryPrevious API. Similar to #tryNext, nil will be answered if an attempt is made
to go beyond the view boundaries"
view := 'ab' utf8.
view setToEnd.
self assert: [view atEnd].
self assert: [view tryPrevious = $b value].
self assert: [view tryPrevious = $a value].
self assert: [view atStart].
self assert: [view tryPrevious isNil].
Copy-On-Write
Views hook into a Unicode string's copy-on-write functionality. This means when a view is created on a <UnicodeString>, the content of the string can be considered immutable. Even if the Unicode string is modified after the fact, the view will see the content as it was when the view was created.
While bridging APIs have been provided on <String> and '` to have access to the power of views, they do not support copy-on-write semantics. This means that views on these objects will be reading from mutable data and care should be taken if a user plans to mutate the data while working with a view on it.
| str view |
"Immutable View on UnicodeString"
str := 'abc' asUnicodeString.
view := str utf8.
str replaceFrom: 1 to: 3 with: 'xyz'.
self assert: [view contents = (Utf8 with: $a value with: $b value with: $c value)].
"Mutable View on String"
str := 'abc' copy.
view := str utf8.
str replaceFrom: 1 to: 3 with: 'xyz'.
self assert: [view contents = (Utf8 with: $x value with: $y value with: $z value)].
Collection Conformance
Views have an API that allows for accessing elements without consuming input. This gives views some limited ability to act like read-only <SequenceableCollection>s.
| view |
"Get a grapheme view on '<CrLf>abc'"
view := (String crlf, 'abc') graphemes.
"Get the CrLf grapheme at index 1. This is not guaranteed to be performed in constant-time O(1)"
self assert: [(view at: 1) = Grapheme crlf].
"Get the size of the view. This is not guaranteed to be performed in constant-time O(1)"
self assert: [view size = 4].
"Iterate and collect up all graphemes into an OrderedCollection"
self assert: [
(view inject: OrderedCollection new into: [:all :each | all add: each. all]) asArray = (view contentsInto: Array)].
"Find the index of an element"
self assert: [(view indexOf: $c asGrapheme) = 4].
"Does the view include the element?"
self assert: [view includes: $c asGrapheme]
Iteration methods will begin at the current position. This means iteration methods will iterate over the remainingelements in the view. If you want to start from the beginning of the view, you must reset the view before calling the iteration method. You can also cache the previous position object so you can restore the original position quickly.
| view position str |
view := 'Smalltalk' asUnicodeString graphemes.
"Grab the next 5 elements and save that position"
self assert: [(view next: 5) = 'Small' asUnicodeString].
position := view position.
"Reset the view to the beginning"
view reset.
"Iterate through each element and add to str"
str := UnicodeString new.
view do: [:g | str add: g].
self assert: [str = 'Smalltalk' asUnicodeString].
"Restore the position that was saved off above"
view position: position.
self assert: [view upToEnd = 'talk' asUnicodeString]
Copying
Copying a Unicode view will answer another Unicode view making copy a performant API.
| view |
view := 'smalltalk' graphemes.
[view isEmpty] whileFalse: [view := view copyFrom: 2. Transcript show: view contents; cr]
Contents
A view can collect up its entire contents into a desired collection. This is similar to what normal Smalltalk streams can do, but it also allows some flexibility in defining the result collection type (without conversion overhead).
All views will have a default collection type. This will be the collection answered when #contents is requested.
self assert: ['abc' graphemes contents isUnicodeString].
self assert: [
'abc' unicodeScalars contents isArray and: ['abc' unicodeScalars contents allSatisfy: [:s | s isUnicodeScalar]]].
self assert: ['abc' utf8 contents isKindOf: Utf8].
self assert: ['abc' utf16 contents isKindOf: Utf16].
self assert: ['abc' utf32 contents isKindOf: Utf32].
All views can also represent their content as a <ByteArray>. This will expose the actual byte-representation of the encoded contents of the view.
"Grapheme views will produce their contents as utf8 encoded bytes in a <ByteArray> container"
self assert: ['abc' graphemes asByteArray = #[97 98 99]].
"UnicodeScalar views will produce their contents as utf8 encoded bytes in a <ByteArray> container"
self assert: ['abc' unicodeScalars asByteArray = #[97 98 99]].
"Utf8 views will produce their contents as utf8 encoded bytes in a <ByteArray> container"
self assert: ['abc' utf8 asByteArray = #[97 98 99]].
"Utf16 little-endian views will produce their contents as utf16 encoded bytes in a <ByteArray> container"
self assert: ['abc' utf16LE asByteArray = #[97 0 98 0 99 0]].
"Utf32 big-endian views will produce their contents as utf32 encoded bytes in a <ByteArray> container"
self assert: ['abc' utf32BE asByteArray = #[0 0 0 97 0 0 0 98 0 0 0 99]].
Views are very efficient at making copies of themselves, and this can be used to help get the contents of a range in the view.
self assert: [('Smalltalk' graphemes copyFrom: 6) contents = 'talk' asUnicodeString]
Positions
Positions are first-class objects in Unicode views. Unlike an index, its better to think of a position as the location between the elements such that in a valid range, calling #next on a view at postition n will yield the element in the view at index n+1.
The vertical pipes below represent positions while the carets represent indices.
"
| a | b | c |
^ ^ ^
"
Unicode views are often doing a lot of transcoding underneath when moving from one position to the next. For example, a <UnicodeString> stores its contents in UTF-8, but it presents its elements to the user as a collection of <Grapheme>s. A given position from a grapheme view on a <UnicodeString> might have a logical position that is very different from its associated position in the UTF-8 backing storage. Unlike a simple index, a <UnicodeViewPosition> object keeps track of this so streams can be efficiently positioned.
| positions view |
view := (String crlf , 'abc') asUnicodeString graphemes.
"Get all grapheme positions in the stream"
positions := view positions.
"<CrLf> is one grapheme but two UTF-8 bytes. This means the second position will be 2, but its bookmark (encoded offset
into backing storage) will be 3"
view position: 2.
self assert: [view position value = 2].
self assert: [view position bookmark = 3].
"Reposition the stream to the end directly - no transcoding required"
view position: positions last.
self assert: [view previous = $c asGrapheme].
"Find the position, such that calling #next will yield the element to find"
view reset.
self assert: [(view positionOf: $b asGrapheme) value = 2].
Streaming
Unicode views have a rich streaming interface. They have a traditional <ReadStream> API that provides high levels of compatibility with the [view atEnd] whileFalse: [view next] style of programming. Views also have streaming APIs prefixed with try that will answer nil instead of raising exceptions when the view cursor runs out of bounds. Views may optionally support bidirectional streaming, which means you can request previous elements until you are at the beginning (i.e. [view atStart] whileFalse: [view previous]).
| view |
view := 'Smalltalk' asUnicodeString graphemes.
"Supports traditional method of cursoring through streams"
[view atEnd] whileFalse: [view next].
view reset.
[view atEnd] whileFalse: [view skip].
"Same example but using tryNext/trySkip.
Avoids throwing exceptions when running off the end"
view reset.
[view tryNext notNil] whileTrue: [].
view reset.
[view trySkip] whileTrue: [].
self assert: [view tryNext isNil].
"Lots of convenience methods for streaming"
view reset.
self assert: [(view next: 5) = 'Small'].
self assert: [(view tryNext: 1000) = 'talk'].
view reset.
self assert: [view peek = $S asGrapheme].
view next: 5.
self assert: [view contents = 'Smalltalk' asUnicodeString].
self assert: [view remainingContents = 'talk' asUnicodeString].
self assert: [view upToEnd = 'talk'].
self assert: [view atEnd].
"Views can even go backwards!"
self assert: [view previous = $k asGrapheme].
[view atStart] whileFalse: [view previous].
self assert: [view tryPrevious isNil]
@see 'Class Comments' of subclasses for more information on specific views.
Class Methods
on:
Answer a read stream that produces an immutable view on
@aUnicodeComponent.
Arguments:
aUnicodeComponent - <UnicodeString | Grapheme | UnicodeScalar | String | ByteArray>
Answers:
<UnicodeView subclass> instance
supportsBidirectionalStreaming
Answers true if the concrete view supports bidirectional streaming.
This will give access to APIs like #previous, #tryPrevious.
All views can move forward using #next, #tryNext and related.
Answers:
<Boolean> true if bidirectional, false otherwise.
viewClassFor:
Answer the view class for a given @anId.
@anId should be one of the constants defined in the pool dictionary 'UnicodeViewIds'
Arguments:
anId - <Integer> constant from UnicodeViewIds pool dictionary
Answers:
<UnicodeView class> or nil if there are no matches
Instance Methods
asByteArray
Convert the complete contents of the view into a byte array.
GraphemeView - UTF-8 representation of all graphemes in the view
UnicodeScalarView - UTF-8 representation of all unicode scalars in the view
Utf8View - UTF-8 representation in a <ByteArray> container (instead of a <Utf8> container)
Utf16View (and subclasses) - endian-sensitive byte representation of all UTF-16 code units in the view
Utf32View (and subclasses) - endian-sensitive byte representation of all UTF-32 code units in the view
Example:
self assert: ['a' utf16LE asByteArray = #[97 0]].
self assert: ['a' utf16BE asByteArray = #[0 97]].
self assert: [String crlf graphemes asByteArray = #[13 10]]
Answers:
<ByteArray>
at:
Answer the element at a particular index within the stream.
This will not consume any input and the position will be the
same before and after the call
Notes:
This is not guaranteed to be performed in constant-time O(1).
Access times may increase as @anIndex gets larger.
Example:
self assert: [('abc' graphemes at: 2) = $b asGrapheme]
Arguments:
anIndex - <Integer>
Answers:
<Object>
atEnd
Answer a Boolean which is true if the receiver cannot
access any more objects, and false otherwise.
Example:
| view |
view := 'abc' graphemes.
self assert: [view atEnd not].
view skip: 3.
self assert: [view atEnd].
Answers:
<Boolean>
atStart
Answer a Boolean which is true if the receiver cannot
access any more objects, and false otherwise.
Example:
| view |
view := 'abc' graphemes.
view setToEnd.
self assert: [view atStart not].
view previous; previous; previous.
self assert: [view atStart].
Answers:
<Boolean>
close
Compat: Take no action.
contents
Answer a Collection which the collection that the receiver is streaming over.
Example:
self assert: ['abc' graphemes contents = 'abc' asUnicodeString].
self assert: [String crlf unicodeScalars contents = { UnicodeScalar cr. UnicodeScalar lf }].
self assert: ['abc' utf8 contents = (Utf8 with: $a value with: $b value with: $c value)]
Answers:
<Object> instance of view collection
contentsInto:
Answer the collection over which the receiver is viewing
@aCollectionClass options depend on the view type:
GraphemeView - Any byte-shaped object or pointer-shaped object class
(i.e. ByteArray, Array...)
UnicodeScalarView - Any byte-shaped object or pointer-shaped object class
(i.e. ByteArray, Array...)
Utf8View - Any byte-shaped object class
(i.e. ByteArary, Utf8)
Utf16View - Any byte-shaped or word-shaped object class
(i.e. ByteArary, Utf16)
Utf32View - Any byte-shaped or long-shaped object class
(i.e. ByteArary, Utf32)
Example:
self assert: [('abc' graphemes contentsInto: Array) = 'abc' asUnicodeString asArray].
self assert: [(String crlf unicodeScalars contentsInto: ByteArray) = #[13 10]].
Arguments:
aCollectionClass - <Object>
Answers:
<Object> instance of @aCollectionClass
copy
Answer a shallow copy of this view.
API usage on this copy will not impact the receiver.
Example:
| view peek |
view := 'abc' graphemes.
self assert: [view next = $a asGrapheme].
self assert: [(peek := view copy next) = $b asGrapheme].
self assert: [view next = $b asGrapheme].
Answers:
<UnicodeView> copy
copyFrom:
Answer a new view of the collection over which the receiver is viewing
from @index to the end of the view content.
If @index is < 1 or @index > size, then an empty view will be answered.
Example:
self assert: [('abcde' utf8 copyFrom: 3) asByteArray = #[99 100 101]].
self assert: [('abcde' graphemes copyFrom: 2) contents = 'bcde' asUnicodeString]
Arguments:
index - <Integer>
Answers:
<UnicodeView> new view over the range [index, self size] or an empty view [0, 0]
copyFrom:to:
Answer a new view of the collection over which the receiver is viewing
from @firstIndex to @lastIndex (inclusive).
If @firstIndex is < 1 or @firstIndex > size, then an empty view will be answered.
If @lastIndex is < 1 or @lastIndex > size, then an empty view will be answered.
If @lastIndex < @firstIndex, then an empty view will be answered.
Example:
self assert: [('abcdef' utf8 copyFrom: 3 to: 5) asByteArray = #[99 100 101]].
self assert: [('abcdef' graphemes copyFrom: 2 to: 5) contents = 'bcde' asUnicodeString]
Arguments:
firstIndex - <Integer>
lastIndex - <Integer>
Answers:
<UnicodeView> new view over the range [firstIndex, lastIndex] or an empty view [0, 0]
do:
Evaluate the one argument block, aBlock for each of the remaining
elements accessible by the receiver.
Fail if aBlock is not a one-argument Block.
@Note: The stream position will remain the same before/after the call.
Example:
'abc' graphemes do: [:e | Transcript show: e asString]
Arguments:
aBlock - <Block> 1-arg block
doWithIndex:
Evaluate the two argument block, aBlock using
each element of the receiver, in order, and the element
index. Fail if aBlock is not a two-argument Block.
@Note: The stream position will remain the same before/after the call.
Example:
'abc' graphemes doWithIndex: [:e :i | self assert: [('abc' at: i) asGrapheme = e]]
Arguments:
aBlock - <Block> 2-arg block with <Object> element and <Integer> index
doWithPosition:
Evaluate the two argument block, aBlock using
each element of the receiver, in order, and the element's
position. Fail if aBlock is not a two-argument Block.
The position of an element relates to the cursor between elements.
The first position is always 0, the last position in a collection of n elements
is n. The total number of positions in a collection of n elements is n+1.
The vertical pipes below represent positions while the carets represent indices
| a | b | c |
^ ^ ^
@Note: The stream position will remain the same before/after the call.
Example:
| view positions |
view := 'abc' graphemes.
positions := view positions.
view doWithPosition: [:e :p | self assert: [p = positions removeFirst]].
self assert: [positions isEmpty]
Arguments:
aBlock - <Block> 2-arg block with <Object> element and <UnicodeViewPosition> position
first
Answer the first element in the view or raise index out of range
exception if no elements available.
Example:
self assert: ['abc' graphemes first = $a asGrapheme].
self assert: [(['' graphemes first = $a asGrapheme. false] on: Exception do: [:ex | ex exitWith: true])].
Answers:
<Object>
Raises:
ExCLDTIndexOutOfRange if no elements are available
firstPosition
Answer the first position in the stream before any elements
have been processed
Example:
| view |
view := 'abc' graphemes.
view positionOf: $d asGrapheme ifAbsent: [view firstPosition]
Answers:
<UnicodeViewPosition>
flush
Compat: Take no action
includes:
Answer true if there exist an element in the view that is equivalent
to @anElement
If the receiver does not contain an element that is equivalent to
@anElement, answer false.
Searching begins at the current position. After the call, the position
of the stream will be restored to where it was before the call.
Example:
self assert: ['abcde' unicodeScalars includes: $c asUnicodeScalar].
self assert: [('abcde' unicodeScalars includes: $z asUnicodeScalar) not].
Arguments:
anElement - <Object>
Answers:
<Boolean> true if found, false otherwise
indexOf:
Answer an <Integer> which is the first index within the view
that is equivalent @anElement
If the receiver does not contain an element that is equivalent to
@anElement, answer 0.
Searching begins at the current position. After the call, the position
of the stream will be restored to where it was before the call.
Example:
self assert: [('abcde' unicodeScalars indexOf: $c asUnicodeScalar) = 3].
self assert: [('abcde' unicodeScalars indexOf: $z asUnicodeScalar) = 0].
Arguments:
anElement - <Object>
Answers:
<Integer> index if found
indexOf:ifAbsent:
Answer an <Integer> which is the first index within the view
that is equivalent @anElement
If the receiver does not contain an element that is equivalent to
@anElement, answer the result of evaluating the zero argument block,
exceptionBlock.
Searching begins at the current position. After the call, the position
of the stream will be restored to where it was before the call.
Example:
self assert: [('abcde' unicodeScalars indexOf: $c asUnicodeScalar ifAbsent: ['None']) = 3].
self assert: [('abcde' unicodeScalars indexOf: $z asUnicodeScalar ifAbsent: ['None']) = 'None'].
Arguments:
anElement - <Object>
exceptionBlock - <Block> 0-arg
Answers:
<Integer> index if found
<Object> result of exception block if not found
inject:into:
Answer an Object which is the final result of iteratively evaluating
the two argument block, @aBlock using the previous result of evaluating
@aBlock and each element of the view as arguments. The first
argument of @aBlock represents the result of the previous iteration
and the second represents an element in the view. The initial
value for the first block argument is the argument @thisValue.
Fail if aBlock is not a two-argument Block
Fail if the evaluation of aBlock cannot be successfully used
as the first parameter on subsequent iterations.
Example:
self assert: [('abc' graphemes inject: 0 into: [:sum :e | sum + e utf8 size]) = 3]
Arguments:
thisValue - <Object>
aBlock - <Block> 2-argument block
Answers:
<Object>
isBidirectional
Answers true if the concrete view supports bidirectional streaming.
This will give access to APIs like #previous, #tryPrevious.
All views can move forward using #next, #tryNext and related.
Example:
| view |
view := 'abcde' graphemes.
[view atEnd] whileFalse: [view next].
self assert: [view atEnd].
view isBidirectional ifTrue: [
[view atStart] whileFalse: [view previous].
self assert: [view atStart]
].
Answers:
<Boolean> true if bidirectional, false otherwise.
isEmpty
Answer true if the contents of the view are empty.
This is relative to the complete contents and is not
impacted by the current position.
Example:
self assert: ['' graphemes isEmpty].
self assert: ['abc' graphemes isEmpty not].
Answers:
<Boolean>
isUnicodeView
Answer true if the receiver is a unicode view, false otherwise.
Answers:
<Boolean>
lineDelimiter
Answer the line delimiter of the receiver.
Notes:
Some subclasses may not support line delimiters and will answer nil.
Answers:
<Object>
lineDelimiter:
Set the line delimiter of the receiver.
This is only designed to be implemented by subclasses
that support line endings
Arguments:
<anObject> - anObject
next
Answer an Object that is the next accessible by the
receiver. Change the state of the receiver so that
returned object is no longer accessible.
Raise exception if at the end.
Example:
| view |
view := ('abc' , String crlf) graphemes.
self assert: [view next = $a].
self assert: [view next = $b].
self assert: [view next = $c].
self assert: [view next = Grapheme crlf].
self assert: [[view next. false] on: Exception do: [:ex | ex exitWith: true]].
Answers:
<Object>
Raises:
<Exception> ExCLDTIndexOutOfRange
next:
Answer a collection containing the next @anInteger elements from the view.
If @anInteger < 1, an empty collection is answered
Example:
| view |
view := ('abc' , String crlf) graphemes.
self assert: [(view next: 3) = 'abc'].
self assert: [(view next: 1) = UnicodeString crlf].
self assert: [[(view next: 1). false] on: Exception do: [:ex | ex exitWith: true]].
Arguments:
anInteger - <Integer>
Answers:
<Object> instance of view collection class
Raises:
<Exception> ExCLDTIndexOutOfRange
peek
Answer an Object that is the next accessible by the receiver.
Change the state of the receiver so that returned object is no longer accessible.
Answer nil if the view is atEnd
Example:
| view |
view := 'abc' graphemes.
self assert: [view peek = $a asGrapheme].
self assert: ['' graphemes peek isNil].
Answers:
<Object> or nil if at end
position
Answer a <UnicodeViewPosition> representing the current position of access for
the receiver. This can be used later to effeciently cursor to positions in variable-length
views.
Unlike a <PositionableStream>, this uses a <UnicodeViewPosition> object instead of an <Integer>.
This is because transcoding may be required to move from one position to another. This <UnicodeViewPosition>
carries around with it an 'encoded offset' that can be used for constant time positioning.
Example:
| view pos |
view := 'abc' graphemes.
view position: 3.
pos := view position.
view reset.
self assert: [view position value = 0].
view position: pos.
self assert: [view position value = 3]
Answers:
<UnicodeViewPosition>
position:
Set the receiver's position reference to argument aPosition.
Answer self.
Example:
| view pos |
view := 'abcde' graphemes.
pos := view setToEnd; position.
self assert: [(view reset; next: 3) = 'abc'].
view position: pos.
self assert: [view previous = $e]
Arguments:
aPosition - <UnicodeViewPosition | Integer>
positionOf:
Answer an <UnicodeViewPosition> which is the first position within the view
that is equivalent @anElement
If the receiver does not contain an element that is equivalent to
@anElement, answer 0.
The position of an element relates to the cursor between elements.
The first position is always 0, the last position in a collection of n elements
is n. The total number of positions in a collection of n elements is n+1.
The vertical pipes below represent positions while the carets represent indices
| a | b | c |
^ ^ ^
Searching begins at the current position. After the call, the position
of the stream will be restored to where it was before the call.
Example:
self assert: [('abcde' unicodeScalars positionOf: $c asUnicodeScalar) value = 2].
self assert: [('abcde' unicodeScalars positionOf: $z asUnicodeScalar) value = 0].
Arguments:
anElement - <Object>
Answers:
<UnicodeViewPosition> position if found or first position if not found
positionOf:ifAbsent:
Answer an <UnicodeViewPosition> which is the first position within the view
that is equivalent @anElement
If the receiver does not contain an element that is equivalent to
@anElement, answer the result of evaluating the zero argument block,
exceptionBlock.
The position of an element relates to the cursor between elements.
The first position is always 0, the last position in a collection of n elements
is n. The total number of positions in a collection of n elements is n+1.
The vertical pipes below represent positions while the carets represent indices
| a | b | c |
^ ^ ^
Searching begins at the current position. After the call, the position
of the stream will be restored to where it was before the call.
Example:
| view |
view := 'abcde' unicodeScalars.
self assert: [(view positionOf: $c asUnicodeScalar ifAbsent: [view firstPosition]) value = 2].
self assert: [(view positionOf: $z asUnicodeScalar ifAbsent: [view firstPosition]) value = 0].
Arguments:
anElement - <Object>
exceptionBlock - <Block> 0-arg
Answers:
<UnicodeViewPosition> position if found
<Object> result of exception block if not found
positions
Answer the sequenceable collection of all positions in this view
starting at position 0 and ending at the position after the last
element.
The grapheme positions in the unicode string in the example
below have the following layout (where a position is displayed as <pos #>)
| <0> 'a' <1> 'b' <2> 'c' <3> |
Example:
| view positions position |
view := 'abc' graphemes.
positions := view positions.
view position: positions last.
self assert: [view atEnd].
view position: (positions at: 3).
self assert: [view next = $c asGrapheme].
self assert: [view position = positions last]
Answers:
<SequenceableCollection>
previous
Answer an Object that is the previous accessible by the
receiver. Change the state of the receiver so that
returned object is no longer accessible.
Raise exception if an attempt is made to view before the
start.
Example:
| view |
view := ('abc' , String crlf) graphemes.
view setToEnd.
self assert: [view previous = Grapheme crlf].
self assert: [view previous = $c].
self assert: [view previous = $b].
self assert: [view previous = $a].
self assert: [[view previous. false] on: Exception do: [:ex | ex exitWith: true]].
Answers:
<Object>
Raises:
<Exception> ExCLDTIndexOutOfRange
remainingContents
Answer the collection over which the receiver is viewing starting at the current position.
Example:
self assert: [('abcde' graphemes next; next; contents) = 'abcde'].
self assert: [('abcde' graphemes next; next; remainingContents) = 'cde']
Answers:
<Object> instance of view collection
remainingContentsInto:
Answer the collection over which the receiver is viewing starting at the current position
and place the result in a new instance of @aCollectionClass
Example:
self assert: [('abcde' graphemes next; next; contentsInto: Array) = 'abcde' asArray].
self assert: [('abcde' graphemes next; next; remainingContentsInto: Array) = 'cde' asArray]
Arguments:
aCollectionClass - <Object>
Answers:
<Object> instance of @aCollectionClass
reset
Set the receiver's position reference to 0.
This also resets the bookmark used for positioning within the internal representation
of the component
reverseDo:
Evaluate the one argument block, aBlock for each of the each
elements accessible by the receiver, in reverse order
Fail if aBlock is not a one-argument Block.
@Note: The stream position will remain the same before/after the call.
Example:
| expected |
expected := 'abc' reverse asOrderedCollection collect: [:e | e asGrapheme].
'abc' graphemes reverseDo: [:e | self assert: [e = expected removeFirst]].
self assert: expected isEmpty
Arguments:
aBlock - <Block> 1-arg block
setToEnd
Set the position of the receiver to be the size of the
underlying contents
size
Answer the number of elements in the view.
Notes:
Depending on the view, this may be an expensive calculation if transcoding is required.
Callers are encouraged to review usage to see if this is really required.
For example, switch out calls like
view size = 0 ifTrue: [...]
with
view isEmpty ifTrue: [...]
Example:
self assert: [('abc' , String crlf) graphemes size = 4].
self assert: [('abc' , String crlf) unicodeScalars size = 5].
Answers:
<Integer>
skip
Read and discard @anInteger number of elements from the view.
Fail if the number of elements in the view < @anInteger
Example:
self assert: [('abcde' utf8 skip; skip; upToEnd) asByteArray = 'cde' asByteArray]
Arguments:
anInteger - <Integer>
Answers:
<Boolean> true if skip, false if atEnd
Raises:
<Exception> ExCLDTIndexOutOfRange
skip:
Read and discard @anInteger number of elements from the view.
Fail if the number of elements in the view < @anInteger
Example:
self assert: [('abcde' utf8 skip: 2; upToEnd) asByteArray = 'cde' asByteArray]
Arguments:
anInteger - <Integer>
Raises:
<Exception> ExCLDTIndexOutOfRange
skipTo:
Read and discard elements just past the occurrence of @anObject.
Example:
self assert: [('abcde' graphemes skipTo: $c; upToEnd) = 'de'].
self assert: [('abcde' graphemes skipTo: $z; upToEnd) = '']
Arguments:
anObject - <Object>
Answers:
<Boolean> true if found, false otherwise
skipToAll:
Attempt to read and discard elements just past the occurrence of @aSequenceableCollection.
Answer true if all elements in @aSequentialCollection occurred, else answer false.
Example:
self assert: ['abcde' graphemes skipToAll: 'bc'].
self assert: [('abcde' graphemes skipToAll: 'bc'; upToEnd) = 'de'].
self assert: [('abcde' graphemes skipToAll: 'zzz') not].
self assert: [('abcde' graphemes skipToAll: 'zzz'; upToEnd) = ''].
Arguments:
aSequenceableCollection - <SequenceableCollection>
Answers:
<Boolean>
skipToAny:
Read and discard elements beyond the next occurrence
of an element that exists in @aSequentialCollection or if none,
to the end of stream.
Answer true if an element in @aSequentialCollection
occurred, else answer false.
Example:
self assert: ['abcde' graphemes skipToAny: 'bd'].
self assert: [('abcde' graphemes skipToAny: 'bd'; upToEnd) = 'cde'].
self assert: [('abcde' graphemes skipToAny: 'zzz') not].
self assert: [('abcde' graphemes skipToAny: 'zzz'; upToEnd) = ''].
Arguments:
aSequentialCollection - <aSequentialCollection>
Answers:
<Boolean>
supportsLineDelimiters
Subclasses that support line delimiters should override
and answer true. This will allow for apis like nextLine
Answers:
<Boolean>
tryNext
Answer an Object that is the next accessible by the
receiver. Change the state of the receiver so that
returned object is no longer accessible.
Answer nil if the view is atEnd
Example:
| view |
view := ('abc' , String crlf) graphemes.
self assert: [view tryNext = $a].
self assert: [view tryNext = $b].
self assert: [view tryNext = $c].
self assert: [view tryNext = Grapheme crlf].
self assert: [view tryNext isNil].
Answers:
<Object>
tryNext:
Answer a collection containing UP TO the next @anInteger elements from the view.
If @anInteger < 1, an empty collection is answered
Example:
| view |
view := ('abc' , String crlf) graphemes.
self assert: [(view tryNext: 3) = 'abc'].
self assert: [(view tryNext: 1) = UnicodeString crlf].
self assert: [(view tryNext: 1) isEmpty].
Arguments:
anInteger - <Integer>
Answers:
<Object> instance of view collection class
tryPrevious
Answer an Object that is the previous accessible by the
receiver. Change the state of the receiver so that
returned object is no longer accessible.
Answer nil if the view already atStart
Example:
| view |
view := ('abc' , String crlf) graphemes.
view setToEnd.
self assert: [view tryPrevious = Grapheme crlf].
self assert: [view tryPrevious = $c].
self assert: [view tryPrevious = $b].
self assert: [view tryPrevious = $a].
self assert: [view tryPrevious isNil].
Answers:
<Object>
trySkip
Change the state of the receiver so that the next object is no longer accessible.
Answers true if there are more elements in the view, false if the view is atEnd
Example:
| view |
view := 'abcde' utf8.
self assert: [(view trySkip; trySkip; upToEnd) asByteArray = 'cde' asByteArray].
self assert: [view trySkip not]
Answers:
<Boolean>
trySkip:
Attempt to read and discard @anInteger number of elements from the view.
Answer the number of bytes actually skipped
Example:
| view |
view := 'abcde' utf8.
self assert: [(view trySkip: 2; upToEnd) asByteArray = 'cde' asByteArray].
view previous.
self assert: [(view trySkip: 2) = 1]
Arguments:
anInteger - <Integer>
Answers:
<Integer> number actually skipped
upTo:
Answers a collection of all of the objects in the view
beginning from the current position up to, but not including,
@anObject.
Example:
self assert: [('abcde' graphemes upTo: $c) = 'ab'].
self assert: [('abcde' graphemes upTo: $z) = 'abcde']
Arguments:
anObject - <Object>
Answers:
<Object> instance of view collection class
upToAll:
Answers a collection of all of the objects in the view beginning from the current position up to,
but not including, @aSequenceableCollection
Example:
self assert: [('abcde' graphemes upToAll: 'bc') = 'a'].
self assert: [('abcde' graphemes upToAll: 'bc'; upToEnd) = 'de'].
self assert: [('abcde' graphemes upToAll: 'zzz') = 'abcde'].
self assert: [('abcde' graphemes upToAll: 'zzz'; upToEnd) isEmpty].
Arguments:
aSequenceableCollection - <SequenceableCollection>
Answers:
<Object> instance of view collection class
upToAny:
Answers a collection of all of the objects in the view up to, but not including, the next occurrence
of the element that exists in @aSequenceableCollection. If the element that exists in @aSequenceableCollection
is not found and the end of the view is encountered, a collection of the objects read is returned.
Example:
self assert: [('abcde' graphemes upToAny: 'bd') = 'a'].
self assert: [('abcde' graphemes upToAny: 'bd'; upToEnd) = 'cde'].
self assert: [('abcde' graphemes upToAny: 'zzz') = 'abcde'].
self assert: [('abcde' graphemes upToAny: 'zzz'; upToEnd) isEmpty].
Arguments:
aSequenceableCollection - <SequenceableCollection>
Answers:
<Object> view collection class
upToEnd
Answer a collection containing UP TO the maximum number of elements read from the view.
If there are no more elements available to be read, then an empty collection is answered.
Example:
self assert: ['abcde' graphemes upToEnd = 'abcde'].
self assert: [('abcde' graphemes next: 2; upToEnd) = 'cde'].
self assert: ['' graphemes upToEnd = '']
Answers:
<Object> instance of view collection class
with:do:
Iteratively evaluate the two argument block, @aBlock using each element of
this view and the corresponding element of @aUnicodeView.
@Note - Unlike some other variations of this API, this version DOES allow processing
when the views are of different sizes. The processing will terminate when either view runs out of
elements to process.
Fail if aUnicodeView is not a kind of UnicodeView.
Fail if aBlock is not two-argument block.
Example:
'abc' graphemes with: 'abc' utf8 do: [:g :u | self assert: [g asciiValue = u]]
Arguments:
aUnicodeView - <UnicodeView>
aBlock - <Block> 2-arg block