UnicodeReadStream
Description
This is an adapter class used for bridging <UnicodeView>s with <ReadStream>s.
By default, this streams 'graphemes' which are user-perceived characters. A grapheme is represented in VAST by a <Grapheme> object. A <UnicodeString> in VAST is to be thought of as a <Collection> of <Grapheme>s.
If you need more technical parsing precision or closer line-ending compatibility with <Character>, then you can put this stream into unicode scalar mode by calling #switchToUnicodeScalarMode. A unicode scalar is represented in VAST by a <UnicodeScalar> object. A <UnicodeScalar> represents all Unicode code points except for a special range reserved for UTF-16 encoding.
If you are working with pure Unicode, then consider using views rather than this adapter class.
@see the class category Views on a <UnicodeString> for more details.
Instance State
CLDT-API
This adapter redefines the necessary <ReadStream> (and superclass) APIs to allow for efficient streaming of a <UnicodeString>. In most cases, this means delegating to the internal view which tend to implement operations more efficiently for variable-width collections than <ReadStream> does.
Modes
This stream can go into different modes which define how elements of the stream are to be interpreted. While the default mode is graphemes, you can switch to different modes using the APIs in the Modes
category. Switching a mode will always reset the stream to the beginning.
For example, if you wanted to process a string object as a <Collection> of <UnicodeScalar>s, you could do the following:
| stream |
stream := 'Smalltalk' asUnicodeString readStream.
"Process stream as unicode scalars"
stream switchToUnicodeScalarMode.
self assert: [stream next = $S asUnicodeScalar].
self assert: [(stream next: 8) = ('Smalltalk' asUnicodeString unicodeScalars copyFrom: 2) contents]
Class Methods
None
Instance Methods
<details> atEnd
Answer a Boolean which is true if the receiver cannot
access any more objects, and false otherwise.
Example:
self assert: [UnicodeString new readStream atEnd].
self assert: ['Smalltalk' asUnicodeString readStream atEnd not].
Answers:
<Boolean>
</details>
<details> isEmpty
<pre><code> Answer true if the contents of the view are empty. This is relative to the complete contents and is not impacted by the current position.
Example: self assert: [UnicodeString new readStream isEmpty]. self assert: ['Smalltalk' asUnicodeString readStream isEmpty not] Answers: <Boolean> </code></pre> </details>
<details> lineDelimiter
Return the receiver's line delimiter.
Answers:
<Object>
</details>
<details> lineDelimiter:
<pre><code> Set the receiver's line delimiter to be delimiter, and answer the receiver.
Example: | stream | stream := ('Small' , String lf , 'talk' , String cr , 'er') asUnicodeString readStream. self assert: [(stream lineDelimiter: Grapheme cr; nextLine) = ('Small' , String lf , 'talk')].
Arguments: delimiter - Grapheme mode: <Grapheme> grapheme delim <UnicodeScalar> scalar delim <UnicodeString> graphemes <Array> of <implementors of #asGrapheme> Compat: <String | Character> Unicode Scalar mode: <UnicodeScalar> scalar delim <Grapheme> grapheme delim <UnicodeString> graphemes <Array> of <implementors of #asUnicodeScalar> Compat: <String | Character> Answers: <UnicodeReadStream> self </code></pre> </details>
<details> next
<pre><code> Answer an Object that is the next accessible by the receiver. Change the state of the receiver so that returned object is no longer accessible.
Example: self assert: [('Smalltalk' asUnicodeString readStream next; next; next) = $a asGrapheme]. self assert: [('Smalltalk' asUnicodeString readStream switchToUnicodeScalarMode; next; next; next) = $a asUnicodeScalar]. Answers: <Object> view object </code></pre> </details>
<details> next:
<pre><code> Answer a collection containing the next @anInteger elements from the view. If @anInteger < 1, an empty collection is answered
Example: self assert: [('Smalltalk' asUnicodeString readStream next: 5) = 'Small']. self assert: [| stream | stream := 'Smalltalk' asUnicodeString readStream. (stream switchToUnicodeScalarMode; next: 5) = 'Small' unicodeScalars contents] Arguments: anInteger - <Integer> Answers: <Object> instance of view collection class Raises: <Exception> ExCLDTIndexOutOfRange </code></pre> </details>
<details> next:into:startingAt:
<pre><code> Answer @anIndexedCollection with the next @anInteger number of items from the receiver, stored starting at position @initialPosition.
If the receiver's state is such that there are fewer than anInteger elements between its current position and the end of the stream, the operation will fail, and the receiver will be left in a state such that it answers true to the atEnd message.
Example: | col | col := Array new: 5. 'Smalltalk' asUnicodeString readStream next: 5 into: col startingAt: 1. self assert: [col = 'Small' asUnicodeString asArray] Arguments: anInteger - <Integer> anIndexedCollection - <Collection> initialPosition - <Integer> Answers: <Collection> - anIndexedCollection </code></pre> </details>
<details> nextLine
<pre><code> Answer the elements between the current position and the next lineDelimiter.
Example: | stream | stream := ('Small' , String lf , 'talk' , String cr , 'er' , String crlf , 's') asUnicodeString readStream. self assert: [stream nextLine = 'Small']. self assert: [stream nextLine = 'talk']. self assert: [stream nextLine = 'er']. self assert: [stream nextLine = 's']. stream switchToUnicodeScalarMode. self assert: [stream nextLine = 'Small' unicodeScalars contents]. self assert: [stream nextLine = 'talk' unicodeScalars contents]. self assert: [stream nextLine = 'er' unicodeScalars contents]. self assert: [stream nextLine = 's' unicodeScalars contents]. self assert: stream atEnd. Answers: <Object> view-dependent </code></pre> </details>
<details> peek
Answer an Object that is the next accessible by the receiver.
Change the state of the receiver so that returned object is no longer accessible.
Answer nil if the view is atEnd
Example:
self assert: [('' asUnicodeString readStream peek) isNil].
self assert: [('Smalltalk' asUnicodeString readStream peek) = $S asGrapheme].
self assert: [('Smalltalk' asUnicodeString readStream switchToUnicodeScalarMode; peek) = $S asUnicodeScalar].
Answers:
<Object> or nil if at end
</details>
<details> position:
<pre><code> Set the receiver's position reference to argument anInteger. Answer self.
Example: | stream pos | stream := 'abcde' asUnicodeString readStream. pos := stream setToEnd; position. self assert: [(stream reset; next: 3) = 'abc']. stream position: pos. self assert: [stream position = pos] Arguments: aPosition - <anInteger> </code></pre> </details>
<details> setToEnd
Set the position of the receiver to be the size of the
underlying contents
</details>
<details> size
Answer the number of elements in the view.
Example:
self assert: [('Smalltalk' , String crlf) asUnicodeString
readStream size = 10].
self assert: [('Smalltalk' , String crlf) asUnicodeString
readStream switchToUnicodeScalarMode size = 11].
Answers:
<Integer>
</details>
<details> skip:
<pre><code> Increment the receiver's current reference position by anInteger. Fail if anInteger is not a kind of Integer.
Example: self assert: [('abcde' asUnicodeString readStream skip: 2; upToEnd) = 'cde'] Arguments: anInteger - <Integer> Raises: <Exception> ExCLDTIndexOutOfRange </code></pre> </details>
<details> skipTo:
<pre><code> Read and discard elements just past the occurrence of @anObject.
Example: self assert: [('abcde' asUnicodeString readStream skipTo: $c; upToEnd) = 'de']. self assert: [('abcde' asUnicodeString readStream skipTo: $z; upToEnd) = ''] Arguments: anObject - <Object> Answers: <Boolean> true if found, false otherwise </code></pre> </details>
<details> skipToAll:
<pre><code> Attempt to read and discard elements just past the occurrence of @aSequentialCollection. Answer true if all elements in @aSequentialCollection occurred, else answer false.
Note: If aSequentialCollection is an EsString, then we attempt ot convert to a UnicodeString
Example: self assert: ['abcde' asUnicodeString readStream skipToAll: 'bc']. self assert: [('abcde' asUnicodeString readStream skipToAll: 'bc'; upToEnd) = 'de']. self assert: [('abcde' asUnicodeString readStream skipToAll: 'zzz') not]. self assert: [('abcde' asUnicodeString readStream skipToAll: 'zzz'; upToEnd) = '']. Arguments: aSequentialCollection - <aSequentialCollection> Answers: <Boolean> </code></pre> </details>
<details> skipToAny:
<pre><code> Read and discard elements beyond the next occurrence of an element that exists in @aSequentialCollection or if none, to the end of stream.
Answer true if an element in @aSequentialCollection occurred, else answer false.
Note: If aSequentialCollection is an EsString, then we attempt ot convert to a UnicodeString
Example: self assert: ['abcde' asUnicodeString readStream skipToAny: 'bd']. self assert: [('abcde' asUnicodeString readStream skipToAny: 'bd'; upToEnd) = 'cde']. self assert: [('abcde' asUnicodeString readStream skipToAny: 'zzz') not]. self assert: [('abcde' asUnicodeString readStream skipToAny: 'zzz'; upToEnd) = '']. Arguments: aSequentialCollection - <aSequentialCollection> Answers: <Boolean> </code></pre> </details>
<details> switchToGraphemeMode
Switch the mode to graphemes.
This will reset the stream.
Calls like #next will answer <Grapheme> objects.
Calls like #next:/#contents will answer <UnicodeString> objects
Example:
self assert: [UnicodeString crlf readStream switchToGraphemeMode size = 1].
self assert: [UnicodeString crlf readStream switchToUnicodeScalarMode size = 2]
</details>
<details> switchToUnicodeScalarMode
<pre><code> Switch the mode to unicode scalars. This will reset the stream.
Calls like #next will answer <UnicodeScalar> objects. Calls like #next:/#contents will answer <Array> of <UnicodeScalar>s
Example: self assert: [UnicodeString crlf readStream switchToGraphemeMode size = 1]. self assert: [UnicodeString crlf readStream switchToUnicodeScalarMode size = 2]. </code></pre> </details>
<details> upTo:
<pre><code> Answers a collection of all of the objects in the view beginning from the current position up to, but not including, @anObject.
Example: self assert: [('abcde' asUnicodeString readStream upTo: $c) = 'ab']. self assert: [('abcde' asUnicodeString readStream upTo: $z) = 'abcde'] Arguments: anObject - <Object> Answers: <Object> instance of view collection class </code></pre> </details>
<details> upToAll:
<pre><code> Answers a collection of all of the objects in the view beginning from the current position up to, but not including, @aSequenceableCollection
Note: If aSequenceableCollection is an EsString, then we attempt ot convert to a UnicodeString
Example: self assert: [('abcde' asUnicodeString readStream upToAll: 'bc') = 'a']. self assert: [('abcde' asUnicodeString readStream upToAll: 'bc'; upToEnd) = 'de']. self assert: [('abcde' asUnicodeString readStream upToAll: 'zzz') = 'abcde']. self assert: [('abcde' asUnicodeString readStream upToAll: 'zzz'; upToEnd) isEmpty]. Arguments: aSequenceableCollection - <SequenceableCollection> Answers: <Object> instance of view collection class </code></pre> </details>
<details> upToAny:
<pre><code> Answers a collection of all of the objects in the view up to, but not including, the next occurrence of the element that exists in @aSequenceableCollection. If the element that exists in @aSequenceableCollection is not found and the end of the view is encountered, a collection of the objects read is returned.
Note: If aSequenceableCollection is an EsString, then we attempt ot convert to a UnicodeString
Example: self assert: [('abcde' asUnicodeString readStream upToAny: 'bd') = 'a']. self assert: [('abcde' asUnicodeString readStream upToAny: 'bd'; upToEnd) = 'cde']. self assert: [('abcde' asUnicodeString readStream upToAny: 'zzz') = 'abcde']. self assert: [('abcde' asUnicodeString readStream upToAny: 'zzz'; upToEnd) isEmpty]. Arguments: aSequenceableCollection - <SequenceableCollection> Answers: <Object> view collection class </code></pre> </details>
<details> upToEnd
<pre><code> Answer a collection containing UP TO the maximum number of elements read from the view. If there are no more elements available to be read, then an empty collection is answered.
Example: self assert: ['abcde' asUnicodeString readStream upToEnd = 'abcde']. self assert: [('abcde' asUnicodeString readStream next: 2; upToEnd) = 'cde']. self assert: ['' asUnicodeString readStream upToEnd = ''] Answers: <Object> instance of view collection class </code></pre> </details>