Document Object Model (DOM) parser
A DOM parser allows you to represent your XML document as a tree of nodes in your program. The DOM provides a common way of accessing general data structures from structured documents. A DOM parser can be used to process an XML file in Smalltalk.
Document Object Model (DOM)
The DOM standard is a W3C standard which describes mechanisms for software developers and Web script authors to access and manipulate parsed XML (and HTML) content. The DOM is platform- and language-neutral.
The DOM presents documents as a hierarchy of node objects that also implement other, more specialized interfaces. Methods that are part of the Document Object Model are saved in the
AbtDOM-API category. The DOM API is documented at
http://www.w3.org/TR/DOM-Level-2-Core.
When to use the DOM parser
Use the DOM parser (AbtXmlDomParser) to read an XML file and return a representation of the file as a tree of objects. Most objects are subclasses of AbtDOMNode. You can then traverse the document tree and execute actions on the tree structure.
VA Smalltalk support for the DOM level-2 specification
The VA Smalltalk XML DOM parser is actually a SAX-2 parser which supplies SAX event handlers for constructing DOM objects. Currently, VA Smalltalk supports the core interfaces of the DOM level-2 specification.
The following table lists the Smalltalk classes that implement the interfaces from DOM level-2:
| |
Attr | AbtDOMAttr |
CDATASection | AbtDOMCDataSection |
Comment | AbtDOMComment |
Document | AbtDOMDocument |
DocumentFragment | AbtDOMDocumentFragment |
DocumentType | AbtDOMDocumentType |
DOMImplementation | AbtDOMImplementation |
DOMString | String |
Element | AbtDOMElement |
Entity | AbtDOMEntity |
Entity Reference | AbtDOMEntityReference |
DOM Exception | SgmlException |
Named Node Map | AbtDOMNamedNodeMap |
Node | AbtDOMNode |
Node List | AbtDOMNodeList |
Notation | AbtDOMNotation |
Processing Instruction | AbtDOMProcessingInstruction |
VA Smalltalk deviations from the DOM level-2 specification
The VA Smalltalk DOM parser deviates from the DOM level-2 specification as follows:
•The DOMString interface deviates from the DOM level-2 specification. Smalltalk String objects are used in places where the specification calls for DOMStrings. Smalltalk Strings are stored in the code page of the active system. DOMStrings are encoded as UTF-16.
•Some of the character manipulation functions of the DOM level-2 specification call for arguments that use 16-bit units. The VA Smalltalk DOM support always uses 8-bit (1 byte) units. When the DOM specification refers to a 16-bit unit, VA Smalltalk uses a single byte (8-bit unit).
Last modified date: 05/14/2020