"The contents of this file are subject to the ParcPlace Public License Version 1.0 (the `License'); you may not use this file except in compliance with the License.  You may obtain a copy of the License at
     http://www.parcplace.com/support/opensource/PPL-1.0.html

Software distributed under the License is distributed on an `AS IS' basis, WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License for the specific language governing rights and limitations under the License.

The Original Code is the VisualWorks XML Framework.

The Initial Developer of the Original Code is Cincom Systems Inc. Portions created by Cincom are Copyright (C) 2000 Cincom Systems Inc. All Rights Reserved.

Contributor(s):"

'From VisualWorks®, Release 5i.2 feb00.2 of February 29, 2000 on March 12, 2000 at 3:39:37 pm'!


Smalltalk addSubspace: #XML!
XML import: #URIResolver from: NetworkClients!
XML at: #PredefinedEntities put: nil!
XML at: #CanonicalXMLEntities put: nil!
XML at: #IllegalCharacters put: nil!
XML at: #XML_URI put: 'http://www.w3.org/XML/1998/namespace'!


Namespace current: XML!
Object subclass: #Pattern
    instanceVariableNames: 'followSet '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!Pattern class methodsFor: 'instance creation'!

new
    ^super new initialize! !



Pattern comment: '
The element structure of an XML document may, for validation purposes, be constrained using element type and attribute-list declarations. An element type declaration constrains the element''s content by constraining which element types can appear as children of the element.The constraint includes a content model, a simple grammar or pattern governing the allowed types of child elements and the order in which they are allowed to appear. These content models are represented by this XML.Pattern class and it''s subclasses.

Constraint rules or patterns maybe complex (ComplexPattern and it''s subclasses) or simple (ConcretePattern and it''s subclasses).
 
Subclasses must implement the following messages:
    coercing
	alternateHeads
	pushDownFollowSet
    testing
	isSimple

Instance Variables:
    followSet	<MethodFilterOr | MethodFilterAnd | Collection>  comment
'!


!Pattern methodsFor: 'initialize'!

initialize
    followSet := OrderedCollection new: 2! !

!Pattern methodsFor: 'coercing'!

addFollow: aNode
    followSet add: aNode!

addFollows: aList
    followSet addAll: aList!

alternateHeads
    ^self subclassResponsibility!

followSet
    ^followSet!

normalize
    | list done t r result |
    list := OrderedCollection
		with: (result := InitialPattern new addFollow: self)
		with: self
		with: TerminalPattern new.
    self addFollow: list last.
    done := OrderedCollection new.
    [list isEmpty]
	whileFalse:
		[t := list removeFirst.
		r := t pushDownFollowSet.
		r isNil
			ifTrue: [done add: t]
			ifFalse: [list addAll: r]].
    list := done.
    done := OrderedCollection new.
    [list isEmpty]
	whileFalse:
		[t := list removeFirst.
		t normalizeFollowSet
			ifTrue: [done add: t]
			ifFalse: [list add: t]].
    done do: [:p |
	p isSimple ifFalse: [self error: 'Incomplete translation'].
	p followSet do: [:p1 |
		p1 isSimple ifFalse: [self error: 'Incomplete translation']]].
    ^result!

normalizeFollowSet
    | changed oldFollow newFollow |
    oldFollow := IdentitySet withAll: followSet.
    newFollow := IdentitySet new.
    oldFollow do: [:pat |
	newFollow addAll: pat alternateHeads].
    changed := newFollow size ~= oldFollow size or: [(newFollow - oldFollow) size > 0].
    followSet := newFollow asOrderedCollection.
    ^changed not!

pushDownFollowSet
    ^self subclassResponsibility! !

!Pattern methodsFor: 'testing'!

isSimple
    ^self subclassResponsibility! !

!Pattern methodsFor: 'printing'!

printOn: aStream
    aStream nextPutAll: self description! !


Object subclass: #DocumentType
    instanceVariableNames: 'attributeDefs elementDefs generalEntities parameterEntities notations '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!




!DocumentType class methodsFor: 'instance creation'!

new
    ^super new initialize! !



DocumentType comment: '
This class represents an XML document type definition or DTD. The document type declaration can point to an external subset containing markup declarations, or can contain the markup declarations directly in an internal subset, or can do both. The DTD for a document consists of both subsets taken together.

Instance Variables:
    attributeDefs	<Dictionary>
    elementDefs	<Dictionary>
    generalEntities	<Dictionary>
    parameterEntities	<Dictionary>
    notations	<Dictionary>
'!


!DocumentType methodsFor: 'initialize'!

initialize
    notations := Dictionary new.
    elementDefs := Dictionary new.
    attributeDefs := Dictionary new.
    generalEntities := Dictionary new.
    parameterEntities := Dictionary new.! !

!DocumentType methodsFor: 'accessing'!

attributeFor: key subKey: k2
    | val |
    (val := attributeDefs at: key asString ifAbsent: []) isNil
	ifTrue: [XMLParser invalid: ('The attribute "%1 %2" has not been defined'
						bindWith: key asString
						with: k2 asString)].
    ^val at: k2 asString
	ifAbsent: [XMLParser invalid: ('The attribute "%1 %2" has not been defined'
						bindWith: key asString
						with: k2 asString)]!

attributeFor: key subKey: k2 put: value
    | dict |
    dict := attributeDefs at: key asString ifAbsentPut: [Dictionary new].
    (dict includesKey: k2 asString)
	ifTrue: [^self warn: ('The attribute "%1 %2" has been defined more than once'
				bindWith: key asString
				with: k2 asString)].
    (value type isID and: [dict contains: [:attr | attr type isID]])
	ifTrue: [^XMLParser invalid: ('The element %1 has two attributes typed as ID' bindWith: key asString)].
    dict at: k2 asString put: value!

attributesFor: key
    ^attributeDefs at: key asString ifAbsent: [Dictionary new]!

elementFor: key
    | val |
    (val := elementDefs at: key asString ifAbsent: []) isNil
	ifTrue: [self warn: ('The element "%1" has not been defined'
						bindWith: key asString)].
    ^val!

elementFor: key put: value
    (elementDefs includesKey: key asString)
	ifTrue: [^self warn: ('The element "%1" has been defined more than once'
					bindWith: key asString)].
    elementDefs at: key asString put: value!

generalEntityAt: key
    "We do some tricks to make sure that, if the value
    is predefined in the parser, we use the predefined
    value. We could just store the predefined values
    in with the general ones, but we don't want to show
    warnings if the user (very correctly) defines them.
    An enhancement would be to let the user use his own
    values rather than the predefined ones, but we know
    that the predefined ones will be correct--we don't know
    that his will be."

    | val |
    val := PredefinedEntities at: key ifAbsent: [].
    val isNil
	ifTrue: [val := generalEntities at: key ifAbsent: []].
    ^val!

generalEntityAt: key put: value
    (generalEntities includesKey: key)
	ifTrue: [^self warn: ('The general entity "%1" has been defined more than once'
					bindWith: key asString)].
    generalEntities at: key put: value!

notationAt: name
    ^notations at: name ifAbsent: [XMLParser invalid: 'Reference to an undeclared Notation']!

notationAt: name ifAbsent: aBlock
    ^notations at: name ifAbsent: aBlock!

notationAt: name put: notation
    (notations includesKey: name)
	ifTrue: [XMLParser invalid: 'Duplicate definitions for a Notation'].
    notations at: name put: notation!

parameterEntityAt: key
    ^parameterEntities at: key ifAbsent: []!

parameterEntityAt: key put: value
    (parameterEntities includesKey: key)
	ifTrue: [^self warn: ('The parameter entity "%1" has been defined more than once'
					bindWith: key)].
    parameterEntities at: key put: value! !

!DocumentType methodsFor: 'private'!

completeValidationAgainst: aParser
    generalEntities keysAndValuesDo: [:eName :entity |
	entity completeValidationAgainst: aParser].
    attributeDefs keysAndValuesDo: [:eName :attribs |
	attribs keysAndValuesDo: [:aName :attrib |
		attrib completeValidationAgainst: aParser]]!

warn: aString
    ^Warning signal: aString! !


Link subclass: #StreamWrapper
    instanceVariableNames: 'stream isInternal protocol name usedAsExternal entity cr lf '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!




!StreamWrapper class methodsFor: 'instance creation'!

emptyWithExtraSpace: space
    ^self stream: (space ifTrue: ['  '] ifFalse: ['']) readStream
	protocol: 'internal'
	name: nil
	entity: nil!

stream: str protocol: key name: value entity: entity
    ^self new stream: str protocol: key name: value entity: entity! !



StreamWrapper comment: '
This class is used by the XMLParser to wrap both internal and external streams with proper encoding before handing them to the parser for processing

Instance Variables:
    stream			<EncodedStream>  stream being wrapped
    isInternal		<Boolean>  true if the stream is internal and hencer doesn''t need careful line-end treatment
    protocol			<String>  name of stream type, ''internal'' or ''file''
    name			<String | nil>  the name of the stream, if it is named
    usedAsExternal	<Boolean>  flag used to override protocol and say how stream is being used?
    entity			<Entity | nil>  if wrapping on behalf of an Entity this is it?
    cr				<Character>  cache of Character cr
    lf				<Character>  cache of Character lf
'!


!StreamWrapper methodsFor: 'initialize'!

stream: str protocol: key name: value entity: ent
    stream := str.
    isInternal := key = 'internal'.
    protocol := key.
    name := value.
    entity := ent.
    cr := Character cr.
    lf := Character lf.!

usedAsExternal
    ^usedAsExternal!

usedAsExternal: aBoolean
    usedAsExternal := aBoolean! !

!StreamWrapper methodsFor: 'accessing'!

characterSize: aCharacter
    ^1 "(self stream respondsTo: #encoder)
	ifTrue: [self stream encoder characterSize: aCharacter]
	ifFalse: [1]"!

checkEncoding
    "| encoding |
    encoding := [stream encoding] on: Error do: [:ex | ex returnWith: #null].
    encoding = #UTF_8
	ifTrue:
		[| c1 c2 pos |
		pos := stream position.
		stream setBinary: true.
		c1 := stream next.
		c2 := stream next.
		stream setBinary: false.
		(c2 notNil and: [c1 * c2 = 16rFD02])
			ifTrue: [stream encoder: (UTF16StreamEncoder new
								forByte1: c1 byte2: c2)]
			ifFalse: [stream position: pos]]"!

close
    stream close!

contentsFor: aParser
    | s |
    s := (String new: 100) writeStream.
    [self atEnd]
	whileFalse: [s nextPut: (self nextFor: aParser)].
    ^s contents!

entity
    ^entity!

name
    | streamName |
    name notNil ifTrue: [^name].
    stream isExternalStream ifFalse: [^nil].
    streamName := [stream name] on: Error do: [:ex| ex returnWith: nil].
    streamName notNil ifTrue: [streamName replaceAll: Directory pathSeparator with: $/].
    ^streamName!

protocol
    ^protocol!

stream
    ^stream! !

!StreamWrapper methodsFor: 'streaming'!

nextFor: aParser
    | ch n |
    ch := stream atEnd ifTrue: [ nil ] ifFalse: [ stream next ].
    isInternal
	ifFalse:
		[lf isNil ifTrue: [self halt].
		ch == cr
			ifTrue:
				[stream peekFor: lf.
				ch := aParser eol]
			ifFalse: [ch == lf
				ifTrue: [ch := aParser eol]]].
    (ch isNil
		or: [(n := ch asInteger) < 16r110000   "IllegalCharacters size"
		and: [(IllegalCharacters at: n+1) = 0]])
	ifFalse: [aParser notPermitted: 'a character with Unicode value ', n printString].
    ^ch!

skip: n
    stream skip: n! !

!StreamWrapper methodsFor: 'testing'!

atEnd
    ^stream atEnd! !

!StreamWrapper methodsFor: 'declaration'!

encodingDeclIn: aParser
    | enc |
    ^stream peek = $e
	ifTrue:
		[| encoding |
		self mustFind: 'encoding' errorOn: aParser.
		self skipSpaceIn: aParser.
		self mustFind: '=' errorOn: aParser.
		self skipSpaceIn: aParser.
		encoding := self quotedString.
		aParser validateEncoding: encoding.
		((stream respondsTo: #encoding)
				and: [stream encoding asLowercase ~= (XMLParser mapEncoding: encoding) asLowercase])
			ifTrue:
				["enc := (StreamEncoder new:
						(XMLParser mapEncoding: encoding) asSymbol)
							initializeForFiles.
				stream encoder: enc"].
		true]
	ifFalse: [false]!

expected: string
    XMLParser malformed: string, ' expected, but not found'!

mustFind: str errorOn: aParser
    (self skipIf: str)
	ifFalse: [aParser expected: str].!

quotedString
    (stream peekFor: $")
	ifTrue: [^(stream upTo: $") asString].
    (stream peekFor: $')
	ifTrue: [^(stream upTo: $') asString].
    self expected: 'quoted string'!

sdDeclIn: aParser
    ^stream peek = $s
	ifTrue:
		[| word |
		self mustFind: 'standalone' errorOn: aParser.
		self skipSpaceIn: aParser.
		self mustFind: '=' errorOn: aParser.
		self skipSpaceIn: aParser.
		word := self quotedString.
		(#('yes' 'no') includes: word)
			ifFalse: [aParser expected: 'yes or no'].
		true]
	ifFalse: [false]!

skipIf: str
    | p |
    p := stream position.
    1 to: str size do: [:i |
	(stream peekFor: (str at: i))
		ifFalse:
			[stream position: p.
			^false]].
    ^true!

skipSpaceIn: aParser
    | p space |
    space := false.
    [p := stream position.
    #(9 10 13 32) includes: (self nextFor: aParser) asInteger]
	whileTrue: [space := true].
    stream position: p.
    ^space!

textDeclIn: aParser
    self checkEncoding.
    ^(self skipIf: '<?xml')
	ifTrue:
		[| hasSpace |
		hasSpace := self skipSpaceIn: aParser.
		hasSpace
			ifTrue: [(self versionInfoIn: aParser) isNil
				ifFalse: [hasSpace := self skipSpaceIn: aParser]].
		hasSpace
			ifTrue: [(self encodingDeclIn: aParser)
				ifFalse: [self expected: 'encoding']]
			ifFalse: [(self encodingDeclIn: aParser)
				ifTrue: [self expected: 'white space']].
		self skipSpaceIn: aParser.
		self mustFind: '?>' errorOn: aParser.
		true]
	ifFalse: [false]!

versionInfoIn: aParser
    | version |
    ^stream peek = $v
	ifTrue:
		[self mustFind: 'version' errorOn: aParser.
		self skipSpaceIn: aParser.
		self mustFind: '=' errorOn: aParser.
		self skipSpaceIn: aParser.
		version := self quotedString.
		version = '1.0' ifFalse: [self expected: 'version 1.0'].
		version]
	ifFalse: [nil]!

xmlDeclIn: aParser
    self checkEncoding.
    ^(self skipIf: '<?xml')
	ifTrue:
		[| hasSpace version |
		(self skipSpaceIn: aParser)
			ifTrue: [version := self versionInfoIn: aParser]
			ifFalse: [version := nil].
		version = nil ifTrue: [self expected: 'version'].
		aParser documentNode xmlVersion: version.
		hasSpace := self skipSpaceIn: aParser.
		hasSpace
			ifTrue: [(self encodingDeclIn: aParser)
				ifTrue: [hasSpace := self skipSpaceIn: aParser]]
			ifFalse: [(self encodingDeclIn: aParser)
				ifTrue: [self expected: 'white space']].
		hasSpace
			ifTrue: [(self sdDeclIn: aParser)
				ifTrue: [hasSpace := self skipSpaceIn: aParser]]
			ifFalse: [(self sdDeclIn: aParser)
				ifTrue: [self expected: 'white space']].
		self mustFind: '?>' errorOn: aParser.
		true]
	ifFalse: [false]! !

!StreamWrapper methodsFor: 'private'!

error: aStringOrMessage
    ^XMLSignal signal: aStringOrMessage asString! !


Object subclass: #NodeTag
    instanceVariableNames: 'namespace type qualifier '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!


NodeTag comment: '
In XML all elements or nodes are delimited by start and end tags (or empty-element tags) and instances of this class are used to represent these element tags. The name of the tag (type instance variable) gives the element type.


Instance Variables:
    namespace	<String>
    type	<String> Name of tag, used to indicate element type
    qualifier	<String>
'!


!NodeTag methodsFor: 'initialize'!

qualifier: q ns: ns type: typeStr
    namespace := ns.
    type := typeStr.
    qualifier := q! !

!NodeTag methodsFor: 'accessing'!

expandedName
    ^namespace isEmpty
	ifTrue: [type]
	ifFalse: [namespace, '#', type]!

namespace
    ^namespace!

qualifier
    ^qualifier!

type
    ^type! !

!NodeTag methodsFor: 'converting'!

asString
    ^qualifier isEmpty
	ifTrue: [type]
	ifFalse: [qualifier, ':', type]! !

!NodeTag methodsFor: 'testing'!

< aNodeTag
    ^self asString < aNodeTag asString
!

isLike: aName
    ^aName isString
	ifTrue: [namespace isEmpty and: [type = aName]]
	ifFalse: [namespace = aName namespace and: [type = aName type]]! !

!NodeTag methodsFor: 'printing'!

printOn: aStream
    aStream nextPutAll: '{', self asString, '}'! !


Pattern subclass: #ConcretePattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


ConcretePattern comment: '
This class is the superclass to what are considered ''simple'' patterns or constraint rules in the element content declarations. As seen from the class hiererarchy, instances of simple patterns include AnyPattern, EmptyPattern, InitialPattern, NamePattern, PCDATAPattern and TerminalPattern.

Subclasses must implement the following messages:
    testing
	matches:
'!


!ConcretePattern methodsFor: 'accessing'!

followSetDescription
    | s |
    s := (String new: 32) writeStream.
    s nextPut: $(.
    followSet do: [:n | s nextPutAll: n tag]
	separatedBy: [s space].
    s nextPut: $).
    ^s contents! !

!ConcretePattern methodsFor: 'testing'!

canTerminate
    ^followSet contains: [:p | p isTerminator]!

couldBeText
    ^false!

isSimple
    ^true!

isTerminator
    ^false!

matches: aNode
    self subclassResponsibility! !

!ConcretePattern methodsFor: 'coercing'!

alternateHeads
    ^Array with: self!

pushDownFollowSet
    ^nil! !

!ConcretePattern methodsFor: 'validation'!

validate: node
    | couldBeText |
    couldBeText := false.
    self followSet do: [:i |
	i couldBeText
		ifTrue: [couldBeText := true].
	(i matches: node)
		ifTrue: [^i]].
    couldBeText
	ifFalse: [node isBlankText ifTrue: [^self]].
    ^nil! !


ConcretePattern subclass: #EmptyPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


EmptyPattern comment: '
This class represents the EMPTY element content constraint in an element type declaration. According to the XML 1.0 specification the EMPTY element declaration indicates that the element has no content'!


!EmptyPattern methodsFor: 'coercing'!

alternateHeads
    ^followSet! !

!EmptyPattern methodsFor: 'testing'!

matches: aNode
    ^false! !


ConcretePattern subclass: #PCDATAPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


PCDATAPattern comment: '
This class represents a content constraint or pattern in an element type declaration indicating that the element content includes parsed character data. This is typically used in mixed content type patterns and is signified by the presence of the string ''#PCDATA'' in the element content
declaration.'!


!PCDATAPattern methodsFor: 'accessing'!

description
    ^'#PCDATA'! !

!PCDATAPattern methodsFor: 'testing'!

matches: aNode
    ^aNode isText! !


Object subclass: #SAXDriver
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-SAX'!


SAXDriver comment: '
This class includes supports for the Simple API for XML (SAX), an event-driven API for parsing XML documents.

This class is not yet complete'!


!SAXDriver methodsFor: 'document events'!

document
    ^nil!

endDocument!
endDocumentFragment!
startDocument!
startDocumentFragment! !

!SAXDriver methodsFor: 'elements'!

endElement!
sourcePosition: pos inStream: str!
startElement: name atts: atts! !

!SAXDriver methodsFor: 'characters'!

characters: aString!
ignorableWhitespace: aString! !

!SAXDriver methodsFor: 'other'!

attribute: name value: value
    ^Attribute name: name value: value!

comment: data!
notation: name value: val
    ^Notation new name: name identifiers: val!

processingInstruction: target data: data! !


Object subclass: #AttributeDef
    instanceVariableNames: 'name default type flags '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


AttributeDef comment: '
XML documents may contain attribute-list declarations that are used to define the set of attributes pertaining to a given element type. These attribute-list declarations are also used to establish type constraints for the attributes and to provide default values for attributes. Attribute-list declarations contain attribute definitions and this class is used to instantiate these attribute definitions.

An attribute definition specifies the name (name instance variable) of the attribute, the data type of the attribute (type instance variable) and an optional default value (default instance variable) for the attribute

Instance Variables:
    name	<XML.NodeTag> 		name of attribute
    default	<Object>  				default value, if any
    type	<XML.AttributeType>	type used for validation
    flags	<Integer>				encoding for fixed, implied and required type attributes
'!


!AttributeDef methodsFor: 'accessing'!

default
    ^default!

default: n
    flags := 0.
    default := nil.
    n = #required
	ifTrue: [flags := 1]
	ifFalse: [n = #implied
		ifTrue: [flags := 2]
		ifFalse:
			[n class == Association
				ifFalse: [self error: 'Invalid default'].
			n key ifTrue: [flags := 4].
			default := n value]]!

hasDefault
    ^(self isImplied or: [self isRequired]) not!

isFixed
    ^(flags bitAnd: 4) = 4!

isImplied
    ^(flags bitAnd: 2) = 2!

isRequired
    ^(flags bitAnd: 1) = 1!

name
    ^name!

name: n
    name := n!

tag
    ^name!

type
    ^type!

type: n
    type := n! !

!AttributeDef methodsFor: 'validating'!

completeValidationAgainst: aParser
    ^self type completeValidationAgainst: aParser from: self!

selfValidateFor: aParser
    type validateDefinition: self for: aParser!

validateValueOf: anAttribute for: aParser
    type validateValueOf: anAttribute for: aParser.
    (self isFixed not or: [anAttribute value = self default])
	ifFalse: [aParser invalid: ('The attribute "%1" was declared FIXED, but the value used in the document ("%2") did not match the default ("%3")'
					bindWith: anAttribute tag asString
					with: anAttribute value
					with: self default)].! !

!AttributeDef methodsFor: 'private'!

value
    ^self default!

value: str
    default := str! !


Object subclass: #XMLParser
    instanceVariableNames: 'sourceStack hereChar lastSource currentSource documentNode dtd unresolvedIDREFs builder validating ignore eol '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!

Error subclass: #XMLSignal
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Errors'!

XMLSignal subclass: #InvalidSignal
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Errors'!

XMLSignal subclass: #MalformedSignal
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Errors'!


!XMLParser class methodsFor: 'class initialization'!

initialize
    "XMLParser initialize"

    CanonicalXMLEntities := Array new: 256.
    CanonicalXMLEntities at: Character cr value + 1 put: '&#13;';
	at: Character lf value + 1 put: '&#10;';
	at: Character tab value + 1 put: '&#9;';
	at: $& value + 1 put: '&amp;';
	at: $< value + 1 put: '&lt;';
	at: $> value + 1 put: '&gt;';
	at: $" value + 1 put: '&quot;';
	at: $' value + 1 put: '&apos;'.

    PredefinedEntities := Dictionary new
        at: 'amp' put: (GeneralEntity new name: 'amp'; text: '&#38;');
        at: 'lt' put: (GeneralEntity new name: 'lt'; text: '&#60;');
        at: 'gt' put: (GeneralEntity new name: 'gt'; text: '&#62;');
        at: 'apos' put: (GeneralEntity new name: 'apos'; text: '&#39;');
        at: 'quot' put: (GeneralEntity new name: 'quot'; text: '&#34;');
        yourself.

    IllegalCharacters := LargeWordArray new: 16r110000.
    ((0 to: 16r1F) asSet - #(9 10 13))
	do: [:i | IllegalCharacters at: i+1 put: 1].
    16rD800 to: 16rDFFF
	do: [:i | IllegalCharacters at: i+1 put: 1].
    16rFFFE to: 16rFFFF
	do: [:i | IllegalCharacters at: i+1 put: 1].
    IllegalCharacters compress! !

!XMLParser class methodsFor: 'instance creation'!

on: aStream
    ^self new on: aStream!

processDocumentInFilename: aFilename 
    ^self processDocumentInFilename: aFilename beforeScanDo: [:parser | ]!

processDocumentInFilename: aFilename beforeScanDo: aBlock
    | stream p |
    stream := (File name: aFilename) readStream.
    p := self on: stream.
    aBlock value: p.
    ^p scanDocument!

processDocumentString: aString
    ^self processDocumentString: aString beforeScanDo: [:parser | ]!

processDocumentString: aString beforeScanDo: aBlock
    | p |
    p := self on: aString readStream.
    aBlock value: p.
    ^p scanDocument! !

!XMLParser class methodsFor: 'utilities'!

invalid: aString
    ^InvalidSignal signal: aString!

malformed: aString
    ^MalformedSignal signal: aString!

mapEncoding: anEncoding
    | enc |
    enc := anEncoding asLowercase.
    enc = 'utf-8' ifTrue: [#UTF8].
    enc = 'utf-16' ifTrue: [#UTF16].
    enc = 'iso-8859-1' ifTrue: [#ISO88591].
    ^enc!

readFileContents: fn
    | s p |
    s := StreamWrapper
		stream: (File name: fn) readStream
		protocol: 'file'
		name: nil	
		entity: nil.
    p := self new.
    p lineEndLF.
    ^[s checkEncoding.
      s contentsFor: p]
	ensure: [s close]! !



XMLParser comment: '
This class represents the main XML processor in the system. This  XMLParser may be used as a validating or non-validating parser to scan and process an XML document and provide access to it''s content and structure to a smalltalk application. This XMLParser tries to follow the guidelines laid out in the W3C XML Version 1.0 specification.

Instance Variables:
    sourceStack	<XML.StreamWrapper>  stack of input streams that handles inclusion.
    hereChar	<Character>  the current character being parsed
    lastSource	<XML.StreamWrapper>  record of previous source used to check correct nesting
    currentSource	<XML.StreamWrapper>  current input stream (the top of sourceStack)
    documentNode	<XML.Document>  the document created by parsing
    dtd	<XML.DocumentType>  the document type definition for the current document
    unresolvedIDREFs	<Collection>  collection of IDREfs that have yet to be resolved; used for validation
    builder	<XML.NodeBuilder>  node builder
    validating	<Boolean>  if true then the parse validates the XML
    ignore	<Boolean>  ?
    eol	<Character>  the end-of-line character in the source stream
'!


!XMLParser methodsFor: 'initialize'!

builder: anXMLNodeBuilder
    builder := anXMLNodeBuilder!

lineEndCR
    eol := Character cr!

lineEndLF
    eol := Character lf!

on: inputStream
    validating isNil ifTrue: [validating := true].
    sourceStack := self wrapStream: inputStream.
    builder isNil
	ifTrue: [builder := NodeBuilder new].
    documentNode := Document new.
    dtd := DocumentType new.
    documentNode dtd: dtd.
    unresolvedIDREFs := Set new.
    ignore := false.
    eol := Character lf!

wrapStream: aStream
    ^StreamWrapper
	stream: aStream
	protocol: (aStream isExternalStream
					ifTrue: ['file']
					ifFalse: ['internal'])
	name: nil
	entity: nil! !

!XMLParser methodsFor: 'accessing'!

dtd
    ^dtd!

eol
    ^eol!

sourceWrapper
    ^sourceStack "last"!

validate: aBoolean
    validating := aBoolean! !

!XMLParser methodsFor: 'testing'!

hasExpanded: anEntity
    | s |
    s := sourceStack.
    [s isNil] whileFalse:
	[s entity == anEntity
		ifTrue: [self malformed: ('The %1 entity "%2" invokes itself recursively'
					bindWith: anEntity entityType
					with: anEntity name)].
	s := s nextLink].
    ^false!

isValidating
    ^validating!

shouldTestWFCEntityDeclared
    ^self documentNode hasDTD not
	or: [(self documentNode hasExternalDTD not
		and: [self documentNode usesParameterEntities not])
	or: [self documentNode isDeclaredStandalone]]! !

!XMLParser methodsFor: 'api'!

comment
    | comment str1 |
    str1 := currentSource.
    ^(self skipIf: '<!--')
	ifTrue:
		[comment := self completeComment: str1.
		(ignore or: [comment isDiscarded])
			ifFalse: [self documentNode addNode: comment].
		true]
	ifFalse: [false]!

docTypeDecl
    | nm id |
    ^(self skipIf: '<!DOCTYPE')
	ifTrue:
		[self forceSpace.
		self documentNode noteDTD.
		nm := self getQualifiedName.
		nm yourself.
		self skipSpace.
		(id := self externalID: #docType) notNil ifTrue: [self skipSpace].
		self sourceWrapper usedAsExternal: false.
		(self skipIf: '[')
			ifTrue: [[self skipIf: ']']
				whileFalse: [self dtdEntry]].
		self sourceWrapper usedAsExternal: nil.
		id isNil ifFalse: [self dtdFile: id].
		self skipSpace.
		self mustFind: '>'.
		self isValidating ifTrue: [dtd completeValidationAgainst: self].
		true]
	ifFalse: [false]!

latestURI
    | s nm |
    s := self fullSourceStack reverse detect: [:i | i protocol ~= 'internal'] ifNone: [nil].
    ^s isNil
	ifTrue:
		[nm := Directory append: 'xxx' to: Directory working.
		nm replaceAll: Directory pathSeparator with: $/.
		'file' -> nm]
	ifFalse: [s protocol -> s name]!

misc
    ^self atEnd not and: [self skipSpace or: [self comment or: [self pi]]]!

pi
    | str1 pi |
    str1 := currentSource.
    ^(self skipIf: '<?')
	ifTrue:
		[pi := self completePI: str1.
		ignore ifFalse: [self documentNode addNode: pi].
		true]
	ifFalse: [false]!

prolog
    self sourceWrapper xmlDeclIn: self.        "This is optional."
    self getNextChar.
    [self misc] whileTrue.
    self docTypeDecl
	ifTrue: [[self misc] whileTrue].!

pushSource: aStreamWrapper
    aStreamWrapper nextLink: sourceStack.
    sourceStack := aStreamWrapper!

scanDocument
    ^[self prolog.
    self documentNode addNode: self element.
    [self misc] whileTrue.
    self atEnd ifFalse: [self expected: 'comment or processing instruction'].
    self documentNode updateIDs.
    self checkUnresolvedIDREFs.
    self documentNode]
	ensure: [self closeAllFiles]! !

!XMLParser methodsFor: 'DTD processing'!

conditionalSect
    | nm oldIgnore |
    hereChar = $< ifFalse: [^false].
    self inInternalSubset ifTrue: [^false].
    ^(self skipIf: '<![')
	ifTrue:
		[self skipSpaceInDTD.
		nm := self getSimpleName.
		(#('INCLUDE' 'IGNORE') includes: nm)
			ifFalse: [self expected: 'INCLUDE or IGNORE'].
		oldIgnore := ignore.
		ignore := ignore or: [nm = 'IGNORE'].
		self skipSpaceInDTD.
		self mustFind: '['.
		[self skipIf: ']]>']
			whileFalse: [self dtdEntry].
		ignore := oldIgnore.
		true]
	ifFalse: [false]!

dtdEntry
    ((self PERef: #dtdEntry) or:
		[self markUpDecl or:
		[self conditionalSect or:
		[self skipSpace]]])
	ifFalse: [self expected: 'markup declaration or PE reference']!

dtdFile: newURI
    | uri str |
    self documentNode noteExternalDTD.
    currentSource skip: -1.
	"So we don't lose hereChar."
    uri := URIResolver resolve: newURI last from: self latestURI.
    self pushSource: (str := StreamWrapper
					stream: (URIResolver openStreamOn: uri)
					protocol: uri key
					name: uri value
					entity: nil).
    str usedAsExternal: true.
    str textDeclIn: self.
    self getNextChar.
    [self fullSourceStack includes: str]
	whileTrue: [self dtdEntry].!

externalID: usage
    "Usage may be #docType, #entity, or #notation.
    DocType is treated specially, since PE references are not allowed.
    Notation is treated specially since the system identifier of the
	PUBLIC form is optional."

    | lit2 lit1 forceSpace skipSpace |
    forceSpace := [usage == #docType
				ifTrue: [self forceSpace]
				ifFalse: [self forceSpaceInDTD]].
    skipSpace := [usage == #docType
				ifTrue: [self skipSpace]
				ifFalse: [self skipSpaceInDTD]].
    ^(self skipIf: 'SYSTEM')
	ifTrue:
		[forceSpace value.
		lit2 := self systemLiteral.
		Array with: lit2]
	ifFalse: [(self skipIf: 'PUBLIC')
		ifTrue:
			[forceSpace value.
			lit1 := self pubIdLiteral.
			usage == #notation
				ifTrue:
					[(skipSpace value and:
							[hereChar = $' or: [hereChar = $"]])
						ifTrue: [lit2 := self systemLiteral]
						ifFalse: [lit2 := nil]]
				ifFalse:
					[forceSpace value.
					lit2 := self systemLiteral].
			Array with: lit1 with: lit2]
		ifFalse:
			[nil]]!

inInternalSubset
    self fullSourceStack reverseDo:
	[:str |
	str usedAsExternal isNil
		ifFalse: [^str usedAsExternal not]].
    self error: 'Not currently processing the DTD'!

markUpDecl
    ^self elementDecl
	or: [self attListDecl
	or: [self entityDecl
	or: [self notationDecl
	or: [self pi
	or: [self comment]]]]]!

notationDecl
    | nm id str |
    str := currentSource.
    ^(self skipIf: '<!NOTATION')
	ifTrue:
		[self forceSpaceInDTD.
		nm := self getSimpleName.
		self forceSpaceInDTD.
		id := self externalID: #notation.
		ignore ifFalse: [dtd notationAt: nm put: (builder notation: nm value: id)].
		self skipSpaceInDTD.
		self mustFind: '>'.
		str == lastSource
			ifFalse: [self invalid: 'Improper nesting of declarations within a parameter entity'].
		true]
	ifFalse: [false]!

pubIdLiteral
    | str |
    str := self quotedString.
    str do: [:ch |
	((' -''()+,./:=?;!*#@$_%' includes: ch)
			or: [ch asInteger = 10
			or: [ch asInteger = 13
			or: [ch < 127
			and: [ch isLetter or: [ch isDigit]]]]])
		ifFalse: [self expected: 'valid public id character']].
    ^str!

systemLiteral
    ^self quotedString! !

!XMLParser methodsFor: 'entity processing'!

entityDecl
    | nm def str |
    str := currentSource.
    ^(self skipIf: '<!ENTITY')
	ifTrue:
		[self forceSpace.
		hereChar = $%
			ifTrue:
				[self getNextChar; forceSpaceInDTD.
				nm := self getSimpleName.
				self forceSpaceInDTD.
				def := self peDef.
				def name: nm.
				ignore ifFalse: [self dtd parameterEntityAt: nm put: def]]
			ifFalse:
				[self skipSpaceInDTD.
				nm := self getSimpleName.
				self forceSpaceInDTD.
				def := self entityDef.
				def name: nm.
				ignore ifFalse: [self dtd generalEntityAt: nm put: def]].
		self skipSpaceInDTD.
		self mustFind: '>'.
		str == lastSource
			ifFalse: [self invalid: 'Improper nesting of declarations within a parameter entity'].
		true]
	ifFalse: [false]!

entityDef
    | val ndata |
    ^(val := self entityValue) isNil
	ifTrue: [(val := self externalID: #entity) isNil
		ifTrue: [self expected: 'entity value or external id']
		ifFalse:
			[ndata := self nDataDecl.
			GeneralEntity new
				externalFrom: val;
				ndata: ndata]]
	ifFalse: [GeneralEntity new text: val]!

entityValue
    | data aQuote s str1 |
    aQuote := hereChar.
    (aQuote = $' or: [aQuote = $"]) ifFalse: [^nil].
    s := currentSource.
    self getNextChar.
    data := (String new: 32) writeStream.
    OrderedCollection new.
    [hereChar isNil
	ifTrue: [self expected: (String with: aQuote)].
    (hereChar = aQuote and: [s = currentSource])]
	whileFalse:
		[hereChar = $&
			ifTrue:
				[str1 := currentSource.
				(self skipIf: '&#')
					ifTrue: [self charEntity: data startedIn: str1]
					ifFalse: [self getNextChar; generalEntity: data]]
			ifFalse: [(self PERef: #data)
				ifFalse:
					[data nextPut: hereChar.
					self getNextChar]]].
    self getNextChar.
    ^data contents!

generalEntity: str
    | nm |
    nm := self getSimpleName.
    hereChar = $;
	ifFalse: [self expected: 'semicolon'].
    str nextPut: $&; nextPutAll: nm; nextPut: $;.
    self getNextChar!

nDataDecl
    ^self skipSpaceInDTD
	ifTrue: [(self skipIf: 'NDATA')
		ifTrue:
			[self forceSpaceInDTD.
			self getSimpleName]
		ifFalse: [nil]]
	ifFalse: [nil]!

peDef
    | val |
    ^(val := self entityValue) isNil
	ifTrue: [(val := self externalID: #entity) isNil
		ifTrue: [self expected: 'entity value or external id']
		ifFalse:
			[ParameterEntity new
				externalFrom: val]]
	ifFalse: [ParameterEntity new text: val]!

PERef: refType
    | nm exp |
    ^(hereChar = $%)
	ifTrue:
		[self getNextChar.
		(self inInternalSubset and: [refType ~= #dtdEntry])
			ifTrue: [self notPermitted: 'Parameter entity reference in the internal DTD, inside a declaration'].
		nm := self getSimpleName.
		hereChar = $; ifFalse: [self expected: 'semicolon'].
		exp := self dtd parameterEntityAt: nm.
		exp isNil
			ifTrue: [self warn: ('The parameter entity "%1" has not been defined'
						bindWith: nm)].
		exp isNil
			ifTrue: [self isValidating
				ifTrue:
					[self invalid: 'Parameter entity used but not defined'.
					self getNextChar]
				ifFalse:
					[self pushSource: (StreamWrapper
							emptyWithExtraSpace: refType ~= #data).
					self getNextChar]]
			ifFalse:
				[exp streamFor: self addSpaces: refType ~= #data].
		(refType ~= #data and: [self sourceWrapper protocol ~= 'internal'])
			ifTrue: [self sourceWrapper usedAsExternal: true].
		true]
	ifFalse: [false]! !

!XMLParser methodsFor: 'element def processing'!

completeChildren: str
    | div items node |
    items := OrderedCollection with: self cp.
    self skipSpaceInDTD.
    div := nil.
    [self skipIf: ')']
	whileFalse:
		[div isNil ifTrue:
			[(',|' includes: hereChar) ifFalse: [self expected: ', or |'].
			div := hereChar].
		div = hereChar ifFalse: [self expected: (String with: div)].
		self getNextChar; skipSpaceInDTD.
		items add: self cp.
		self skipSpaceInDTD].
    (self isValidating and: [lastSource ~~ str])
	ifTrue: [self expected: 'proper nesting of parentheses within entities'].
    div isNil ifTrue: [div := $,].
    div = $,
	ifTrue: [node := SequencePattern on: items]
	ifFalse: [node := ChoicePattern on: items].
    ('*+?' includes: hereChar)
	ifTrue:
		[node := ModifiedPattern on: node type: hereChar.
		self getNextChar].
    ^node!

completeMixedContent: str
    "we already have the #PCDATA finished."
    | names |
    self skipSpaceInDTD.
    names := OrderedCollection new.
    [hereChar = $)]
	whileFalse:
		[self mustFind: '|'.
		self skipSpaceInDTD.
		names add: (NamePattern named: self getQualifiedName).
		self skipSpaceInDTD].
    (self isValidating and: [currentSource ~~ str])
	ifTrue: [self expected: 'proper nesting of parentheses within entities'].
    names size = 0
	ifTrue: [self mustFind: ')']
	ifFalse: [self mustFind: ')*'].
    ^MixedPattern on: names!

contentsSpec
    | str |
    ^(self skipIf: 'ANY')
	ifTrue: [AnyPattern new]
	ifFalse: [(self skipIf: 'EMPTY')
		ifTrue: [EmptyPattern new]
		ifFalse:
			[str := currentSource.
			self mustFind: '('.
			self skipSpaceInDTD.
			(self skipIf: '#PCDATA')
				ifTrue: [self completeMixedContent: str]
				ifFalse: [self completeChildren: str]]]!

cp
    | node str |
    str := currentSource.
    ^(self skipIf: '(')
	ifTrue: [self completeChildren: str]
	ifFalse:
		[node := NamePattern named: self getQualifiedName.
		('*+?' includes: hereChar)
			ifTrue:
				[node := ModifiedPattern on: node type: hereChar.
				self getNextChar].
		node]!

elementDecl
    | nm cSpec str |
    str := currentSource.
    ^(self skipIf: '<!ELEMENT')
	ifTrue:
		[self forceSpaceInDTD.
		nm := self getQualifiedName.
		self forceSpaceInDTD.
		cSpec := self contentsSpec normalize.
		ignore ifFalse: [self dtd elementFor: nm put: cSpec].
		self skipSpaceInDTD.
		self mustFind: '>'.
		str == lastSource
			ifFalse: [self invalid: 'Improper nesting of declarations within a parameter entity'].
		true]
	ifFalse: [false]! !

!XMLParser methodsFor: 'element processing'!

charEntity: data startedIn: str1
    | base digit n d |
    hereChar = $x
	ifTrue:
		[base := 16.
		digit := 'hex digit'.
		self getNextChar]
	ifFalse:
		[base := 10.
		digit := 'digit'].
    n := 0.
    [hereChar = $;]
	whileFalse:
		[d := hereChar digitValue.
		(d >= 0 and: [d < base]) ifFalse: [self expected: digit].
		n := n * base + d.
		self getNextChar].
    str1 = currentSource
	ifFalse: [self expected: 'proper nesting of character entities inside other entities'].
    (n >= 16r110000   "IllegalCharacters size"
		or: [(IllegalCharacters at: n+1) = 1])
	ifTrue: [self notPermitted: 'a character with Unicode value ', n printString].
    data nextPut: (Character value: n).
    self getNextChar!

closeTag: tag return: elements
    | nm |
    nm := self getQualifiedName.
    nm := builder correctTag: nm.
    self skipSpace.
    self mustFind: '>'.
    nm = tag
	ifFalse: [self expected: 'close tag for ', tag asString].
    ^elements!

completeCDATA: str1
    | str data |
    data := (String new: 32) writeStream.
    [str := self upToAll: ']>'.
    str last = $]]
	whileFalse:
		[data nextPutAll: str; nextPutAll: ']>'].
    lastSource = str1
	ifFalse: [self expected: 'proper nesting of CDATA in entities'].
    data nextPutAll: (str copyFrom: 1 to: str size - 1).
    ^builder makeText: data contents!

completeComment: str1
    | str data comment |
    data := (String new: 32) writeStream.
    [str := self upToAll: '->'.
    str last = $-]
	whileFalse:
		[data nextPutAll: str; nextPutAll: '->'].
    data nextPutAll: (str copyFrom: 1 to: str size - 1).
    comment := builder comment: data contents.
    (comment text findString: '--' startingAt: 1) = 0
	ifFalse: [self notPermitted: 'doubled hyphens in comments'].
    (comment text size > 0 and: [comment text last = $-])
	ifTrue: [self notPermitted: 'a hyphen as the last character in a comment'].
    lastSource = str1
	ifFalse: [self expected: 'proper nesting of comments in entities'].
    ^comment!

completePI: str1
    | nm pi |
    nm := self getSimpleName.
    nm = 'xml' ifTrue: [self notPermitted: '"xml" declaration, except at the beginning of the file'].
    nm asLowercase = 'xml' ifTrue: [self notPermitted: '''xml'' as part of a name'].
    self skipSpace
	ifTrue:
		[pi := self upToAll: '?>']
	ifFalse:
		[pi := ''.
		self mustFind: '?>'].
    lastSource = str1
	ifFalse: [self expected: 'proper nesting of programming instructions in entities'].
    ^builder pi: nm text: pi!

element
    | str1 startPosition |
    str1 := currentSource.
    startPosition := str1 stream position - (str1 characterSize: hereChar).
    ^self elementAtPosition: startPosition!

elementAtPosition: startPosition
    | attributes nm str1 elements p tag |
    str1 := currentSource.
    self mustFind: '<'.
    nm := self getQualifiedName.
    ^builder pushTag: nm
	whileDoing:
		[attributes := self processAttributes.
		self isValidating
			ifTrue: [attributes := self validateAttributes: attributes for: builder currentTag].
		(self skipIf: '/>')
			ifTrue: 
				[str1 = lastSource ifFalse: [self expected: 'elements properly nested within entities'].
				self isValidating
					ifTrue:
						[tag := builder currentTag.
						p := self dtd elementFor: tag.
						p isNil
							ifTrue: [self notPermitted: ('Using a tag (%1) without declaring it' bindWith: tag asString)]
							ifFalse: [p validateTag: tag content: #() for: self]].
				builder postProcessElement: (builder
					tag: builder currentTag
					attributes: attributes
					elements: nil
					position: startPosition
					stream: str1)]
			ifFalse: [(self skipIf: '>')
				ifTrue: 
					[str1 = lastSource ifFalse: [self expected: 'elements properly nested within entities'].
					elements := self elementContent: builder currentTag openedIn: str1.
					builder postProcessElement: (builder
						tag: builder currentTag
						attributes: attributes
						elements: elements
						position: startPosition
						stream: str1)]
				ifFalse: [self expected: 'end of start tag']]]!

elementContent: tag openedIn: str
    | data elements str1 result p |
    data := (String new: 32) writeStream.
    elements := OrderedCollection new.
    [hereChar isNil
	ifTrue: [self expected: ('end tag for %%<%1>' bindWith: tag)].
    hereChar = $<
	ifTrue:
		[data position = 0
			ifFalse:
				[data := data contents.
"###					(data findString: ']]>' startingAt: 1) = 0
					ifFalse: [self halt: 'including ]]> in element content'].
"					self with: elements add: (builder makeText: data).
				data := (String new: 32) writeStream].
		str1 := currentSource.
		(self skipIf: '</')
			ifTrue:
				[result := self closeTag: tag return: elements asArray.
				str == lastSource
					ifFalse: [self expected: 'elements properly nested within entities'].
				self isValidating
					ifTrue:
						[p := self dtd elementFor: tag.
						p isNil
							ifTrue: [self invalid: ('Using a tag (%1) without declaring it is not valid' bindWith: tag)]
							ifFalse: [p validateTag: tag content: result for: self]].
				^result]
			ifFalse: [(self skipIf: '<?')
				ifTrue: [self with: elements add: (self completePI: str1)]
				ifFalse: [(self skipIf: '<![CDATA[')
					ifTrue: [self with: elements add: (self completeCDATA: str1)]
					ifFalse: [(self skipIf: '<!--')
						ifTrue: [self with: elements add: (self completeComment: str1)]
						ifFalse: [self with: elements add: self element]]]]]
	ifFalse: [hereChar = $&
		ifTrue:
			[str1 := currentSource.
			(self skipIf: '&#')
				ifTrue: [self charEntity: data startedIn: str1]
				ifFalse: [self getNextChar; generalEntityInText: data canBeExternal: true]]
		ifFalse:
			[data nextPut: hereChar.
			self getNextChar]]] repeat!

generalEntityInText: str canBeExternal: external
    | exp nm str1 |
    str1 := lastSource.
    nm := self getSimpleName.
    hereChar = $;
	ifFalse: [self expected: 'semicolon'].
    currentSource = str1
	ifFalse: [self expected: 'proper nesting of entity references within other entity references'].
    exp := self dtd generalEntityAt: nm.
    exp isNil
	ifTrue: [self warn: ('The general entity "%1" has not been defined'
						bindWith: nm)].
    exp isNil
	ifTrue:
		[self shouldTestWFCEntityDeclared
			ifTrue: [self malformed: 'General entity used but not defined'].
		str nextPut: $&; nextPutAll: nm; nextPut: $;.
		self getNextChar]
	ifFalse:
		[(external or: [exp isExternal not])
			ifFalse: [self notPermitted: 'external entities in attribute values'].
		exp isParsed
			ifFalse: [self malformed: 'References to unparsed entities other than in an attribute of type ENTITY are not permitted'].
		exp streamFor: self].!

isValidTag: aTag
    ^true! !

!XMLParser methodsFor: 'attribute def processing'!

attListDecl
    | nm str1 attr |
    str1 := currentSource.
    ^(self skipIf: '<!ATTLIST')
	ifTrue:
		[self forceSpaceInDTD.
		nm := self getQualifiedName.
		[self skipSpaceInDTD.
		self skipIf: '>']
			whileFalse:
				[self skipSpaceInDTD.
				attr := AttributeDef new name: self getQualifiedName.
				self forceSpaceInDTD.
				attr type: self attType.
				self forceSpaceInDTD.
				attr default: self defaultDecl.
				self isValidating ifTrue: [attr selfValidateFor: self].
				ignore ifFalse: [self dtd attributeFor: nm subKey: attr name put: attr]].
		str1 == lastSource
			ifFalse: [self invalid: 'Improper nesting of declarations within a parameter entity'].
		true]
	ifFalse: [false]!

attType
    | nm all type |
    ^hereChar = $(
	ifTrue: [self enumeration]
	ifFalse:
		[nm := self getSimpleName.
		all := #('NOTATION' 'CDATA' 'ID'
					'IDREF' 'IDREFS'
					'ENTITY' 'ENTITIES'
					'NMTOKEN' 'NMTOKENS').
		(all includes: nm)
			ifFalse: [self expected: 'one of ', all storeString].
		type := #(#{NOTATION_AT} #{CDATA_AT} #{ID_AT}
					#{IDREF_AT} #{IDREFS_AT}
					#{ENTITY_AT} #{ENTITIES_AT}
					#{NMTOKEN_AT} #{NMTOKENS_AT})
				at: (all indexOf: nm).
		nm = 'NOTATION'
			ifTrue: [self completeNotationType]
			ifFalse: [type value new]].!

completeNotationType
    | nm |
    self forceSpaceInDTD.
    self mustFind: '('.
    self skipSpaceInDTD.
    nm := OrderedCollection with: self getSimpleName.
    self skipSpaceInDTD.
    [self skipIf: '|']
	whileTrue:
		[self skipSpaceInDTD.
		nm add: self getSimpleName.
		self skipSpaceInDTD].
    self mustFind: ')'.
    ^NOTATION_AT typeNames: nm!

defaultDecl
    | fixed default |
    ^(self skipIf: '#REQUIRED')
	ifTrue: [#required]
	ifFalse: [(self skipIf: '#IMPLIED')
		ifTrue: [#implied]
		ifFalse:
			[fixed := self skipIf: '#FIXED'.
			fixed ifTrue:
				[self forceSpaceInDTD].
			default := self attValue.
			default isNil ifTrue: [self expected: 'quoted value for the attribute''s default'].
			fixed -> default]]!

enumeration
    | nm |
    self mustFind: '('.
    self skipSpaceInDTD.
    nm := OrderedCollection with: self nmToken.
    self skipSpaceInDTD.
    [self skipIf: '|']
	whileTrue:
		[self skipSpaceInDTD.
		nm add: self nmToken.
		self skipSpaceInDTD].
    self mustFind: ')'.
    ^Enumeration_AT withAll: nm! !

!XMLParser methodsFor: 'attribute processing'!

attribute
    | nm value |
    nm := self getQualifiedName.
    self skipSpace.
    self mustFind: '='.
    self skipSpace.
    value := self attValue.
    value isNil ifTrue: [self expected: 'quoted value for the attribute'].
    ^builder attribute: nm value: value!

attValue
    | data aQuote s str1 |
    aQuote := hereChar.
    (aQuote = $' or: [aQuote = $"]) ifFalse: [^nil].
    s := currentSource.
    self getNextChar.
    data := (String new: 32) writeStream.
    OrderedCollection new.
    [(hereChar = aQuote and: [s = currentSource])]
	whileFalse:
		[hereChar = $<
			ifTrue: [self malformed: '< not permitted in attribute values; use &lt;'].
		hereChar = $&
			ifTrue:
				[str1 := currentSource.
				(self skipIf: '&#')
					ifTrue: [self charEntity: data startedIn: str1]
					ifFalse: [self getNextChar; generalEntityInText: data canBeExternal: false]]
			ifFalse:
				[hereChar asInteger < 16r20
					ifTrue: [data space]
					ifFalse: [data nextPut: hereChar].
				self getNextChar]].
    self getNextChar.
    ^data contents!

isValidName: aTag
    aTag size = 0 ifTrue: [^false].
    (self isValidNameStart: aTag first)
	ifFalse: [^false].
    2 to: aTag size do: [:i |
	(self isValidNameChar: (aTag at: i))
		ifFalse: [^false]].
    ^true!

isValidNmToken: aTag
    aTag size = 0 ifTrue: [^false].
    1 to: aTag size do: [:i |
	(self isValidNameChar: (aTag at: i))
		ifFalse: [^false]].
    ^true!

processAttributes
    | attributes |
    attributes := nil.
    [self skipSpace.
    self isValidNameStart: hereChar]
	whileTrue:
		[attributes isNil ifTrue: [attributes := OrderedCollection new].
		attributes add: self attribute.
		(attributes collect: [:i | i key]) asSet size = attributes size
			ifFalse: [self notPermitted: 'two attributes with the same name']].
    attributes := builder resolveNamespaces: attributes.
    ^attributes!

quotedString
    | string |
    hereChar = $"
	ifTrue:
		[string := self upTo: $".
		self getNextChar.
		^string].
    hereChar = $'
	ifTrue:
		[string := self upTo: $'.
		self getNextChar.
		^string].
    self expected: 'quoted string'!

validateAttributes: attributes for: tag
    | attr attributeList keys |
    attr := self dtd attributesFor: tag.
    attributeList := attributes isNil ifTrue: [#()] ifFalse: [attributes].
    keys := Set new.
    attributeList do: [:i |
	(keys includes: i key)
		ifTrue: [self malformed: ('the attribute %1 was used twice in this element''s tag'
							bindWith: i key)]
		ifFalse: [keys add: i key].
	(attr includesKey: i key asString)
		ifFalse: [self invalid: ('the attribute %1 was not defined in the DTD'
							bindWith: i key)]].
    attr do: [:adef | | a |
	a := attributeList detect: [:at | at key isLike: adef name] ifNone: [].
	a isNil
		ifTrue: [adef hasDefault
			ifTrue: [attributeList := attributeList copyWith:
						(builder attribute: adef name value: adef default)]
			ifFalse: [adef isRequired
				ifTrue: [self invalid: ('"%1" elements are required to have a "%2" attribute'
								bindWith: tag asString
								with: adef name asString)]]]
		ifFalse: [adef validateValueOf: a for: self]].
    ^attributeList size = 0
	ifTrue: [nil]
	ifFalse: [attributeList]! !

!XMLParser methodsFor: 'IDs'!

checkUnresolvedIDREFs
    (self isValidating and: [unresolvedIDREFs isEmpty not])
	ifTrue: [self invalid: ('The IDREFs %1 have not been resolved to IDs' bindWith: unresolvedIDREFs asSortedCollection asArray)]!

rememberIDREF: anID
    self documentNode atID: anID ifAbsent: [unresolvedIDREFs add: anID]!

resolveIDREF: anID
    unresolvedIDREFs remove: anID ifAbsent: []! !

!XMLParser methodsFor: 'streaming'!

atEnd
    [sourceStack isNil ifTrue: [^true].
    sourceStack atEnd]
	whileTrue:
		[sourceStack close.
		sourceStack := sourceStack nextLink].
    ^false!

forceSpace
    self skipSpace ifFalse: [self expected: 'white space'].!

forceSpaceInDTD
    self skipSpaceInDTD ifFalse: [self expected: 'white space'].!

getNextChar
    ^hereChar := self nextChar!

mustFind: str
    (self skipIf: str)
	ifFalse: [self expected: '"', str, '"']!

nextChar
    | ch |
    self atEnd.
    sourceStack isNil ifTrue: [^nil].
    lastSource := currentSource.
    currentSource := sourceStack.
    ch := currentSource nextFor: self.
    ^ch!

skipIf: str
    | p oc |
    hereChar = str first ifFalse: [^false].
    p := self sourceWrapper stream position.
    oc := hereChar.
    1 to: str size do: [:i |
	hereChar = (str at: i)
		ifFalse:
			[self sourceWrapper stream position: p.
			hereChar := oc.
			^false].
	lastSource := currentSource.
	currentSource := self sourceWrapper.
	hereChar := self sourceWrapper nextFor: self].
    hereChar isNil
	ifTrue: [self getNextChar].
    ^true!

skipSpace
    | n |
    n := 0.
    [hereChar notNil and: [#(9 10 13 32) includes: hereChar asInteger]] whileTrue:
	[n := n+1.
	self getNextChar].
    ^n > 0!

skipSpaceInDTD
    | space |
    space := self skipSpace.
    [self PERef: #dtd]
	whileTrue: [space := self skipSpace | space].
    ^space!

upTo: aCharacter
    "Answer a subcollection from position to the occurrence (if any, exclusive) of anObject.
     The stream is left positioned after anObject.
    If anObject is not found answer everything."

    | newStream element |
    newStream := (String new: 64) writeStream.
    [self atEnd]
	whileFalse:
		[element := self nextChar.
		element = aCharacter
			ifTrue: [^newStream contents].
		newStream nextPut: element.].
    self expected: (String with: aCharacter).
    ^newStream contents!

upToAll: target
    " Answer a subcollection from the current position
    up to the occurrence (if any, not inclusive) of target,
    and leave the stream positioned before the occurrence.
    If no occurrence is found, answer the entire remaining
    stream contents, and leave the stream positioned at the end.
    We are going to cheat here, and assume that the first
    character in the target only occurs once in the target, so
    that we don't have to backtrack."

    | str i |
    (target occurrencesOf: target first) = 1
	ifFalse: [self error: 'The target collection is ambiguous.'].
    self sourceWrapper skip: -1.
    str := (String new: 32) writeStream.
    [str nextPutAll: (self upTo: target first).
    i := 2.
    [i <= target size and:
		[self nextChar = (target at: i)]]
	whileTrue:
		[i := i+1].
    i <= target size]
	whileTrue:
		[str nextPutAll: (target copyFrom: 1 to: i - 1).
		self sourceWrapper skip: -1].
    self getNextChar.
    ^str contents! !

!XMLParser methodsFor: 'private'!

closeAllFiles
    self fullSourceStack do: [:str | str close]!

documentNode
    ^documentNode!

error: aStringOrMessage
    ^XMLSignal signal: aStringOrMessage asString!

expected: string
    self malformed: string, ' expected, but not found'!

fullSourceStack
    | out s |
    out := OrderedCollection new.
    s := sourceStack.
    [s isNil]
	whileFalse:
		[out addFirst: s.
		s := s nextLink].
    ^out!

getQualifiedName
    | nm |
    nm := self getSimpleName.
    ^hereChar = $:
	ifTrue:
		[self getNextChar.
		NodeTag new qualifier: nm ns: '' type: self getSimpleName]
	ifFalse:
		[NodeTag new qualifier: '' ns: '' type: nm]!

getSimpleName
    | s |
    (self isValidNameStart: hereChar) ifFalse: [^self expected: 'name'].
    s := (String new: 16) writeStream.
    s nextPut: hereChar.
    [self getNextChar.
    self isValidNameChar: hereChar] whileTrue: [s nextPut: hereChar].
    ^s contents!

invalid: aString
    self class invalid: aString!

isValidNameChar: c
    ^c isLetter
	or: [c isDigit
	or: ['-_.' includes: c]]!

isValidNameStart: c
    ^c isLetter or: [c = $_]!

malformed: aString
    self class malformed: aString!

nmToken
    | s |
    (self isValidNameChar: hereChar) ifFalse: [^self expected: 'NmToken'].
    s := (String new: 16) writeStream.
    s nextPut: hereChar.
    [self getNextChar.
    self isValidNameChar: hereChar] whileTrue: [s nextPut: hereChar].
    ^s contents!

notPermitted: string
    self malformed: string, ' is not permitted'!

validateEncoding: encName
    | c |
    encName size = 0 ifTrue: [self expected: 'non-empty encoding name'].
    c := encName first.
    (c asInteger < 128 and: [c isLetter])
	ifFalse: [self malformed: ('The first letter of the encoding ("%1") must be an ASCII alphabetic letter'
					bindWith: encName)].
    2 to: encName size do: [:i |
	c := encName at: i.
	(c asInteger < 128 and: [c isLetter or: [c isDigit or: ['._-' includes: c]]])
		ifFalse: [self malformed: ('The first name of the encoding ("%1") must be ''.'', ''_'', ''-'', or an ASCII letter or digit'
					bindWith: encName)]]!

warn: aString
    Warning signal: aString!

with: list add: node
    node isDiscarded
	ifFalse: [list add: node]! !


Object subclass: #XMLNodeBuilder
    instanceVariableNames: 'tagStack tags '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!




!XMLNodeBuilder class methodsFor: 'instance creation'!

new
    ^super new initialize! !



XMLNodeBuilder comment: '
This class along with it''s subclass NodeBuilder is used by the XMLParser to distill an XML document into it''s component elements. Since XML elements are tag delimited and nest
properly within each other in a well-formed XML document, this class contains code to
process the tags and build a dictionary of xml elements

Instance Variables:
    tagStack	<OrderedCollection> 
    tags	<Dictionary>
'!


!XMLNodeBuilder methodsFor: 'initialize'!

initialize
    tagStack := OrderedCollection new.
    tags := Dictionary new.! !

!XMLNodeBuilder methodsFor: 'accessing'!

currentTag
    ^tagStack last tag!

pushTag: tag whileDoing: aBlock
    tagStack addLast: (ElementContext new tag: tag).
    ^aBlock ensure: [tagStack removeLast]! !

!XMLNodeBuilder methodsFor: 'namespaces'!

correctAttributeTag: attribute
    | ns tag key qual type |
    qual := attribute tag qualifier.
    qual isEmpty
	ifTrue: [^self].
    type := attribute tag type.
    ns := self findNamespace: qual.
    key := Array with: qual with: ns with: type.
    tag := tags at: key ifAbsentPut: [NodeTag new qualifier: qual ns: ns type: type].
    attribute tag: tag!

correctTag: tag
    | ns type key qualifier |
    qualifier := tag qualifier.
    type := tag type.
    ns := self findNamespace: qualifier.
    key := Array with: qualifier with: ns with: type.
    ^tags at: key ifAbsentPut: [NodeTag new qualifier: qualifier ns: ns type: type]!

findNamespace: ns
    | nsURI |
    ns = 'xml' ifTrue: [^XML_URI].
    ns = 'xmlns' ifTrue: [^'<!-- xml namespace -->'].
    tagStack size to: 1 by: -1 do: [:i |
	nsURI := (tagStack at: i) findNamespace: ns.
	nsURI = nil ifFalse: [^nsURI]].
    ^ns = ''
	ifTrue: ['']
	ifFalse: [XMLParser invalid: ('The namespace qualifier %1 has not been bound to a namespace URI' bindWith: ns)]!

postProcessElement: anElement
    tagStack last definesNamespaces
	ifTrue: [anElement namespaces: tagStack last namespaces].
    ^anElement!

resolveNamespaces: attributes
    | newAttributes |
    attributes isNil
	ifTrue: [newAttributes := #()]
	ifFalse:
		[newAttributes := OrderedCollection new: attributes size.
		attributes do: [:attr |
			newAttributes add: attr.
			attr tag qualifier = 'xmlns'
				ifTrue: [tagStack last defineNamespace: attr]
				ifFalse: [(attr tag isLike: 'xmlns')
					ifTrue: [tagStack last defineDefaultNamespace: attr]
					ifFalse: []]].
		newAttributes do: [:attr | self correctAttributeTag: attr]].
    tagStack last tag: (self correctTag: tagStack last tag).
    ^newAttributes isEmpty
	ifTrue: [nil]
	ifFalse: [newAttributes asArray]! !


Object subclass: #Node
    instanceVariableNames: 'parent flags '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!




!Node class methodsFor: 'instance creation'!

new
    ^super new initialize! !



Node comment: '
An XML.Node represents a logical component of an XML Document. Logically, the
document is composed of declarations, elements, comments, character references, and 
processing instructions, all of which are indicated in the document by explicit markup. 
The various subclasses of XML.Node represent these various components.

Subclasses must implement the following messages:
    printing
	printCanonicalOn:
	printHTMLOn:
	printNoIndentOn:endSpacing:spacing:

Instance Variables:
    parent	<XML.Node | nil> 
    flags	<SmallInteger> 
'!


!Node methodsFor: 'initialize'!

initialize
    flags := 0.! !

!Node methodsFor: 'accessing'!

discard
    self flags: (self flags bitOr: 1)!

document
    ^parent document!

expandedName
    ^''!

flags
    ^flags!

flags: flagBits
    flags := flagBits!

parent
    ^parent!

parent: aNode
    parent := aNode!

selectNodes: aBlock
    ^#()!

tag
    ^nil! !

!Node methodsFor: 'printing'!

canonicalPrintString
    | s |
    s := (String new: 1024) writeStream.
    self printCanonicalOn: s.
    ^s contents!

noIndentPrintString
    | s |
    s := (String new: 1024) writeStream.
    self printNoIndentOn: s.
    ^s contents!

printCanonical: text on: aStream
    "Print myself on the stream in the form described by James
    Clark's canonical XML."

    text do: [:c || entity |
	entity := CanonicalXMLEntities at: c value + 1.
	entity isNil
	    ifTrue: [ aStream nextPut: c ]
	    ifFalse: [ aStream nextPutAll: entity ]
    ]!

printCanonicalOn: aStream
    "Print myself on the stream in the form described by James
    Clark's canonical XML."

    self subclassResponsibility!

printHTMLOn: aStream
    "Print myself on the stream in a form usual for HTML."

    self subclassResponsibility!

printNoIndentOn: aStream
    "Print myself on the stream with line breaks between adjacent
    elements, but no indentation."

    self printNoIndentOn: aStream
	endSpacing: [:node :list | aStream nl]
	spacing: [:node :list | aStream nl]!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    "Print myself on the stream with line breaks between adjacent
    elements, but no indentation."

    self subclassResponsibility!

printOn: aStream
    self printOn: aStream depth: 0!

simpleDescription
    ^self printString! !

!Node methodsFor: 'testing'!

hasAncestor: aNode
    | p |
    p := self parent.
    [p isNil] whileFalse:
	[p == aNode ifTrue: [^true].
	p := p parent].
    ^false!

hasSubNodes
    ^false!

isAttribute
    ^false!

isBlankText
    ^false!

isComment
    ^false!

isContent
    ^false!

isDiscarded
    ^(self flags bitAnd: 1) = 1!

isDocument
    ^false!

isElement
    ^false!

isLike: aNode
    ^self class == aNode class!

isProcessingInstruction
    ^false!

isText
    ^false!

precedes: aNode
    | n1 n2 |
    aNode document == self document
	ifFalse: [self error: 'These nodes can''t be ordered. They are not in the same document.'].
    aNode == self document
	ifTrue: [^false].
    self == self document
	ifTrue: [^true].
    n1 := self.
    n2 := aNode.
    (n2 hasAncestor: n1) ifTrue: [^true].
    (n1 hasAncestor: n2) ifTrue: [^false].
    [n1 parent == n2 parent] whileFalse:
	[[n1 parent hasAncestor: n2 parent] whileTrue: [n1 := n1 parent].
	[n2 parent hasAncestor: n1 parent] whileTrue: [n2 := n2 parent].
	n1 parent == n2 parent
		ifFalse: [n1 := n1 parent. n2 := n2 parent]].
    ^(n1 parent indexOf: n1) < (n1 parent indexOf: n2)!

verifyContents
    ^true! !

!Node methodsFor: 'enumerating'!

nodesDo: aBlock
    aBlock value: self! !

!Node methodsFor: 'namespaces'!

findNamespaceAt: qualifier
    | ns node |
    qualifier = 'xml' ifTrue: [^XML_URI].
    ns := nil.
    node := self.
    [node isElement and: [ns isNil]]
	whileTrue:
		[ns := node namespaceAt: qualifier.
		node := node parent].
    ^ns!

findQualifierAtNamespace: ns
    | qual node |
    qual := nil.
    node := self.
    [node isElement and: [qual isNil]]
	whileTrue:
		[qual := node qualifierAtNamespace: ns.
		node := node parent].
    ^qual!

namespaceAt: qualifier
    ^nil!

qualifierAtNamespace: ns
    ^nil! !


Node subclass: #Comment
    instanceVariableNames: 'text '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


Comment comment: '
This class represents an XML comment. XML comments may appear anywhere in an XML document outside other markup or within the document type declaration at places allowed by grammar.

 XML comments are delimited by the start-tag ''<!--'' and the end-tag ''-->''. 

According to the XML 1.0 specification, for compatibilty, double-hyphens (the string ''--'') must not occur within comments.

Instance Variables:
    text	<String>  contents of the comment element
'!


!Comment methodsFor: 'printing'!

printCanonicalOn: aStream
    ^self!

printHTMLOn: aStream
    self printOn: aStream!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    ^self printOn: aStream!

printOn: aStream depth: indent
    aStream nextPutAll: '<!--', (text isNil ifTrue: [''] ifFalse: [text]), '-->'! !

!Comment methodsFor: 'accessing'!

text
    ^text!

text: aText
    text := aText! !

!Comment methodsFor: 'testing'!

isComment
    ^true! !


Node subclass: #Notation
    instanceVariableNames: 'name publicID systemID '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


Notation comment: '
This class represents an XML Notation declaration. Notations are XML elements/nodes which identify by name the format of unparsed entities, the format of elements which bear a notation
attribute or the application to which a processing instruction is addressed.

Notations are delimited by the start-tag ''<!NOTATION'' and end-tag ''>''

The name instance variable provides a name or identifier for the notation, for use in entity and
attribute specifications. The publicID instance variable provides an external identifier which allows
the XML processor or the client application to locate a helper application capable of processing
data in the given notation. The systemID variable allows the parser to optionally resolve the
publicID into the system identifier, file name, or other information needed to allow the application to call a processor for data in the notation.


Instance Variables:
    name	<XML.NodeTag>
    publicID	<XML.URIResolver>
    systemID	<XML.URIResolver>
'!


!Notation methodsFor: 'initialize'!

name: aName identifiers: anArray
    name := aName.
    anArray size = 1
	ifTrue: [systemID := anArray at: 1]
	ifFalse: [anArray size = 2
		ifTrue:
			[systemID := anArray at: 2.
			publicID := anArray at: 1]
		ifFalse: [self error: 'Invalid PUBLIC / SYSTEM identifiers']]! !


Node subclass: #Element
    instanceVariableNames: 'tag attributes namespaces elements definition
			    userData '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!




!Element class methodsFor: 'instance creation'!

tag: tag
    ^self new setTag: tag attributes: nil elements: nil!

tag: tag attributes: attributes elements: elements
    ^self new setTag: tag attributes: attributes elements: elements!

tag: tag elements: elements
    ^self new setTag: tag attributes: nil elements: elements! !



Element comment: '
XML document element boundaries are either delimited by start-tags and end-tags, or, for empty elements, by an empty-element tag. Each element has a type, identified by name, sometimes called its "generic identifier" (GI), and may have a set of attribute specifications. Each attribute specification has a name and a value.

Instance Variables:
    tag				<String | NodeTag>  the tag name of this element
    attributes		<Collection>  comment
    namespaces	<Dictionary>  comment
    elements		<SequenceableCollection>  comment
    definition		<Object>  suspect this is unused
'!


!Element methodsFor: 'initialize'!

initialize
    super initialize.
    tag := 'undefined'.
    attributes := #().! !

!Element methodsFor: 'accessing'!

anyElementNamed: elementName
    "This will return the receiver if its name matches the requirement."

    | list |
    list := self anyElementsNamed: elementName.
    list size > 1 ifTrue: [self error: 'There is not a unique element with this tag'].
    ^list isEmpty ifFalse: [list first]!

anyElementsNamed: elementName
    "This includes the receiver as one of the possibilities."

    | list |
    list := OrderedCollection new.
    self nodesDo: [:e |
	(e isElement and: [e tag isLike: elementName])
		ifTrue: [list add: e]].
    ^list!

attributes
    ^attributes isNil
	ifTrue: [#()]
	ifFalse: [attributes]!

characterData
    | str |
    str := (String new: 128) writeStream.
    self characterDataOnto: str.
    ^str contents!

characterDataOnto: str
    self elements do: [:i |
	i isContent ifTrue: [i characterDataOnto: str]].!

definition
    ^definition!

definition: aPattern
    definition := aPattern!

description
    ^'a %%<%1> element' bindWith: tag!

elementNamed: elementName
    | list |
    list := self elementsNamed: elementName.
    list size = 1 ifFalse: [self error: 'There is not a unique element with this tag'].
    ^list first!

elements
    ^elements isNil
	ifTrue: [#()]
	ifFalse: [elements]!

elementsNamed: elementName
    ^self elements select: [:e | e isElement and: [e tag isLike: elementName]]!

expandedName
    ^tag expandedName!

indexOf: aChild
    aChild parent == self ifFalse: [^nil].
    ^aChild isAttribute
	ifTrue: [-1]
	ifFalse: [elements indexOf: aChild ifAbsent: [nil]]!

namespaces: aDictionary
    namespaces := aDictionary!

selectNodes: aBlock
    ^self attributes, self elements select: aBlock!

tag
    ^tag!

userData
    ^userData!

userData: anObject
    userData := anObject!

valueOfAttribute: attributeName ifAbsent: aBlock
    ^(self attributes
	detect: [:a | a tag isLike: attributeName]
	ifNone: [^aBlock value]) value! !

!Element methodsFor: 'printing'!

printCanonicalOn: aStream
    | elem az |
    aStream nextPut: $<.
    aStream nextPutAll: tag asString.
    attributes isNil
	ifFalse:
		[az := attributes asSortedCollection: [:x :y | x key < y key].
		az do: [:at |
			aStream space.
			at printCanonicalOn: aStream]].
    aStream nextPut: $>.
    elem := elements isNil
	ifTrue: [#()]
	ifFalse: [elements" reject: [:str | str isBlankText]"].
    elem do: [:e |
	e printCanonicalOn: aStream].
    aStream nextPutAll: '</', tag asString, '>'!

printHTMLOn: aStream
    | elem az |
    aStream nextPut: $<.
    aStream nextPutAll: tag asString.
    attributes isNil
	ifFalse:
		[az := attributes asSortedCollection: [:x :y | x key < y key].
		az do: [:at |
			aStream space.
			at printOn: aStream]].
    aStream nextPut: $>.
    elem := elements isNil
	ifTrue: [#()]
	ifFalse: [elements" reject: [:str | str isBlankText]"].
    self isHTMLBlock ifTrue: [aStream nl].
    elem do: [:e |
	e printHTMLOn: aStream.
	self isHTMLBlock ifTrue: [aStream nl]].
    aStream nextPutAll: '</', tag asString, '>'!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    | elem az |
    aStream nextPut: $<.
    aStream nextPutAll: tag asString.
    attributes isNil
	ifFalse:
		[az := attributes asSortedCollection: [:x :y | x key < y key].
		az do: [:at |
			aStream space.
			at printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock]].
    aStream nextPut: $>.
    elem := elements isNil
	ifTrue: [#()]
	ifFalse: [elements reject: [:e | e isBlankText]].
    (elem contains: [:e | e isElement])
	ifTrue:
		[endSpacingBlock value: self value: elem.
		elem do: [:e | e
				printNoIndentOn: aStream
				endSpacing: endSpacingBlock
				spacing: spacingBlock]
			separatedBy: [spacingBlock value: self value: elem].
		endSpacingBlock value: self value: elem]
	ifFalse: [elem do: [:e | e
				printNoIndentOn: aStream
				endSpacing: endSpacingBlock
				spacing: spacingBlock]].
    aStream nextPutAll: '</', tag asString, '>'!

printOn: aStream depth: indent
    | elem |
    aStream nextPut: $<.
    aStream nextPutAll: tag asString.
    attributes isNil
	ifFalse: [1 to: attributes size do: 
		[:i | | a |
		a := attributes at: i.
		aStream space.
		a printOn: aStream]].
    elements isNil
	ifTrue: [aStream nextPutAll: '/>']
	ifFalse:
		[aStream nextPut: $>.
		elem := elements reject: [:str | str isText and: [str isStripped]].
		(elem size <= 1 and: [(elem contains: [:n | n isText not]) not])
			ifTrue: [elem do: [:e |
						e printOn: aStream depth: indent+1]]
			ifFalse:
				[1 to: elem size do: [:i | | e |
					e := elem at: i.
					aStream crtab: indent+1.
					e isString 
						ifTrue: [aStream nextPutAll: e]
						ifFalse: [e printOn: aStream depth: indent+1]].
				aStream crtab: indent ].
		aStream nextPutAll: '</', tag asString, '>']!

simpleDescription
    ^'<', self tag asString, '>'! !

!Element methodsFor: 'namespaces'!

namespaceAt: qualifier
    ^namespaces isNil
	ifTrue: [nil]
	ifFalse: [namespaces at: qualifier ifAbsent: [nil]]!

qualifierAtNamespace: ns
    ^namespaces isNil
	ifTrue: [nil]
	ifFalse:
		[namespaces keysAndValuesDo: 
			[:qualifier :namespace | namespace = ns ifTrue: [^qualifier]].
		nil]! !

!Element methodsFor: 'private'!

attributes: a
    attributes := a.
    a isNil ifFalse: [a do: [:i | i parent: self]].!

condenseList
    elements isNil
	ifFalse: [elements size = 0
		ifTrue: [elements := nil]
		ifFalse: [elements := elements asArray]]!

condenseText
    | elmts str tc |
    elmts := (Array new: elements size) writeStream.
    str := nil.
    elements do: [:elm |
	elm isText
		ifTrue:
			[str isNil ifTrue: [str := (String new: elm text size) writeStream].
			tc := elm class.
			str nextPutAll: elm text]
		ifFalse:
			[str isNil ifFalse: [elmts nextPut: (tc new text: str contents)].
			str := nil.
			elmts nextPut: elm]].
    str isNil ifFalse: [elmts nextPut: (tc new text: str contents)].
    elements := elmts contents.!

elements: e
    elements := e.
    self isEmpty
	ifFalse:
		[self condenseText.
		elements do: [:elm | elm parent: self]]!

setTag: t attributes: a elements: e
    tag := t isString
		ifTrue: [NodeTag new qualifier: '' ns: '' type: t]
		ifFalse: [t].
    self attributes: a.
    self elements: e! !

!Element methodsFor: 'testing'!

hasSubNodes
    ^elements size > 0 or: [attributes size > 0]!

isContent
    ^true!

isElement
    ^true!

isEmpty
    ^elements isNil!

isHTMLBlock
    ^#('p' 'html' 'head' 'body') includes: tag type asLowercase!

isLike: aNode
    ^self class == aNode class
	and: [self tag isLike: aNode tag]!

verifyContents
    self isEmpty
	ifFalse: [elements do: [:elm | elm verifyContents]]! !

!Element methodsFor: 'enumerating'!

nodesDo: aBlock
    aBlock value: self.
    1 to: self attributes size do: [:i |
	(self attributes at: i) nodesDo: aBlock].
    1 to: self elements size do: [:i |
	(self elements at: i) nodesDo: aBlock]! !

!Element methodsFor: 'modifying'!

addAttribute: aNode
    attributes isNil ifTrue: [
	attributes := OrderedCollection new ].
    (attributes class == OrderedCollection)
	ifFalse: [ attributes := attributes asOrderedCollection ].

    attributes add: aNode!

addNode: aNode
    elements isNil
	ifTrue: [elements := OrderedCollection new]
	ifFalse: [elements class == Array
		ifTrue: [elements := elements asOrderedCollection]].
    elements add: aNode!

removeAttribute: aNode
    attributes isNil ifFalse: [
	(attributes class == OrderedCollection)
	    ifFalse: [ attributes := attributes asOrderedCollection ].
        attributes remove: aNode ifAbsent: [ ].
        attributes isEmpty ifTrue: [ attributes := nil ]
    ]!

removeNode: aNode
    elements isNil ifFalse: [
	(elements class == OrderedCollection)
	    ifFalse: [ elements := elements asOrderedCollection ].
        elements remove: aNode ifAbsent: [ ].
        elements isEmpty ifTrue: [ elements := nil ]
    ]! !


Node subclass: #Document
    instanceVariableNames: 'root nodes xmlVersion dtd ids '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


Document comment: '
This class represents an XML document entity and serves as the root of the document entity tree. Each XML document has one entity (root) called the document entity, which serves as the starting point for the XML processor and may contain the whole document (nodes collection). 

According to the XML 1.0 specification, XML documents may and should begin with an XML declaration which specifies the version of XML (xmlVersion instance variable) being used. 

The XML document type declaration which must appear before the first element in a document contains or points to markup declarations that provide the grammar for this document. This grammar is known as document type definition (dtd instance variable). An XML document is valid if it has an associated document type declaration and if the document complies with the constraints expressed in it

Instance Variables:
    root	<XML.Node>  
    nodes	<Collection>  
    xmlVersion	<String>  
	Version of the XML specification to which the document is conformant
    dtd	<XML.DocumentType>
	Associated document type definition 
    ids	<KeyedCollection | Dictionary>  
'!


!Document methodsFor: 'initialize'!

initialize
    super initialize.
    nodes := OrderedCollection new.
    ids := Dictionary new.! !

!Document methodsFor: 'accessing'!

addNamespaceDefinitions
    | d tag |
    d := Dictionary new.
    self nodesDo: [:aNode |
	tag := aNode tag.
	tag isNil
		ifFalse:
			[(d at: tag qualifier ifAbsent: [tag namespace]) = tag namespace
				ifFalse: [self error: 'Using the same tag for multiple namespaces is not currently supported'].
			d at: tag qualifier put: tag namespace]].
    (d at: '' ifAbsent: ['']) = ''
	ifTrue: [d removeKey: '' ifAbsent: []].
    d removeKey: 'xml' ifAbsent: [].
    d removeKey: 'xmlns' ifAbsent: [].
    self root namespaces: d!

addNode: aNode
    nodes add: aNode.
    aNode parent: self.
    aNode isElement
	ifTrue: [root isNil
		ifTrue: [root := aNode]
		ifFalse: [self error: 'It is illegal to have more than one element node at the top level in a document']]!

document
    ^self!

dtd
    ^dtd!

dtd: aDTD
    dtd := aDTD!

elements
    ^nodes!

root
    ^root!

selectNodes: aBlock
    ^nodes select: aBlock!

setRoot: aNode
    root := aNode!

xmlVersion: aString
    xmlVersion := aString! !

!Document methodsFor: 'IDs'!

atID: key ifAbsent: aBlock
    ^ids at: key ifAbsent: aBlock!

registerID: attribute from: aParser
    (ids includesKey: attribute value)
	ifTrue: [aParser invalid: ('The id "%1" was used more than once' bindWith: attribute value)].
    ids at: attribute value put: attribute.
    aParser resolveIDREF: attribute value!

updateIDs
    | attr |
    ids keys do: [:key |
	attr := ids at: key.
	(attr isAttribute and: [attr value = key])
		ifFalse: [self error: 'Attempt to update the ID dictionary twice'].
	ids at: key put: attr parent].! !

!Document methodsFor: 'flags'!

hasDTD
    ^(self flags bitAnd: 16r100) = 16r100!

hasExternalDTD
    ^(self flags bitAnd: 16r200) = 16r200!

noteDTD
    self flags: (self flags bitOr: 16r100)!

noteExternalDTD
    self flags: (self flags bitOr: 16r200)!

notePEReference
    self flags: (self flags bitOr: 16r400)!

usesParameterEntities
    ^(self flags bitAnd: 16r400) = 16r400! !

!Document methodsFor: 'testing'!

hasSubNodes
    ^nodes size > 0!

isContent
    ^true!

isDocument
    ^true!

verifyContents
    root verifyContents! !

!Document methodsFor: 'printing'!

printCanonicalOn: aStream
    nodes do: [:n | n printCanonicalOn: aStream]!

printHTMLOn: aStream
    nodes do: [:n | n printHTMLOn: aStream]!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    nodes do: [:n | n printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock]!

printOn: aStream
    nodes do: [:n | n printOn: aStream. aStream nl]! !

!Document methodsFor: 'enumerating'!

nodesDo: aBlock
    aBlock value: self.
    1 to: self elements size do: [:i |
	(self elements at: i) nodesDo: aBlock]! !


ConcretePattern subclass: #AnyPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


AnyPattern comment: '
This class represents the ANY element content constraint in an element type declaration. According to the XML 1.0 specification the ANY pattern/rule is used to indicate to the validating
parser that the element can contain any elements in any order, as long as it doesn''t break any of the other rules of XML and the types of any child elements have been declared.'!


!AnyPattern methodsFor: 'accessing'!

description
    ^'ANY'! !

!AnyPattern methodsFor: 'coercing'!

alternateHeads
    ^followSet copyWith: self!

pushDownFollowSet
    self addFollow: self.
    ^nil! !

!AnyPattern methodsFor: 'testing'!

couldBeText
    ^true!

matches: aNode
    ^true! !

ConcretePattern subclass: #NamePattern
    instanceVariableNames: 'name '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!NamePattern class methodsFor: 'instance creation'!

named: aName
    ^self new named: aName! !



NamePattern comment: '
This class represents a content constraint in an element type declaration such that the declaration includes the names of the element types that may appear as children in the element''s content

Instance Variables:
    name	<XML.NodeTag>'!


!NamePattern methodsFor: 'initialize'!

named: aName
    name := aName! !

!NamePattern methodsFor: 'accessing'!

description
    ^name printString!

name
    ^name! !

!NamePattern methodsFor: 'testing'!

matches: aNode
    ^(aNode isElement and: [name isLike: aNode tag])
	"or: [aNode isText not]"! !


XMLNodeBuilder subclass: #NodeBuilder
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!


NodeBuilder comment: '
This class along with it''s superclass XMLNodeBuilder is used by the XMLParser to distill an XML document into it''s component elements. This NodeBuilder class in particular is used to create instances of the various XML elements that are included in the scanned-in XML document or document string.'!


!NodeBuilder methodsFor: 'building'!

attribute: name value: value
    ^Attribute name: name value: value!

comment: aText
    ^Comment new text: aText!

makeText: text
    ^Text text: text!

notation: name value: val
    ^Notation new name: name identifiers: val!

pi: nm text: text
    ^PI new name: nm text: text!

tag: tag attributes: attributes elements: elements position: p stream: stream
    ^Element tag: tag attributes: attributes elements: elements! !


Object subclass: #ElementContext
    instanceVariableNames: 'tag namespaces '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Parsing'!


ElementContext comment: '
This class includes some functionality to support XML namespaces. XML namespaces provide a simple method for qualifying element and attribute names used in XML documents 

Instance Variables:
    tag	<XML.NodeTag>
    namespaces	<Dictionary>
'!


!ElementContext methodsFor: 'accessing'!

namespaces
    namespaces isNil ifTrue: [namespaces := Dictionary new].
    ^namespaces!

tag
    ^tag!

tag: aTag
    tag := aTag isString
		ifTrue: [NodeTag new qualifier: '' ns: '' type: aTag]
		ifFalse: [aTag].! !

!ElementContext methodsFor: 'namespaces'!

defineDefaultNamespace: attribute
    self namespaces at: '' put: attribute value!

defineNamespace: attribute
    (#('xmlns' 'xml') includes: attribute tag type)
	ifTrue: [self error: ('It is illegal to redefine the qualifier "%1".' bindWith: attribute tag type)].
    attribute value isEmpty ifTrue: [XMLParser invalid: 'It is not permitted to have an empty URI as a namespace name'].
    self namespaces at: attribute tag type put: attribute value!

findNamespace: ns
    ^namespaces isNil
	ifTrue: [nil]
	ifFalse: [namespaces at: ns ifAbsent: [nil]]! !

!ElementContext methodsFor: 'testing'!

definesNamespaces
    ^namespaces notNil and: [namespaces isEmpty not]! !


SAXDriver subclass: #DOM_SAXDriver
    instanceVariableNames: 'stack document '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-SAX'!


DOM_SAXDriver comment: '
This class represents a specialized type of SAX (Simple API for XML) processor that follows the ''object model'' for processing XML documents to build a Document Object Model (DOM) tree from the processed XML document.

Note: This class implementation is not yet complete

Instance Variables:
    stack	<OrderedCollection>
    document	<XML.Document>
'!


!DOM_SAXDriver methodsFor: 'document events'!

document
    ^document!

endDocument
    document := stack removeLast.
    document isDocument ifFalse: [self error: 'End of Document not expected'].
    stack isEmpty ifFalse: [self error: 'End of Document not expected'].!

endDocumentFragment
    document := stack removeLast.
    document isDocument ifFalse: [self error: 'End of Document not expected'].
    stack isEmpty ifFalse: [self error: 'End of Document not expected'].!

startDocument
    stack := OrderedCollection with: Document new!

startDocumentFragment
    stack := OrderedCollection with: Document new! !

!DOM_SAXDriver methodsFor: 'elements'!

endElement
    stack size = 0 ifTrue: [^self].
    stack removeLast condenseList!

startElement: name atts: atts
    | elm |
    stack size = 0 ifTrue: [^self].
    elm := self createElement: name attributes: atts.
    stack last addNode: elm.
    stack addLast: elm! !

!DOM_SAXDriver methodsFor: 'characters'!

characters: aString
    stack last addNode: (Text text: aString)! !

!DOM_SAXDriver methodsFor: 'other'!

comment: data
    stack last addNode: (Comment new text: data)!

processingInstruction: target data: data
    stack last addNode: (PI new name: target text: data)! !

!DOM_SAXDriver methodsFor: 'private'!

createElement: name attributes: attr
    ^Element tag: name attributes: attr elements: OrderedCollection new! !


ConcretePattern subclass: #InitialPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


InitialPattern comment: '
Since an element''s content declaration may include multiple constraint rules or patterns, instances of this class are used to indicate to the XML parser, the initial or first rule in the declaration'!


!InitialPattern methodsFor: 'validation'!

validateTag: tag content: content for: aParser
    | n |
    n := self.
    content do: [:elm |
	elm isContent
		ifTrue:
			[n := n validate: elm.
			n isNil ifTrue: [aParser notPermitted: elm description, ' at this point in the "', tag, '" node']]].
    n canTerminate
	ifFalse: [aParser expected: 'one of ', n followSetDescription].
    ^self! !


Object subclass: #AttributeType
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


AttributeType comment: '
This class with its various subclasses represents the type of the XML attribute. According to the XML 1.0 specification, XML attribute types are of three kinds:a string type, a set of tokenized types, and enumerated types. The string type may take any literal string as a value, the tokenized types have varying lexical and semantic constraints and the enumerated type attibutes can take one of a list of values provided in the declaration.

Subclasses of AttributeType represent these various types of XML attributes '!


!AttributeType methodsFor: 'validating'!

completeValidationAgainst: aParser from: anAttribute
    ^self!

simpleValidateValueOf: anAttribute for: aParser
    | v |
    v := anAttribute value copy.
    v replaceAll: Character cr with: Character space.
    v replaceAll: Character lf with: Character space.
    v replaceAll: Character tab with: Character space.
    anAttribute value: v!

validateDefinition: anAttributeDefinition for: aParser
    anAttributeDefinition hasDefault
	ifTrue: [self validateValueOf: anAttributeDefinition for: aParser]!

validateValueOf: anAttribute for: aParser
    "We're going to do this the hard way for now."

    | v v1 |
    self simpleValidateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    [v1 := v copyReplaceAll: '  ' with: ' '.
    v1 = v] whileFalse: [v := v1].
    (v size > 1 and: [v first = Character space])
	ifTrue: [v := v copyFrom: 2 to: v size].
    (v size > 1 and: [v last = Character space])
	ifTrue: [v := v copyFrom: 1 to: v size - 1].
    anAttribute value: v! !

!AttributeType methodsFor: 'testing'!

isID
    ^false! !


AttributeType subclass: #IDREFS_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


IDREFS_AT comment: '
This class represents the IDREFS attribute type. This is a tokenized type of attribute and for an XML document to be valid, each of the values of IDREFS type attributes must match each of the values of some ID attribute on some element in the XML document.'!


!IDREFS_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v tokensBasedOn: Character space) do: [:nm |
	(aParser isValidName: nm)
		ifFalse: [aParser invalid: ('An IDREFS attribute (%1="%2") does not match the required syntax of a list of Names.'
				bindWith: anAttribute tag asString
				with: v)].
	aParser rememberIDREF: nm]! !


AttributeType subclass: #NMTOKEN_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


NMTOKEN_AT comment: '
This class represents the NMTOKEN attribute type. This is a tokenized type of attribute and for the purposes of validation, values of NMTOKEN type attributes must match a Nmtoken, which is any mixture of legal name characters as defined in the XML 1.0 specification. '!


!NMTOKEN_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v includes: Character space)
	ifTrue: [aParser notPermitted: 'white space in IDREF attributes'].
    (aParser isValidNmToken: v)
	ifFalse: [aParser invalid: ('An NMTOKEN attribute (%1="%2") does not match the required syntax of an NmToken.'
				bindWith: anAttribute tag asString
				with: v)]! !


AttributeType subclass: #NOTATION_AT
    instanceVariableNames: 'typeNames '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!




!NOTATION_AT class methodsFor: 'instance creation'!

typeNames: list
    ^self new typeNames: list! !



NOTATION_AT comment: '
This class represents the NOTATION attribute type. A NOTATION attribute identifies a notation element, declared in the DTD with associated system and/or public identifiers, to be used in interpreting the element to which the attribute is attached.

Instance Variables:
    typeNames	<SequenceableCollection>
'!


!NOTATION_AT methodsFor: 'accessing'!

typeNames
    ^typeNames!

typeNames: aList
    typeNames := aList! !

!NOTATION_AT methodsFor: 'validating'!

completeValidationAgainst: aParser from: anAttributeDef
    typeNames do: [:nm |
	aParser notationAt: nm ifAbsent:
		[aParser invalid: ('Undeclared Notation "%1" used by attribute type "%2"'
						bindWith: nm
						with: anAttributeDef tag asString)]]!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (typeNames includes: v)
	ifFalse: [aParser invalid: ('A NOTATION attribute (%1="%2") should have had a value from %3.'
				bindWith: anAttribute tag asString
				with: v
				with: typeNames asArray)].! !


AttributeType subclass: #ENTITIES_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


ENTITIES_AT comment: '
This class represents the ENTITIES attribute type. This is a tokenized type of attribute that signifies to the XML parser that for the purposes of validating, the values of entities type attributes must match each of the names of unparsed entities declared in the document type definition.'!


!ENTITIES_AT methodsFor: 'validating'!

completeValidationAgainst: aParser from: anAttribute
    ^self validateValueOf: anAttribute for: aParser!

validateDefinition: anAttributeDefinition for: aParser
    ^self!

validateValueOf: anAttribute for: aParser
    | v ent |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v tokensBasedOn: Character space) do: [:nm |
	(aParser isValidName: nm)
		ifFalse: [aParser invalid: ('An ENTITIES attribute (%1="%2") does not match the required syntax of a list of Names.'
				bindWith: anAttribute tag asString
				with: v)].
	ent := aParser dtd generalEntityAt: nm.
	ent isNil
		ifTrue: [aParser invalid: ('Undeclared unparsed entity "%1" used by attribute type "%2"'
					bindWith: nm
					with: anAttribute tag asString)]
		ifFalse: [ent isParsed
			ifTrue: [aParser invalid: ('The entity "%1" used by attribute type "%2" is a parsed entity and should be unparsed'
						bindWith: nm
						with: anAttribute tag asString)]
			ifFalse: []]]! !


AttributeType subclass: #IDREF_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


IDREF_AT comment: '
This class represents the IDREF attribute type. This is a tokenized type of attribute and for an XML document to be valid, values of IDREF type attributes must match the value of some ID attribute on some element in the XML document.

ID and IDREF attributes together provide a simple inside-the-document linking mechanism with every IDREF attribute required to point to an ID attribute as stated above.'!


!IDREF_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v includes: Character space)
	ifTrue: [aParser notPermitted: 'white space in IDREF attributes'].
    (aParser isValidName: v)
	ifFalse: [aParser invalid: ('An IDREF attribute (%1="%2") does not match the required syntax of a Name.'
				bindWith: anAttribute tag asString
				with: v)].
    aParser rememberIDREF: v! !


AttributeType subclass: #NMTOKENS_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


NMTOKENS_AT comment: '
This class represents the NMTOKENS attribute type. This is a tokenized type of attribute and for the purposes of validation, values of each NMTOKENS type attributes must match each Nmtoken, which is any mixture of legal name characters as defined in the XML 1.0 specification. '!


!NMTOKENS_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v tokensBasedOn: Character space) do: [:nm |
	(aParser isValidNmToken: nm)
		ifFalse: [aParser invalid: ('An NMTOKENS attribute (%1="%2") does not match the required syntax of a list of NmTokens.'
				bindWith: anAttribute tag asString
				with: v)]]! !


Node subclass: #Text
    instanceVariableNames: 'text stripped '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!




!Text class methodsFor: 'instance creation'!

text: aString
    ^self new text: aString! !



Text comment: '
This class represents an XML textual object, i.e. a sequence of legal characters as defined in the XML 1.0 specification and may represent markup or character data.


Instance Variables:
    text	<CharacterArray | nil>
    stripped	<Boolean> 
'!


!Text methodsFor: 'accessing'!

characterData
    ^self text!

characterDataOnto: str
    str nextPutAll: self text!

description
    ^'text'!

strip: aBoolean
    stripped := aBoolean.!

text
    ^text!

text: aText
    text := aText.
    stripped isNil ifTrue: [stripped := false].! !

!Text methodsFor: 'printing'!

printCanonicalOn: aStream
    text isNil ifTrue: [^self].
    self isStripped
	ifFalse: [self printCanonical: text on: aStream.]!

printHTMLOn: aStream
    text isNil ifTrue: [^self].
    self isStripped
	ifFalse: [self printCanonical: text on: aStream.]!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    text isNil ifTrue: [^self].
    self isStripped
	ifFalse: [aStream nextPutAll: text]!

printOn: aStream depth: indent
    aStream nextPutAll: (text isNil ifTrue: ['&nil;'] ifFalse: [text])! !

!Text methodsFor: 'testing'!

isBlankText
    ^(text contains: [:i | i isSeparator not]) not!

isContent
    ^true!

isStripped
    ^stripped!

isText
    ^true! !


Node subclass: #Entity
    instanceVariableNames: 'name text systemID publicID '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


Entity comment: '
An XML document may consist of one or many storage units called entities. All XML entities have content and are idententified by name. 

Entities may be either parsed or unparsed. This class and it''s subclasses GeneralEntity and ParameterEntity represent parsed entities. These entities are invoked by name using entity references and their contents are held in the text instance variable

Entities may also be internal or external. If the content of the entity is given in the declaration (within the document) itself then the entity is called an internal entity. If the entity is not internal to the document and is declared elsewhere it''s called an external entity. 

External entities have a system identifier (systemID instance variable) that is an URI which may be used to retrieve the entity. In addition to a system identifier, an external entity declaration may include a public identifier (publicID instance variable). The XMLParser uses the publicID to try to generate an alternative URI to retrive the entity''s contents.

Subclasses must implement the following messages:
    accessing
	entityType

Instance Variables:
    name	<XML.NodeTag>  
			Identifies the entity in an entity referece
    text	<SequenceableCollection | Filename | IOAccessor | ExternalConnection | LogicalFilename | EncodedStreamConstructor>
			The entity''s contents
    systemID	<XML.URIResolver>
			URI used to retrieve an external entity''s contents
    publicID	<XML.URIResolver>
			Alternative URI used to retrieve an external entity''s contents
'!


!Entity methodsFor: 'accessing'!

entityType
    ^self subclassResponsibility!

externalFrom: anArray
    anArray class == Array
	ifFalse: [self error: 'External ID is expected to be an Array'].
    anArray size = 1
	ifTrue: [systemID := anArray at: 1]
	ifFalse: [anArray size = 2
		ifTrue:
			[publicID := anArray at: 1.
			systemID := anArray at: 2]
		ifFalse:
			[self error: 'External ID has too many or too few identifiers']]!

name
    ^name!

name: aName
    name := aName!

streamFor: aParser
    (aParser hasExpanded: self)
	ifTrue: [aParser malformed: 'Can''t expand this entity; it is defined recursively'].
    text isNil
	ifTrue:
		[| uri str |
		uri := URIResolver resolve: systemID from: aParser latestURI.
		aParser pushSource: (str := StreamWrapper
							stream: (URIResolver openStreamOn: uri)
							protocol: uri key
							name: uri value
							entity: self).
		str textDeclIn: aParser.
		aParser getNextChar]
	ifFalse:
		[aParser pushSource: (StreamWrapper
							stream: text readStream
							protocol: 'internal'
							name: nil
							entity: self).
		aParser getNextChar].!

text: aString
    text := aString! !

!Entity methodsFor: 'testing'!

isExternal
    ^publicID notNil or: [systemID notNil]!

isParsed
    ^true! !

!Entity methodsFor: 'printing'!

printOn: aStream
    self basicPrintOn: aStream.
    text isNil
	ifTrue: [aStream nextPutAll: '(',systemID,')']
	ifFalse: [aStream nextPutAll: '(',text,')']! !


Entity subclass: #GeneralEntity
    instanceVariableNames: 'ndata '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


GeneralEntity comment: '
This class represents a general entity which is a parsed entity for use within the XML document content.

Instance Variables:
    ndata	<Notation>
'!


!GeneralEntity methodsFor: 'accessing'!

entityType
    ^'generic'!

ndata: aNotifierNameOrNil
    ndata := aNotifierNameOrNil! !

!GeneralEntity methodsFor: 'testing'!

isParsed
    ^ndata isNil! !

!GeneralEntity methodsFor: 'validation'!

completeValidationAgainst: aParser
    ndata isNil
	ifFalse: [aParser dtd notationAt: ndata ifAbsent:
			[aParser invalid: ('Unparsed entity "%1" uses an undeclared notation "%2"'
					bindWith: name
					with: ndata)]]! !


Entity subclass: #ParameterEntity
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!


ParameterEntity comment: '
This class represents a parameter entity which is a parsed entity for use within the document type definition. '!


!ParameterEntity methodsFor: 'accessing'!

entityType
    ^'parameter'!

streamFor: aParser addSpaces: spaces
    | myText |
    (aParser hasExpanded: self)
	ifTrue: [aParser malformed: 'Can''t expand this entity; it is defined recursively'].
    text isNil
	ifTrue:
		[| uri str |
		uri := URIResolver resolve: systemID from: aParser latestURI.
		str := StreamWrapper
					stream: (URIResolver openStreamOn: uri)
					protocol: uri key
					name: uri value
					entity: self.
		str textDeclIn: aParser.
		text := str stream upToEnd.
		str close].
    myText := text.
    spaces ifTrue: [myText := ' ', text, ' '].
    aParser pushSource: (StreamWrapper
					stream: myText readStream
					protocol: 'internal'
					name: nil
					entity: self).
    aParser getNextChar.! !


AttributeType subclass: #Enumeration_AT
    instanceVariableNames: 'values '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!




!Enumeration_AT class methodsFor: 'instance creation'!

withAll: list
    ^self new values: list! !



Enumeration_AT comment: '
This class represents the Enumeration attribute type. Enumerated attributes can take one of a list of values provided in the declaration.

Instance Variables:
    values	<Collection>
'!


!Enumeration_AT methodsFor: 'accessing'!

values: aList
    values := aList! !

!Enumeration_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (values includes: v)
	ifFalse: [aParser invalid: ('An attribute (%1="%2") should have had a value from %3.'
				bindWith: anAttribute tag asString
				with: v
				with: values asArray)]! !


AttributeType subclass: #ID_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


ID_AT comment: '
This class represents the ID attribute type. This is also a tokenized type of attribute and values of ID type attributes must match legal names as defined in the XML 1.0 specification. 

For an XML document to be valid, ID values must uniquely identify the elements which bear them; i.e. A name must not appear more than once in an XML document as a value of this type. Also for validity purposes, an ID attribute must have a declared default of #IMPLIED or #REQUIRED in the DTD.

ID and IDREF attributes together provide a simple inside-the-document linking mechanism with every IDREF attribute required to point to an ID attribute.'!


!ID_AT methodsFor: 'validating'!

validateDefinition: anAttributeDefinition for: aParser
    anAttributeDefinition hasDefault
	ifTrue: [XMLParser invalid: 'ID attributes must be either #REQUIRED or #IMPLIED']!

validateValueOf: anAttribute for: aParser
    | v |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v includes: Character space)
	ifTrue: [aParser notPermitted: 'white space in ID attributes'].
    (aParser isValidName: v)
	ifFalse: [aParser invalid: ('An ID attribute (%1="%2") does not match the required syntax of a Name.'
				bindWith: anAttribute tag asString
				with: v)].
    aParser documentNode registerID: anAttribute from: aParser! !

!ID_AT methodsFor: 'testing'!

isID
    ^true! !


AttributeType subclass: #ENTITY_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


ENTITY_AT comment: '
This class represents the ENTITY attribute type. This is a tokenized type of attribute that signifies to the XML parser that for the purposes of validating, the values of entity type attributes must match the name of an unparsed entity declared in the document type definition.'!


!ENTITY_AT methodsFor: 'validating'!

completeValidationAgainst: aParser from: anAttribute
    ^self validateValueOf: anAttribute for: aParser!

validateDefinition: anAttributeDefinition for: aParser
    ^self!

validateValueOf: anAttribute for: aParser
    | v ent |
    super validateValueOf: anAttribute for: aParser.
    v := anAttribute value.
    (v includes: Character space)
	ifTrue: [aParser notPermitted: 'white space in ENTITY attributes'].
    (aParser isValidName: v)
	ifFalse: [aParser invalid: ('An ENTITY attribute (%1="%2") does not match the required syntax of a Name.'
				bindWith: anAttribute tag asString
				with: v)].
    ent := aParser dtd generalEntityAt: v.
    ent isNil
	ifTrue: [aParser invalid: ('Undeclared unparsed entity "%1" used by attribute type "%2"'
					bindWith: v
					with: anAttribute tag asString)]
	ifFalse: [ent isParsed
		ifTrue: [aParser invalid: ('The entity "%1" used by attribute type "%2" is a parsed entity and should be unparsed'
						bindWith: v
						with: anAttribute tag asString)]
		ifFalse: []]! !


Node subclass: #Attribute
    instanceVariableNames: 'name value '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!




!Attribute class methodsFor: 'instance creation'!

name: nm value: val
    ^self new name: nm value: val! !



Attribute comment: '
XML elements may have name-value pairs called attributes associated with them. This class instantiates an XML attribute.

Instance Variables:
    name	<XML.Node | XML.NodeTag>	tag name
    value	<Object>					tag value
'!


!Attribute methodsFor: 'initialize'!

name: nm value: val
    name := nm isString
		ifTrue: [NodeTag new qualifier: '' ns: '' type: nm]
		ifFalse: [nm].
    value := val!

tag: aTag
    name := aTag! !

!Attribute methodsFor: 'accessing'!

characterData
    ^self value!

expandedName
    ^name expandedName!

key
    ^name!

tag
    ^name!

value
    ^value!

value: aValue
    value := aValue! !

!Attribute methodsFor: 'printing'!

printCanonicalOn: aStream
    aStream nextPutAll: self tag asString, '="'.
    self printCanonical: value on: aStream.
    aStream nextPutAll: '"'.!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    aStream nextPutAll: self tag asString, '="'.
    self printCanonical: value on: aStream.
    aStream nextPutAll: '"'.!

printOn: aStream
    self printCanonicalOn: aStream!

simpleDescription
    ^'@', self key! !

!Attribute methodsFor: 'testing'!

isAttribute
    ^true!

isLike: aNode
    ^self class == aNode class
	and: [self tag isLike: aNode tag]! !


Pattern subclass: #ComplexPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


ComplexPattern comment: '
This class is the superclass to what are considered ''complex'' patterns or rules in the element content declarations. As seen from the class hiererarchy, instances of complex patterns include ChoicePattern, MixedPattern, ModifiedPattern and SequencePattern.'!


!ComplexPattern methodsFor: 'testing'!

isSimple
    ^false! !


ComplexPattern subclass: #ChoicePattern
    instanceVariableNames: 'items '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!ChoicePattern class methodsFor: 'instance creation'!

on: aList
    ^self new on: aList! !



ChoicePattern comment: '
This class represents the ''choice'' element content constraint in an element type declaration. According to the XML 1.0 specification, the ''choice'' pattern/rule signifies that any content particle in a choice list (declared in the DTD) may appear in the element content at the location where the choice list appears in the grammar

Instance Variables:
    items	<Collection | CCompoundType | MC_FileBTree | Stream>
			Collection of content particles'!


!ChoicePattern methodsFor: 'initialize'!

on: aList
    items := aList! !

!ChoicePattern methodsFor: 'coercing'!

alternateHeads
    ^items!

pushDownFollowSet
    items do: [:i | i addFollows: followSet].
    ^items! !


Node subclass: #PI
    instanceVariableNames: 'name text '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Nodes'!




!PI class methodsFor: 'instance creation'!

name: nm text: aString
    ^self new name: nm text: aString! !



PI comment: '
This class represents the XML Processing Instruction element. Processing instructions allow XML documents to contain instructions for applications.  XML processing instructions are delimited by the start-tag ''<?'' and the end-tag ''?>''. According to the XML 1.0 specification, the target names "XML", "xml" and so on are reserved for standardization

Instance Variables:
    name	<XML.NodeTag>	the target of this processing instruction, used to identify the application to which this processing instruction is directed.
    text		<String>			the processing instructions themselves'!


!PI methodsFor: 'initialize'!

name: nm text: aString
    name := nm.
    text := aString! !

!PI methodsFor: 'accessing'!

name
    ^name!

text
    ^text! !

!PI methodsFor: 'printing'!

printCanonicalOn: aStream
    aStream nextPutAll: '<?', name, ' ', text, '?>'!

printHTMLOn: aStream
    aStream nextPutAll: '<?', name, ' ', text, '?>'!

printNoIndentOn: aStream endSpacing: endSpacingBlock spacing: spacingBlock
    aStream nextPutAll: '<?', name, ' ', text, '?>'!

printOn: aStream depth: indent
    aStream nextPutAll: '<?', name, ' ', text, '?>'! !

!PI methodsFor: 'testing'!

isLike: aNode
    ^self class == aNode class
	and: [self name isLike: aNode name]!

isProcessingInstruction
    ^true! !


AttributeType subclass: #CDATA_AT
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Attributes'!


CDATA_AT comment: '
This class represents the CDATA attribute type. A CDATA attribute is used to escape blocks of text containing characters which would otherwise be recognized as markup.

CDATA sections are demilited by the start-tag ''<![CDATA['' and the end-tag '']]''.'!


!CDATA_AT methodsFor: 'validating'!

validateValueOf: anAttribute for: aParser
    self simpleValidateValueOf: anAttribute for: aParser.! !


ComplexPattern subclass: #MixedPattern
    instanceVariableNames: 'items '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!MixedPattern class methodsFor: 'instance creation'!

on: aList
    1 to: aList size do: [:i |
	i+1 to: aList size do: [:j |
		((aList at: i) name asString = (aList at: j) name asString)
			ifTrue: [XMLParser invalid: 'Duplicate element names in a mixed content specification.']]].
    ^self new on: (aList size = 0 ifTrue: [#()] ifFalse: [aList])! !



MixedPattern comment: '
This class represents the ''mixed'' element content constraint in an element type declaration. An element type has mixed content when elements of that type may contain both other child elements and character data (text) as specified in the element content declaration.

Note: For mixed content type elements, one can''t control the order in which the child elements, mixed in among the text, appear.

Instance Variables:
    items	<SequenceableCollection>  comment
'!


!MixedPattern methodsFor: 'initialize'!

on: aList
    items := (Array with: PCDATAPattern new), aList! !

!MixedPattern methodsFor: 'coercing'!

alternateHeads
    ^items, followSet!

pushDownFollowSet
    items do: [:i | i addFollow: self; addFollows: followSet].
    ^items! !

!MixedPattern methodsFor: 'testing'!

couldBeText
    ^true! !


ComplexPattern subclass: #ModifiedPattern
    instanceVariableNames: 'node modification '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!ModifiedPattern class methodsFor: 'instance creation'!

on: aNode type: t
    ^self new on: aNode type: t! !



ModifiedPattern comment: '
XML element content declarations can have certain optional characters following an element
name or pattern. These characters govern whether the element or the content particle may occur
one or more (+), zero or more (*), or zero or one (?) times in the element content. This class
represents these patterns or rules

Instance Variables:
    node	<XML.Pattern>
    modification	<Character> Optional character denoting content element occurances
'!


!ModifiedPattern methodsFor: 'initialize'!

on: aNode type: t
    node := aNode.
    modification := t.! !

!ModifiedPattern methodsFor: 'coercing'!

alternateHeads
    ^(modification = $* or: [modification = $?])
	ifTrue: [followSet copyWith: node]
	ifFalse: [Array with: node]!

pushDownFollowSet
    (modification = $+ or: [modification = $*])
	ifTrue: [node addFollow: self].
    node addFollows: followSet.
    ^Array with: node! !


ComplexPattern subclass: #SequencePattern
    instanceVariableNames: 'items '
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!




!SequencePattern class methodsFor: 'instance creation'!

on: aList
    ^self new on: aList! !



SequencePattern comment: '
This class represents the ''sequence'' element content constraint in an element type declaration. According to the XML 1.0 specification, the ''sequence'' pattern/rule signifies that content particles occuring in a sequence list (declared in the DTD) must each appear in the element content in the order given in the list.

Instance Variables:
    items	<SequenceableCollection> Collection of content particles
'!


!SequencePattern methodsFor: 'initialize'!

on: aList
    items := aList! !

!SequencePattern methodsFor: 'coercing'!

alternateHeads
    ^Array with: items first!

pushDownFollowSet
    1 to: items size - 1 do: [:i |
	(items at: i) addFollow: (items at: i+1)].
    items last addFollows: followSet.
    ^items! !


ConcretePattern subclass: #TerminalPattern
    instanceVariableNames: ''
    classVariableNames: ''
    poolDictionaries: ''
    category: 'XML-Patterns'!


TerminalPattern comment: '
Since an element''s content declaration may include multiple constraint rules or patterns, instances of this class are used to indicate to the XML parser, the last or terminal rule in the declaration'!


!TerminalPattern methodsFor: 'accessing'!

description
    ^'<close tag>'! !

!TerminalPattern methodsFor: 'testing'!

isTerminator
    ^true!

matches: aNode
    ^false! !



XML XMLParser initialize!

Namespace current: Smalltalk!

!Stream methodsFor: 'compatibility'!

crtab: indent
    self "nl;" next: indent put: $ .
! !

!CharacterArray methodsFor: 'compatibility'!

findString: x startingAt: y
    ^self indexOfSubCollection: x startingAt: y ifAbsent: [ 0 ]
! !
