Let's discuss SAX API for Ada.
Short description of SAX parse is :
User program registers its callbacks (handler) in SAX parser and
starts parsing providing document URL.
Parser calls handler in appropriate points of parsing process.
For instance when parser encounters new tag handler gets
tag name and tag attribute list.
Let's observe general structure of XML parser written in Ada.
|xml document, presented as Ada.Streams.Stream
||Data type: Ada.Streams.Stream_Element_Array
|encoding support engine
||Data type: Character, Wide_Character
||Data type: access String, access Wide_String
We need effective information delivery within this process
to make parser work faster. It requires:
encoding support engine should be able to transform
Stream_Element_Array into String or Wide_String.
This part of parser is implementation depended because
Storage_Elements'Size may be varied between implementation.
parser should be able to pass data to handler effectively.
Passing tag name and character data could be done in that
callUserHandler( buffer(10..20) ).
In this way we avoid data copying because compiler
choices passing arguments by reference. The main
question in passing list of attributes.
Let's examine following choices:
function getAttrValue(al:AttrList;index:Positive) return String;
It requres copying in/out secondary stack.
function getAttrValue(al:AttrList;index:Positive) return access String;
It requres dymanic memory allocation because
buffer(10..20)'Access is an illegal.
In addition we need an agreement between handler and parser about
deallocating memory to avoid dangling references and memory leak.
The second question is using of character types.
XML specification requres that parser accept XML documents
(at the least) in UTF-8 and Latin-1 encoding.
We can use Wide_Character type to store characters from all encodings.
But sometimes it's convenient to use Character.
One possible way out is pass UTF-8 symbols in String argument.
It conforms to specification requirements but makes
coding more difficult because of UTF-8 occupies a few Characters
in the String.
If we make right decision in these two questions
we'll have a good performance in XML API.
I propose an example of
SAX parser in Ada
as background for further work and discussion.
I ask you to forgive me my penury of English.
I hope the sence of this article is clear enough.