[wsf-c-dev] Access to raw xml stream [was: AXIOM NG ]

James Clark james at wso2.com
Sun Dec 10 02:37:03 PST 2006


On Sun, 2006-12-10 at 15:25 +0600, Samisa Abeysinghe wrote:
> I got a question. (In fact, this is a question that was asked by a user 
> on current AXIOM)
> 
> Can one access the raw XML stream directly? (With current AXIOM, you 
> cant, you got to serialize the OM tree you have build to get the 
> original XML representation)

>From the application's perspective, the way you do it is to do

  input_stream_t *input_stream_from_xml_element(xml_element_t *);

An xml_element_t is just a promise to supply an XML element in your
choice of format.  It's not inherently serialized or non-serialized. If
you ask an xml_element_t to give you the XML element in serialized form,
depending on the implementation of the particular xml_element, it may or
may not be able to do it without serializing.

For example, let's suppose you want to schema validate the payload.  The
SOAP stack will give you the payload as a USE_ONCE xml_element, which
might be implemented as a set of in-scope namespaces plus a pointer into
a input buffer, plus an input-stream for the rest of the input.
input_stream_from_xml_element() would need to

- reserialize the start-tag of the root payload element in order to
insert the additional in-scope namespaces

- create an input stream from the serialized start-tag, the remaining
portion of the current input buffer and the input stream for the rest of
the input

- the tricky bit is to truncate the input stream after the payload
end-tag before the body and envelope end-tags; probably the best way is
to put a wrapper round the input_stream that does a quick preparse just
looking for left angle-brackets and matching up start-/end-tags

> If the answer is no to the above question, then how could one do e
> something like schema validation on incoming XML if they want to?

Just getting the raw XML is not enough.  Getting the raw XML of the
payload directly as it is in the message isn't enough (because it may
lack the namespace declarations that you would need to parse it). What
you need is to get a representation of the payload as serialized XML,
but without unnecessarily parsing and serializing. In other words, I
believe you want

- the semantics of "serialize this element node as an XML document",
with

- the performance of "give me the raw XML stream directly"

James






More information about the Wsf-c-dev mailing list