[Wsf-general] The configuration file for the database service

James Clark james at wso2.com
Mon Feb 26 23:07:48 PST 2007


On Mon, 2007-02-26 at 17:03 +0530, Chamil Thanthrimudalige wrote:
> hi all,
> 
> The configurations for the DBService will be in a file with the  
> extension "dbs". The configuration file so far looks like below.
> 
> <data-service name="UserDataService" targetNameSpace="http:// 
> www.wso2.org/types">
> 	<connection>
> 		<driver>org.apache.derby.jdbc.EmbeddedDriver</driver>
> 		<protocol>jdbc:derby:../database/WSO2WSAS_DB</protocol>
> 		<user>wso2wsas</user>
> 		<password>wso2wsas</password>
> 	</connection>
> 	<operation name="getUser">
> 		<input/>
> 		<output/>
> 	</operation>
> </data-service>
> 
> Details on the connection section needs to be generalized. So I would  
> like get some input from a JDBC expert no how best we can capture the  
> details.

I would hope that the .dbs format could be used for C as well as Java,
which implies the connection section needs to work not just for JDBC but
also for MySQL and any other native code database we might want to
support. The information that a user might need to supply about a
connection is virtually limitless.  For example, MySQL has a gazillion
options any of which the user might need to specify. I think the
simplest solution is for the connection element to have a zero or more
property children, with some conventions for property names, e.g.

<connection>
  <property name="user">wsas</property>
  <property name="jdbc.driver">org.apache.derby.jdbc.EmbeddedDriver</property>
</connection>

A fancier solution would use XML namespaces.

> The input element will take the form of a list of parameters and a  
> prepared sql statement. The question marks in the statement will be  
> replaced with the values of the parameters in order when being  
> executed. The parameter element will include the xpath expression to  
> get the parameter value from the input message.
> Eg:
> <input>
> 	<parameter>row/name</parameter>
> 	<parameter>row/age</parameter>
> 	<sqlStatement>select * from user where name=? or age = ?</sqlStatement>
> </input>

I'll refer to somebody who's using our data service to expose some of
their database content as a web service as a "data provider".  I think
data provider is going to want to provide a WSDL to potential consumers
of their web service.  So we need to think about how this WSDL is going
to get created.  I think we should assume that the average data provider
has reasonable SQL skills, but probably not very good XML skills.  So I
don't think we can expect that a data provider is going to be able to
write the XML schema for their WSDL.  This means we've got to generate
the XML schema for them.

Using XPath here seems to me unnecessarily general, but more importantly
will make it hard to generate the schema.  If instead we say that
parameters always occur as child elements of the root element of the
payload, then it would become possible to generate a schema from the
<input> element.  (Operations that do updates are probably going to need
richer structures, but I suspect queries will be the most common use of
the data service.)

Do we need to force the user always to specify parameter elements?  We
can tell how many parameters the prepared statement needs, and we can
choose default names (like <param>) for the parameters.

Another issue is parameter typing. The generated schema ought to be able
to specify the type of the parameters and our code needs to
intelligently convert between schema types and SQL types.  I would be
inclined to specify the SQL type in the parameter, and then our code
would choose the appropriate XML schema type (following the assumption
that the user would be likely to know more SQL than XML schema).

Finally, the interface of the Web Service shouldn't be coupled too much
to the implementation.  It should be possible for parameters to occur in
a different order in the SOAP request than in the SQL query.

> The output can be custom formated in two easy ways. 

Two different ways is not a good idea. These should be unified.

> The type  
> attribute will say which format is being used. "rename" means that it  
> is a simple column rename and "template" will indicate that an xml  
> template will be used.
> 
> First way is by doing a simple rename of the columns. Here the rename  
> element will indicate the source column name by "from" attribute and  
> target from "to" attribute.
> Eg:
> <output wrap-rows="rowName">
> 	<rename xmlns:my="http://www.wso2.org/types" from="ISINDEX"  
> to="IndexedState"/>
> </output>

What's the xmlns:my declaration doing?

> Or else it can be formated using a xml template. Here the param  
> element will indicate the column of the dataset.
> Eg:
> <output type="template" wrap-rows="rowName">
> 	<name><param colName="name"/></name>
> 	<age><param colName="age"/></age>
> </optput>
> 
> The wrap-rows attribute on the output element will give the repeating  
> container element for grouping the individual records. If it is not  
> given then it is assumed that only a single row will be returned, if  
> more than one result is returned then the system will throw an error.

I think we should make it possible to leave out the <output/> element,
in which case we should provide reasonable default behaviour.

wrap-rows is mixing two things that should be independent:

- a way for the user to change the name of the element used for rows
(should probably default to <row>)

- a way for the user to assert that the query will return at most one
row

I would suggest the following XML syntax for output:

<output singleRow="true|false"
        rowElementName="row"
        rowsElementName="rows"
        useNil="true|false"
        defaultNamespace="..."
        columnDefault="element|attribute|omit">
   <attribute columnName="country"/>
   <element elementName="IndexedState" columnName="ISINDEX"/>
   ...
</output>

Note that everything is optional, so for example,

<output>
  <element elementName="IndexedState" columnName="ISINDEX"/>
</output>

is allowed, as is no <output> element at all.

More details:

- The singleRow attribute defaults to false.  If singleRow=true, then no
rows element is output, and it's an error if there's more than one row.
(How should we handle the case where singleRow="true", but the query
returns and empty set of rows?)

- The rowsElementName attribute default to "rows".

- The rowElementName attribute default to "row".

- The "element" element causes a column to be emitted as an element.

- The "attribute" element causes the column to be emitted as an
attribute rather than an element.

- The elementName attribute on "element" and the attributeName attribute
on "attribute" default to the value of the columnName attribute.

- There are also attributes on "element"/"attribute" controlling type
conversion.

- If useNil is true, then an empty element with an xsi:nil is emitted
for a column that has a null value (nothing is ever emitted for a null
column that is reprented by an attribute); otherwise nothing is emitted
for a column with a null value.

- The columnDefault attribute says what to do with columns for which
there is no "element" or "attribute" element.  The default is "element".

- What order should elements for columns be output in?  I would say
elements mentioned by "element" elements should be output according to
the order in which they are mentioned, followed by any other elements in
the order returned by the query.

- The names of elements and attributes specified in attribute values
should be able to be qnames. The namespace of unqualified names of
elements should be the value of defaultNamespace (which defaults to "").

Automatically generating a schema (except when columnDefault="omit") is
going to require that you can get the database to give you the result
type of the SQL.  MySQL can.  I don't know if straight JDBC can do that.
If it can't, then you'll need to use an extension or a separate tool or
something.

James







More information about the Wsf-general mailing list