[Wsf-general] data services and resources

Sanjiva Weerawarana sanjiva at wso2.com
Wed Apr 18 00:39:42 PDT 2007


I agree the language integrated query work makes it so much easier to do a 
wide class of applications. Java is without a doubt so behind in language 
innovation now.

On ADO Entities- is that really different from EJB CMP and EJB QL?

Sanjiva.

James Clark wrote:
> The ADO.NET Entity Framework
> 
>  http://msdn2.microsoft.com/en-us/library/aa697427(VS.80).aspx
> 
> seems to have some ideas that are relevant to this.
> 
> (It's scary how far Microsoft's language-integrated query work puts it
> ahead of the Java world.)
> 
> James
> 
> On Mon, 2007-03-19 at 17:53 +0700, James Clark wrote:
>> I've been wondering whether a rather different approach to specifying
>> the data service might be preferable.  I haven't worked this approach
>> out in anything like the detail that our current approach has been
>> worked out in, but I will try to explain the basic idea.
>>
>> The overall philosophy is to be higher-level, more declarative, easier
>> to use, but less flexible and less powerful.  With this philosophy it's
>> not a goal that the user should be able to design an arbitrary web
>> service or REST interface to the information in the database and then
>> use the configuration file to specify that design.  Instead the user
>> gets to decide the kind of reading, writing and searching of the data
>> that they require the web service/REST interface to provide and we
>> automatically create a good-quality web service/REST interface that
>> meets that requirement, together with some modest level of tweakability.
>>
>> The fundamental concept is an "entity-set".  The data service would
>> declare one or more named entity-sets. An entity-set is (surprise,
>> surprise) a set of entities.  Each entity has an identifier that
>> uniquely identifies it within its entity-set.  On the database side of
>> things, for each entity set there would be a corresponding table; the
>> primary key would correspond to the entity's identifier (for simplicity,
>> at first I would expect we wouldn't handle multi-part primary keys).
>> However, not every table corresponds to an entity-set.  On the REST side
>> of things for each entity there's a resource that directly corresponds
>> to that entity; there may also be other resources that provide
>> alternative views of the entity.
>>
>> The data service specification would declare one or more top-level named
>> entity-sets.  In my order database example, the top-level entity-sets
>> might be named "products", "orders" and "customers". The name of the
>> database table corresponding with a particular named entity set would
>> obviously default to the name of the entity set.
>>
>> The second key part of this approach is dealing with things like the
>> order_items table in my example, where information that is logically
>> associated with one entity is in a separate database table from the
>> entity-set to which the entity belongs.  I think the way to handle this
>> is to use the concept that a table that does not correspond to a
>> top-level entity-set can be "owned by" a table that does.  So for
>> example, the order_items table would be owned by the orders tables.  For
>> some cases, it may be necessary to be explicit about how the rows of the
>> owned table relate to the rows of the owning table, but my guess is that
>> in most cases you can do the right thing by looking at the primary
>> key/foreign key information in the database schema.
>>
>> I believe it's possible to provide a basic REST interface for many
>> databases using just
>>
>> - the top-level entity-sets and their corresponding tables,
>>
>> - ownership relationships between tables, and
>>
>> - the database schema
>>
>> Obviously there are lots of different ways of providing a REST
>> interface, but I think most of them can be intelligently defaulted (or
>> even fixed). Let's assume we want to expose the REST interface at
>> http://example.com/db/.
>>
>> - There needs to be an XML representation of an entity that both can be
>> generated from the database and also allows the database to be updated
>> from the XML representation.  The database schema is enough to allow a
>> reasonable default.  Any customization facilities mustn't be so flexible
>> that they inhibit automatic generation of an XML schema or mapping from
>> the XML back to the database.  The tricky bit will be handling ownership
>> relationships.  My guess is that you can mostly do the right thing by
>> looking at the primary key/foreign keys. It should also be possible to
>> automatically turn foreign keys into the appropriate URI because you can
>> tell from the configuration where the URI for the resource corresponding
>> to an entity is.
>>
>> - There needs to be a URI for the resource corresponding to each
>> entity-set.  This can be defaulted from the name of the entity-set: for
>> example, the orders entity-set might be at
>> http://example.com/db/orders/.  A GET on that would provide a listing of
>> all the entities in the entity-set. There would need to be a
>> configurable limit on the number of entities returned by such a GET and
>> a way to iterate over large entity sets (e.g. using queries to specify
>> the range of the result to return).  There should be some configuration
>> that says what fields of the entity are returned in a GET on the
>> entity-set: obviously the URI of the entity needs to be there; you might
>> want just that, you might want a single title-like field (as in Atom) to
>> be returned, or you might want all fields to be returned.
>>
>> - There needs to be a URI for the resource corresponding to each entity.
>> This would default to the URI for the entity-set plus the canonical
>> lexical representation of the primary key.  For example,
>> http://example.com/db/orders/12345.  A GET on this would return the XML
>> representation of the entity.  The existing entity could be modified by
>> doing a PUT on its URI. DELETE on the URI will delete the entity.
>>
>> - There are two ways that a new entity might be added: doing a POST on
>> the _entity-set_ URI or doing a PUT on the _entity_ URI.  It should be
>> possible to automatically figure out which is the right way for a
>> particular entity set based on whether the primary key is autogenerated
>> (I think you can get this from the database schema): POST if it's
>> autogenerated, PUT if it's not.
>>
>> The next big thing that a REST interface would need is some searching
>> capability.  A starting point is to allow the user to specify that
>> certain fields are searchable. For example, if they specify that the
>> country field of the customers entity-set is searchable. Then
>> http://example.com/db/customers?country=US would return a listing of all
>> customers with a country field equal to US.  The next step might be to
>> allow the a query parameter to be associated with an SQL expression. For
>> example, we might want http://example.com/db/customers?min-age=18 to
>> give us a list of all customers aged at least 18.   The configuration
>> might have something like this:
>>
>> <query-param name="min-age" type="int">  
>>  <field name="dateOfBirth"/> - now > 18 years
>> </query-param>
>>
>> This would allow query parameters to compose properly with no extra work
>> (e.g. http://example.com/db/customers?min-age=18&country=US would "just
>> work").
>>
>> We would also probably want a way to provide different views of the
>> entities, e.g. that excluded certain fields.
>>
>> By working at a relatively high level, we can automatically can do
>> several nice things for the user:
>>
>> - we can automatically provide introspection facilities (e.g. WADL),
>> complete with XSD and RELAX NG schemas
>>
>> - we should (I think) be able to automatically generate ETags; this
>> important for cacheability and crucial for dealing with concurrent
>> updates
>>
>> - it should be a small step to get an Atom interface as well
>>
>> So far I've focused on REST.  That's partly because I think we have a
>> bit of corporate REST deficit at the moment, and partly because I think
>> it's easy to go from a REST interface to a WSDL (service-oriented)
>> interface than vice-versa.  How might a WSDL interface be specified?  I
>> envisage there being a number of built-in methods such as add,
>> addMultiple, delete, deleteMatching, search, iterate, get which could
>> apply to an entity or entity-set.  The basic idea would be that the user
>> would identify which built-in methods are allowed for which entity-sets.
>> Each built-in method would have some number of configurable parameters.
>> For example, the user might specify that they want to enable the "add"
>> method for the "customers" entity-set.  By default we might choose a
>> WSDL operation name of addCustomer, but there would be a configurable
>> parameter that would allow it to be changed to createCustomer.  There
>> might be some configurable parameters at the entity-set level: for
>> example, the singular noun to be used (e.g. so that you can have a table
>> called "people" and get methods called "addPerson", "removePerson").
>>
>> Given the built-in method and the database schema it should be possible
>> to automatically generate a tasteful default WSDL interface.  The user
>> wouldn't need to worry about writing an XSD schema: even when the input
>> XML is complex, the semantics of the builtin method together with the
>> database schema should be enough to allow us to create the XSD for the
>> user.  Apart from automatically generating the WSDL, another nice thing
>> we should be able to do for the user in the WS-* world is automatically
>> support WS-Transfer and WS-Enumeration.  Maybe we could even have a
>> method that generates events when the database is modified (though this
>> would require permission to create database triggers).
>>
>> In some cases, the built-in methods may not be sufficient. I envisage
>> providing two ways to go beyond this.  The first way would require XML
>> and SQL skills but not programming skills.  This would be quite similar
>> to what we have at the moment: the user would provide a fragment of SQL,
>> perhaps an XSD for the output XML or more likely the input XML, perhaps
>> an XPath to get the input XML into SQL parameters, perhaps an XSLT to
>> get the SQL into the desired XML form.  The second way, which would
>> require programming skills, would be to make the set of built-in methods
>> extensible. The user would be able to extend the available built-in
>> methods just by dropping in a jar file containing a class that
>> implements a particular interface.  The tricky bit would be designing
>> this interface: maybe it would work by generating SQL/XSD/XPath/XSLT, or
>> maybe it would work completely differently.
>>
>> In terms of tooling, I think this is declarative enough that it should
>> be possible to create a nice, easy to use Ajax interface that works on
>> the XML configuration file, which would guided by an XML representation
>> of the database schema.
>>
>> This message is already rather long. I haven't talked about what I see
>> as the problems with the current approach.  I can do that if people
>> want.  The fundamental reason why I prefer the approach I've outlined
>> above is that I think it's better for the specification to express as
>> much as it can at as high a semantic level as possible. I don't think
>> there's a big technical risk in the kind of approach I'm suggesting: it
>> has a lot of conceptual similarities to object-relational mapping
>> technologies, such as the Java Persistence API
>> (http://java.sun.com/developer/technicalArticles/J2EE/jpa/).
>>
>> BTW, if anybody's a bit rusty on databases, I would recommend this book:
>> http://www.amazon.com/Database-Systems-Complete-Hector-Garcia-Molina/dp/0130319953/ (the Amazon customer reviews page has an amusing mixture of 1-star and 5-star reviews).
>>
>> James
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Wsf-general mailing list
>> Wsf-general at wso2.org
>> http://wso2.org/cgi-bin/mailman/listinfo/wsf-general
> 
> 

-- 
Sanjiva Weerawarana, Ph.D.
Founder, Chairman & CEO; WSO2, Inc.; http://www.wso2.com/
email: sanjiva at wso2.com; cell: +94 77 787 6880; fax: +1 509 691 2000

"Oxygenating the Web Service Platform."




More information about the Wsf-general mailing list