[Wsf-general] data services and resources

James Clark james at wso2.com
Fri Apr 13 23:21:30 PDT 2007


The ADO.NET Entity Framework

 http://msdn2.microsoft.com/en-us/library/aa697427(VS.80).aspx

seems to have some ideas that are relevant to this.

(It's scary how far Microsoft's language-integrated query work puts it
ahead of the Java world.)

James

On Mon, 2007-03-19 at 17:53 +0700, James Clark wrote:
> I've been wondering whether a rather different approach to specifying
> the data service might be preferable.  I haven't worked this approach
> out in anything like the detail that our current approach has been
> worked out in, but I will try to explain the basic idea.
> 
> The overall philosophy is to be higher-level, more declarative, easier
> to use, but less flexible and less powerful.  With this philosophy it's
> not a goal that the user should be able to design an arbitrary web
> service or REST interface to the information in the database and then
> use the configuration file to specify that design.  Instead the user
> gets to decide the kind of reading, writing and searching of the data
> that they require the web service/REST interface to provide and we
> automatically create a good-quality web service/REST interface that
> meets that requirement, together with some modest level of tweakability.
> 
> The fundamental concept is an "entity-set".  The data service would
> declare one or more named entity-sets. An entity-set is (surprise,
> surprise) a set of entities.  Each entity has an identifier that
> uniquely identifies it within its entity-set.  On the database side of
> things, for each entity set there would be a corresponding table; the
> primary key would correspond to the entity's identifier (for simplicity,
> at first I would expect we wouldn't handle multi-part primary keys).
> However, not every table corresponds to an entity-set.  On the REST side
> of things for each entity there's a resource that directly corresponds
> to that entity; there may also be other resources that provide
> alternative views of the entity.
> 
> The data service specification would declare one or more top-level named
> entity-sets.  In my order database example, the top-level entity-sets
> might be named "products", "orders" and "customers". The name of the
> database table corresponding with a particular named entity set would
> obviously default to the name of the entity set.
> 
> The second key part of this approach is dealing with things like the
> order_items table in my example, where information that is logically
> associated with one entity is in a separate database table from the
> entity-set to which the entity belongs.  I think the way to handle this
> is to use the concept that a table that does not correspond to a
> top-level entity-set can be "owned by" a table that does.  So for
> example, the order_items table would be owned by the orders tables.  For
> some cases, it may be necessary to be explicit about how the rows of the
> owned table relate to the rows of the owning table, but my guess is that
> in most cases you can do the right thing by looking at the primary
> key/foreign key information in the database schema.
> 
> I believe it's possible to provide a basic REST interface for many
> databases using just
> 
> - the top-level entity-sets and their corresponding tables,
> 
> - ownership relationships between tables, and
> 
> - the database schema
> 
> Obviously there are lots of different ways of providing a REST
> interface, but I think most of them can be intelligently defaulted (or
> even fixed). Let's assume we want to expose the REST interface at
> http://example.com/db/.
> 
> - There needs to be an XML representation of an entity that both can be
> generated from the database and also allows the database to be updated
> from the XML representation.  The database schema is enough to allow a
> reasonable default.  Any customization facilities mustn't be so flexible
> that they inhibit automatic generation of an XML schema or mapping from
> the XML back to the database.  The tricky bit will be handling ownership
> relationships.  My guess is that you can mostly do the right thing by
> looking at the primary key/foreign keys. It should also be possible to
> automatically turn foreign keys into the appropriate URI because you can
> tell from the configuration where the URI for the resource corresponding
> to an entity is.
> 
> - There needs to be a URI for the resource corresponding to each
> entity-set.  This can be defaulted from the name of the entity-set: for
> example, the orders entity-set might be at
> http://example.com/db/orders/.  A GET on that would provide a listing of
> all the entities in the entity-set. There would need to be a
> configurable limit on the number of entities returned by such a GET and
> a way to iterate over large entity sets (e.g. using queries to specify
> the range of the result to return).  There should be some configuration
> that says what fields of the entity are returned in a GET on the
> entity-set: obviously the URI of the entity needs to be there; you might
> want just that, you might want a single title-like field (as in Atom) to
> be returned, or you might want all fields to be returned.
> 
> - There needs to be a URI for the resource corresponding to each entity.
> This would default to the URI for the entity-set plus the canonical
> lexical representation of the primary key.  For example,
> http://example.com/db/orders/12345.  A GET on this would return the XML
> representation of the entity.  The existing entity could be modified by
> doing a PUT on its URI. DELETE on the URI will delete the entity.
> 
> - There are two ways that a new entity might be added: doing a POST on
> the _entity-set_ URI or doing a PUT on the _entity_ URI.  It should be
> possible to automatically figure out which is the right way for a
> particular entity set based on whether the primary key is autogenerated
> (I think you can get this from the database schema): POST if it's
> autogenerated, PUT if it's not.
> 
> The next big thing that a REST interface would need is some searching
> capability.  A starting point is to allow the user to specify that
> certain fields are searchable. For example, if they specify that the
> country field of the customers entity-set is searchable. Then
> http://example.com/db/customers?country=US would return a listing of all
> customers with a country field equal to US.  The next step might be to
> allow the a query parameter to be associated with an SQL expression. For
> example, we might want http://example.com/db/customers?min-age=18 to
> give us a list of all customers aged at least 18.   The configuration
> might have something like this:
> 
> <query-param name="min-age" type="int">  
>  <field name="dateOfBirth"/> - now > 18 years
> </query-param>
> 
> This would allow query parameters to compose properly with no extra work
> (e.g. http://example.com/db/customers?min-age=18&country=US would "just
> work").
> 
> We would also probably want a way to provide different views of the
> entities, e.g. that excluded certain fields.
> 
> By working at a relatively high level, we can automatically can do
> several nice things for the user:
> 
> - we can automatically provide introspection facilities (e.g. WADL),
> complete with XSD and RELAX NG schemas
> 
> - we should (I think) be able to automatically generate ETags; this
> important for cacheability and crucial for dealing with concurrent
> updates
> 
> - it should be a small step to get an Atom interface as well
> 
> So far I've focused on REST.  That's partly because I think we have a
> bit of corporate REST deficit at the moment, and partly because I think
> it's easy to go from a REST interface to a WSDL (service-oriented)
> interface than vice-versa.  How might a WSDL interface be specified?  I
> envisage there being a number of built-in methods such as add,
> addMultiple, delete, deleteMatching, search, iterate, get which could
> apply to an entity or entity-set.  The basic idea would be that the user
> would identify which built-in methods are allowed for which entity-sets.
> Each built-in method would have some number of configurable parameters.
> For example, the user might specify that they want to enable the "add"
> method for the "customers" entity-set.  By default we might choose a
> WSDL operation name of addCustomer, but there would be a configurable
> parameter that would allow it to be changed to createCustomer.  There
> might be some configurable parameters at the entity-set level: for
> example, the singular noun to be used (e.g. so that you can have a table
> called "people" and get methods called "addPerson", "removePerson").
> 
> Given the built-in method and the database schema it should be possible
> to automatically generate a tasteful default WSDL interface.  The user
> wouldn't need to worry about writing an XSD schema: even when the input
> XML is complex, the semantics of the builtin method together with the
> database schema should be enough to allow us to create the XSD for the
> user.  Apart from automatically generating the WSDL, another nice thing
> we should be able to do for the user in the WS-* world is automatically
> support WS-Transfer and WS-Enumeration.  Maybe we could even have a
> method that generates events when the database is modified (though this
> would require permission to create database triggers).
> 
> In some cases, the built-in methods may not be sufficient. I envisage
> providing two ways to go beyond this.  The first way would require XML
> and SQL skills but not programming skills.  This would be quite similar
> to what we have at the moment: the user would provide a fragment of SQL,
> perhaps an XSD for the output XML or more likely the input XML, perhaps
> an XPath to get the input XML into SQL parameters, perhaps an XSLT to
> get the SQL into the desired XML form.  The second way, which would
> require programming skills, would be to make the set of built-in methods
> extensible. The user would be able to extend the available built-in
> methods just by dropping in a jar file containing a class that
> implements a particular interface.  The tricky bit would be designing
> this interface: maybe it would work by generating SQL/XSD/XPath/XSLT, or
> maybe it would work completely differently.
> 
> In terms of tooling, I think this is declarative enough that it should
> be possible to create a nice, easy to use Ajax interface that works on
> the XML configuration file, which would guided by an XML representation
> of the database schema.
> 
> This message is already rather long. I haven't talked about what I see
> as the problems with the current approach.  I can do that if people
> want.  The fundamental reason why I prefer the approach I've outlined
> above is that I think it's better for the specification to express as
> much as it can at as high a semantic level as possible. I don't think
> there's a big technical risk in the kind of approach I'm suggesting: it
> has a lot of conceptual similarities to object-relational mapping
> technologies, such as the Java Persistence API
> (http://java.sun.com/developer/technicalArticles/J2EE/jpa/).
> 
> BTW, if anybody's a bit rusty on databases, I would recommend this book:
> http://www.amazon.com/Database-Systems-Complete-Hector-Garcia-Molina/dp/0130319953/ (the Amazon customer reviews page has an amusing mixture of 1-star and 5-star reviews).
> 
> James
> 
> 
> 
> 
> 
> 
> 
> _______________________________________________
> Wsf-general mailing list
> Wsf-general at wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/wsf-general





More information about the Wsf-general mailing list