[Wsf-general] data services and resources
James Clark
james at wso2.com
Fri Apr 13 23:21:30 PDT 2007
The ADO.NET Entity Framework
http://msdn2.microsoft.com/en-us/library/aa697427(VS.80).aspx
seems to have some ideas that are relevant to this.
(It's scary how far Microsoft's language-integrated query work puts it
ahead of the Java world.)
James
On Mon, 2007-03-19 at 17:53 +0700, James Clark wrote:
> I've been wondering whether a rather different approach to specifying
> the data service might be preferable. I haven't worked this approach
> out in anything like the detail that our current approach has been
> worked out in, but I will try to explain the basic idea.
>
> The overall philosophy is to be higher-level, more declarative, easier
> to use, but less flexible and less powerful. With this philosophy it's
> not a goal that the user should be able to design an arbitrary web
> service or REST interface to the information in the database and then
> use the configuration file to specify that design. Instead the user
> gets to decide the kind of reading, writing and searching of the data
> that they require the web service/REST interface to provide and we
> automatically create a good-quality web service/REST interface that
> meets that requirement, together with some modest level of tweakability.
>
> The fundamental concept is an "entity-set". The data service would
> declare one or more named entity-sets. An entity-set is (surprise,
> surprise) a set of entities. Each entity has an identifier that
> uniquely identifies it within its entity-set. On the database side of
> things, for each entity set there would be a corresponding table; the
> primary key would correspond to the entity's identifier (for simplicity,
> at first I would expect we wouldn't handle multi-part primary keys).
> However, not every table corresponds to an entity-set. On the REST side
> of things for each entity there's a resource that directly corresponds
> to that entity; there may also be other resources that provide
> alternative views of the entity.
>
> The data service specification would declare one or more top-level named
> entity-sets. In my order database example, the top-level entity-sets
> might be named "products", "orders" and "customers". The name of the
> database table corresponding with a particular named entity set would
> obviously default to the name of the entity set.
>
> The second key part of this approach is dealing with things like the
> order_items table in my example, where information that is logically
> associated with one entity is in a separate database table from the
> entity-set to which the entity belongs. I think the way to handle this
> is to use the concept that a table that does not correspond to a
> top-level entity-set can be "owned by" a table that does. So for
> example, the order_items table would be owned by the orders tables. For
> some cases, it may be necessary to be explicit about how the rows of the
> owned table relate to the rows of the owning table, but my guess is that
> in most cases you can do the right thing by looking at the primary
> key/foreign key information in the database schema.
>
> I believe it's possible to provide a basic REST interface for many
> databases using just
>
> - the top-level entity-sets and their corresponding tables,
>
> - ownership relationships between tables, and
>
> - the database schema
>
> Obviously there are lots of different ways of providing a REST
> interface, but I think most of them can be intelligently defaulted (or
> even fixed). Let's assume we want to expose the REST interface at
> http://example.com/db/.
>
> - There needs to be an XML representation of an entity that both can be
> generated from the database and also allows the database to be updated
> from the XML representation. The database schema is enough to allow a
> reasonable default. Any customization facilities mustn't be so flexible
> that they inhibit automatic generation of an XML schema or mapping from
> the XML back to the database. The tricky bit will be handling ownership
> relationships. My guess is that you can mostly do the right thing by
> looking at the primary key/foreign keys. It should also be possible to
> automatically turn foreign keys into the appropriate URI because you can
> tell from the configuration where the URI for the resource corresponding
> to an entity is.
>
> - There needs to be a URI for the resource corresponding to each
> entity-set. This can be defaulted from the name of the entity-set: for
> example, the orders entity-set might be at
> http://example.com/db/orders/. A GET on that would provide a listing of
> all the entities in the entity-set. There would need to be a
> configurable limit on the number of entities returned by such a GET and
> a way to iterate over large entity sets (e.g. using queries to specify
> the range of the result to return). There should be some configuration
> that says what fields of the entity are returned in a GET on the
> entity-set: obviously the URI of the entity needs to be there; you might
> want just that, you might want a single title-like field (as in Atom) to
> be returned, or you might want all fields to be returned.
>
> - There needs to be a URI for the resource corresponding to each entity.
> This would default to the URI for the entity-set plus the canonical
> lexical representation of the primary key. For example,
> http://example.com/db/orders/12345. A GET on this would return the XML
> representation of the entity. The existing entity could be modified by
> doing a PUT on its URI. DELETE on the URI will delete the entity.
>
> - There are two ways that a new entity might be added: doing a POST on
> the _entity-set_ URI or doing a PUT on the _entity_ URI. It should be
> possible to automatically figure out which is the right way for a
> particular entity set based on whether the primary key is autogenerated
> (I think you can get this from the database schema): POST if it's
> autogenerated, PUT if it's not.
>
> The next big thing that a REST interface would need is some searching
> capability. A starting point is to allow the user to specify that
> certain fields are searchable. For example, if they specify that the
> country field of the customers entity-set is searchable. Then
> http://example.com/db/customers?country=US would return a listing of all
> customers with a country field equal to US. The next step might be to
> allow the a query parameter to be associated with an SQL expression. For
> example, we might want http://example.com/db/customers?min-age=18 to
> give us a list of all customers aged at least 18. The configuration
> might have something like this:
>
> <query-param name="min-age" type="int">
> <field name="dateOfBirth"/> - now > 18 years
> </query-param>
>
> This would allow query parameters to compose properly with no extra work
> (e.g. http://example.com/db/customers?min-age=18&country=US would "just
> work").
>
> We would also probably want a way to provide different views of the
> entities, e.g. that excluded certain fields.
>
> By working at a relatively high level, we can automatically can do
> several nice things for the user:
>
> - we can automatically provide introspection facilities (e.g. WADL),
> complete with XSD and RELAX NG schemas
>
> - we should (I think) be able to automatically generate ETags; this
> important for cacheability and crucial for dealing with concurrent
> updates
>
> - it should be a small step to get an Atom interface as well
>
> So far I've focused on REST. That's partly because I think we have a
> bit of corporate REST deficit at the moment, and partly because I think
> it's easy to go from a REST interface to a WSDL (service-oriented)
> interface than vice-versa. How might a WSDL interface be specified? I
> envisage there being a number of built-in methods such as add,
> addMultiple, delete, deleteMatching, search, iterate, get which could
> apply to an entity or entity-set. The basic idea would be that the user
> would identify which built-in methods are allowed for which entity-sets.
> Each built-in method would have some number of configurable parameters.
> For example, the user might specify that they want to enable the "add"
> method for the "customers" entity-set. By default we might choose a
> WSDL operation name of addCustomer, but there would be a configurable
> parameter that would allow it to be changed to createCustomer. There
> might be some configurable parameters at the entity-set level: for
> example, the singular noun to be used (e.g. so that you can have a table
> called "people" and get methods called "addPerson", "removePerson").
>
> Given the built-in method and the database schema it should be possible
> to automatically generate a tasteful default WSDL interface. The user
> wouldn't need to worry about writing an XSD schema: even when the input
> XML is complex, the semantics of the builtin method together with the
> database schema should be enough to allow us to create the XSD for the
> user. Apart from automatically generating the WSDL, another nice thing
> we should be able to do for the user in the WS-* world is automatically
> support WS-Transfer and WS-Enumeration. Maybe we could even have a
> method that generates events when the database is modified (though this
> would require permission to create database triggers).
>
> In some cases, the built-in methods may not be sufficient. I envisage
> providing two ways to go beyond this. The first way would require XML
> and SQL skills but not programming skills. This would be quite similar
> to what we have at the moment: the user would provide a fragment of SQL,
> perhaps an XSD for the output XML or more likely the input XML, perhaps
> an XPath to get the input XML into SQL parameters, perhaps an XSLT to
> get the SQL into the desired XML form. The second way, which would
> require programming skills, would be to make the set of built-in methods
> extensible. The user would be able to extend the available built-in
> methods just by dropping in a jar file containing a class that
> implements a particular interface. The tricky bit would be designing
> this interface: maybe it would work by generating SQL/XSD/XPath/XSLT, or
> maybe it would work completely differently.
>
> In terms of tooling, I think this is declarative enough that it should
> be possible to create a nice, easy to use Ajax interface that works on
> the XML configuration file, which would guided by an XML representation
> of the database schema.
>
> This message is already rather long. I haven't talked about what I see
> as the problems with the current approach. I can do that if people
> want. The fundamental reason why I prefer the approach I've outlined
> above is that I think it's better for the specification to express as
> much as it can at as high a semantic level as possible. I don't think
> there's a big technical risk in the kind of approach I'm suggesting: it
> has a lot of conceptual similarities to object-relational mapping
> technologies, such as the Java Persistence API
> (http://java.sun.com/developer/technicalArticles/J2EE/jpa/).
>
> BTW, if anybody's a bit rusty on databases, I would recommend this book:
> http://www.amazon.com/Database-Systems-Complete-Hector-Garcia-Molina/dp/0130319953/ (the Amazon customer reviews page has an amusing mixture of 1-star and 5-star reviews).
>
> James
>
>
>
>
>
>
>
> _______________________________________________
> Wsf-general mailing list
> Wsf-general at wso2.org
> http://wso2.org/cgi-bin/mailman/listinfo/wsf-general
More information about the Wsf-general
mailing list