[mashup-dev] ScraperService or ScraperHostObject?

Ruwan Linton ruwan at wso2.com
Sat Sep 1 05:21:33 PDT 2007


Keith,

----- Original Message ----- 
From: "Keith Chapman" <keith at wso2.com>
To: <mashup-dev at wso2.org>
Sent: Saturday, September 01, 2007 9:54 AM
Subject: Re: [mashup-dev] ScraperService or ScraperHostObject?


> Sanjiva Weerawarana wrote:
>> How about having a scraper service and a standard stub for it always?
> This can be done but it does not give us the full advantage of WebHarvest 
> (WH). following are a few that i have observed.
>
> 1. WH allows you to include other config files using a include tag (Cannot 
> use it in ScraperService scenario)
> 2. Cannot read and write to the file system directly through WH using its 
> file tag.
> 3. A WH config can define variables usinf var-def tag and during excution 
> of the config it will assign various parts of the data to these variables 
> (Depending on the way the config is written). The user can then query the 
> result using the defined variables (Even our host object does not support 
> this scenario as yet. It can be implemented trivially though). This cannot 
> be done in the Scraper service case too.
>
> I'm more inclined to the HostObject case but its not only xpath that we 
> will be loosing. We will be loosing xquery and xslt too.
>
> I'm not exactly sure of the consequences we will have if we have saxon in 
> the global class path. Which would help us get 100% of WH using a 
> hostobject but this is going to effect all other XSLT processing stuff 
> that we have (?stub, ?try-it).
>
> Note: Saxon issues a warning when processing XSLT 1.0 docs saying that an 
> XSLT 1.0 doc was processed using a XSLT 2.0 processor.

AFAIK, Saxon has its own XSLT 1.0 processor, may be the code commands saxon 
to get the XSLT 2.0 processor.

Thanks,
Ruwan

>
> Thanks,
> Keith.
>> That way it feels like a host object but the object is simply a stub to 
>> the service.
>>
>> Sanjiva.
>>
>> Jonathan Marsh wrote:
>>> If XPath doesn't work in the host object, it doesn't give us much 
>>> choice.
>>> But in all other respects I think a host object is preferable.
>>>
>>> Jonathan Marsh - http://www.wso2.com - 
>>> http://auburnmarshes.spaces.live.com
>>>
>>>> -----Original Message-----
>>>> From: mashup-dev-bounces at wso2.org [mailto:mashup-dev-bounces at wso2.org]
>>>> On Behalf Of Keith Chapman
>>>> Sent: Friday, August 31, 2007 4:34 AM
>>>> To: mashup-dev at wso2.org
>>>> Subject: [mashup-dev] ScraperService or ScraperHostObject?
>>>>
>>>> Hi all,
>>>>
>>>> I managed to get the scraper service working. Which means that we have
>>>> both a host object and a service at the moment. Both mechanisms have
>>>> there advantages and disadvantages.
>>>>
>>>> Having as a host object.
>>>> ---------------------------------------
>>>>
>>>> 1. Convenient to use
>>>> 2. Cannot support xpath (Unless we have saxon in our class path)
>>>> 3. Can directly scrape a page and sage parts of it on to the file
>>>> system
>>>> through the webharvest config file itself.
>>>>
>>>>
>>>> Having a service.
>>>> ----------------------------
>>>>
>>>> 1. Can use xpath (Cause we can include saxon in the lib of the service
>>>> (does not effect the global classpath)).
>>>> 2. Cannot use file elements in config.
>>>>
>>>>
>>>> We cannot have both of these implementations too (Unless we have saxon
>>>> in the global classpath). They cant coexist together cause the first
>>>> requires webharvest  to be on the global classpath and the second
>>>> method
>>>> requires it to be inside the lib of the service.
>>>>
>>>> Are we going to decide to use one of these?
>>>>
>>>> What do you guys think?
>>>>
>>>> Thanks,
>>>> Keith.
>>>>
>>>> _______________________________________________
>>>> Mashup-dev mailing list
>>>> Mashup-dev at wso2.org
>>>> http://www.wso2.org/cgi-bin/mailman/listinfo/mashup-dev
>>>
>>>
>>> _______________________________________________
>>> Mashup-dev mailing list
>>> Mashup-dev at wso2.org
>>> http://www.wso2.org/cgi-bin/mailman/listinfo/mashup-dev
>>>
>>
>
>
> _______________________________________________
> Mashup-dev mailing list
> Mashup-dev at wso2.org
> http://www.wso2.org/cgi-bin/mailman/listinfo/mashup-dev 





More information about the Mashup-dev mailing list