load test

underscore_dot.yahoo.com's picture
Hi, there. I'm testing BPS-2.0.0 and ESB-3.0.0 on my PC (Windows7, 2MB Ram, Intel Core Duo, jdk1.6.0_21). The test consists on a BPEL process that consumes a data service running on the ESB. The data service offers 4 operations: generate id, create record, read record and delete record. I first launched a load test against the data service with 100 concurrent users invoking the "generate id" operation. It performed fast and stable. Then I tried to do the same with the BPEL process running on BPS-2.0.0 but it soon started to throw "No ManagedConnections available within configured blocking timeout ( 30000 [ms] )" (see attached timeout-exception.txt). I thought that, maybe, 100 users were too many, so I then tried with 10 concurrent users. It reached the same situation after 21 minutes. This makes me think that somehow connections may not be freed. After researching on how to optimize performance I found the following alternatives: - Increase pool timeout and/or size (could not find how to do this exactly, though). - Process instance data cleanup: http://ode.apache.org/instance-data-cleanup.html So I tried instance data cleanup by configuring deploy.xml with <cleanup on="always" />. The result was to always, for each service invocation, get the following exception (see attached cleanup-exception.txt): org.apache.openjpa.persistence.ArgumentException: Encountered deleted object "org.apache.ode.dao.jpa.MessageDAOImpl@743dae" in persistent field "org.apache.ode.dao.jpa.MessageExchangeDAOImpl._request" of managed object "org.apache.ode.dao.jpa.MessageExchangeDAOImpl@1147d1a" during flush. As the timeout exception seems related to access to the H2 database where BPS stores BPEL process state, I decided to configure my BPEL process deploy.xml with the parameter <in-memory>true</in-memory> so, supposedly, no state would be stored on db. I ran this last test with 100 concurrent users and I could see a big increase in performance though, with time, degrading and from time to time throwing "Call to testWSO2Service.process timed out(30000 ms). {org.wso2.carbon.bpel.ode.integration.BPELService}" (see attached in-memory-timeout.txt). However, in-memory processes won't probly be an option for me. I think there's a problem with how the state database is being accessed though, if not, how should I fine tune the configuration to make it perform better? Is there a known issue that matches the above behaviour? Is it ODE, WSO2, ... related? Is it related to the environment where ESB and BPS are running in? Many thanks in advance.
AttachmentSize
timeout-exception.txt4.93 KB
cleanup-exception.txt32.58 KB
in-memory-timeout.txt2.95 KB
test.zip9.05 KB
loadtest.zip1.97 KB
results-in-memory.zip121.15 KB
results-persisted-state.zip294 KB
results-Persisted10u86400s400mc4000mcph.zip150.9 KB
communication-link-failure-exception.txt19.02 KB
results-Persisted-NoEvents40u3600s400mc4000mcph.zip65.82 KB
BPS - Error sending message to Axis2 for ODE mex.txt5.37 KB
ESB - Two services cannot have same name.txt2.44 KB
results-Persisted-NoEvents10u86400s400mc4000mcph.zip423.9 KB
results-Persisted-activityLifecycleEvents10u43200s400mc4000mcph.zip216.17 KB
results-Persisted-activityLifecycleEvents10u43200s400mc4000mcph-2.zip313.4 KB
waruna's picture

The embedded H2 db may not

The embedded H2 db may not support high concurrency. So you may try configuring an external database [1] (i.e. MySQL). To configure instance clean-up, refer [2]. [1] - http://wso2.org/project/bps/2.0.0/docs/user_guide.html#Configuring-Ext-DS [2] - http://wso2.org/project/bps/2.0.0/docs/user_guide.html#Using-Process-Instance-Cleanup
milinda's picture

Please share the test artifacts if possible

Try to increase the max connections per host and max total connections for multi-threaded http connection manager using following configuration elements in bps.xml. Please uncomment following lines and change the parameters to suite your scenario. For example, say your process calls 5 external services and maximum concurrency you expect is 10. If all those external services are located in same host, there will be 50 connections created to the same host at given time. But this can vary depends on the situation and above number was just a rough calculation. So we should allow at least 50 max connections per host. And we should allow more total connections also. For example 50 max connection per host and 500 total connections will be enough for situation I described earlier. But you have to calculate best value for your situation. Also please attach the test artifacts to this post if possible. Thanks Milinda
underscore_dot.yahoo.com's picture

test artifacts

Hi, Milinda. See test.zip among attached files. It includes: - A MySQL database dump. - A data service, pruebaDataService.dbs, which relies on the previous MySQL database. - The BPEL service, testWSO2.zip, which invokes operations on the data service. Regards.
milinda's picture

We'll try out this and let you know the results

We'll try out your scenarios and let you know optimizations you can do.
underscore_dot.yahoo.com's picture

test artifacts

I have also attached: - loadtest.zip: contains a SoapUI 3.5.1 load test. - results-in-memory.zip: two excel files containing the resulting data of two corresponding tests: * results-InMemory20u3600s120mc1200mcph.xls: in-memory process invoked by 20 concurrent users during 1h. maxConnectionsPerHost=120 and maxTotalConnections=1200 * results-InMemory25u3600s120mc1200mcph.xls: Same as previous one excepting it was run by 25 concurrent users. See this one had 11 errors.
john.campaner.s1.com's picture

Similar Problems With External Database

We are having similar problems using a DB2 external database. We have also tried similar fixes without much success. We did find however that the load test was highly successful when using the process in "in-memory" mode. Any guidance here would be helpful.
underscore_dot.yahoo.com's picture

test artifacts

I'm now attaching new test results. This time using MySQL persisted state. In results-persisted-state.zip you will find: - results-Persisted20u3600s120mc1200mcph.xls: ok - results-Persisted30u3600s120mc1200mcph.xls: ok - results-Persisted40u3600s120mc1200mcph.xls: with errors - results-Persisted40u3600s400mc4000mcph.xls: ok I no longer can see the exceptions I used to see while using H2. Response time is higher than using in-memory, but variation is lower. Progress is more ... steady Regards.
milinda's picture

This is the expected behavior

The above results are normal when persistence is enabled. Large response time is caused by large amount of database operations needs to be carried out for persistence. You can gain performance improvement by disabling event generation[1]configuring instance cleanup as described in following two blog posts[2][3]. [1]http://ode.apache.org/ode-execution-events.html [2]http://blog.mpathirage.com/2010/05/31/wso2-bps-instance-cleanup-task/ [3]http://blog.mpathirage.com/2010/05/10/wso2-bps-process-instance-data-cleanup/
underscore_dot.yahoo.com's picture

test artifacts

More test results (see results-Persisted10u86400s400mc4000mcph.zip). This time a 1 day - 10 concurrent user test. Failed. BPS server was left in an inconsistent state, no longer responding to requests. I had to recreate state database and redeploy the test BPEL process. The BPS server was throwing exceptions like the following (see attached communication-link-failure-exception.txt): org.apache.ode.scheduler.simple.DatabaseException: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: No operations allowed after connection closed.Connection was implicitly closed by the driver. Still don't know if the exception was caused by an unappropriate MySQL configuration, MySQL limitation, ... I found some 15 million records in table ODE_EVENT.
underscore_dot.yahoo.com's picture

I'll definitively try to

I'll definitively try to limit the amount of events logged into the ODE database. After my last attempt at a 1 day - 10 user run I found table ODE_EVENT had some 15 million records. Process instance cleanup is something I've already tried and mentioned in my first post. It simply doesn't work for me as it fails with the following exception for all requests: org.apache.openjpa.persistence.ArgumentException: Encountered deleted object "org.apache.ode.dao.jpa.MessageDAOImpl@743dae" in persistent field "org.apache.ode.dao.jpa.MessageExchangeDAOImpl._request" of managed object "org.apache.ode.dao.jpa.MessageExchangeDAOImpl@1147d1a" during flush. As for the instance cleanup task, I still was unable to continuously run my test for more than 1 day without failing. I understand you can not configure this task to run more than once a day. Am I right?. Regards.
underscore_dot.yahoo.com's picture

test artifacts

And now, in results-Persisted-NoEvents40u3600s400mc4000mcph.zip, the results after removing all events by configuring deploy.xml with <process-events generate="none"/> Comparing to results in results-Persisted40u3600s400mc4000mcph.xls we have that - Speed has increased from 3.1 tps to 7.78 tps. - Average response time has decreased from 12025.56 ms to 4347.57 ms. - Maximum response time has decreased from 28277 ms to 8473 ms.
underscore_dot.yahoo.com's picture

test artifacts

A 10 user/1 day test run not persisting any events: Failed. I've uploaded the results in results-Persisted-NoEvents10u86400s400mc4000mcph.zip. I found exceptions both in BPS and ESB server log files. - This piece of log from ESB was continuouly looping. I think it started when the test started reporting errors (see attached ESB - Two services cannot have same name.txt): 2010-08-05 08:16:19,502 [-] [Timer-7] INFO DeploymentEngine org.apache.axis2.deployment.DeploymentException: Two services cannot have same name. A service with the echo name already exists in the system. 2010-08-05 08:16:19,541 [-] [Timer-7] ERROR ServiceDeployer The version service, which is not valid, caused Two services cannot have same name. A service with the version name already exists in the system. org.apache.axis2.AxisFault: Two services cannot have same name. A service with the version name already exists in the system. - BPS server was logging the following piece (see attached BPS - Error sending message to Axis2 for ODE mex.txt): [2010-08-05 08:12:23,198] ERROR - Error sending message to Axis2 for ODE mex {PartnerRoleMex#hqejbhcnphr5hk717dbuj1 [PID null] calling org.apache.ode.il.epr.WSAEndpoint@b3743c.SiguienteId(...)} {org.wso2.carbon.bpel.ode.integration.ExternalService} org.apache.axis2.AxisFault: The input stream for an incoming message is null. ... [2010-08-05 08:12:23,211] INFO - ActivityRecovery: Registering activity 17, failure reason: Error sending message to Axis2 for ODE mex {PartnerRoleMex#hqejbhcnphr5hk717dbuj1 [PID null] calling org.apache.ode.il.epr.WSAEndpoint@b3743c.SiguienteId(...)} on channel 42 {org.apache.ode.bpel.engine.BpelRuntimeContextImpl} [2010-08-05 08:12:23,211] INFO - ActivityRecovery: Registering activity 17, failure reason: Error sending message to Axis2 for ODE mex {PartnerRoleMex#hqejbhcnphr5hk717dbuj1 [PID null] calling org.apache.ode.il.epr.WSAEndpoint@b3743c.SiguienteId(...)} on channel 42 {org.apache.ode.bpel.engine.BpelRuntimeContextImpl} [2010-08-05 08:12:23,759] ERROR - Call to testWSO2Service.process timed out(30000 ms). {org.wso2.carbon.bpel.ode.integration.BPELService} java.util.concurrent.TimeoutException ... I will need further configuration tunning, whether for BPS, ESB or MySQL. What do you suggest? Kind regards.
milinda's picture

Will try to find the issues in BPS

We are heavily working on Stratos BPS currently and doing major changes to WSO2 BPS core. So I didn't have time to test your scenario. I'll try to find out the issues ASAP and provide you with a solution.
underscore_dot.yahoo.com's picture

test artifacts

Test results for a 10user/12h test run: failed This time persisting only activityLifecycle events (see results-Persisted-activityLifecycleEvents10u43200s400mc4000mcph.zip). Unfortunately I did not check database status, but I'll check it next time with SHOW TABLE STATUS FROM <database_name>; I think the only solution is making instance cleanup work.
underscore_dot.yahoo.com's picture

test artifacts

I've repeated the last 10u/12h test run: failed. This time started up with an empty ODE process state database. results-Persisted-activityLifecycleEvents10u43200s400mc4000mcph-2.zip includes both the test results and the results of SHOW TABLE STATUS FROM ODE. Table ode_event reached almost 15GB and corresponding index 500MB. Comparing to the previous test run I found most of the errors happened in the same point in time.
underscore_dot.yahoo.com's picture

ODE version

Hi, Milinda. In http://ode.apache.org/instance-data-cleanup.html I read "This feature is only available in Ode 1.3 or later". Which ODE version is BPS 2.0 running? Many thanks.
milinda's picture

We used ODE 2.x experimental

We have used ODE 2.x experimental for BPS 2.0.0 and currently working on migrating to current trunk. But process instance cleanup is there in BPS 2.0.0 too.
underscore_dot.yahoo.com's picture

- I've been trying different

- I've been trying different combinations of the cleanup configuration, and the most I was able to configure without it throwing any exceptions is the one bellow (I still have to run a load test on it). <cleanup on="success"> <!--category>instance</category> <category>variables</category> <category>messages</category--> <category>correlations</category> <category>events</category> </cleanup> - Also, I've tried BPS cleanup configuration. I can see this in the log file: [2010-08-30 12:00:11,007] INFO - Running instance clean-up task... {org.wso2.carbon.bpel.ode.integration.instancecleanup.InstanceCleanupTask} [2010-08-30 12:00:11,454] INFO - Instance clean-up execution completed... {org.wso2.carbon.bpel.ode.integration.instancecleanup.InstanceCleanupTask} Despite of this, I can not see any data is being cleaned. I set a 2 day load test last weekend (with <process-events generate="all"/>) and it end up filling up a 500GB disk (MySQL's InnoDB ibdata1 file was 400GB!!).
underscore_dot.yahoo.com's picture

test artifacts

I have finished a 10user/2day test with apparently no errors. At least SoapUI did not report any. This time I configured instance cleanup as in http://wso2.org/forum/thread/10239#comment-15448. That is, cleaning process events from ODE database. I'd upload the results file but, as the file upload feature is no longer available, I'll post here the statistics: min: 432 max: 24803 avg: 1437.79 cnt: 789735 tps: 4.57 bytes: 277196985 bps: 1604 err: 0 MySQL's InnoDB ibdata1 reached 22.7 GB. Still too big. There is one thing I noticed for which I still need an explanation. The number of requests sent to BPS was 789735 (see cnt statistic) but SELECT count(*) FROM `ode`.`ode_process_instance`; returns 782069. That is, there were 7666 less process instances than requests sent to BPS. There might have been problems while persisting process state. SoapUI did not report any errors (timeout, SOAP fault, ...), though. Regards.
waruna's picture

We found an issue in deleting

We found an issue in deleting messages. Eventhough we enable messages, it only deletes message routes but not messages. We are working on a fix for this issue.
underscore_dot.yahoo.com's picture

Issue link

I guess you will open an issue into "Issue tracker". Would you post its link to this thread? Regards.
waruna's picture

You may track the issue

You may track the issue through: https://wso2.org/jira/browse/CARBON-7698
library project main code
Learn Cloud
Learn
Cloud

The WSO2 Application Server is a reliable application server that can host your enterprise web applications. The WSO2 Application Server as a Service is offered in StratosLive, the WSO2 Platform as a Service. This article explains how a simple web application can be developed and deployed from Carbon Studio to the WSO2 Application Server...

Latest Webinar
KeellsSuper is a leading supermarket chain with 50-plus outlets in Sri Lanka, and it offers the only online supermarket in the country. In 2005, JKH implemented SAP ERP across it’s 70 subsidiaries...
Thursday, February 16th 2012, 09.00 AM (PST)

Thursday, February 16th 2012, 10.00 AM (GMT)