Ticket #801 (closed task: fixed)

Opened 9 months ago

Last modified 5 months ago

Test: Overload Mulgara with TCP connections to test behavior of RMI

Reported by: russ Assigned to: ronald
Priority: critical Milestone: 0.9.0
Component: topaz-mulgara Version: 0.8.2.1-SNAPSHOT
Keywords: error Cc:
Blocking: Blocked By:

Description

plosone.log error.

how do i troubleshoot/trace back to root cause?

2008-02-27 05:02:06,384 ERROR ItqlStore(PLoSClinicalTrials)> Failed to close connection after rollback failure [TP-Processor136 org.topazproject.otm.stores.\
ItqlStore]
AxisFault
 faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
 faultSubcode:
 faultString: java.net.ConnectException: Connection refused
 faultActor:
 faultNode:
 faultDetail:
        {http://xml.apache.org/axis/}stackTrace:java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
        at java.net.Socket.connect(Socket.java:516)
        at sun.reflect.GeneratedMethodAccessor35.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at org.apache.axis.components.net.DefaultSocketFactory.create(DefaultSocketFactory.java:153)
        at org.apache.axis.components.net.DefaultSocketFactory.create(DefaultSocketFactory.java:120)
        at org.apache.axis.transport.http.HTTPSender.getSocket(HTTPSender.java:191)
        at org.apache.axis.transport.http.HTTPSender.writeToSocket(HTTPSender.java:404)
        at org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:138)
        at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
        at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
        at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
        at org.apache.axis.client.AxisClient.invoke(AxisClient.java:165)
        at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
        at org.apache.axis.client.Call.invoke(Call.java:2767)
        at org.apache.axis.client.Call.invoke(Call.java:2443)
        at org.apache.axis.client.Call.invoke(Call.java:2366)
        at org.apache.axis.client.Call.invoke(Call.java:1812)
        at org.topazproject.mulgara.itql.service.ItqlInterpreterBeanServicePortSoapBindingStub.close(ItqlInterpreterBeanServicePortSoapBindingStub.java:238)
        at org.topazproject.mulgara.itql.ItqlHelper.close(ItqlHelper.java:393)
        at org.topazproject.otm.stores.ItqlStore$ItqlStoreConnection.abort(ItqlStore.java:917)
        at org.topazproject.otm.stores.ItqlStore$ItqlStoreConnection.rollback(ItqlStore.java:901)
        at org.topazproject.otm.Transaction.rollback(Transaction.java:85)
        at org.topazproject.otm.util.TransactionHelper.doInTxE(TransactionHelper.java:112)
        at org.plos.xacml.cond.OtmQueryFunction.executeQuery(OtmQueryFunction.java:71)
        at org.plos.xacml.cond.DBQueryFunction.evaluate(DBQueryFunction.java:149)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
        at org.plos.xacml.cond.CachedBagFunction.evaluate(CachedBagFunction.java:161)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
        at com.sun.xacml.cond.FunctionBase.evalArgs(Unknown Source)
        at com.sun.xacml.cond.GeneralBagFunction.evaluate(Unknown Source)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
        at com.sun.xacml.cond.FunctionBase.evalArgs(Unknown Source)
        at com.sun.xacml.cond.ComparisonFunction.evaluate(Unknown Source)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
        at com.sun.xacml.cond.LogicalFunction.evaluate(Unknown Source)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
        at com.sun.xacml.Rule.evaluate(Unknown Source)
        at com.sun.xacml.combine.FirstApplicableRuleAlg.combine(Unknown Source)
        at com.sun.xacml.AbstractPolicy.evaluate(Unknown Source)
        at com.sun.xacml.PolicyReference.evaluate(Unknown Source)
        at com.sun.xacml.combine.FirstApplicablePolicyAlg.combine(Unknown Source)
        at com.sun.xacml.AbstractPolicy.evaluate(Unknown Source)
        at com.sun.xacml.PDP.evaluateContext(Unknown Source)
        at com.sun.xacml.PDP.evaluate(Unknown Source)
        at org.plos.xacml.DenyBiasedPEP.evaluate(DenyBiasedPEP.java:120)
        at org.plos.xacml.AbstractSimplePEP.checkAccess(AbstractSimplePEP.java:132)
        at org.plos.xacml.AbstractSimplePEP.checkAccess(AbstractSimplePEP.java:148)
        at org.plos.article.service.ArticleOtmService.getObjectInfo(ArticleOtmService.java:467)
        at org.plos.article.service.ArticleOtmService$$FastClassByCGLIB$$45a772ae.invoke(<generated>)
        at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
        at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:695)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:144)
        at org.springframework.aop.support.DelegatingIntroductionInterceptor.doProceed(DelegatingIntroductionInterceptor.java:131)
        at org.springframework.aop.support.DelegatingIntroductionInterceptor.invoke(DelegatingIntroductionInterceptor.java:119)
        at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:166)
        at org.springframework.aop.framework.Cglib2AopProxy$DynamicAdvisedInterceptor.intercept(Cglib2AopProxy.java:630)
        at org.plos.article.service.ArticleOtmService$$EnhancerByCGLIB$$5a59a63c.getObjectInfo(<generated>)
        at org.plos.article.action.FetchObjectAction.fetchFirstObject(FetchObjectAction.java:63)
        at sun.reflect.GeneratedMethodAccessor638.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:585)
        at com.opensymphony.xwork2.DefaultActionInvocation.invokeAction(DefaultActionInvocation.java:404)
        at com.opensymphony.xwork2.DefaultActionInvocation.invokeActionOnly(DefaultActionInvocation.java:267)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:229)
        at org.plos.user.EnsureUserAccountInterceptor.intercept(EnsureUserAccountInterceptor.java:50)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at org.plos.user.UserAccountsInterceptor.intercept(UserAccountsInterceptor.java:77)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at org.plos.journal.SessionFiltersInterceptor.intercept(SessionFiltersInterceptor.java:46)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.interceptor.DefaultWorkflowInterceptor.doIntercept(DefaultWorkflowInterceptor.java:221)
        at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:86)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.validator.ValidationInterceptor.doIntercept(ValidationInterceptor.java:150)
        at org.apache.struts2.interceptor.validation.AnnotationValidationInterceptor.doIntercept(AnnotationValidationInterceptor.java:48)
        at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:86)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.interceptor.ConversionErrorInterceptor.intercept(ConversionErrorInterceptor.java:123)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.interceptor.ParametersInterceptor.doIntercept(ParametersInterceptor.java:167)
        at com.opensymphony.xwork2.interceptor.MethodFilterInterceptor.intercept(MethodFilterInterceptor.java:86)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.interceptor.StaticParametersInterceptor.intercept(StaticParametersInterceptor.java:105)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at org.apache.struts2.interceptor.FileUploadInterceptor.intercept(FileUploadInterceptor.java:207)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at com.opensymphony.xwork2.interceptor.ModelDrivenInterceptor.intercept(ModelDrivenInterceptor.java:74)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
        at com.opensymphony.xwork2.DefaultActionInvocation.invoke(DefaultActionInvocation.java:221)
        at org.apache.struts2.interceptor.debugging.DebuggingInterceptor.intercept(DebuggingInterceptor.java:206)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:224)
        at com.opensymphony.xwork2.DefaultActionInvocation$2.doProfiling(DefaultActionInvocation.java:223)
        at com.opensymphony.xwork2.util.profiling.UtilTimerStack.profile(UtilTimerStack.java:455)
...
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
        at java.lang.Thread.run(Thread.java:595)

        {http://xml.apache.org/axis/}hostname:plostopaz01.localdomain

java.net.ConnectException: Connection refused
        at org.apache.axis.AxisFault.makeFault(AxisFault.java:101)
        at org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:154)
        at org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
        at org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
        at org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
        at org.apache.axis.client.AxisClient.invoke(AxisClient.java:165)
        at org.apache.axis.client.Call.invokeEngine(Call.java:2784)
        at org.apache.axis.client.Call.invoke(Call.java:2767)
        at org.apache.axis.client.Call.invoke(Call.java:2443)
        at org.apache.axis.client.Call.invoke(Call.java:2366)
        at org.apache.axis.client.Call.invoke(Call.java:1812)
        at org.topazproject.mulgara.itql.service.ItqlInterpreterBeanServicePortSoapBindingStub.close(ItqlInterpreterBeanServicePortSoapBindingStub.java:238)
        at org.topazproject.mulgara.itql.ItqlHelper.close(ItqlHelper.java:393)
        at org.topazproject.otm.stores.ItqlStore$ItqlStoreConnection.abort(ItqlStore.java:917)
        at org.topazproject.otm.stores.ItqlStore$ItqlStoreConnection.rollback(ItqlStore.java:901)
        at org.topazproject.otm.Transaction.rollback(Transaction.java:85)
        at org.topazproject.otm.util.TransactionHelper.doInTxE(TransactionHelper.java:112)
        at org.plos.xacml.cond.OtmQueryFunction.executeQuery(OtmQueryFunction.java:71)
        at org.plos.xacml.cond.DBQueryFunction.evaluate(DBQueryFunction.java:149)
        at com.sun.xacml.cond.Apply.evaluate(Unknown Source)
...
        at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
        at org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
        at org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
        at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
        at java.lang.Thread.run(Thread.java:595)
Caused by: java.net.ConnectException: Connection refused
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
        at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
        at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)

Dependency Graph

Change History

02/27/08 11:27:57 changed by russ

here are the next few related loglines:

2008-02-27 05:02:06,394 WARN  TransactionHelper(PLoSClinicalTrials)> rollback failed [TP-Processor136 org.topazproject.otm.util.TransactionHelper]
org.topazproject.otm.OtmException: error rolling back transaction
Caused by: java.net.ConnectException: Connection refused
2008-02-27 05:02:06,396 WARN  DBQueryFunction(PLoSClinicalTrials)> query '
                select obj.state from ObjectInfo obj where obj.id = :p1 ;
              ' execution failed. [TP-Processor136 org.plos.xacml.cond.DBQueryFunction]
org.plos.xacml.cond.DBQueryFunction$QueryException: query '
                select obj.state from ObjectInfo obj where obj.id = :p1 ;
              ' execution failed.
Caused by: org.topazproject.otm.OtmException: error starting transaction
Caused by: java.net.ConnectException: Connection refused
2008-02-27 05:02:14,946 ERROR ItqlStore(PLoSNTD)> Failed to close connection after rollback failure [TP-Processor73 been through net.sf.ehcache.constructs.w\
eb.filter.SimplePageCachingFilter org.topazproject.otm.stores.ItqlStore]
AxisFault
 faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
 faultSubcode:
 faultString: java.net.ConnectException: Connection refused

it goes on in this manner for quite some time...

02/27/08 13:03:53 changed by jsuttor

  • status changed from new to assigned.
  • milestone set to pubApp_0.8.2.2.

03/10/08 16:33:08 changed by rich

I also see these errors in the logs which may be related

2008-03-09 10:07:02,622 ERROR ItqlStore(PLoSONE)> Failed to close connection after rollback failure [TP-Processor207 org.topazproject.otm.stores.ItqlStore]
AxisFault
 faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException
 faultSubcode:
 faultString: java.net.ConnectException: Connection timed out
 faultActor:
 faultNode:
 faultDetail:

(follow-up: ↓ 5 ) 03/10/08 16:46:41 changed by amit

Is there anything in the logs before which indicates something closed the connection for some reason?

(in reply to: ↑ 4 ; follow-up: ↓ 6 ) 03/10/08 19:49:59 changed by ronald

Replying to amit:

Is there anything in the logs before which indicates something closed the connection for some reason?

We need to be careful with the term "connection" here, as it's being used for two somewhat different things. On the one hand there are the TCP connections, which is what is being referred to in the "java.net.ConnectException: Connection refused"; on the other hand there are "ItqlStoreConnection"s which is what is being referred to in the "Failed to close connection" message and which represent an OTM internal connection abstraction that has nothing to do with TCP connections, but instead designates a conversation spanning a transaction.

The "Failed to close connection" per se is not a huge deal - it means that the underlying mulgara Session object was not closed. This should not be particularly problematic in itself as the mulgara Session will get discarded when the http-session times out.

However, the "error rolling back transaction" just after the "Failed to close connection" message is indeed problematic: it means the transaction will stay active on mulgara (and hence block everybody else) until the transaction times out. So this needs to be investigated and fixed. Since the underlying cause is a "java.net.ConnectException" the problem is that mulgara is refusing connections, probably due to having run out of threads and the tcp backlog being full.

(in reply to: ↑ 5 ) 03/20/08 11:04:40 changed by jsuttor

  • owner changed from jsuttor to ronald.
  • status changed from assigned to new.
  • component changed from publishing-app to topaz-mulgara.
  • type changed from clarification to defect.

Replying to ronald:

However, the "error rolling back transaction" just after the "Failed to close connection" message is indeed problematic: it means the transaction will stay active on mulgara (and hence block everybody else) until the transaction times out. So this needs to be investigated and fixed.

it appears the fix is in the OTM transaction scope so assigning to Ronald.

03/27/08 16:39:01 changed by rich

  • milestone changed from pubApp_0.8.2.2 to 0.9.0.

Moving to 0.9 for diagnostics/testing with RMI. From Ronald:

Because 0.9 has a new mulgara and will be using RMI instead of SOAP, things are different for the two.

For the current stuff (0.8.*) I'm not sure there's much we can do, as I think it happens because mulgara is overloaded with tcp connections. The best I can think of is that we try and put such failed otm-connections on a queue and repeatedly try to roll-back the tx and close the session.

For 0.9 things are quite different because with the use of RMI we don't actually control tcp-connections (we strictly speaking don't now either, but the way axis is configured we effectively can). For 0.9 we need to run some test and figure out how things behave when we try to overload mulgara.

03/28/08 09:37:05 changed by rich

  • summary changed from ERROR: Failed to close connection after rollback failure; faultString: java.net.ConnectException: Connection refused to Test: Overload Mulgara with TCP connections to test behavior of RMI.

05/06/08 10:37:45 changed by amit

  • type changed from defect to task.
  • blocking changed.
  • blockedby changed.

06/23/08 15:43:14 changed by amit

  • owner changed from ronald to amit.

06/26/08 18:15:45 changed by amit

  • owner changed from amit to ronald.

Assigning this back to Ronald to figure out if we can fix the multiple thread problem related to singleton parser.

06/27/08 15:58:44 changed by amit

  • priority changed from high to critical.

Upgrading as it would be good to know load handling capabilities here.

07/07/08 11:42:55 changed by amit

  • status changed from new to closed.
  • resolution set to fixed.

With changes in mulgara from r1057 to r1069 all the problems related to concurrency and time-outs using MulgaraRMI.groovy have been fixed. MulgaraRMI can be used to overload Mulgara directly with lots of threads doing {delete/insert/query/[commit|rollback]} to simulate very heavy loads and to determine performance characteristics.