ATG Dynamo DRP Hung threads solution

June 19th, 2007

What is a DRP Thread?
DRP stands for Distribution and Replication Protocol. It is the protocol for communication between an HTTP server (Web server) and an application Server (ATG). When a user clicks on a link or summits a form in his/her web browser, that information or the “request” is sent to a webserver. The webserver examines the request and determines if it needs to pass on the request to the application server. If it does need to pass the request, it uses the DRP protocol to send the message. Inside the application server when it receives a request, it starts a DRP thread to handle the request. That thread performs the necessary actions, and returns the response to the webserver, which then returns the response to the user’s browser.

Why do they get Hung?
The main reason the threads hang is because it can not finish the request in a timely manner. If your data needs continuously updating, some of queries loose efficiency because their execution paths over time need to be re-analyzed. Over the last couple years we have seen this in several of our queries.

Since we have a set number of available DRP threads (40 in our case), when they are all hanging, the application server is no longer able to serve requests.

This problem is an Enterprise problem. This means all ATG applications run into this problem, if the queries don’t return.

The solution to the problem.

Overview
The solution is to first monitor and identify DRP hung threads, send email regarding their hung status, and then kill them before they can do damage. This solution requires a monitor running inside the application server that will periodically take an inventory of the running DRP threads. When a thread age reaches a predetermined threshold, it will be terminated, and information about the terminated thread will be sent to a predetermined email for evaluation.

The following are the control inputs for the DRPThread Monitor stored in a property file. The property file is loaded upon startup.

This tells the monitor to auto-start.
threadMonitorAutoStart = true

This is how often it checks the DRP threads
threadMonitorWaitInterval = 5000

This is percent of DRP threads in use before checking threshold
threadMonitorThreshold= 80

This is the Age limit of threads. When the age limit is surpassed
thresholdAgeLimit = 300000

This is the max age of any thread. A thread over this age is killed
maxThreadAge= 1800000

List of email address sent notifications when a DRP thread is killed
emailNotificationList = hwilliamson@greattastingjava.com

Pseudo Code

1. Application server starts up
2. Starts DRP Monitor

a. If threadMonitorAutoStart = true, then monitor continues to start up, else it stops.
b. Loop until forced stop

    i. Check DRP server
    1. percent active thread > threadMonitorThreshold && age of thread > thresholdAgeLimit) || age of thread > maxThreadAge then
    a. Email info about hung thread
    b. Kill thread
    ii. Sleep threadMonitorWaitInterval

Code Sample

while (!forcedStop) {

try {

RequestServerHandler handlers[] = drpServer.getRequestHandlers();

int activeCount = drpServer.getActiveHandlerCount();

double percentActive = (activeCount/handlers.length) *100.0;

for (int i = 0; i < handlers.length; i++) {

if (handlers[i] instanceof DrpServerConnection) {

DrpServerConnection connection = (DrpServerConnection) handlers[i];
if (connection.getHandlingRequest()) {

if ((percentActive> threadMonitorThreshold && connection.getCurrentRequestTime() > thresholdAgeLimit) ||connection.getCurrentRequestTime() > maxThreadAge)
{
logger.debug(”killing:” + connection);

String path = connection.getCurrentRequestPathInfo();
long time = connection.getCurrentRequestTime();

connection.killHandler();

Mail mail = new Mail();
mail.postMail(emailNotificationList,”DRP Thread Killed”, “Path:” +path +”\n”
+”Time:” +time+”\n”,”hwilliamson@greattastingjava.com”);
logger.debug(”emailing:” + emailNotificationList);

}
}
}
}
sleepDRPThreadMonitor();
} catch (Exception ex)
{
ex.printStackTrace();
}
}

Example code: Java Spring Framework HelloWorld example

January 30th, 2006

I have been trying to learn about the Java Spring framework,
and here is a great Helloworld example I found: HERE

Technorati Tags: , , , ,

Configuring JDBC data source/JNDI name in Rational application development Environment 6.0 for Oracle

January 6th, 2006

Configuring the JDBC jndi name in RAD ia a lot harder than it needs to be.
Here is how to do it.
Read the rest of this entry »

Message Beans: Container Vs. Bean Managed Transactions, XA Transaction problem

December 27th, 2005

I recently tried to get a container managed transaction message bean to read from a MQ queue then read from an oracle database. I got an error indicating that I did not have XA Transactions turned on in my MQ queue. Then I dug a little deeper….
Read the rest of this entry »

MQ 5.3 performance testing with Message Beans and Websphere 6.0

December 15th, 2005
Earlier in the week I did some testing of MQ with message beans.
This is what I got.
10,000 very small messages of 10 bytes going through 1 queue with persistence turned on takes about 37 seconds.
10,000 large messages of 10,000 bytes through 1 queue with persistence turned on takes about 90 seconds

Technorati Tags: , , , ,

JNDI look up in Websphere 6.0 to Test MQ 5.3 Queue, Message Beans with JMS

December 12th, 2005
If you have read my other post Configuring Rational Software Development Platform Test Environment 6.0 to use a WebSphere MQ 5.3 Queue
I used JMSAdmin as my service provider for inserting objects into the queue.
QueueIBMTest.java uses websphere 6.0 as my service provider to insert items in an MQ queue using JMS.
Read the rest of this entry »

Configuring Rational Software Development Platform Test Environment 6.0 to use a WebSphere MQ 5.3 Queue

December 12th, 2005

If you have ever tried to get a Message Bean working in Rational Software Development Platform 6.0 and Websphere MQ 5.3, you would know why this tutorial is needed good luck….

As you go through this tutorial there are things to keep in hand

  • Queue Manager Name : QM_headley (default on my machine)
  • Queue Manager Port :1414 (default)
  • Queue name: headleyqueue (created this under QM_headley in MQ explorer )
  • File System url : file:/C:/JNDI-Directory/mq
  • File System JNDI Connection Factory name : headleyCF
  • File System JNDI Queue name : headleyq
  • Websphere Provider Connection Factory name: jms/headleyCF
  • Websphere Provider Queue name: jms/headleyq
  • Listener port name: headleyListener
  • Test Java code: QueueTest.java QueuePut.java

Read the rest of this entry »