Debugging EHCache in a cluster
Yesterday I had to find a bug in EHCache in a cluster installation, and wanted to use the EHCache remote debugger, as described here : http://ehcache.org/documentation/remotedebugger.html
It turns out the documentation wasn't very clear, and it wasn't less clear was where the package could be found. In fact it can be retrieved here : http://sourceforge.net/projects/ehcache/files/ehcache-debugger/ (note the fact that it seems the name of the debugger is either "remote debugger" or "debugger", but it's the same code base).
Now the tricky part was to make it work with Jahia. The way the debugger works is to actually participate in the cluster as an EHCache cluster node. What the documentation doesn't tell you is that in order to participate in the cluster, it will need to be able to deserialize all the objects it receives in the cluster messages, and this is why your application JARs are required. Also, I have tested it with JGroups replication, and it seems to work fine, so it can safely be used in other setupts than the RMI replication.
Another problematic part of the documentation is the fact that the example command line mixes the -classpath and -jar command line options, which isn't supported by the JDK 1.5. So the example command line from the documentation will not work. Also, as is the case with Jahia, there might be a lot of application JARs, so it can be quite tedious to list them all. I put a little shell script together, that will automatically create the classpath correctly for a Jahia installation, which I am showing here :
debug_ehcache.sh
--------------------------
buildClassPath() {
jar_dir=$1
if [ $# -ne 1 ]; then
echo "Jar directory must be specified."
exit 1
fi
class_path=
c=1
for i in `ls $jar_dir/*.jar`
do
if [ "$c" -eq "1" ]; then
class_path=${i}
c=2
else
class_path=${class_path}:${i}
fi
done
echo $class_path
#return $class_path
}
JAHIA_LIBS=/Users/loom/java/deployments/jahia-6-0-hotfix/apache-tomcat-6.0.18/webapps/ROOT/WEB-INF/lib
JAHIA_SHARED_LIBS=/Users/loom/java/deployments/jahia-6-0-hotfix/apache-tomcat-6.0.18/lib
JAHIA_CLASSPATH=`buildClassPath ${JAHIA_LIBS}`
JAHIA_SHARED_CLASSPATH=`buildClassPath ${JAHIA_SHARED_LIBS}`
CLASSPATH=${JAHIA_CLASSPATH}:${JAHIA_SHARED_CLASSPATH}:./backport-util-concurrent-3.1.jar:./commons-logging-1.0.4.jar:./commons-collections-3.2.jar:./jsr107cache-1.0.jar:./ehcache-debugger-1.5.0.jar
export CLASSPATH
java net.sf.ehcache.distribution.RemoteDebugger ehcache-jahia_cluster.xml $1 $2 $3 $4
-------------------
Before launching this, make sure you copy your EHCache configuration file from WEB-INF/classes/ehcache-jahia_cluster.xml . Also, as this file uses variables injected from the jahia.properties file, you will have to replace the variables with the real values, as in the example below :
<cacheManagerPeerProviderFactory
class="net.sf.ehcache.distribution.jgroups.JGroupsCacheManagerPeerProviderFactory" properties="connect=TCP_NIO(start_port=7870;bind_addr=127.0.0.1;loopback=true;recv_buf_size=20000000;send_buf_size=640000;discard_incompatible_packets=true;max_bundle_size=64000;max_bundle_timeout=30;use_incoming_packet_handler=true;enable_bundling=true;use_send_queues=false;sock_conn_timeout=300;skip_suspected_members=true;use_concurrent_stack=true): TCPPING(initial_hosts=127.0.0.1[7870],127.0.0.1[7871];port_range=10;timeout=3000;num_initial_members=2): MERGE2(max_interval=100000;min_interval=20000): FD_SOCK: FD(timeout=10000;max_tries=5;shun=true): VERIFY_SUSPECT(timeout=1500): pbcast.NAKACK(gc_lag=100;retransmit_timeout=3000;discard_delivered_msgs=true): pbcast.STABLE: pbcast.GMS(join_timeout=5000;shun=true;print_local_addr=true): VIEW_SYNC(avg_send_interval=60000): FC(max_credits=2000000;min_threshold=0.10): FRAG2(frag_size=60000)" propertySeparator="::" />
You can then start using the script to listen to a cache in a Jahia cluster installation. Here is an example command line :
./debug_ehcache.sh SkeletonCache
The output looks like this :
Received removeAll notification.
Cache: SkeletonCache Notifications received: 1 Elements in cache: 0
Cache: SkeletonCache Notifications received: 1 Elements in cache: 0
Cache: SkeletonCache Notifications received: 1 Elements in cache: 0
Received put notification for element [ key = 2-normal-en-administrators|administrators|guest|users-$$$#$#G_ContentPage_2WORKFLOWSTATE-normalLANGUAGECODE-en#$#G_SITE-2#$#G_USERNAME-administrators|administrators|guest|users, value=org.jahia.services.cache.CacheEntry@9d04dc, version=1, hitCount=0, CreationTime = 1252940741198, LastAccessTime = 0 ]
Cache: SkeletonCache Notifications received: 2 Elements in cache: 1
Cache: SkeletonCache Notifications received: 2 Elements in cache: 1
Cache: SkeletonCache Notifications received: 2 Elements in cache: 1
Received put notification for element [ key = 2-normal-en-guest:0-$$$#$#G_ContentPage_2WORKFLOWSTATE-normalLANGUAGECODE-en#$#G_USERNAME-guest:0#$#G_SITE-2, value=org.jahia.services.cache.CacheEntry@8caee7, version=1, hitCount=0, CreationTime = 1252940747195, LastAccessTime = 0 ]
Cache: SkeletonCache Notifications received: 3 Elements in cache: 2
Cache: SkeletonCache Notifications received: 3 Elements in cache: 2
Cache: SkeletonCache Notifications received: 3 Elements in cache: 2
Received put notification for element [ key = 302-normal-en-guest:0-$$$#$#G_USERNAME-guest:0#$#G_SITE-2#$#G_ContentPage_302WORKFLOWSTATE-normalLANGUAGECODE-en, value=org.jahia.services.cache.CacheEntry@535057, version=1, hitCount=0, CreationTime = 1252940754196, LastAccessTime = 0 ]
This can really be a neat tool to diagnose or simply get a feel of how Jahia is using EHCache to communicate within a cluster. I hope this little blog entry will help you use this debugger, because despite the tricky setup it is very useful and a neat design.

