Java soft-references – usage consequences on a memory based architecture (data grid/caching, cloud)

My previous article on java references introduced the java referencing model with the nice features that it can include. One of them the soft-reference was as I explained very useful for cache and buffers in a VM.

In this article I will discuss the particular effect that using soft-references can have on a java cloud or data-grid. We will particularly focus on GigaSpaces as it is the currently more active solution as a Cloud-Computing end-to-end application server and build in java.

When a developer uses soft reference, he allows the JVM to take control on a part of the heap and manage it just the way she likes. The developer in that case really has no control at all. Even if the Garbage Collection cycles are controlled by JVM the developer can try to invoke GC (unless the -XX:+DisableExplicitGC is on but we really not consider this as a good practise). As all of us know calling an explicit GC is not a hundred percent guaranty of having a real full GC cycle but certainly help to reduce the heap in different scenarios. However this will for sure have no impact at all on the soft-reference usage.

So “why can this be bad?” After all the goal of soft-reference is too be a cache, use the heap but still guarantee the security of the JVM (no risk of OutOfMemoryException). Basically if there is no direct impact on most of application we can wonder what is the impact of the soft-reference on heap monitoring. Indeed the objects held by soft-reference are here but are not consuming heap in a “risky way”. Basically they cannot be considered as responsible for any memory leak or even “real usage”.

However a monitoring of the heap may make people think that the heap is really used and that basically they have a risk for their application. If people can be educated and can find out that memory is related to soft-reference, there is no statistics in the JVM that clearly shows it.

The problem is even more relevant when an application uses a self-monitoring to take actions at given SLA. This is basically what will and should happen in many cases on a java cloud. We can this way enter in SLA being triggered without any good reason.

This is why soft-reference usage must be used with caution in those environments.

 

If we look at the example of GigaSpaces:

-As a cloud provider GigaSpaces help you to monitor and to define SLAs based on the heap and memory usage. As we already saw this can be really dangerous (depending on your SLA action) if many soft references are used. We can run in scenarios where some part of your application will not be relocated on a VM because of these soft-references.– I want to really emphasis that such scenarios will mostly be bad usage of soft-reference in the application code or sometimes some really extreme situation that I will detail later.

-As a data-grid vendor, GigaSpaces has a very nice memory protection feature that allows the Spaces to protect themselves from OutOfMemory by cancelling write operations performed on it. These protections are based on memory SLAs (see http://www.gigaspaces.com/wiki/display/XAP66/Memory+Management+Facility#MemoryManagementFacility-MemoryUsage) and implies the same impact when the memory is consumed by soft-references.

Again the scenarios that will push to run in such situations are rare and quite extreme. But let details one of them that can happen with GigaSpaces.

The GigaSpaces software uses soft-reference in his communication protocol LRMI to manage buffers. In most situations this will never cause any high heap consumption. However when many client connect to the space to request concurrently big object or many small objects we may run in a situation where many buffers are created and will consume memory. This kind of scenario may happen typically in Grid Computing scenarios, when the whole Grid (we encounter mostly DataSynapse or Platform systems) tries to connect and perform request on a single data-grid node.

The question that comes next is: “so what can I do in such a case?”

The truth is that there is no easy answer to this question. The parameter -XX:SoftRefLRUPolicyMSPerMB help to tune the soft-reference garbaging in SUN JVM. However this parameter is not part of the java specification. It does indeed not exist in other JVM.

I will therefore recommend using in such use-case a SUN JVM to be able to perform the tuning that my application will require.

I don’t know if any of you have been involved in similar situations, and I wonder how you managed to handle this.

Java references

This article will discuss the different java referencing system, benefits and consequences of such usage. Many java developer doesn’t really know about them and may not understand JVM behaviour when interacting with third-party that use them.

Strong-reference

The ‘classic’ referencing model is the strong-reference.

//create a strong reference
Integer myInteger = new Integer(0);

In that case the developer manage a Strong reference to an Integer.
As long as the developer keeps a reference to this integer it is not available for garbage collection. Once the object is unreferenced (by assigning null to it or when the reference goes out of scope of the execution flow) the object becomes available for garbage collection using the classical garbage collection cycles.

Java 1.2 and following new references

Starting from java 1.2, 3 new different implementations of references have been introduced. They have, as it is detailed later, a close Relationship to the way they will be managed and garbaged.

  • WeakReference
  • PhantomReference
  • SoftReference

All of them extends from a main class Reference and are part of the java.lang.ref package (see http://java.sun.com/javase/6/docs/api/java/lang/ref/package-summary.html)

WeakReference

A WeakReference is an Object that holds a reference to an Object but do not prevent him for being garbaged. Back to the previous example, to get a WeakReference to the previously mentioned integer, a java developer will use the following:

Integer   myInteger = new Integer(0);
// create a weak reference to the previous integer
WeakReference<Integer> weakReference = new WeakReference<Integer>(myInteger);

In such a case the component that will manage weakReference will have an access to myInteger using

Integer integerOtherRef = weakReference.get();

If a StrongReference exists to your referenced object somewhere in your code then the get method will always return a reference to the object. If the only reference to the object is the WeakReference that you have, the object is electable for garbage collection. This means that as long as no garbage collection occurs the get method will return a reference to the object to you. However as soon as a garbage collection occurs, the object will be removed from the heap and the get method will return null.

Usage

The most interesting usage of weakreference is the ability to link an object to another that we don’t control the life-cycle by ourselves without creating any memory leak risk.
For example a part of code that you don’t manage uses different foo objects. The third-party code is responsible for managing the foo life-cycles: creates some of them and de-reference them. You can get reference to foo but there is no event that will tell you when a foo is de-referenced.
In that case using a weakreference is a very good practice. This way you won’t have any leak by keeping references to existing objects. A common usage of WeakReference is performed when having a map of object that you don’t manage as key to something that is your responsibility. In such a case you want to have the hash-map value de-referenced just when the key (object you don’t manage) is. The WeakHashMap does such a thing.

PhantomReference

To make it short, a PhantomReference is a WeakReference that have a get method that always return null.

Usage

At first glance it seems to not have any usage: why would I like to keep a reference to an object that I cannot event retrieve ?
In fact PhantomReference goal is to be able to track an object garbaging. This is where I need to introduce the ReferenceQueue.
The ReferenceQueue is a queue which contains reference objects that are appended by the garbage collector after the appropriate reachability changes are detected.
When you give a ReferenceQueue to a PhantomReference or WeakReference constructor, the Reference will be appended to your queue by the garbage collector when it’s available for garbage.

Integer myInteger = new Integer(0);
// create a reference queue to retrieve object ready to be garbaged
ReferenceQueue<Integer> referenceQueue = new ReferenceQueue<Integer>();
// create a phantom reference that will be associated to the reference queue
PhantomReference<Integer> phantomReference = new PhantomReference<Integer>(myInteger, referenceQueue);

The interest of the PhantomReference therefore is to get an indication by polling the queue that a given Object is going to be garbaged.

SoftReference

Last reference type is the SoftReference. SoftReference are reference on objects that are not part to the classical garbage collection cycles. An object that has a SoftReference only will not be garbaged even if there is no more strong-reference to it. The JVM will decide by itself whenever the object must be garbaged or not. Basically the behaviour of such reference depends on the heap that is available in the JVM.

Usage

A soft-reference is really good for caching purpose. Indeed JVM makes sure that it will not cause any trouble related to heap usage (no OutOfMemory are possible because of SoftReference usage) and that JVM will try to keep soft-reference as much at it can be.
Basically SoftReferences is ideal for flexible and secure caching in a JVM.

// create a strong reference
Integer myInteger = new Integer(0);
// create a soft reference
SoftReference<Integer> softReference = new SoftReference<Integer>(myInteger);
// unreference the strong reference: the JVM is now managing the life-cycle of the   soft-reference
myInteger = null;

Sources, links and related reading

Java GC tuning official documentation and JavaDoc:

http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html#1.1.Other%20Considerations%7Coutline

http://java.sun.com/javase/6/docs/api/java/lang/ref/package-summary.html

Related blogs:

http://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html

http://www.artima.com/insidejvm/ed2/gc17.html

Related projects:

Even if I didn’t really tested or looked deeply at it I found this cache based on weak-reference implementation that is a good illustration to the capabilities of alternate Referencing capabilities of java.

http://rcache.sourceforge.net/