Concurrency and performance

Massively concurrent programming will be the future of development. No doubt about that. But manipulating lots of threads sharing resources is not an easy task.
Evenso you managed to make your program correct – and considering the lack of tools or methodology to ensure it this is tricky – you have to take a look at performance.

Using multi thread programming technics is not a garanty to have better performances. Far from it. Recently on customer site I managed to improve performance more than an order of magnitude just by minimizing synchronization cost.

Here are a few simple tips to help you benefit from all your cores:

  • Minimize synchronized block length and synchronized method usage
  • Replace synchronized keyword by Lock
  • Replace Object.wait and Object.notify by Condition
  • Limit safe objects usage when not needed (like StringBuffer, old collection implementations)
  • Rely on java 5 concurrency features

This last point certainly is the most important. Java 5 introduced lots of concurrency related structures.

Atomic* types allows to manipulate safely primitive like object. Much more efficient than having to synchronize every access to, say, your global statistics. See javadoc for more details.

ConcurrentHashMap, CopyOnWriteArrayList, CopyOnWriteArraySet and others offer safe optimized collection. More lightweight than relying on Collections.synchronized* methods. See javadoc for more details.

New libraries adding more concurrent structures are now available, here are a few:

Interresting new metaphores to split complex tasks into independant subtasks should be added in Java 7.

Using Maven2 projects at googlecode.com

googlecode.com offers a nice hosting platform for your open source projects. You get SCM (either subversion or mercurial), a wiki, a bug tracker and download facilities.

Here are a few tips to facilitate Maven2 project integration with it (subversion is used as I am not familiar with mercurial yet).

Subversion access configuration

Add those informations in your pom.xml

<scm>
<connection>scm:svn:http://my-project.googlecode.com/svn</connection>
<developerConnection>scm:svn:https://my-project.googlecode.com/svn</developerConnection>
<url>https://my-project.googlecode.com/svn</url>
</scm>

When accessing the subversion repository (for instance while using maven release:prepare and maven release:perform) you must provide googlecode username and password to maven, using command-line parameters

-Dusername=googlecode.username -Dpassword=googlecode.password

or by modifying your settings.xml file

<server>
<!-- Will be used for scm connection -->
<id>googlecode.com</id>
<username>googlecode.username</username>
<password>googlecode.password</password>
</server>

Your googlecode password can be found under source tab of your googlecode.com project website, click on ‘When prompted, enter your generated googlecode.com password.’.

Issue management configuration

Add those informations in your pom.xml

<issueManagement>
<system>Google Code</system>
<url>http://code.google.com/p/my-project/issues/list</url>
</issueManagement>

Repository on subversion configuration

You can rely on googlecode subversion to provide access to your maven artifacts.

Add those informations in your pom.xml

<distributionManagement>
<repository>
<id>my-project-release</id>
<url>dav:https://my-project.googlecode.com/svn/maven2/releases</url>
</repository>
</distributionManagement>

An alternative would be to rely on the subversion wagon implementation as described here.

You will also need to modify your settings.xml file to provide googlecode.com credentials.

<server>
<id>my-project-release</id>
<username>googlecode.username</username>
<password>googlecode.password</password>
</server>

Web pages hosting

If you want to send html/css files to your svn repository and still be able to normally browse them you will have to fix the mime type which is defaulted to ‘text/text’. If you are using subversion you can manually set the mime type with those commands:

svn propset svn:mime-type 'text/html' *.html
svn propset svn:mime-type 'text/css' *.css

Java Performance – tips

In my day to day work at FastConnect I am often dealing with java performance that is mandatory in many of our projects. While digging into my code and java language I found some tips that helped us to perform better. I decided to share some of them with you in this article.

Note that these tips are various, some of them are quite extreme and can create a code complication that is not always welcome. Thus they must be used carefully and only when it is really required. Some other on the other way can also help to reach as well as performance an easier readability. Note that as always a good documentation and structure is necessary to make the overall comprehensible.

Limit Object creations

One aspect that can be easily implemented and that can help in frequently called scenarios is to limit the creation of Objects. As you know creating an Object requires first the JVM to allocate space for it and perform all the linking operations and second to manage this object in the garbage collection process, these two operations can have performance impact when dealing with frequent short-lived object creation.See the following example:

package fr.fastconnect.java.performance;

public class ObjectCreationOverheadDemo {
	/* Static fields and methods */
	/**
	 * 10.000.000
	 */
	private final static int ITERATIONS=10000000;

	public static void main(String[] args) {
		// launch a first test with both and no mesurement to not take
		// JVM initialisations in the measurement
		executeWithNewObject();
		executeWithExistingObject();
		// perform test and measure performance
		long start = System.currentTimeMillis();
		executeWithNewObject();
		long duration = System.currentTimeMillis()-start;
		System.out.println("Duration with new object creation : " + duration);
		start = System.currentTimeMillis();
		executeWithExistingObject();
		duration = System.currentTimeMillis()-start;
		System.out.println("Duration with single object       : " + duration);
	}

	private static void executeWithNewObject() {
		for(int i=0;i<ITERATIONS;i++) {
			ObjectCreationOverheadDemo demoObject = new ObjectCreationOverheadDemo();
			demoObject.intValue = i;
			demoObject.longValue = i;
			/*.. Application usage of the object ..*/
		}
	}

	private static void executeWithExistingObject() {
		ObjectCreationOverheadDemo demoObject = new ObjectCreationOverheadDemo();
		for(int i=0;i<ITERATIONS;i++) {
			demoObject.intValue = i;
			demoObject.longValue = i;
			/*.. Application usage of the object ..*/
		}
	}

	/*
	 * Object fields
	 */
	private int intValue;
	private long longValue;
}

Running the test on my machine I got the following output that confirm the impact of object creation.

Duration with new object creation : 174
Duration with single object : 26

Note that in this situation this is small objects and if the impact exist it is in fact very small. Indeed a few millisecond for so many iterations may seems limited. However think on the impact with bigger objects, I’m thinking on the impact on garbage collection mainly.

Working with String:

Java String methods are very fast, however a bad usage of String features can have a very heavy impact on your program performance.

String concatenation

When you are performing multiple string concatenations, try to use a StringBuilder instead of a ‘+’ between the two strings. StringBuilder class is the ultimate optimized solution to append a string to another.

package fr.fastconnect.java.performance;

public class StringConcatenationDemo {
	/* Static fields and methods */
	/**
	 * 10.000
	 */
	private final static int ITERATIONS=10000;
	private final static String strToAppend = "Str To Append";

	public static void main(String[] args) {
		// launch a first test with both and no mesurement
		// to not take JVM initialisations in the measurement
		executeStringPlus();
		executeStringBuilder();
		// perform test and measure performance
		long start = System.currentTimeMillis();
		String strPlusResult = executeStringPlus();
		long duration = System.currentTimeMillis()-start;
		System.out.println("Duration with '+' concatenation : " + duration);
		start = System.currentTimeMillis();
		String strBuilderResult = executeStringBuilder();
		duration = System.currentTimeMillis()-start;
		System.out.println("Duration with StringBuilder     : " + duration);
		System.out.println("String are the same : "+strPlusResult.equals(strBuilderResult));
	}

	private static String executeStringPlus() {
		String resultString="";
		for(int i=0;i<ITERATIONS;i++) {
			resultString += strToAppend;
		}
		return resultString;
	}

	private static String executeStringBuilder() {
		StringBuilder strBuilder = new StringBuilder();
		for(int i=0;i<ITERATIONS;i++) {
			strBuilder.append(strToAppend);
		}
		return strBuilder.toString();
	}
}

Running this example on my machine I got the following output:
Duration with ‘+’ concatenation : 7685
Duration with StringBuilder : 7
String are the same : true

Here again the benefit impact of StringBuilder is clear.

getBytes vs custom implementation

getBytes method in java is quite slow. This is related mainly to char encoding parameters and validations. In certain circonstances where encoding is the same on every machine and doesn’t cause issues, we can use an optimised custom implementation:

package fr.fastconnect.java.performance;

import java.util.Arrays;

public class StringGetBytesDemo {
	/* Static fields and methods */
	/**
	 * 10.000
	 */
	private final static int ITERATIONS=100000;
	private final static String str = "String to get bytes from";

	public static void main(String[] args) {
		// launch a first test with both and no mesurement
		// to not take JVM initialisations in the measurement
		execute();
		executeCustom();
		// perform test and measure performance
		long start = System.currentTimeMillis();
		byte[] bytes = execute();
		long duration = System.currentTimeMillis()-start;
		System.out.println("Duration with getBytes : " + duration);
		start = System.currentTimeMillis();
		byte[] bytesCustom = executeCustom();
		duration = System.currentTimeMillis()-start;
		System.out.println("Duration with custom impl     : " + duration);
		System.out.println("byte arrays are the same : "+Arrays.equals(bytes, bytesCustom));
	}

	private static byte[] execute() {
		for(int i=0;i<ITERATIONS-1;i++) {
			str.getBytes();
		}
		return str.getBytes();
	}

	private static byte[] executeCustom() {
		for(int i=0;i<ITERATIONS-1;i++) {
			char buffer[] = new char[str.length()];
			int length = str.length();
			str.getChars(0, length, buffer, 0);
			byte b[] = new byte[length];
			for (int j = 0; j < length; j++) {
				b[j] = (byte) buffer[j];
			}
		}
		char buffer[] = new char[str.length()];
		int length = str.length();
		str.getChars(0, length, buffer, 0);
		byte b[] = new byte[length];
		for (int j = 0; j < length; j++) {
			b[j] = (byte) buffer[j];
		}
		return b;
	}
}

Note that in many cases the java getBytes and custom method will just produce the exact same byte array. This is something you must check in your tests (providing a switch to use one or the other can also be a good idea).
Duration with getBytes : 74
Duration with custom impl : 15
byte arrays are the same : true

Serialization tips

In many situations serialization can be costly as well in term of size and as well performance. This is linked as IO operations are related to size of data, and serialization’s goal is as you know to be able to send data on an IO channel.
In such situation when performance is a must it may be useful to use a custom serialization.

Use Externalizable

To enable custom serialization for your object, you must implements the Externalizable interface. This interface will provide you 2 methods writeExternal and readExternal that will allow you to deal by yourself with the object serialization.
When using the native java serialisation on your objects, java will therefore use your Externalizable method instead of native serialization process.
Be careful to use this only if the custom serialization has better performance or footprint than the java classic serialization. This is mostly the case on objects that have complex fields and not only primitive.

Create byte array on your own

In some really extreme situations you may want to go even deeper in the performance optimization and create by yourself a byte array that you will use for writing to a network or file stream. (Or basically any stream). Indeed classic java serialization save some class meta-datas. All this overhead can be avoided by creating a custom byte-array. Using custom byte array creation allow also to perform custom batching etc. Basically this is really powerful but has also an impact on code readability. Thus such implementation must be used carefully, be well documented and must use methods with explicit names.

Try to size your data to avoid buffer resizing

When java developer want to create a byte array (serializing data dealing with various buffer…), the most common used class is the ByteArrayOutputStream. this object allow you to write bytes to a stream and to store it in a byte array. The big point with it is that the goal of it is too be able to resize this array anytime if you keep putting data in it. However as explained before the ByteArrayOutputStream can be costly because of ArrayCopy. If you are able to size the data correctly, or if you can send the data by chunk from a defined size, I encourage you to use the ByteBuffer class that will perform a lot better! It will bring however the constraint of not resizing the byte array.

When working directly with more deep layer native networking etc., it is even possible to create the ByteBuffer in the native heap! This will be even more faster to just put data in it as well as to send to network layer but you won’t be able to get any backing byte array back in you java application!! Also this kind of buffer use Native memory and will not be controlled by the heap so be even more careful regarding memory leaks ;)

Multithreading:

Avoid thread common variables and synchronization of resources

When you can do it try to avoid sharing any ‘thread common variables’. When you use a buffer for example try to create one per thread, even if you have to merge them at the end. To do so you can ether use ThreadLocal variables or even better just use full different instances (when you use spring remember that if singleton is the default object scope, using prototypes can be sometimes a better approach). Even if this sounds basic, keep it in mind: achieving a performing multithreaded implementation can be done only with the very minimum synchronization points.

Queue instead of synchronize

When your process can be done asyncronously and you have a synchronization point then queue your process (java.util.concurrent.BlockingQueue) and use a single consumer. This will allow that you don’t block all other thread because of this synchronization point and can help to make your application perform better. If you are afraid that the queue can grow too much don’t worry: BlockingQueue can limit the size of the queue. When this size will be reached then you will pay for the synchronization… Anyway in many situation this can help ;)

Use optimized concurrent classes

Use the java concurrent package that contains many Collections and tools that contains high performance implementations for concurrency scenario. This package contains lot of very useful classes that will perform a lot better than other implementations or custom basic synchronized.

Links

http://java.sun.com/developer/technicalArticles/Programming/Performance/

This list is of course not exhaustive and I invite each of you to comment the article and add your tips ;)