Java Finalization is Bullshit

The purpose of this article is to speak about some poorly understood mechanisms of the Java Virtual Machine. If you already know all these mechanisms, you should contact our RH to work at FastConnect ;-)

So please, consider the following program. If we run it with a java heap limited to 8Mb (thanks to the option -Xmx8m), it will fail with an ‘OutOfMemoryException’. But it will happen some time after 1000 loops ; other times after only 100 loops. It seems random.

class Padding
{
	private byte[] padding;
	Padding() {
		this.padding = new byte[1024*1024]; //1Mbyte
	}
	protected void finalize() throws Throwable {
		this.padding[0] = this.padding[1];
	}
}

public class Main
{
	public static void main(String[] args) throws Exception {
		int cpt = 0;
		try {
			for(;; ++cpt) {
				new Padding();
			}
		} catch (OutOfMemoryError oom) {
			System.err.println("oom after "+cpt+" loops.");
			throw oom;
		}
	}
}

Here a little quiz. Can you explain this behavior? Is there a leak in the program? Why the number of iterations before the OOM is random?

Before reading the following, I suggest you try to run this little program by yourself. Maybe you will have to increase the size of the padding array to reproduce the described behaviour.

The first question is not very difficult. The random behavior is due to the garbage collector. To provide good performance, recent Java VMs manage the memory at the same time as your code is executed. That’s why if the Java heap is not big enough it is possible to have an OOM even if there is no leak, even if the marked objects are always smaller than the java heap…

That’s why you must never be stingy with memory. A java program needs to have a lot of free memory to maximise safe and efficient memory management.

The next questions are more tricky. If we remove the ‘finalize method’, the program works fine, even with a very small heap. Can you explain why?

class Padding
{
	private byte[] padding;
	Padding() {
		this.padding = new byte[1024*1024]; //1Mbyte
	}
}

public class Main
{
	public static void main(String[] args) throws Exception {
		for(;;) {
			new Padding();
		}
	}
}

Again, I suggest you try to run this program. If you open ‘jconsole’ or ‘jvisualvm’, you should see a very abnormal memory activity: the curve is flat (but few very rare accidents). So, I have two questions:

  • Why the curve of this very simple program is so flat?
  • Why the existence of a ‘finalize method’ disrupts the memory management.

As a java expert, I expect you already read (several times) the documentation about the java garbage collector, its tuning and its algorithms.

The default GC uses a generational algorithm where the allocation in Eden generation is a simple stack allocation. With our simple program, Padding objects are never promoted from the young generation to the tenured generation during the collection. It means that objects are fastly allocated in stack. At some point later the Copy GC is triggered and it copies nothing. In this particular – but not rare – situation the java memory management is more efficient than the C++ malloc because allocations and deallocations are held in stack and batched.

I hope my explanations were clear. If not, you should read the following article: Java theory and practice: Urban performance legends, revisited.

But we have not answered the last and most tricky question: why the existence of a ‘finalize method’ disrupts the memory management. A hint is in the title of this article. Please consider the following program:

class Padding
{
	static java.util.Set<Padding> retention = new java.util.HashSet<Padding>();
	private byte[] padding;
	private int id;
	Padding(int id) {
		this.id = id;
		this.padding = new byte[1024*1024]; //1Mbyte
	}
	protected void finalize() throws Throwable {
		System.err.println("finalize is called for Padding[id="+id+"]");
		retention.add(this);
	}
}

public class Main
{
	public static void main(String[] args) throws Exception {
		int cpt = 0;
		try {
			for(;; ++cpt) {
				new Padding(cpt);
			}
		} catch (OutOfMemoryError oom) {
			System.err.println("oom after "+cpt+" loops.");
			throw oom;
		}
	}
}

Again I suggest you test it several times with different heap size (-Xmx64m for example). Its behaviour is very stable: it fails all of the time at the same moment.

On blogs and forums, I often read “finalize is called when the object is reclaimed”. This is wrong! When a finalizable object is no more referenced, the JVM adds it to the JVM’s finalization queue. At some point later, the JVM’s finalizer thread will dequeue it, call its ‘finalize method’, and record that its finalizer has been called. It is only when the garbage collector rediscovers that the object is unreachable that it reclaims it.

This behaviour is a problem: it makes the code of the Garbage Collector very complex, it is error prone for the JVM providers and it consumes a lot of memory. It is the main reason why the Finalization is not in the Java specification for mobile phones and (credit) card.

Elsewhere there is no good usage of the Java Finalization. For example, if you close sockets thanks to a ‘finalize method’, you will have a big problem if you have a lot of free memory but not enough file descriptors. Garbage Collectors are good for managing the managed memory and only that.

You can refer to this article which explains how finalization works and how not to use it.

Cyril Martin (mcoolive).

This entry was posted in Development, Performance and tagged by Cyril Martin. Bookmark the permalink.

About Cyril Martin

Architecte de 8 ans d’expérience en développement et administration, avec une compétence particulière sur la programmation Objet, les environnements distribués et complexes, les technologies de grilles de calculs, de caches distribuées, de bases NoSQL, de compilation et de virtualisation. Actuellement responsable des pôles HPC et Cloud Computing à FastConnect.