Thursday, February 28, 2013

Flushing with a volatile!

Core Java provides different ways of ensuring visibility of actions on memory performed by one thread to other threads. You probably can think of synchronization, marking a variable volatile or using a thread safe collection from java.util.concurrent package.

Today we will explore another, less obvious approach. We will use a volatile variable to ensure visibility of another non-volatile variable. Lets start with a simple but flawed example:
class BadTask implements Runnable {
 boolean keepRunning = true;
 
 @Override
 public void run() {
  while(keepRunning) {
  }
  System.out.println("Done.");
 }
}

public class VolatileExp {
 public static void main(String[] args) throws InterruptedException {
  BadTask r = new BadTask();
  new Thread(r).start();
  Thread.sleep(1000);
  r.keepRunning = false;
  System.out.println("keepRunning is false");
 }
}
The intention of this code is to let BadTask run for 1 second and after that to stop it by setting keepRunning boolean to false. As simple as it may look this code is doomed to fail - the BadTask won’t stop after 1 second and will run until you terminate the program manually. If it works fine in your environment try a different one. In my case the above code fails constantly on the following:
$ cat /proc/cpuinfo | egrep "core id|physical id" | tr -d "\n" | sed s/physical/\\nphysical/g | grep -v ^$ | sort | uniq | wc -l
12
$ java -version
java version "1.6.0_26"
Java(TM) SE Runtime Environment (build 1.6.0_26-b03)
Java HotSpot(TM) 64-Bit Server VM (build 20.1-b02, mixed mode)
$ cat /proc/version
Linux version 2.6.32-33-server (buildd@allspice) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5) ) #72-Ubuntu SMP Fri Jul 29 21:21:55 UTC 2011
If the program does not stop you may wonder what happend. In short - the main thread and the thread running BadTask have been executed on different cores. Each core has its own set of registers and caches. The new value of keepRunning has been written to one of these without being flushed to the main memory. Thus it is not visible to the code running on a different core.

Ok, how we can fix it? The simplest and the most correct way is to mark this variable volatile. Another approach would be to acquire a common lock when accessing it but that would be definetly an overkill.

So what we will do today? We will introduce another variable marked with a volatile keyword! In the above code it does not make much sense and is only for demonstrating some aspects of Java memory model. But think about a scenario where there are more variables of keepRunning nature. Have a look at the below code that does not have visibility problem anymore:
class BadTask implements Runnable {
        boolean keepRunning = true;
        volatile boolean flush = true;

        @Override
        public void run() {
                while(keepRunning) {
                        if(flush);
                }
                System.out.println("BadTask is done.");
        }
}

public class VolatileExp extends Thread {

        public static void main(String[] args) throws InterruptedException {
                BadTask r = new BadTask();
                new Thread(r).start();
                Thread.sleep(1000);
                r.keepRunning = false;
                r.flush = false;
                System.out.println("keepRunning is false");
        }
}
So as already mentioned we have introduced a new volatile variable “flush”. We do two things with it. First, we do a write operation in the main thread, right after modifying a non-volatile keepRunning variable. Second, in the thread running BadTask, we do a read operation on it.
Now, how come the value of keepRunning is flushed to the main memory? This is guaranteded by the current Java memory model. According to JSR133 “writing to a volatile field has the same memory effect as a monitor release, and reading from a volatile field has the same memory effect as a monitor acquire”. Thus, actions on memory done by one thread before writing to a volatile variable will be visible to another thread after reading that variable.

This is an advanced technique which should be used sparingly only when the performance has the highest priority. If you are looking for a real life adaptation of it, you can have a look at the ConcurrentHashMap from java.util.concurrent package.

No comments:

Post a Comment