|
The Java Specialists' Newsletter
Issue 155 2008-01-16
Category:
Concurrency
Java version: 5+
Subscribe Free
RSS Feed
The Law of the Micromanagerby Dr. Heinz M. KabutzAbstract:
In good Dilbert style, we want to avoid having
Pointy-Haired-Bosses (PHBs) in our code. Commonly called
micromanagers, they can make a system work extremely
inefficiently. My prediction is that in the next few years,
as the number of cores increases per CPU, lock contention
is going to be the biggest performance problem facing
companies.
Welcome to the 155th issue of The Java(tm) Specialists' Newsletter. Last Tuesday, we had
a bit of a break from the rain here on Crete, so I packed my
laptop in my Suzuki Jimny and headed for a more interesting location.
As you are a Java Specialist reader this will almost certainly be of
interest. We have recently launched a course specifically designed for top
Java professionals like yourself. The course is the result of all the
knowledge and experience gained from publishing 150 advanced Java
Newsletters, teaching hundreds of seminars and writing thousands of lines of
Java code and offers Java programmers the chance to truly master the Java
Programming Language. For more information check out the
Master's Course.
Plus we are currently offering the Java Specialist Master Course in
Oslo, Norway from
the 28th to 31st of January, together with our partner Bouvet.
The Law of the Micromanager
We are looking at a series of laws to make sense of Java concurrency. Just
to remind you, here they are again. In this newsletter, we will look at the
Law of the Micromanager.
- The Law of the Sabotaged Doorbell
- The Law of the Distracted Spearfisherman
- The Law of the Overstocked Haberdashery
- The Law of the Blind Spot
- The Law of the Leaked Memo
- The Law of the Corrupt Politician
- The Law of the Micromanager
- The Law of Cretan Driving
- The Law of Sudden Riches
- The Law of the Uneaten Spinach
The Law of the Micromanager
Even in life, it wastes effort and
frustrates the other threads.
mi·cro·man·age: to manage or control with excessive attention to
minor details.
When tuning the performance of a system, we typically start
by measuring the utilisation of hardware (CPU, disk, memory
and IO). If we do not find the bottleneck, we go one step up
and look at the number of threads plus the garbage collector
activity. If we still cannot find the bottleneck, in all
likelihood we have a thread contention problem, unless our
measurements or test harness were incorrect.
In the past few laws, I have emphasised the importance of
protecting data against corruption. However, if we add too
much protection and at the wrong places, we may end up
serializing the entire system. The CPUs can then not be
utilized fully, since they are waiting for one core to exit a
critical section.
A friend of mine sent me some code that he found during a code review. The
programmers had declared String WRITE_LOCK_OBJECT, pointing to
a constant String, like so:
String WRITE_LOCK_OBJECT = "WRITE_LOCK_OBJECT";
// Later on in the class
synchronized(WRITE_LOCK_OBJECT) { ... }
But wait a minute? String constants are stored in the intern
table, thus if the String
"WRITE_LOCK_OBJECT" occurs in two
classes, they will point to the same object in the constant
pool. We can demonstrate this with classes A and B, which
each contain fields with a constant String.
public class A {
private String WRITE_LOCK_OBJECT = "WRITE_LOCK_OBJECT";
public void compareLocks(Object other) {
if (other == WRITE_LOCK_OBJECT) {
System.out.println("they are identical!");
System.out.println(
System.identityHashCode(WRITE_LOCK_OBJECT));
System.out.println(
System.identityHashCode(other));
} else {
System.out.println("they do not match");
}
}
public static void main(String[] args) {
A a1 = new A();
A a2 = new A();
a1.compareLocks(a2.WRITE_LOCK_OBJECT);
}
}
When you run A.main(), you see that with two instances of A, the field
WRITE_LOCK_OBJECT is pointing to the same String instance.
they are identical!
11394033
11394033
Similarly, in B we compare the internal String to the String inside A, and
again they are identical:
public class B {
private String WRITE_LOCK_OBJECT = "WRITE_LOCK_OBJECT";
public static void main(String[] args) {
B b = new B();
A a = new A();
a.compareLocks(b.WRITE_LOCK_OBJECT);
}
}
they are identical!
11394033
11394033
If the String had been created with new, it would
have been a different object, but I still think this is a bad idiom to use
for locking:
public class C {
private String WRITE_LOCK_OBJECT =
new String("WRITE_LOCK_OBJECT");
public static void main(String[] args) {
C c = new C();
A a = new A();
a.compareLocks(c.WRITE_LOCK_OBJECT);
a.compareLocks(c.WRITE_LOCK_OBJECT.intern());
}
}
As we would expect, the Strings are now different objects, but if we
intern() it, it would point to the same object again:
they do not match
they are identical!
11394033
11394033
Since he had repeated this pattern throughout his code, his entire system
was synchronizing on one lock object. This would not only cause terrible
contention, but also raise the possibilities of deadlocks and livelocks.
You could also lose signals by having several threads waiting on a lock by
mistake. Basically, this is a really bad idea, so please do not use
it in your code.
Amdahl's Law
Before we continue, we should consider Amdahl's Law in relation to
parallelization. According to Wikipedia,
Amdahl's law states that if F is the fraction of
a calculation that is sequential (i.e. cannot benefit from parallelization),
and (1 - F) is the fraction that can be
parallelized, then the maximum speedup that can be achieved by using
N processors is
1
-------------
F + (1 - F)/N
For example, if F is even just 10%, the problem
can be sped up by a maximum of a factor of 10, no matter what the
value of N is. For all practical
reasons, the benefit of adding more cores decreases as we get
closer to the theoretical maximum speed-up of 10. Here
is an example of how we can calculate it:
public class Amdahl {
private static double amdahl(double f, int n) {
return 1.0 / (f + (1 - f) / n);
}
public static void main(String[] args) {
for (int i = 1; i < 10000; i *= 3) {
System.out.println("amdahl(0.1, " + i + ") = " + amdahl(0.1, i));
}
}
}
We can see from the output that the benefit of adding more
cores decreases as we get closer to the theoretical maximum
of 10:
amdahl(0.1, 1) = 1.0
amdahl(0.1, 3) = 2.5
amdahl(0.1, 9) = 5.0
amdahl(0.1, 27) = 7.5
amdahl(0.1, 81) = 9.0
amdahl(0.1, 243) = 9.642857142857142
amdahl(0.1, 729) = 9.878048780487804
amdahl(0.1, 2187) = 9.959016393442623
amdahl(0.1, 6561) = 9.986301369863012
(Thanks to Jason Oikonomidis and Scott Walker for pointing
this out)
Atomics and Compare-And-Swap (CAS)
Since Java 5, the language includes support for atomics. Instead of
synchronizing access to our fields, we can use atomic references or
atomic primitives. Atomics use the Compare-And-Swap
approach, supported by hardware (the CMPXCHG instruction on Intel).
For example, if you want to do a ++i operation with
AtomicInteger, you would call the
AtomicInteger.addAndGet(int delta) method. This
would use an optimistic algorithm that assumes we will not have a conflict.
If we do have a conflict, we simply try again. The addAndGet()
method does something like this:
-
get the current value and store it in a local variable current
-
store the new value (current + delta) in a local variable next
-
call compareAndSet with the current value and the next value
-
inside the compareAndSet, if the current value matches the
value in the AtomicInteger, then return true; otherwise false
-
if compareAndSet returns true, we return next; otherwise start from
1.
We can thus have thread safe code without explicit locking. There is still
a memory barrier though, since the field inside the AtomicInteger is marked
as volatile to prevent the visibility problem seen
(or perhaps not seen would be more appropriate ;-) in The Law of the Blind Spot.
If we look back at the example in our previous law, The Law of the Corrupt Politician, we can simplify
it greatly by using an AtomicInteger, instead of explicitly
locking. In addition, the throughput should be better as well:
import java.util.concurrent.atomic.AtomicInteger;
public class BankAccount {
private final AtomicInteger balance =
new AtomicInteger();
public BankAccount(int balance) {
this.balance.set(balance);
}
public void deposit(int amount) {
balance.addAndGet(amount);
}
public void withdraw(int amount) {
deposit(-amount);
}
public int getBalance() {
return balance.intValue();
}
}
There are other approaches for reducing contention. For
example, instead of using HashMap or
Hashtable for shared
data, we could use the ConcurrentHashMap. This
map partitions the buckets into several sections, which can
then be modified independently. When we construct the
ConcurrentHashMap, we can specify how many
partitions we would like to have, called the
concurrencyLevel (default is 16). The
ConcurrentHashMap is an excellent class for
reducing contention.
Another useful class in the JDK is the
ConcurrentLinkedQueue, which uses an efficient
wait-free algorithm based on one described in Simple,
Fast, and Practical Non-Blocking and Blocking Concurrent
Queue Algorithms. It also uses the compare-and-swap
approach that we saw with the atomic classes.
Since Java 6, we also have the ConcurrentSkipListMap
and the ConcurrentSkipListSet as part of the
JDK.
There is a lot more that I could write on the subject of
contention, but I encourage you to do your own research on
this very important topic, which will become even more
essential as the number of cores increases.
Kind regards from Crete
Heinz
Concurrency Articles
Related Java Course
|