Home of The JavaSpecialists' Newsletter

024Self-tuning FIFO Queues

Posted: 2001-07-06Category: PerformanceJava Version: Dr. Heinz M. Kabutz



Welcome to the 24th issue of The Java(tm) Specialists' Newsletter. Winter has arrived in Cape Town with great rain and cold temperatures, much to the chagrin of the author. The only way to combat such is to sit next to the heater with a glass of South African 1997 Zonnebloem Cabernet Sauvignon and write Java newsletters. This newsletter has taken the most time to date - performance tests tend to have that effect on me :(((

Before I go on today's voyage, some feedback regarding the Socket Wheel idea:

I'm not quite sure who gave the best contributions regarding possible improvements for last newsletter's SocketWheel. Neil Bartlett (Algorithmics Canada) and John Vlissides (IBM USA) pointed out that the design was very close to the Reactor pattern by Schmidt. This is a great plus mark for patterns; they truly are things that OO developers around the world would come up with independently. Josh Rehman pointed out that it is not too clever to try minimze memory as that is very cheap nowadays, which I agree with. He also pointed out that JDK 1.4 has non-blocking IO making it possibly a lot easier to implement such a SocketWheel. Ecce Jezuch (Poland) suggested using the available() method to find out how many bytes would be available without blocking, but unfortunately under Windows the JDK always returned 0.

James Pereira provided some excellent information regarding sockets. It's quite technical, so I'm including it verbatim:

"Registered Ports, ports between 1024 and 49151, are listed by the IANA and on most systems can be used by applications or programs executed by users. Table C.2 specifies the port used by the server process as its contact port. The IANA registers uses of these ports as a convenience to the Internet community. To the extent possible, these same port assignments are used with UDP. The Registered Ports are in the numerical range of 1024-49151. The Registered Ports between 1024 and 5000 are also referred to as the Ephemeral Ports. At least on Windows , The TCP/Stack (OS) re-uses these ports internally on every socket connection cycling from 1024...5000 wrapping around to 1024 again. This could lead to some interesting problems if sockets are opened and close very quickly as there is usually a time delay before that port is made available again...

"Second, the number of user-accessible ephemeral ports that can be used to source outbound connections is configurable with the MaxUserPort registry entry (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters). By default, when an application requests any socket from the system to use for an outbound call, a port between the values of 1024 and 5000 is supplied. You can use the MaxUserPort registry entry to set the value of the highest port number to be used for outbound connections. For example, setting this value to 10000 would make approximately 9000 user ports available for outbound connections. For more details, see RFC 793. See also the MaxFreeTcbs and MaxHashTableSize registry settings (HKLM\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters).

"BTW: The stack and memory size issue is only an NT issue. On NT the stack size is configurabale via linker and EditBin.exe unfortunately we don't want to mess with the VM."

This information should solve the problem that I encountered with too many sockets causing exceptions.

Ideally we should be able to specify stack size per thread as some threads will need a lot and others only a little, so I still think that we have a problem with many threads and their stack size.

Another excellent contribution was made by Craig Larman, author of "Java 2 Performance and Idiom Guide", who suggested using an approach of multiplexing sockets described in his book to minimize the number of sockets needed per client connection. Next week I will try and write about this idea and give you a solution that hopefully will work acceptably. I always recommend Craig's book as an excellent book for the Java fundi who wants to have his mind activated regarding performance ideas. It is very difficult to write anything about performance as it is so dependent on your implementation and hardware. This leads me to this week's newsletters on self-tuning FIFO queues....

NEW: Refactoring to Java 8 Lambdas and Streams Workshop Are you currently using Java 6 or 7 and would like to see how Java 8 can improve your code base? Are you tired of courses that teach you a whole bunch of techniques that you cannot apply in your world? Check out our one day intensive Refactoring to Java 8 Lambdas and Streams Workshop.

Self-tuning FIFO Queues

HotSpot(tm) has caused a lot of trouble for the Java Specialist in the field, especially those of us with a few years experience. All of a sudden, all the hard-earned performance knowledge was wiped out in one foul swoop. The only thing left for us to do was to write the simplest code that could possibly work and let the HotStop compiler sort out performance for us. I liken it to the great financial crisis of Japan in the 90's, where no one knew whether he was coming or going and all the old certainties went out of the window. Luckily, unlike in Japan, we are not taking this too seriously, so our windows are still closed. Fortunately everyone knows that Java is slow ;-)

A lot of the performance tricks we used in old code actually make our code slower under HotSpot. Since we don't know what the performance of our code will be for a specific platform, would it be completely hairbrained to write self-tuning code? Write 3 algorithms, let the program measure on the target platform which performs best, and choose that algorithm for the duration of the VM.

To illustrate this idea, I want to write a FIFO queue that is based on a java.util.List implementation. A while ago I discovered that java.util.ArrayList is sometimes faster than java.util.LinkedList for FIFO queue implementations. The switch over occurs at a specific point in time, which we can measure beforehand. The switch-over point is dependent on the VM, whether we are using HotSpot(tm), etc. For example, on my little notebook with 256MB RAM and Pentium III 700, the cross-over point is 50 elements in the list when I use HotSpot, but 500 elements when I switch off hotspot compiling (sometimes!).

The interface that the FIFO queues will implement is very simple:

public interface FIFO {
  /** Add an object to the end of the FIFO queue */
  boolean add(Object o);
  /** Remove an object from the front of the FIFO queue */
  Object remove();
  /** Return the number of elements in the FIFO queue */
  int size();

We implement this interface and extend ArrayList and LinkedList:

// FIFOArrayList.java
import java.util.ArrayList;
public class FIFOArrayList extends ArrayList implements FIFO {
  public Object remove() {
    return remove(0);
// FIFOLinkedList.java
import java.util.LinkedList;
public class FIFOLinkedList extends LinkedList implements FIFO {
  public Object remove() {
    return remove(0);

We also write a SwappingOverFIFOQueue which has values for HIGH and LOW water marks. When we reach a HIGH water mark and we are busy using an ArrayList, we start using a LinkedList. On the contrary, if we reach a LOW water mark and we are busy using a LinkedList we start using an ArrayList.

In foresight to my next example, I have made it possible to set the watermarks, which also checks the optimal list types for all the lists currently in the system. We have to be careful to not get a memory leak by keeping handles to the instances of the SwappingOverFIFOQueue so we use WeakReferences to hold the references. Have a look at newsletter 15 for a discussion on Weak / Soft References.

// SwappingOverFIFOQueue.java
import java.util.*;
import java.lang.ref.WeakReference;
public final class SwappingOverFIFOQueue implements FIFO {
  /** The low value after which we switch over to ArrayList */
  private static int LOW = 30;
  /** The high value after which we switch down to LinkedList */
  private static int HIGH = 70;
  /** This list contains weak references of instances of this
      class */
  private static List instances = new LinkedList();
  /** We add the weak references in an initializer block */
    instances.add(new WeakReference(this));
  /** When we set the low and high water marks we go through all
      the existing instances and check for the optimal list type.
      If the references is null we remove the WeakReference from
      our instance list. */
  public static void setWaterMarks(int low, int high) {
    LOW = low;
    HIGH = high;
    Iterator it = instances.iterator();
    while(it.hasNext()) {
      WeakReference ref = (WeakReference)it.next();
      SwappingOverFIFOQueue q = (SwappingOverFIFOQueue)ref.get();
      if (q == null) {
      } else {

  private List list = new ArrayList();
  public Class getListType() { return list.getClass(); }
  private int size = 0;
  public boolean add(Object o) {
    try {
      return true;
    } finally {
      if (++size == HIGH) checkOptimalListType();
  public Object remove() {
    try {
      return list.remove(0);
    } finally {
      if (--size == LOW) checkOptimalListType();
  public int size() {
    return size;
  private void checkOptimalListType() {
    if (size >= HIGH && (!(list instanceof LinkedList))) {
      list = new LinkedList(list);
    } else if (size <= LOW && (!(list instanceof ArrayList))) {
      list = new ArrayList(list);

My test program takes the number of entries in the queue and then illustrates how often we can add/remove in 2 seconds for each of the queues. I found that you get the best performance results when you run your tests for about 2 seconds each, so I count iterations rather than milliseconds.

// SwappingOverFIFOQueueTest.java
import java.util.*;
public class SwappingOverFIFOQueueTest {
  private static int ENTRIES;
  public static void test(FIFO queue) {
    for (int i=0; i<ENTRIES; i++) {
      queue.add(new Object());
    long up_to = System.currentTimeMillis() + 2000;
    int iterations = 0;
    while(System.currentTimeMillis() <= up_to) {
      queue.add(new Object());
    System.out.println("\t" + iterations + " iterations");
  public static void main(String[] args) {
    if (args.length != 1) {
        "Usage: java SwappingOverFIFOQueueTest entries");
    ENTRIES = Integer.parseInt(args[0]);
    System.out.println("Entries = " + ENTRIES);
    test(new FIFOArrayList());
    test(new FIFOLinkedList());
    SwappingOverFIFOQueue q = new SwappingOverFIFOQueue();
    System.out.println("Current queue implementation " +

On my notebook, when I run this program with 0 entries, I get the following output:

Entries = 0
class FIFOArrayList
        4552883 iterations
class FIFOLinkedList
        2551017 iterations
class SwappingOverFIFOQueue
        3594810 iterations

With 100 entries I get:

Entries = 100
class FIFOArrayList
        1800877 iterations
class FIFOLinkedList
        2509328 iterations
class SwappingOverFIFOQueue
        2158451 iterations

And with 10000 entries I get:

Entries = 10000
class FIFOArrayList
        49500 iterations
class FIFOLinkedList
        812933 iterations
class SwappingOverFIFOQueue
        758657 iterations

We can thus see that the SwappingFIFOQueue is always faster than the worst case and slower than the best case, as one would logically expect. However, I chose the HIGH and LOW values from some tests that I made on my notebook, for that specific JVM. If I take the JDK 1.2.2 that comes with JBuilder, for 100 entries I get:

Entries = 100
class FIFOArrayList
        1434122 iterations
class FIFOLinkedList
        1307108 iterations
class SwappingOverFIFOQueue
        1178115 iterations

Or if I use the -Xint mode for JDK 1.3 under Windows to switch off the hotspot compiler, for 100 entries I get

Entries = 100
class FIFOArrayList
        497550 iterations
class FIFOLinkedList
        480599 iterations
class SwappingOverFIFOQueue
        392314 iterations

In both cases, the values of the SwappingOverFIFOQueue were worse than for both the ArrayList and the LinkedList.

We therefore write a Profiler that finds ideal HIGH/LOW water marks for the JVM that is running on your system and sets up the SwappingOver water marks.

// SwappingOverFIFOQueueProfiler.java
import java.util.*;
  For the sake of brevity we only consider two implementations
  of java.util.List, namely java.util.ArrayList and
  java.util.LinkedList. */
public class SwappingOverFIFOQueueProfiler {
  private static boolean isArrayListFaster(int entries) {
    System.out.println("isArrayListFaster(" + entries + ")");
    return countIterations(new ArrayList(), entries) >
      countIterations(new LinkedList(), entries);
  private static int countIterations(List list, int entries) {
    for (int i=0; i<entries; i++) {
      list.add(new Object());
    long end = System.currentTimeMillis() + 1000;
    int iterations = 0;
    while(System.currentTimeMillis() <= end) {
      list.add(new Object());
    return iterations;

  static {
    int checks = 0;
    int watermark = 1;
    int bestWatermark = 0;
    for (int i=0; i<16; i++) {
      if (isArrayListFaster(watermark)) {
        bestWatermark = Math.max(watermark, bestWatermark);
        watermark *= 2.0;
      } else {
        watermark *= 0.75;
        if (watermark <= bestWatermark)
          watermark *= 1.25;
    System.out.println("Best watermark = " + bestWatermark);
    int low = (int)(bestWatermark * 0.75);
    int high = (int)(bestWatermark * 1.25);
    System.out.println("Setting LOW to " +  low +
      " and HIGH to " + high);
    SwappingOverFIFOQueue.setWaterMarks(low, high);
  public static void main(String[] args) {
    SwappingOverFIFOQueueTest.main(new String[] { "0" });
    SwappingOverFIFOQueueTest.main(new String[] { "100" });
    SwappingOverFIFOQueueTest.main(new String[] { "10000" });

If we load this class in our system then it will do measurements of where the swap-over between ArrayList and LinkedList performance occurs. On my computer, with JDK 1.3 and HotSpot, the swap-over was measured to happen at about 32 entries in the list. When I switch off the HotSpot, it occurs at about 121 entries, and under JDK 1.2.2 it happens at about 303.

After spending about 10 hours on this stupid newsletter, I have to conclude that it would be better to stick to a LinkedList for a FIFO queue as it is a better "average" performer. Perhaps the lesson I've learnt from this newsletter is that we must be careful of writing code which is too complicated as it tends to be more difficult to optimize. As performance guru Craig Larman pointed out though, we must be sure to not ignore performance altogether; our customers might just kill the project if the prototypes perform like dogs.

I always appreciate any feedback, both positive and negative, so please keep sending your ideas and suggestions. Please also remember to take the time to send this newsletter to others who are interested in Java.



Related Articles

Browse the Newsletter Archive

About the Author


Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

Nobody ever wants to call a Java performance consultant, but with first-hand experience repairing and improving commercial Java applications - JavaSpecialists are a good place to start...

Threading Emergency?

If your system is down, we will review it for 15 minutes and give you our findings for just 1 € without any obligation.