Home of The JavaSpecialists' Newsletter

220bavailableProcessors() Follow-up

Posted: 2014-05-30Category: Java 8Java Version: 1.8+Dr. Heinz M. Kabutz

Abstract: Whenever we have thread pools in our system, we need to make their size configurable. In this follow-up, we continue looking at some issues with the common fork/join pool that is used in Java 8.


A quick follow-up to the previous newsletter on Runtime.getRuntime().availableProcessors().

The parallelism of the common pool is typically calculated as availableProcessors() - 1, unless we specify it differently as a system property. However, if you run Java on a single-core machine, it sets the common pool parallelism to 1, whereas the more correct value would be zero. You might wonder whether anybody on this planet still has single-core machines? Actually, it has become vogue to give developers shared virtual development environments, where they are allocated a single core but oodles of memory. With virtualized boxes, it is quite likely that you will have just one core. The problem comes in when code such as the parallelSort algorithm tries to make a decision whether to run in parallel or to default to the old sort mechanism. In the Arrays.parallelSort() method, they look at the value of ForkJoinPool.getCommonPoolParallelism(). Since the value for a single-core and dual-core system is the same (1), it calls the default Arrays.sort() method. I thus stand by my assertion that the 30% performance boost claim on a dual-core machine is at best thumb sucked.

However, for streams it works a little bit better. Since for a dual-core system, we have a Fork/Join pool of size 1, we actually will have two threads working: 1. The thread in the pool and 2. the submitting thread. I made the typical mistake where you are asked to count all the people in the room and you forget to include yourself in the total :-) So parallel() does work - sort of. It has the correct number of workers based on the value that is returned by availableProcessors(). Except of course if you have a single-core machine, in which case parallel() will use two threads instead of just the one that is mapped to a physical core.

Pierre-yves Saumont kindly sent me a piece of code that proves that parallel() defaults to the availableProcessors() value (with some slight modifications by me - any mistakes are my fault :-))

import java.util.*;
import java.util.concurrent.*;
import java.util.stream.*;

public class CountThreads {
  private static Map<String, Integer> map = new ConcurrentHashMap<>();

  public static void main(String... args) {
    System.out.println(IntStream.range(1, 1_000_000).
    map.forEach((k, v) -> System.out.println(k + ": " + v));

  public static boolean isPrime(int n) {
        (k, v) -> v == null ? 1 : v + 1);
    if (n % 2 == 0) return false;
    for (int i = 3; i * i <= n; i += 2) {
      if (n % i == 0) {
        return false;
    return true;

On a dual-core and single-core systems, you'd see the following output:

ForkJoinPool.commonPool-worker-1: 499999
main: 500000

On my 1-4-2 machine, with 8 hyperthreads, we see that 8 threads were part of the work:

ForkJoinPool.commonPool-worker-7: 125000
ForkJoinPool.commonPool-worker-1: 125000
ForkJoinPool.commonPool-worker-2: 125000
main: 125000
ForkJoinPool.commonPool-worker-5: 125000
ForkJoinPool.commonPool-worker-6: 156249
ForkJoinPool.commonPool-worker-3: 125000
ForkJoinPool.commonPool-worker-4: 93750

If you measure the performance of something like parallel sort, which requires lots of memory access, you might find that your hyperthreads will give you close to linear speedup. This is because the biggest bottleneck is the speed at which the cache lines can be filled. Hyperthreading allows some of that to happen in parallel. Finding prime numbers is less dependent on memory and thus you would not see as much of a speedup beyond your physical cores.

In addition, the sorting algorithm for parallelSort() is different. For some of the data sets that I tried, a single-threaded Arrays.sort() was much faster than an 8-threaded Arrays.parallelSort(), especially if the data set was already sorted.

Lastly, I tried writing the CountThreads code with the Fork/Join framework. It came out looking like this:

import java.util.concurrent.*;

public class CountThreadsForkJoin {
  public static void main(String... args) {
        new PrimeSearcher(1, 1_000_000)));

  private static class PrimeSearcher extends RecursiveTask<Integer> {
    private static final int THRESHOLD = 10000;
    private final int from;
    private final int to;

    public PrimeSearcher(int from, int to) {
      this.from = from;
      this.to = to;

    protected Integer compute() {
      if (to - from < THRESHOLD) {
        return serialCount();
      int from0 = from;
      int to0 = from + (to - from) / 2;
      int from1 = to0 + 1;
      int to1 = to;
      PrimeSearcher ps0 = new PrimeSearcher(from0, to0);
      PrimeSearcher ps1 = new PrimeSearcher(from1, to1);
      return ps1.invoke() + ps0.join();

    private Integer serialCount() {
      int count = 0;
      for (int i = from; i < to; i++) {
        if (CountThreads.isPrime(i)) count++;
      return count;

I would like to meet the person who thinks that Java 8 streams are not an improvement over the pain of manually coding Fork/Join:

import java.util.stream.*;

public class CountThreadsStream {
  public static void main(String... args) {
    System.out.println(IntStream.range(1, 1_000_000).

But time will tell how many use cases we can find for parallel() in the real world.

Kind regards from Crete



Related Articles

Browse the Newsletter Archive

About the Author


Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

Nobody ever wants to call a Java performance consultant, but with first-hand experience repairing and improving commercial Java applications - JavaSpecialists are a good place to start...

Threading Emergency?

If your system is down, we will review it for 15 minutes and give you our findings for just 1 € without any obligation.