Javaspecialists

Running on Java 24+36-3646 (Preview)

Home of The JavaSpecialists' Newsletter

220bRuntime.getRuntime().availableProcessors() (Follow-Up)

Author: Dr Heinz M. KabutzDate: 2014-05-30Java Version: 8Sources on GitHubCategory: Concurrency

Abstract: Whenever we have thread pools in our system, we need to make their size configurable. In this follow-up, we continue looking at some issues with the common fork/join pool that is used in Java 8.

A quick follow-up to the previous newsletter on Runtime.getRuntime().availableProcessors().

The parallelism of the common pool is typically calculated as availableProcessors() - 1, unless we specify it differently as a system property. However, if you run Java on a single-core machine, it sets the common pool parallelism to 1, whereas the more correct value would be zero. You might wonder whether anybody on this planet still has single-core machines? Actually, it has become vogue to give developers shared virtual development environments, where they are allocated a single core but oodles of memory. With virtualized boxes, it is quite likely that you will have just one core. The problem comes in when code such as the parallelSort algorithm tries to make a decision whether to run in parallel or to default to the old sort mechanism. In the Arrays.parallelSort() method, they look at the value of ForkJoinPool.getCommonPoolParallelism(). Since the value for a single-core and dual-core system is the same (1), it calls the default Arrays.sort() method. I thus stand by my assertion that the 30% performance boost claim on a dual-core machine is at best thumb sucked.

However, for streams it works a little bit better. Since for a dual-core system, we have a Fork/Join pool of size 1, we actually will have two threads working: 1. The thread in the pool and 2. the submitting thread. I made the typical mistake where you are asked to count all the people in the room and you forget to include yourself in the total :-) So parallel() does work - sort of. It has the correct number of workers based on the value that is returned by availableProcessors(). Except of course if you have a single-core machine, in which case parallel() will use two threads instead of just the one that is mapped to a physical core.

Pierre-yves Saumont kindly sent me a piece of code that proves that parallel() defaults to the availableProcessors() value (with some slight modifications by me - any mistakes are my fault :-))

import java.util.*;
import java.util.concurrent.*;
import java.util.stream.*;

public class CountThreads {
  private static Map<String, Integer> map = new ConcurrentHashMap<>();

  public static void main(String... args) {
    System.out.println(IntStream.range(1, 1_000_000).
        parallel().filter(CountThreads::isPrime).count());
    map.forEach((k, v) -> System.out.println(k + ": " + v));
  }

  public static boolean isPrime(int n) {
    map.compute(Thread.currentThread().getName(),
        (k, v) -> v == null ? 1 : v + 1);
    if (n % 2 == 0) return false;
    for (int i = 3; i * i <= n; i += 2) {
      if (n % i == 0) {
        return false;
      }
    }
    return true;
  }
}

On a dual-core and single-core systems, you'd see the following output:

78498
ForkJoinPool.commonPool-worker-1: 499999
main: 500000

On my 1-4-2 machine, with 8 hyperthreads, we see that 8 threads were part of the work:

78498
ForkJoinPool.commonPool-worker-7: 125000
ForkJoinPool.commonPool-worker-1: 125000
ForkJoinPool.commonPool-worker-2: 125000
main: 125000
ForkJoinPool.commonPool-worker-5: 125000
ForkJoinPool.commonPool-worker-6: 156249
ForkJoinPool.commonPool-worker-3: 125000
ForkJoinPool.commonPool-worker-4: 93750

If you measure the performance of something like parallel sort, which requires lots of memory access, you might find that your hyperthreads will give you close to linear speedup. This is because the biggest bottleneck is the speed at which the cache lines can be filled. Hyperthreading allows some of that to happen in parallel. Finding prime numbers is less dependent on memory and thus you would not see as much of a speedup beyond your physical cores.

In addition, the sorting algorithm for parallelSort() is different. For some of the data sets that I tried, a single-threaded Arrays.sort() was much faster than an 8-threaded Arrays.parallelSort(), especially if the data set was already sorted.

Lastly, I tried writing the CountThreads code with the Fork/Join framework. It came out looking like this:

import java.util.concurrent.*;

public class CountThreadsForkJoin {
  public static void main(String... args) {
    System.out.println(ForkJoinPool.commonPool().invoke(
        new PrimeSearcher(1, 1_000_000)));
  }

  private static class PrimeSearcher extends RecursiveTask<Integer> {
    private static final int THRESHOLD = 10000;
    private final int from;
    private final int to;

    public PrimeSearcher(int from, int to) {
      this.from = from;
      this.to = to;
    }

    protected Integer compute() {
      if (to - from < THRESHOLD) {
        return serialCount();
      }
      int from0 = from;
      int to0 = from + (to - from) / 2;
      int from1 = to0 + 1;
      int to1 = to;
      PrimeSearcher ps0 = new PrimeSearcher(from0, to0);
      PrimeSearcher ps1 = new PrimeSearcher(from1, to1);
      ps0.fork();
      return ps1.invoke() + ps0.join();
    }

    private Integer serialCount() {
      int count = 0;
      for (int i = from; i < to; i++) {
        if (CountThreads.isPrime(i)) count++;
      }
      return count;
    }
  }
}

I would like to meet the person who thinks that Java 8 streams are not an improvement over the pain of manually coding Fork/Join:

import java.util.stream.*;

public class CountThreadsStream {
  public static void main(String... args) {
    System.out.println(IntStream.range(1, 1_000_000).
        parallel().filter(CountThreads::isPrime).count());
  }
}

But time will tell how many use cases we can find for parallel() in the real world.

Kind regards from Crete

Heinz

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Browse the Newsletter Archive

Categories:

Subscribe Now

Get: every new article, exclusive invites to live webinars, our top-10 articles ever, access to expert tutorials & more!

Agree to JavaSpecialists Privacy Policy *

Read our Privacy Policy

About the Author

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack

Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...

JavaSpecialists

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations. Find Out More

Javaspecialists

220bRuntime.getRuntime().availableProcessors() (Follow-Up)

Author: Dr Heinz M. KabutzDate: 2014-05-30Java Version: 8Sources on GitHubCategory: Concurrency

Comments

Related Articles

Browse the Newsletter Archive

About the Author

Superpack

Free Java Book

Java Training

Java Consulting

JavaSpecialists

Quick Links

Connect With Us

Javaspecialists

220bRuntime.getRuntime().availableProcessors() (Follow-Up)

Author: Dr Heinz M. KabutzDate: 2014-05-30Java Version: 8Sources on GitHubCategory: Concurrency

Comments

Related Articles

282 Biased locking a goner, but better things Loom ... 2020-07-21

160 The Law of the Uneaten Lutefisk 2008-05-12

241b Concurrency Puzzle Helpful Hints 2016-10-04

Browse the Newsletter Archive

Subscribe Now

About the Author

Superpack

Free Java Book

Java Training

Java Consulting

JavaSpecialists

Quick Links

Connect With Us