Running on Java 22-ea+27-2262 (Preview)
Home of The JavaSpecialists' Newsletter

258ShuffleCollector

Author: Dr. Heinz M. KabutzDate: 2018-04-09Java Version: 8Category: Tips and Tricks
 

Abstract: Sorting a stream is easy. But what if we want the opposite: shuffling? We can shuffle a List with Collections.shuffle(List). But how can we apply that to a Stream? In this newsletter we show how with Collectors.collectingAndThen().

 

Welcome to the 258th edition of The Java(tm) Specialists' Newsletter. We celebrated Orthodox Easter Sunday in the remote Cretan mountain village of Kampoi. Our kids had hopped into the Viano and after about an hour of winding roads we alighted at our friends' house. My son had come earlier and with Koumparos Giorgos was grilling a huge mountain of lamb chops. They urged me to start munching right away. "Braaier's privilege", we call that in South Africa. A cup of excellent Nikolioudakis wine was thrust into my hands with shouts of "Chronia Polla". Best lamb I tasted on Crete, and I have sampled extensively. We feasted and chatted and had a thoroughly enjoyable and relaxing time. Everyone should experience Greek Easter. Only one advice if you celebrate it on Crete - don't go hiking in lonely forests on that particular day.

javaspecialists.teachable.com: Please visit our new self-study course catalog to see how you can upskill your Java knowledge.

ShuffleCollector

Our JavaSpecialists.EU website is migrating to Java 9. Before it was on Java 7. Yes, yes, where we thinking? Selling a course on refactoring to Java 8 streams and then not doing the transition for our own site? A lot of the students attending our courses in 2018 are stuck in the Java 6 universe on various projects. In an ideal world, we would all be deploying under Java 10. But there are practical considerations that prevent us from moving as fast as we want to. JavaSpecialists.EU purred along in Java 7, with not enough "win" in the new language features to encourage us to migrate. Recently, with a bit of spare time available, we upgraded to Java 9 despite the lacklustre tangible benefits. Don't fix what ain't broken.

I am pleased with the result. The old Collection filtering and transformation code is gone and in its stead we have Java 8 Streams and Lambdas. A sight of beauty. Only one small method kept the ugly collection mess: getRandomRelatedNewsletters(). When someone reads one of our articles, we suggest three random newsletters of the same category. We used Collections.shuffle(List) to mix them around a bit. Here is the pre-Java 8 code:

public Collection<Newsletter> getRandomRelatedNewsletters(
    String issue, String category, int numberOfNewsletters) {
  List<Newsletter> result = new ArrayList<>(
    getNewsletters(category));
  result.remove(new Newsletter(issue));
  Collections.shuffle(result);
  return numberOfNewsletters < result.size() ?
     result.subList(0, numberOfNewsletters) : result;
}

A few smells: It would be nicer to generate a stream that we can then filter to exclude the current issue, rather than calling remove() on the result list. The return statement is also ugly. Why should we care if the list is already smaller than the desired numberOfNewsletters, as long as it is not larger? And the way that we have to call shuffle on the list is not nice either.

I set out to explore whether we could shuffle the stream itself. After several attempts, I created the ShuffleCollector. It produces a List of items and then shuffles them, taking an optional Random supplier, and then converts them back to a Stream. I used the Collectors.collectingAndThen() method, which allows us to add a "finisher" to a standard Collector. Here is how our code now looks:

public Collection<Newsletter> getRandomRelatedNewsletters(
    String issue, String category, int numberOfNewsletters) {
  return getNewsletters(category).stream()
      .filter(newsletter -> !newsletter.getIssue().equals(issue))
      .collect(ShuffleCollector.shuffle())
      .limit(numberOfNewsletters)
      .collect(toList());
}

This reads better than the previous version. We are saying that we want all newsletters of a particular category. We filter out the current issue, shuffle them, limit the stream to the numberOfNewsletters and lastly collect them to a List.

We could have let collect(ShuffleCollector.shuffle()) return a List, but a Stream is more practical. This way we can very easily produce a maximum sub-stream, or map them to another type, or do whatever else we might want to with a stream.

Here is what ShuffleCollector looks like:

import java.util.*;
import java.util.concurrent.*;
import java.util.function.*;
import java.util.stream.*;

public class ShuffleCollector {
  public static <T> Collector<T, ?, Stream<T>> shuffle() {
    return shuffle(ThreadLocalRandom::current);
  }

  public static <T> Collector<T, ?, Stream<T>> shuffle(
      Supplier<? extends Random> random) {
    return Collectors.collectingAndThen(Collectors.toList(),
        ts -> {
          Collections.shuffle(ts, random.get());
          return ts.stream();
        });
  }
}

By default we want to shuffle using ThreadLocalRandom, since it is by far the fastest random number generator in the JDK. However, we might not know which thread will end up calling the finishing lambda in the second shuffle() method, so it is better to pass in a Supplier<Random>, rather than an instance of the ThreadLocalRandom. Since Java 8, ThreadLocalRandom is a Singleton and the seed is stored in Thread. It is seeded in the current() method. We thus should never store an instance of ThreadLocalRandom in a field or pass it to a method. It should never escape from the methods in which we use it. We can store it in a local variable, as long as no lambda captures this.

We did not implement this method for primitives, since there is no Arrays.shuffle(int[]) method, only Collections.shuffle(List). In all likelihood a shuffled object stream will be sufficient. If we need, we can create a shuffled primitive stream with .boxed().collect(ShuffleCollector.shuffle()).mapToInt(Integer::intValue)

Here is a working example:

import java.util.*;
import java.util.concurrent.*;
import java.util.function.*;
import java.util.stream.*;

public class PrimitiveShuffleCollectorTest {
  private static void printRandom(
      int from, int upto, Supplier<Random> randomSupplier) {
    int[] shuffled = IntStream.range(0, 10)
        .boxed()
        .collect(ShuffleCollector.shuffle(randomSupplier))
        .limit(5)
        .mapToInt(Integer::intValue)
        .toArray();
    System.out.println(Arrays.toString(shuffled));
  }

  public static void main(String... args) {
    printRandom(0, 10, ThreadLocalRandom::current);
    printRandom(0, 10, () -> new Random(0));
  }
}

Output is something like:

[5, 4, 8, 2, 0]
[4, 8, 9, 6, 3]

The first line changes between calls. The second line is always the same, since we have seeded the Random with 0.

Kind regards from Crete

Heinz

 

Comments

We are always happy to receive comments from our readers. Feel free to send me a comment via email or discuss the newsletter in our JavaSpecialists Slack Channel (Get an invite here)

When you load these comments, you'll be connected to Disqus. Privacy Statement.

Related Articles

Browse the Newsletter Archive

About the Author

Heinz Kabutz Java Conference Speaker

Java Champion, author of the Javaspecialists Newsletter, conference speaking regular... About Heinz

Superpack '23

Superpack '23 Our entire Java Specialists Training in one huge bundle more...

Free Java Book

Dynamic Proxies in Java Book
Java Training

We deliver relevant courses, by top Java developers to produce more resourceful and efficient programmers within their organisations.

Java Consulting

We can help make your Java application run faster and trouble-shoot concurrency and performance bugs...