Java 1.9 Diary


  • void ifPresentOrElse(Consumer action, Runnable emptyAction)

    Applies an action if the element is present or the runnable otherwise ex:

    Optional option = Optional.of(1);
    option.ifPresentOrElse(System.out::println,()->System.out.println("not found"));
  • Optional or(Supplier supplier)

    Returns the option value of the supplier if the host option is absent. Say we want to try to compute some value from different methods and we stop when we succeed.

    Optional fromX(){
    Optional fromY(){
    fromX().or(()-> fromY());
  • Stream stream()

    converts an Option  to a Stream, very useful when combined with streams. Say you have a list on optionals and you want to filter out the empty ones and leave the non empty ones,  simply you can convert to a Stream and flatmap it.

Stream<Integer> nonEmpty = Stream.of(fromX(), fromY(), fromZ()).flatMap(Optional::stream)


  • T requireNonNullElse(T obj, T defaultObj)

    Returns the first argument if it’s not null or the defaultObj otherwise, throws an NPE if defaultObj is null.

  • T requireNonNullElseGet(T obj, Supplier supplier)

    Similar to the one before but this one takes supplier rather than a value and hence.

  • int checkIndex(int index, int length)

    throws an IndexOutOfBoundsException if index= length or length

  • int checkFromToIndex(int fromIndex, int toIndex, int length)

    throws an IndexOutOfBoundsException if fromIndex toIndex or toIndex>length or length

  • int checkFromIndexSize(int fromIndex, int size, int length)

    throws an IndexOutOfBoundsException if fromIndex length or length


  • Map of()

    Returns an empty immutable Map

    Map<String,Integer> empty = Map.of();
  • Map of(K k1, V v1, ...., K k10, V v10)

    Returns an immutable Map that contains the provided key-value pairs

    Map<String,Integer> empty = Map.of("hi", 1, "there", 2);

Linked Lists Are Bad (Part 1)

This might not be a surprising fact to any performance hacker but could quite shocking from Big-O point of view.

The claim: Linked Lists are bad and don’t have any use case.

In this post we will see that even in the most simple and common use case which is sequentially iterating over the list ( linear scan), An ArrayList would perform much better.

Problem: Linear scan over a list.

LinkedList Big-O: O(n)

ArrayList Big-O: O(n)

So both structures have the same characteristics for this use case.  Now lets hack a quick JMH benchmark to see if this holds.

private LinkedList<Integer> linkedList;
    private ArrayList<Integer> arrayList;

    public void setup(){
        linkedList = new Random().ints().limit(10_000_000).boxed()
        arrayList = new Random().ints().limit(10_000_000).boxed()

    public void iterateLinkedList(Blackhole blackhole){
        for (Integer i : linkedList) {

    public void iterateArrayList(Blackhole blackhole){
        for (Integer i : arrayList) {

When running it with the following configuration

public static void main(String[] args) throws RunnerException {
        Options options = new OptionsBuilder()

        Runner runner = new Runner(options);;

It gives the following

Benchmark                         Mode  Cnt    Score    Error  Units
ListsBenchmark.iterateArrayList   avgt   10   58.399 ± 14.834  ms/op
ListsBenchmark.iterateLinkedList  avgt   10  132.461 ±  7.075  ms/op

Showing that scanning an ArrayList is way more efficient than scanning a LinkedList.


Reactions GIF - Find & Share on GIPHY

Locality. Locality does matter. Reading a value from memory is expensive compared to reading it from the cache. Arrays are great in getting the best out of your cache line by batch loading relevant data to the cache.

Memories are the new disks. Cliff Click.

Another problem with linked lists is their huge footprint. Every node needs special object allocation and maintains at least two references to the next and previous node. This problem just doesn’t exist in ArrayList where all what you allocate is big array object and extra variables for bookkeeping.

Higher-Kinded Types, Any useful?

What is a kind anyway?

From Odersky’s paper Generics of a Higher Kind : Conceptually, kinds are used to distinguish a type parameter that stands for a proper type, such as List[Int], from a type parameter that abstracts over a type constructor, such as List.

And from the same paper: generalisation to types that abstract over types that abstract over types.

In simple words, higher-kinded types are high order types, like higher order functions types can be expressed as values ( passed as parameters and returned from functions).

What do languages with poor type-systems like Java do for expressing higher-kinded types?

They give up! A map or a filter over a Stream in Java returns a Stream and the programmer is asked to explicitly convert the stream back to the desired proper type. -> c.length()).collect(Collectors.toList());

Libraries like Guava provide an endless list of static helpers with almost the same functionality in order to return the same type back.
Where in Scala a map over a List returns a List and a map over an Option returns an Option.

How would one achieve type soundness like this?

Higher-Kinded type is the answer.


Lets start by writing a Functor using higher-kinded type.

  trait Functor[F[_]]{
    def map[A,B](fa: F[_ <:A])(f: A =>B ): F[B]

Note that funny F[_]! It is not a type parameter, it is rather a kind. A higher order type.

Now lets reinvent a couple of Abstract Data Types.

  sealed trait Maybe[+A]
  case class Just[A](value: A) extends Maybe[A]
  case object Nope extends Maybe[Nothing]

  sealed trait Try[+A]
  case class Failed(throwable: Throwable) extends Try[Nothing]
  case class Result[A](value: A) extends Try[A]

And their respective functors so we can map over them.

  object MaybeFunctor extends Functor[Maybe]{
    override def map[A, B](fa: Maybe[_<: A])(f:A=>B):Maybe[B]=fa match {
      case Nope=> Nope
      case Just(a) => Just(f(a))

  object TryFunctor extends Functor[Try]{
    override def map[A, B](fa: Try[_<:A])(f:A=>B): Try[B]=fa match {
      case failed @ Failed(throwable) => failed
      case Result(value) => Result(f(value))

Nothing special so far. We can just call the specialised ..Functor and get the proper type. However with this abstraction we can express universal map that returns the most specific type.

def universalMap[A,B, F[_]]( source: F[_ <: A])(fn: A=> B)(implicit functor: Functor[F]): F[B] = {[A,B](source)(fn)

Using the universal map.

val option: Maybe[String] = Nope
implicit val choiceFunctor = MaybeFunctor
val optionMapped: Maybe[Int] = universalMap(option)(_.length)
val tryValue: Try[String] = Result("Foo")
implicit val tryFunctor = TryFunctor
val tryMapped: Try[Int] = universalMap(tryValue)(_.length)

Have you noticed the return types? the universal map returns a Maybe[T] when mapped over a Maybe[T] and Try[T] when mapped over a Try[T].



Building a Custom Collector

In a previous post we looked at parallel reduction and gave a conceptual background on using  the ForkJoin model to parallelise an embarrassingly parallel operation. Reduction is at the heart of building parallel algorithms. In this post, we will to take this knowledge a step further and  use it in building a custom Collector that works with a parallel pipeline.

Motivation, Why Would you build a Collector?

  • Curiosity.
  • The collector you need is not provided by the JDK.
  • You don’t want to use an old style loops, not because loops are bad but they are hard to parallelise.
  •  The last thing you want to do is writing a ForkJoin Task.

A SumAndCount Collector

The collector that we will be building works on a Stream on Integers, accumulates the sum in a BigDecimal and counts the elements in the source Stream.

public class SumAndCount{
   BigDecimal sum = BigDecimal.ZERO;
   int count;

Ingredients for a Parallel Collector

  • Supplier, a function that can be used to instantiate a partial result.
  • BiConsumer, a function that takes a partial result and an element from the source stream and adds to the partial result.
  • BinaryOperator, a merge function that takes two partial results, merges them and returns the result.

Putting it all together

Given the ingredients above, we are ready to go. Note that no synchronization  is required here,parallelism is not concurrency.


Collector<Integer, SumAndCount, SumAndCount> collector = Collector.of(SumAnCount::new, (sumAndCount, item) -> {
            sumAndCount.sum = sumAndCount.sum.add(BigDecimal.valueOf(item));
        }, (s1, s2) -> {
            s1.sum = s1.sum.add(s2.sum);
            s1.count += s2.count;
            return s1;

Thats a collector that we can hand to the Stream API and the API will take it from there and parallelise the job for you.

SumAndCount sumAndCount = Stream.iterate(0,i->i+10).limit(8000_000).parallel().collect(collector);

Parallel Reduction

In this post we will look on how would we go about building a general purpose parallel reduce using the ForkJoin model.

Btw, you don’t need to do that if you are on Java 1.8 or later versions because Streams already support reduction and you should be using them rather than rolling your own, so this is for educational purposes only.

The idea of reduction is very simple, given a dataset, transform it into a single value. An example of reduce would be, given an array of integers, return the sum of all its elements. Such an operation could be done using a simple loop and an accumulator.

int sum = 0;
for(int element:array){
 sum += element;

The only problem with code above is that it doesn’t scale to multi-core. Given infinite number of processors, the algorithm above will run in O(n) time regardless. At the end of this post, you should be able to convince yourself that such an operation could done O(lgn) if the number of processors we have tends to infinity.

A Divide and Conquer sum.

Instead of calculating the sum through scanning the entire array at once sequentially, we can use a divide and conquer algorithm for that, something very similar to merge sort; divide the array into two parts and solve the subproblem recursively. As in merge sort, we aiming for  recursion tree of depth O(lgn), although in practice our recursion tree would be shorter than that.

A Naive Implementation.

A  naive way would be to have a shared atomic accumulator, split the array into parts recursively and when when the size of the array is small enough to be computed sequentially, we calculate the partial sum, append it to the accumulator. At the end of this, the accumulator will have the total sum of the element.

  The algorithm is definitely correct, it derives its correctness from the sequential one anyway.  The only problem is that this algorithm suffers from contention. All worker threads have to coordinate when they ready to contribute with their partial sum. This would introduce a sequential bottleneck to our algorithm and according to  Amdahl’s law, sequential parts of the algorithm is what governs its performance.

 One more problem with the algorithm described above is that it is not cache friendly, atomics are usually implemented using a CAS instruction. Although CAS scales better than  a mutex  as it doesn’t suffer from context switching, it still requires cache line invalidation.

A Better Solution.

A better solution would be  a shared nothing solution where our solution tree looks like binary balanced tree, when we reach leaf ( a partial sum), we combine it with it’s sibling’s partial solution building a bottom up aggregated result.


A pseudo code.

if(size == sequential_threshold){
  sum = 0
  for(i : a){
  return sum;
} else {
  leftArray = a[0..mid]
  rightArray = a[mid+1..length]
  leftTask = fork(sum(left))
  rightTask = fork(sum(right))

  return leftTask.join() + rightTask.join()


A General Purpose Reduce

But this looks like a pattern, we often accumulate, not necessarily for sum, but counting is a similar thing, so is MAX, so is MIN. So, can we build a general  purpose reduce? Of course we can. However, there is limitation to what can be achieved using this pattern. The operation has to be Associative, in simple words, the order doesn’t matter. Sum for example is associative.

A + (B + C) = (A+B) + C

So is MAX

MAX(1,-77,13,4) = MAX(4,1,13,-77)

A property that division doesn’t have since order matters.

Ingredients for our Recipe 

Now it’s time to think of the tools needed for that.

  • We need a source collection, a one that we can split efficiently. A linked list would be a horrible one, because splitting it requires linear time. Arrays to rescue ! Arrays split in constant time, remember merge sort?? Moreover,  arrays get the best out of cache lines because of locality.
  • An operator, a binary operator in particular. Something that takes two parameters, does something, and returns a result. For that, we will be using BinaryOperator from JDK 1.8.
  • ForkJoin framework.
  • An identity or seed, a value that we can as a baseline ( “initialiser”) for our computation. For sum it would be 0, for MAX it would be Integer.MIN_VALUE

Java Implementation

The code for this is incredibly simple. All what we need is a simple RecursiveTask and tiny wrapper around that hides some details.

static class ReductionTask<T> extends RecursiveTask<T>{

  private final T[]a;
  private final int low;
  private final int high;
  private final BinaryOperator<T> operator;

  private static final int THRESHOLD = 1000;

  public ReductionTask(T[]a, BinaryOperator<T>operator){

  private ReductionTask(T[]a,int low,int high,BinaryOperator<T>operator) {
      this.a = a;
      this.low = low;
      this.high = high;
      this.operator = operator;

   protected T compute() {
      T result = a[low];
      for (int i = low + 1;i<= high;i++) {
result = operator.apply(result,a[i]);
return result;

int mid = (low + high) <<< 1;
    ReductionTask<T> leftTask = new ReductionTask<>(a,low,mid,operator);
    ReductionTask<T> rightTask = new ReductionTask<>(a,mid+1,high,operator);
ForkJoinTask<T>fork1 = leftTask.fork();
ForkJoinTask<T>fork2 = rightTask.fork();
T result = operator.apply(fork1.join(),fork2.join());
return result;

And a tiny class to submit the task to the ForkJoinPool

public class ArrayReducer<T>{
  private final T[]a;
  private final T identity;
  private final BinaryOperator<T> operator;

  public ArrayReducer(T[]a,T identity, BinaryOperator<T> operator){
    this.a = a;
    this.identity = identity;
    this.operator = operator;

  public T compute() throws ExecutionException, InterruptedException {
       return this.identity;
    ReductionTask<T> reductionTask = new ReductionTask<>(a,operator);
    ForkJoinTask<T> task = ForkJoinPool.commonPool().submit(reductionTask);
    T result = task.invoke();
    result = operator.apply(identity, result);
    return result;


Now we can use this abstraction to perform our parallel sum as follows

ArrayReduce<Integer> reduce = new ArrayReduce<>(a,0,(x,y) -> x+ y);
int sum = reduce.compute();

The same could be used for MAX as well.

ArrayReduce<Integer> reduce = new ArrayReduce<>(a,Integer.MIN_VALUE,Math::max);
int max = reduce.compute();


Don’t Abuse toLowerCase

Most usage of string.toLowerCase and string.toUpperCase in our programs has nothing to do with lower or upper casing  strings, all what we want to achieve most of the time is case insensitive comparison.

The problem is that toLowerCase will allocate a new string which includes array copy and produce unnecessary garbage for no good reason. The good news is, there is a better and more concise way for achieving this using String.equalsIgnoreCase(String).


Here is a JMH benchmark that shows the difference between using both methods for testing string’s equality.

And the results when I run it on my machine.

Benchmark                              Mode  Cnt    Score    Error  Units
StringEqualsBenchmark.withIgnoreCases  avgt   10   18.299 ±  0.319  ns/op
StringEqualsBenchmark.withLowerCase    avgt   10  280.870 ± 14.076  us/op

Do not Start an Executor in a static Block

Don’t do that, because  your program will hang forever.

Code, show me some code.

public class Foo{
static {
        try {
            ExecutorService executorService = Executors.newSingleThreadExecutor();
            int sum = executorService.submit(() -> {
                int x = 0;
                for (int i = 0; i<10; i++) {
                    x += i;
                return x;


        } catch (Exception e) {

  public static void main(String[]args){


This program will hang forever due to a class initialisation deadlock, both threads, the class initialisation thread and the executor thread will be waiting for the initialisation flag to be set.

Obviously the same behaviour could be reproduced with parallel streams where threads will be submitted to the ForkJoinPool behind the scenes.

static {
        Integer[]a = IntStream.range(1,50_000).boxed().toArray(Integer[]::new);
        List<Integer> filtered = Arrays.asList(a)
                .parallelStream().filter(c -> c % 2 == 0).collect(Collectors.toList());


This is not a bug, as described here it is just a stupidly written program.

Thoughts: I believe that tools (compilers/IDEs) should raise a warning about it. FindBugs spots it, because FindBugs rocks….

The 10 Worst Things In Java

I’ve been thinking about this for while and I finally managed to compile a list of the 10 worst things in Java. And by the way, I love Java, the language and the platform. So lets get started

  1. Weak, Soft, and Phantom references. Those are evil, if you like chasing weird bugs in an unpredictable program, then use them. Basically the program’s behaviour will rely on the GC which is unpredictable.
  2. Every object is a mutex, if you have a reference to an object you can lock on it.
    List<Integer> list = getList();

    That is bad and can cause deadlocks if some inconsistent lock ordering is going on. Designated Lock objects were added to the JDK later.

  3. Every object is a condition, this is similar to the one above. wait, notify, and notifyAll methods reside in Object. Conditions were added later to the language.
  4. Every object has hashCode and equals inherited from Object. Programmers always have to remember to implement them in order store them in collections. Interfaces should have been added for those.
    interface Hashable{
     int hashCode();
    interface Equatable<T>{
     boolean equals(T other);
  5. Checked Exceptions. They becoming worst with lambdas and method references. Anders Hejlsberg described their ugliness here. Although I believe throws isn’t too bad as an annotation for the programmer and the machine.
  6. Arrays are covariant, I’ve posted about it previously.
  7. Bytes are always signed
  8. Method overloading resolution rules are a tricky business, I wrote about it previously here and Lukas Eder wrote a great blog about it as well.
  9. extends and super keywords are so confusing to describe covariant and contravariant types. You always need remember PECS
  10. Cloneable has no clone method.

Method Overloading/ The Ugly Parts

Method overloading resolution in C style languages is a tricky business, I am referring here to Java an C# in particular. Resolution rules seem obvious for the novice.

 void fn(int x);
 void fn(int x,int y);

So far so simple, but what happens if we have the following

void fn(float x);
void fn(double x);
// call fn with an int 

the value 1 will be resolved into a 32-bit integer and we have 2 candidates, one that takes a float and the other takes a double, both types can hold an int, so which one would it choose? The answer is fn(float). Why? Because fn(float) is more specific. It looks at the candidates, it finds a float and a double, float is a 32-bit signed floating point type and double is a 64-bit signed floating point type and hence we can squeeze any float into a double and hence we choose float 🙂

Now lets do overloading on reference type parameters.

 class Foo{
 class Bar extends Foo{
 void fn(Foo foo);
 void fn(Bar bar);
 // call fn

So we calling fn with a null parameter. There are 2 candidates Foo and Bar, every Bar is a Foo but not every Foo is a Bar and hence it chooses Bar because it is more specific 🙂
If we have a third candidate that takes a String as follows.

 class Foo{
 class Bar extends Foo{
 void fn(Foo foo);
 void fn(Bar bar);
 void fn(String s);
 // call fn, ambiguous call

Then it becomes an ambiguous call because there no is relation between String and Bar.
It really becomes a pain with lambdas and method references. Consider the following.

 ExecutorService executorService = Executors.newSingleThreadExecutor();
            throw new RuntimeException();

Now you guess it. Would it choose Callable or Runnable ?
The answer is Callable because the other candidate Runnable returns void 🙂 .You can assign to a Future as follows

 ExecutorService executorService = Executors.newSingleThreadExecutor();
  Future<Object> future = executorService.submit(()->{
            throw new RuntimeException();

If you want to master it, you should read the specification every night before you go to bed.

Conclusion: There are very good reasons why languages with strong type systems like Haskell don’t allow method overloading.

When T doesn’t Match Itslef

Thats a true story.
If a method returns you a List<T> doesn’t mean that you can guarantee getting a List<T>, it might do, and might not. Not even a subtype of T, actually you have to declare that T is covariant explicitly.

class Foo{
class Bar extends Foo{
List<Foo> foos = new ArrayList<Bar>(); // doesn't compile
List<? extends Foo> foos = new ArrayList<Bar>(); // compiles

But thats fine, the problem is that when T doesn’t match itself either, for ex:

static <T> List<T> getTs(){
   String[] strings = {"type","system"};
   List<T> list = new ArrayList<>();
   for (String string : strings) {
   return list;

List<Foo> foos = getTs(); // compiles fine, runs fine

// problems happen here
//java.lang.ClassCastException: java.lang.String cannot be cast to Foo
for (Foo foo : foos) { 

And btw, this code is not so artificial, here is a real world example. Gson API for JSON parsing,

static <T> List<T> getTs(){
  Type type = new TypeToken<List<T>>() {}.getType();
  List<T> list = new Gson().fromJson("[{id:1}]" , type);
  return list;
List<Foo> foos = getTs(); // fine, but doesn't contain Foos