Parallel stream in Java 8
-
Last Updated: June 13, 2024
-
By: javahandson
-
Series
In this article, we will understand what is a parallel stream in Java 8. We will also learn when and how to use the parallel stream effectively.
Parallel streams in Java 8 provide a way to perform parallel processing on collections, making it easier to leverage multiple CPU cores for improved performance.
In the previous articles we have learned how the Stream interface helps us to effectively manipulate the data. Here we will learn about parallel streams that help us to execute a pipeline of operations parallelely.
For ex. If we have a very large set of numbers and we have to add all of them, then instead of running the sum operation sequentially on a single thread we can use this parallel stream feature to execute the sum operation in parallel on multiple threads.
numbers = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 }
Here using the parallel stream the above numbers will be split into multiple chunks ( automatically without any extra piece of code ) and the sum operation will be performed on individual chunks. In the end, the result of all the chunks will be added to generate a final result.
Thread1 chunk1 = { 1, 2, 3, 4, 5 } sum1 = 15 Thread2 chunk2 = { 6, 7, 8, 9, 10 } sum2 = 40 result = sum1 + sum2 = 55
If we are trying to achieve the above result before Java 7 then firstly we need to split the numbers into subparts ( manually using code ) and then we have to assign each subpart to a thread. Then we have to properly synchronize both threads to avoid any race conditions. After that, we have to wait for all the threads to complete the execution and at the end, we have to combine the results from different threads to generate a final result. This process is quite difficult and error-prone as well.
Java 7 introduced a framework called fork/join to perform these operations more consistently and in a less error-prone way. Java 8 parallel stream uses the fork/join framework internally to process the stream in parallel.
In the below examples, we will see how we can add numbers in an iterative, sequential, and parallel manner.
Write a program to find the sum of the first n numbers using the Iterative approach.
package com.javahandson.parallel.data.processing; public class SumParallel { public static void main(String[] args) { long n = 100; long sum = 0; for (long i = 1L; i <= n; i++) { sum += i; } System.out.println("Sum of first "+n + " numbers is : "+sum); } } Output: Sum of first 100 numbers is : 5050
Write a program to find the sum of the first n numbers using a sequential stream.
package com.javahandson.parallel.data.processing; import java.util.stream.Stream; public class SumSequential { public static void main(String[] args) { long n = 100; long sum = Stream.iterate(1l, i -> i + 1) .limit(n) .reduce(0L, Long::sum); System.out.println("Sum of first "+n + " numbers is : "+sum); } } Output: Sum of first 100 numbers is : 5050
Write a program to find the sum of the first n numbers using a parallel stream.
package com.javahandson.parallel.data.processing; import java.util.stream.Stream; public class SumParallel { public static void main(String[] args) { long n = 100; long sum = Stream.iterate(1l, i -> i + 1) .limit(n) .parallel() .reduce(0L, Long::sum); System.out.println("Sum of first "+n + " numbers is : "+sum); } } Output: Sum of first 100 numbers is : 5050
When we use the parallel stream we don’t have to worry about splitting the stream nor do we have to worry about synchronization nor we have to know how many threads were used to process the stream. So using the parallel stream is very easy.
But when do we have to use the parallel stream? Can we use it every time? The answer is No. We will try to explore when and how to use the parallel stream in the below sections.
As per our understanding parallel stream is more efficient than sequential stream because a single task is split into multiple tasks over different threads and all these threads execute in parallel to achieve the result.
But that is not always correct. A parallel stream is not always better than a sequential one. Let us try to understand it better using some examples.
Write a program to calculate the time taken to perform the sum of the first 100000000 numbers using an iterative approach
package com.javahandson.parallel.data.processing; public class IterativePerformance { public static void main(String[] args) { long startTime = System.nanoTime(); long sum = iterativeSum(100000000); long endTime = System.nanoTime(); // Calculate duration in milliseconds long duration = (endTime - startTime) / 1000000; // Output the result and the time taken System.out.println("Sum: " + sum); System.out.println("Duration: " + duration + " ms"); } public static long iterativeSum(long n) { long sum = 0; for (long i = 1L; i <= n; i++) { sum += i; } return sum; } } Output: Sum: 50000005000000 Duration: 206 ms
Write a program to calculate the time taken to perform the sum of the first 100000000 numbers using a sequential approach.
package com.javahandson.parallel.data.processing; import java.util.stream.Stream; public class SequentialPerformance { public static void main(String[] args) { long startTime = System.nanoTime(); long sum = sequentialSum(10000000); long endTime = System.nanoTime(); // Calculate duration in milliseconds long duration = (endTime - startTime) / 1000000; // Output the result and the time taken System.out.println("Sum: " + sum); System.out.println("Duration: " + duration + " ms"); } public static long sequentialSum(long n) { return Stream.iterate(1l, i -> i + 1) .limit(n) .reduce(0L, Long::sum); } } Output: Sum: 5000000050000000 Duration: 3806 ms
Write a program to calculate the time taken to perform the sum of the first 100000000 numbers using a parallel approach.
package com.javahandson.parallel.data.processing; import java.util.stream.Stream; public class ParallelPerformance { public static void main(String[] args) { long startTime = System.nanoTime(); long sum = parallelSum(10000000); long endTime = System.nanoTime(); // Calculate duration in milliseconds long duration = (endTime - startTime) / 1000000; // Output the result and the time taken System.out.println("Sum: " + sum); System.out.println("Duration: " + duration + " ms"); } public static long parallelSum(long n) { return Stream.iterate(1l, i -> i + 1) .limit(n) .parallel() .reduce(0L, Long::sum); } } Output: Sum: 50000005000000 Duration: 6464 ms
The above results are quite shocking right? The parallel version of the summing method is much slower than the iterative and sequential one. So what went wrong?
There are actual 2 issues:
Because of the above reasons parallel processing becomes difficult as it takes an extra amount of time to perform the unboxing and also it takes time to divide the stream into multiple chunks. Other than these 2 operations the chunks have to be moved to different cores of the OS so this is also a coslier operation. So we should not misuse the parallel stream operation as it can worsen the overall performance of our programs.
So the next question is how we can correctly use the parallel stream. We will understand this in the next section.
In the above example, If we are somehow able to overcome the unboxing and splitting issue then we can say we are making proper use of parallel streams.
This is possible by using the LongStream.rangeClosed instead of Stream.iterate.
Write a program to calculate the time taken to perform the sum of the first 100000000 numbers using LongStream.rangeClosed and parallel approach.
package com.javahandson.parallel.data.processing; import java.util.stream.LongStream; public class ParallelPerformance { public static void main(String[] args) { long startTime = System.nanoTime(); long sum = parallelSum(100000000); long endTime = System.nanoTime(); // Calculate duration in milliseconds long duration = (endTime - startTime) / 1000000; // Output the result and the time taken System.out.println("Sum: " + sum); System.out.println("Duration: " + duration + " ms"); } public static long parallelSum(long n) { return LongStream.rangeClosed(1, n) .parallel() .reduce(0L, Long::sum); } } Output: Sum: 5000000050000000 Duration: 166 ms
Finally, we are able to effectively use the parallel stream as the time taken to calculate the sum of 100000000 numbers using parallel stream is lesser than the iterative and sequential approach. This also demonstrates that using the right data structure and then making it work in parallel guarantees the best performance.
We should not use parallel streams with algorithms that mutate some shared state of an object because it will give the wrong output. The time taken here might be less but there is no point if the desired output is wrong.
We will try to demonstrate it with an example. Here we will create an Addition class and write an add method to add the numbers and store the result in an instance variable i.e total.
We will call this add method from a parallel stream operation. Since it’s a parallel operation so multiple threads will call the add method concurrently and mutate the value of the total variable. We will check the results below.
package com.javahandson.parallel.data.processing; import java.util.stream.LongStream; class Addition { long total = 0; public void add(long number) { total += number; } } public class ParallelStateMutation { public static void main(String[] args) { Addition addition = new Addition(); LongStream.rangeClosed(1, 100000000) .parallel().forEach(addition::add); System.out.println(addition.total); } } Output when ran 1st time : 560219812676612 Output when ran 2nd time : 565548884907988 Output when ran 3rd time : 643044342188016
In the above program, we are getting a different answer everytime and here the output is always wrong. This is happening because multiple threads are concurrently accessing the addition object and modifying the total. Here the time taken by the algorithm does not matter. Hence we should always avoid using the parallel stream when we are mutating some shared state of an object.
In the above example if we are removing the parallel operation then the result will be unique and correct.
package com.javahandson.parallel.data.processing; import java.util.stream.LongStream; class Addition { long total = 0; public void add(long number) { total += number; } } public class ParallelStateMutation { public static void main(String[] args) { Addition addition = new Addition(); LongStream.rangeClosed(1, 100000000) .forEach(addition::add); System.out.println(addition.total); } } Output when ran 1st time : 5000000050000000 Output when ran 2nd time : 5000000050000000 Output when ran 3rd time : 5000000050000000
Here the result is constant as we are using only one thread to update the value of the total variable.
So this is all about Parallel streams in Java 8. If you have any questions on this topic, please raise them in the comments section. If you liked this article then please share this post with your friends and colleagues.