Grouping in Java 8
-
Last Updated: May 7, 2024
-
By: javahandson
-
Series
In this article, we will understand what is Grouping in Java 8. We will also learn about different variants of grouping methods with examples.
Grouping organizes similar data into groups based on one or more properties. Grouping is particularly powerful in combination with aggregate functions like SUM, AVG, COUNT, MAX, and MIN which perform calculations on grouped data, returning a single result per group.
Below are a couple of examples that will help to understand grouping in a better way.
Ex1. Count the number of employees in each department
Sample Data for Employees Table
EmployeeID | EmployeeName | DepartmentID |
1 | Shweta | D1 |
2 | Reshma | D3 |
3 | Shrunalini | D2 |
4 | Poornima | D1 |
5 | Shubhangi | D3 |
6 | Suraj | D1 |
7 | Iqbal | D2 |
8 | Kartik | D3 |
9 | Amit | D3 |
10 | Suchit | D2 |
Now we will group the Employees by DepartmentID and count the number of employees in each department. The result will be like below.
DepartmentID | NumberOfEmployess |
D1 | 3 |
D2 | 3 |
D3 | 4 |
Ex2. Group the Students by Subject and get the maximum mark for each Subject.
Sample Data for Student Table
StudentID | Name | Subject | Marks |
1 | Shweta | Mathematics | 95 |
2 | Reshma | English | 90 |
3 | Shrunalini | Physics | 78 |
4 | Poornima | Mathematics | 85 |
5 | Shubhangi | Chemistry | 91 |
6 | Suraj | Mathematics | 84 |
7 | Iqbal | English | 75 |
8 | Kartik | Physics | 82 |
9 | Amit | Mathematics | 65 |
10 | Suchit | Chemistry | 90 |
Now we will group the Students by subject and find the maximum marks for each subject. The result will be like below.
Subject | HighestMarks | |
Mathematics | 95 | |
Physics | 82 | |
Chemistry | 91 | |
English | 90 |
Java 8 has provided a set of groupingBy methods that will help us to do the grouping of data. These methods are part of the Collectors utility class, typically used with streams.
There are 3 variants of groupingBy methods.
We will learn about these methods with examples in the below section.
static<T, K> Collector<T,?,Map<K, List<T>>> groupingBy(Function<? super T, ? extends K> classifier)
a. static<T, K>: This indicates that groupingBy method is a static method. <T, K> are generic type parameters where T is the type of the stream elements and K is the type of the keys in the resulting Map.
b. Collector<T, ?, Map<K, List<T>>>: This describes the return type of the method. The method returns a Collector that collects elements of type T into a Map<K, List>>.
c. Function<? super T, ? extends K>: This is the only parameter of the method. This function is used to group elements of the stream into different groups. The function takes an element of the stream (of type T or any of its superclasses) and returns a key (of type K or any of its subclasses).
Write a program to group the Students by subject.
package com.javahands.collectors.grouping; public class Student { String name; String subject; int marks; public Student(String name, String subject, int marks) { super(); this.name = name; this.subject = subject; this.marks = marks; } public String getSubject() { return subject; } @Override public String toString() { return "Student [name=" + name + ", subject=" + subject + ", marks=" + marks + "]"; } }
package com.javahands.collectors.grouping; import java.util.Arrays; import java.util.List; import java.util.Map; import java.util.stream.Collectors; public class GroupingEx { public static void main(String[] args) { List<Student> studentList = Arrays.asList( new Student("Shweta", "Mathematics", 95), new Student("Reshma", "English", 90), new Student("Shrunalini", "Physics", 78), new Student("Suraj", "Mathematics", 84), new Student("Iqbal", "English", 75), new Student("Kartik", "Physics", 82), new Student("Poornima", "Mathematics", 85), new Student("Amit", "Mathematics", 65)); Map<String, List<Student>> groupBySubjects = studentList.stream() .collect(Collectors.groupingBy(Student::getSubject)); System.out.println(groupBySubjects); } } Output: {English=[Student [name=Reshma, subject=English, marks=90], Student [name=Iqbal, subject=English, marks=75]], Mathematics=[Student [name=Shweta, subject=Mathematics, marks=95], Student [name=Suraj, subject=Mathematics, marks=84], Student [name=Poornima, subject=Mathematics, marks=85], Student [name=Amit, subject=Mathematics, marks=65]], Physics=[Student [name=Shrunalini, subject=Physics, marks=78], Student [name=Kartik, subject=Physics, marks=82]]}
In the above example, Student::getSubject is a classifier. groupingBy function takes a group of Students, it groups the students by subject. Here the output will be a map that contains the subject as the key and a list of students as its value.
static <T, K, A, D> Collector<T, ?, Map<K,D>> groupingBy(Function<? super T, ? extends K> classifier, Collector<? super T, A, D> downstream)
This version of the groupingBy function is a more general and flexible form of the groupingBy collector that not only groups elements by a classifier but also allows for a further reduction of the grouped elements using another collector.
a. static <T, K, A, D>: This indicates that the method is static. <T, K, A, D> are generic type parameters:
b. Collector<T, ?, Map<K,D>>: This describes the return type of the method. The method returns a Collector that processes elements of type T and aggregates them into a Map<K,D>.
c. Function<? super T, ? extends K> classifier: This is a function that is used to determine the key for each element in the stream. It maps each element of the stream to a key of type K.
d. Collector<? super T, A, D> downstream: This parameter represents a downstream collector that is applied to the elements grouped under the same key.
It specifies how the classification results (grouping) are further reduced or transformed. The downstream collector operates on elements of type T (or its super types), accumulates them into a structure of type A, and finally produces a result of type D.
Write a program to count the number of employees in each department.
package com.javahands.collectors.grouping; public class Employee { int id; String name; String department; public Employee(int id, String name, String department) { super(); this.id = id; this.name = name; this.department = department; } public String getDepartment() { return department; } @Override public String toString() { return "Employee [id=" + id + ", name=" + name + ", department=" + department + "]"; } }
package com.javahands.collectors.grouping; import java.util.Arrays; import java.util.List; import java.util.Map; import java.util.stream.Collectors; public class GroupingDemo { public static void main(String[] args) { List<Employee> listOfEmployees = Arrays.asList( new Employee(1, "Shweta", "D1"), new Employee(2, "Reshma", "D3"), new Employee(3, "Poornima", "D3"), new Employee(4, "Shubhangi", "D1"), new Employee(5, "Suraj", "D1"), new Employee(6, "Iqbal", "D2"), new Employee(7, "Amit", "D1"), new Employee(8, "Suchit", "D2")); Map<String, Long> countByDepartment = listOfEmployees.stream() .collect(Collectors.groupingBy(Employee::getDepartment, Collectors.counting())); System.out.println(countByDepartment); } } Output: {D1=4, D2=2, D3=2}
Write a program to group the Students by Subject and get the maximum mark for each Subject.
package com.javahands.collectors.grouping; import java.util.Arrays; import java.util.Comparator; import java.util.List; import java.util.Map; import java.util.Optional; import java.util.stream.Collectors; public class GroupingEx { public static void main(String[] args) { List<Student> studentList = Arrays.asList( new Student("Shweta", "Mathematics", 95), new Student("Reshma", "English", 90), new Student("Shrunalini", "Physics", 78), new Student("Suraj", "Mathematics", 84), new Student("Iqbal", "English", 75), new Student("Kartik", "Physics", 82), new Student("Poornima", "Mathematics", 85), new Student("Amit", "Mathematics", 65)); Map<String, Optional<Student>> groupBySubjects = studentList.stream() .collect(Collectors.groupingBy(Student::getSubject, Collectors.maxBy(Comparator.comparing(Student::getMarks)))); System.out.println(groupBySubjects); } } Output: {English=Optional[Student [name=Reshma, subject=English, marks=90]], Mathematics=Optional[Student [name=Shweta, subject=Mathematics, marks=95]], Physics=Optional[Student [name=Kartik, subject=Physics, marks=82]]}
static <T, K, D, A, M extends Map<K, D>> Collector<T, ?, M> groupingBy(Function<? super T, ? extends K> classifier, Supplier<M> mapFactory, Collector<? super T, A, D> downstream)
This method allows us to group the elements of a stream by a specified classifier, process them with a specified downstream collector, and store the results in a map provided by a custom map factory.
a. <T, K, D, A, M extends Map<K, D>>: These are the generic type parameters used by this method:
b. Function<? super T, ? extends K> classifier: This is the classifier function used to assign a key to each element in the stream. It maps each element T to a key K.
c. Supplier<M> mapFactory: mapFactory provides a custom map implementation that is used for storing the results. The supplier is a factory method that produces a new empty map instance that will be populated by the collector.
d. Collector<? super T, A, D> downstream: This is the downstream collector that accumulates the elements of the stream once they are grouped by the classifier. The downstream collector works on elements of type T, accumulates them into a structure of type A, and produces a result of type D.
Write a program to count the number of employees in each department and ensure the results are stored in a LinkedHashMap to preserve the order in which the departments appear first.
package com.javahands.collectors.grouping; import java.util.Arrays; import java.util.LinkedHashMap; import java.util.List; import java.util.Map; import java.util.stream.Collector; import java.util.stream.Collectors; public class GroupingDemo { public static void main(String[] args) { List<Employee> listOfEmployees = Arrays.asList( new Employee(2, "Reshma", "D3"), new Employee(1, "Shweta", "D1"), new Employee(3, "Poornima", "D3"), new Employee(4, "Shubhangi", "D1"), new Employee(5, "Suraj", "D1"), new Employee(6, "Iqbal", "D2"), new Employee(7, "Amit", "D1"), new Employee(8, "Suchit", "D2")); Collector<Employee, ?, LinkedHashMap<String, Long>> collector = Collectors.groupingBy( Employee::getDepartment, LinkedHashMap::new, Collectors.counting() ); LinkedHashMap<String, Long> employeesByDepartment = listOfEmployees.stream() .collect(collector); System.out.println(employeesByDepartment); } } Output: {D3=2, D1=4, D2=2}
In the above section, we have learned about the 3 types of groupingBy functions. Other than the above 3 types of groupingBy functions we also have a set of groupingByConcurrent functions, which are as follows.
static<T, K> Collector<T,?,ConcurrentMap<K, List<T>>> groupingByConcurrent(Function<? super T, ? extends K> classifier)
static <T, K, A, D> Collector<T, ?, ConcurrentMap<K,D>> groupingByConcurrent(Function<? super T, ? extends K> classifier, Collector<? super T, A, D> downstream)
static <T, K, D, A, M extends ConcurrentMap<K, D>> Collector<T, ?, M> groupingByConcurrent(Function<? super T, ? extends K> classifier, Supplier<M> mapFactory, Collector<? super T, A, D> downstream)
groupingByConcurrent functions are almost the same as groupingBy functions but with few differences.
groupingBy : This method is typically used with sequential streams and with a single-threaded environment. This can also be used with parallel streams without the need for optimized concurrent performance. It is a safe choice for any stream but may not be as efficient as groupingByConcurrent in truly parallel scenarios.
groupingByConcurrent : It is best used with parallel streams when thread safety and performance in a multithreaded environment are priorities.
groupingBy: This method is designed for use in a single-threaded context within a sequential stream. When used in a parallel stream, the operation is still safe because it uses synchronization internally and it is thread-safe. However, it doesn’t specifically optimize for multithreaded environments because if multiple threads are executing parallel then one thread locks the resource and performs the operation, other threads have to keep on waiting till the first completes. So it is not that efficient.
groupingByConcurrent: As the name suggests, this method is specifically designed for concurrent execution in a parallel stream. It uses a concurrent map, such as ConcurrentHashMap, to accumulate results, allowing multiple threads to update the map concurrently without blocking. This can lead to performance improvements when processing large datasets in parallel.
groupingBy: It uses standard HashMap to store the grouping results. We can also supply a custom map factory to use a different type of map, such as a TreeMap or LinkedHashMap.
groupingByConcurrent: It uses a ConcurrentHashMap by default to store the results, ensuring thread safety during updates. We can also provide a custom map factory if we want to use a different type of concurrent map.
Write a program to group the Students by subjects using groupingBy and groupingByConcurrent.
package com.javahands.collectors.grouping; import java.util.Arrays; import java.util.List; import java.util.Map; import java.util.stream.Collectors; public class GroupingEx { public static void main(String[] args) { List<Student> studentList = Arrays.asList( new Student("Shweta", "Mathematics", 95), new Student("Reshma", "English", 90), new Student("Shrunalini", "Physics", 78), new Student("Suraj", "Mathematics", 84), new Student("Iqbal", "English", 75), new Student("Kartik", "Physics", 82), new Student("Poornima", "Mathematics", 85), new Student("Amit", "Mathematics", 65)); Map<String, List<Student>> groupingBySubjects = studentList.stream() .collect(Collectors.groupingBy(Student::getSubject)); System.out.println(groupingBySubjects); System.out.println("Resultant map type for groupingBy : "+ groupingBySubjects.getClass()); Map<String, List<Student>> groupingByConcurrentSubjects = studentList.parallelStream() .collect(Collectors.groupingByConcurrent(Student::getSubject)); System.out.println(groupingByConcurrentSubjects); System.out.println("Resultant map type for groupingByConcurrent : "+ groupingByConcurrentSubjects.getClass()); } } Output: Grouping By : {English=[Student [name=Reshma, subject=English, marks=90], Student [name=Iqbal, subject=English, marks=75]], Mathematics=[Student [name=Shweta, subject=Mathematics, marks=95], Student [name=Suraj, subject=Mathematics, marks=84], Student [name=Poornima, subject=Mathematics, marks=85], Student [name=Amit, subject=Mathematics, marks=65]], Physics=[Student [name=Shrunalini, subject=Physics, marks=78], Student [name=Kartik, subject=Physics, marks=82]]} Resultant map type for groupingBy : class java.util.HashMap Grouping By Concurrent : {English=[Student [name=Iqbal, subject=English, marks=75], Student [name=Reshma, subject=English, marks=90]], Mathematics=[Student [name=Amit, subject=Mathematics, marks=65], Student [name=Poornima, subject=Mathematics, marks=85], Student [name=Suraj, subject=Mathematics, marks=84], Student [name=Shweta, subject=Mathematics, marks=95]], Physics=[Student [name=Kartik, subject=Physics, marks=82], Student [name=Shrunalini, subject=Physics, marks=78]]} Resultant map type for groupingByConcurrent : class java.util.concurrent.ConcurrentHashMap
So this is all about Grouping in Java 8. If you have any questions on this topic, please raise them in the comments section. If you liked this article then please share this post with your friends and colleagues.