Java Stream分组
在处理大量数据时,我们经常需要将数据按照某些特征进行分类或分组,以便进行后续的统计和分析。Java 8引入的Stream API提供了强大的groupingBy
方法,让我们能够轻松地实现数据分组。本文将详细介绍Java Stream的分组操作,帮助你掌握这一实用技能。
什么是Stream分组?
Stream分组是指将流中的元素按照指定的特征或条件进行分类,将具有相同特征的元素归为一组。这类似于SQL中的GROUP BY
语句。在Java Stream API中,分组主要通过Collectors.groupingBy()
方法实现。
备注
分组操作是Stream API中终端操作的一种,它会触发流的计算并返回结果。
基本分组操作
让我们从一个简单的例子开始,假设我们有一个学生列表,需要按照学生的年级进行分组。
首先,定义一个学生类:
public class Student {
private String name;
private int grade;
private String major;
private double score;
// 构造函数、getter和setter方法
public Student(String name, int grade, String major, double score) {
this.name = name;
this.grade = grade;
this.major = major;
this.score = score;
}
public String getName() { return name; }
public int getGrade() { return grade; }
public String getMajor() { return major; }
public double getScore() { return score; }
@Override
public String toString() {
return name + "(grade=" + grade + ", major=" + major + ", score=" + score + ")";
}
}
现在,我们创建一个学生列表并按年级分组:
import java.util.Arrays;
import java.util.List;
import java.util.Map;
import java.util.stream.Collectors;
public class StreamGroupingExample {
public static void main(String[] args) {
List<Student> students = Arrays.asList(
new Student("Alice", 1, "Computer Science", 85.5),
new Student("Bob", 2, "Physics", 76.0),
new Student("Charlie", 1, "Mathematics", 92.5),
new Student("David", 2, "Computer Science", 88.0),
new Student("Eve", 3, "Mathematics", 90.5),
new Student("Frank", 3, "Computer Science", 71.5)
);
// 按年级分组
Map<Integer, List<Student>> studentsByGrade = students.stream()
.collect(Collectors.groupingBy(Student::getGrade));
// 输出结果
studentsByGrade.forEach((grade, studentList) -> {
System.out.println("Grade " + grade + " students:");
studentList.forEach(student -> System.out.println(" " + student));
});
}
}
输出结果:
Grade 1 students:
Alice(grade=1, major=Computer Science, score=85.5)
Charlie(grade=1, major=Mathematics, score=92.5)
Grade 2 students:
Bob(grade=2, major=Physics, score=76.0)
David(grade=2, major=Computer Science, score=88.0)
Grade 3 students:
Eve(grade=3, major=Mathematics, score=90.5)
Frank(grade=3, major=Computer Science, score=71.5)
在这个例子中,Collectors.groupingBy(Student::getGrade)
将学生按照年级分组,结果是一个Map,其中键是年级,值是该年级的学生列表。
多级分组
有时我们需要进行多级分组,例如先按年级分组,然后再按专业分组。Stream API允许我们通过嵌套groupingBy
来实现多级分组:
// 多级分组:先按年级,再按专业
Map<Integer, Map<String, List<Student>>> studentsByGradeAndMajor = students.stream()
.collect(Collectors.groupingBy(
Student::getGrade,
Collectors.groupingBy(Student::getMajor)
));
// 输出结果
studentsByGradeAndMajor.forEach((grade, majorMap) -> {
System.out.println("Grade " + grade + ":");
majorMap.forEach((major, studentList) -> {
System.out.println(" Major: " + major);
studentList.forEach(student -> System.out.println(" " + student));
});
});
输出结果:
Grade 1:
Major: Computer Science
Alice(grade=1, major=Computer Science, score=85.5)
Major: Mathematics
Charlie(grade=1, major=Mathematics, score=92.5)
Grade 2:
Major: Physics
Bob(grade=2, major=Physics, score=76.0)
Major: Computer Science
David(grade=2, major=Computer Science, score=88.0)
Grade 3:
Major: Mathematics
Eve(grade=3, major=Mathematics, score=90.5)
Major: Computer Science
Frank(grade=3, major=Computer Science, score=71.5)
分组并聚合
除了简单分组外,我们通常需要对分组后的数据进行一些聚合操作,如计数、求和、平均值等。Collectors
类提供了丰富的聚合方法:
分组计数
// 统计每个年级的学生人数
Map<Integer, Long> studentCountByGrade = students.stream()
.collect(Collectors.groupingBy(
Student::getGrade,
Collectors.counting()
));
System.out.println("学生数量统计:");
studentCountByGrade.forEach((grade, count) -> {
System.out.println(" Grade " + grade + ": " + count + " students");
});
输出:
学生数量统计:
Grade 1: 2 students
Grade 2: 2 students
Grade 3: 2 students
求平均值
// 计算每个年级学生的平均分数
Map<Integer, Double> averageScoreByGrade = students.stream()
.collect(Collectors.groupingBy(
Student::getGrade,
Collectors.averagingDouble(Student::getScore)
));
System.out.println("平均分数统计:");
averageScoreByGrade.forEach((grade, avgScore) -> {
System.out.println(" Grade " + grade + ": " + avgScore);
});
输出:
平均分数统计:
Grade 1: 89.0
Grade 2: 82.0
Grade 3: 81.0
获取最高分
// 找出每个年级分数最高的学生
Map<Integer, Student> topStudentByGrade = students.stream()
.collect(Collectors.groupingBy(
Student::getGrade,
Collectors.collectingAndThen(
Collectors.maxBy(
Comparator.comparingDouble(Student::getScore)
),
optional -> optional.orElse(null)
)
));
System.out.println("各年级最高分:");
topStudentByGrade.forEach((grade, student) -> {
if (student != null) {
System.out.println(" Grade " + grade + ": " + student.getName() + " - " + student.getScore());
}
});
输出:
各年级最高分:
Grade 1: Charlie - 92.5
Grade 2: David - 88.0
Grade 3: Eve - 90.5