JFR CPU-Time Profiling in Java 25
-
Last Updated: December 19, 2025
-
By: javahandson
-
Series
Learn Java in a easy way
JFR CPU-Time Profiling in Java 25 introduces a more accurate way to analyze performance by sampling based on actual CPU consumption instead of elapsed time. This article explains profiling from first principles, the limitations of old CPU profiling, how CPU-time sampling works, how samples become flame graphs, and how to enable, analyze, and interpret JFR CPU-Time Profiling safely in production.
When we hear that a Java application is “slow,” the immediate question is: Slow because of what?
CPU? Memory? A loop? Database? Network? Without clear answers, performance tuning often turns into guesswork. This is exactly why profiling exists.
A Java application that was working fine earlier may suddenly start showing symptoms like:
At this stage, we only see the symptoms, not the root cause.
The codebase may be large, with thousands of methods, multiple threads, and external calls. From the outside, everything looks normal—except performance.
When performance issues appear, many developers instinctively start guessing:
These changes are often based on:
Unfortunately, this approach usually leads to:
In short, guessing rarely fixes performance problems.
Performance tuning has two very different approaches.
Guessing
Measuring
Profiling is about measuring performance, not guessing it.
In the simplest terms, Profiling means observing a running application and recording where time is actually spent.
A profiler answers questions like:
You can think of profiling as a CCTV camera for your running code:
Why Developers Often Optimize the Wrong Code
A key rule in performance engineering is: Most of the execution time is spent in a very small portion of the code. This is commonly explained using the 80/20 rule:
80% of execution time is spent in 20% of the code
Sometimes, even 90% in just 5%
Without profiling:
Profiling highlights:
One of the most common sources of confusion in performance discussions is the difference between CPU time and elapsed (wall-clock) time.
Many beginners assume they are the same thing.
They are not.
Understanding this distinction is essential before learning profiling, because profilers measure CPU time, not just how long something appears to take.
Elapsed time is the real-world time that passes between two moments. For example, We click an API and the response comes back after 5 seconds.
Elapsed time = 5 seconds
Elapsed time includes everything:
It measures how long the user waited, not how much work the CPU actually did.
CPU time is the amount of time the CPU actively spends executing instructions for your program. It counts only:
It does not include:
If a method runs for 5 seconds in elapsed time but spends most of that time waiting, the CPU time could be only a few milliseconds.
Simple Example: Waiting vs Computing
Let’s look at two very simple scenarios.
Scenario 1: Waiting
Thread.sleep(5000);
The thread is alive, but the CPU is not doing any work.
Scenario 2: Computing
for (long i = 0; i < 5_000_000_000L; i++) {
// busy computation
}
Here, the CPU is continuously executing instructions.
Same Elapsed Time, Very Different CPU Usage
| Scenario | Elapsed Time | CPU Time |
|---|---|---|
| Sleeping | 5 seconds | ~0 sec |
| Computing | 5 seconds | ~5 sec |
The confusion happens because:
As a result:
In reality:
Profilers focus on CPU time because:
This is why profiling tools highlight:
Before discussing advanced profiling techniques or new Java features, it is important to understand the tool that makes all of this possible: Java Flight Recorder, commonly known as JFR.
Java Flight Recorder is a built-in monitoring and profiling framework inside the JVM. It continuously collects detailed information about what a Java application is doing while it is running. This includes how the CPU is being used, how threads behave, how memory is allocated, and how the garbage collector operates. All this information is captured with very low overhead and stored in a compact recording file.
In simple terms, JFR allows us to observe the JVM from the inside, without changing application code or affecting normal execution.
Performance problems rarely happen in a clean, predictable way. They often appear only under real traffic, specific workloads, or production data. By the time a developer tries to reproduce the issue locally, the problem has already disappeared.
Traditional debugging tools are not well-suited for these situations. Debuggers pause execution, logging changes, timing behavior, and test environments rarely match production conditions. Java Flight Recorder was created to solve this exact problem: to observe real application behavior without disturbing it.
JFR is designed to run continuously, even in production, so that when a problem occurs, the data needed to understand it already exists.
At a high level, JFR works by recording events generated by the JVM. An event represents something meaningful that happened during execution—such as a thread running on the CPU, a method being sampled, a garbage collection pause, or a lock being contended.
These events are collected into a recording, which runs for a defined duration or continuously using a circular buffer. The recording is stored as a .jfr file, which can later be analyzed using tools like Java Mission Control.
The important point is that JFR does not rely on application code to produce data. The JVM itself emits structured events, each containing rich information such as timestamps, thread details, stack traces, and durations. This makes the recorded data accurate and consistent.
Many developers try to understand performance issues using logs, but logging and profiling serve very different purposes.
Logging is something we add manually. We decide where to log, what to log, and when to log. This means logs reflect what we think is important. Logs also introduce overhead, increase I/O, and often require redeploying the application to add new log statements.
Profiling, on the other hand, is observational. With JFR, the JVM records what actually happens during execution, not what we assumed would happen. No code changes are required, and the data is collected in a highly efficient way. Profiling captures facts, not guesses.
This is why profiling is far more reliable than logs when diagnosing performance problems.
A common concern is whether running JFR in production is risky. This concern is understandable, but JFR was specifically designed to be production-safe.
Because JFR is built directly into the JVM, it avoids heavy instrumentation and excessive data collection. The overhead is typically very low and predictable. Many organizations run JFR continuously in production with rotating recordings, so they always have historical data available if an issue occurs.
This design allows teams to analyze performance problems after they happen, without trying to reproduce them or adding emergency logging.
Java Flight Recorder is the foundation of modern Java profiling. It provides a unified, reliable view of JVM behavior, covering CPU usage, threads, memory, and more—all in one place.
Advanced profiling features, including CPU-time profiling introduced in newer Java versions, build on top of JFR’s event-based architecture. Without JFR, these features would not be possible in a safe and scalable way.
Before Java 25, Java profiling—especially CPU profiling—was mainly based on wall-clock (elapsed time) sampling. The idea was simple: at regular intervals, the JVM would take a snapshot of running threads and record their stack traces. Over time, these snapshots were aggregated to identify which methods appeared most often. Methods that showed up frequently were considered “hot” and assumed to be responsible for high CPU usage.
This approach is known as sampling-based profiling, and it has been widely used because it is lightweight, fast, and safe to run even in production environments. Java Flight Recorder relied on this technique for many years, and for long-running, CPU-bound workloads, it often produced reasonably useful results.
However, this technique is based on an important assumption: if a method appears often in elapsed-time samples, it must be consuming a lot of CPU. In practice, this assumption does not always hold true.
Wall-clock sampling observes when a thread is seen, not how much CPU it has actually consumed. A thread can appear in a stack trace even when it is waiting for I/O, blocked on a lock, parked by the scheduler, or executing native code. From the profiler’s point of view, the thread looks active, but from the CPU’s point of view, it may be doing very little work.
There is also the issue of missed execution. Sampling happens at fixed intervals, and threads that run briefly between those intervals may never be captured at all, even though they consumed CPU time. This becomes especially problematic for short-lived tasks, bursty workloads, and highly concurrent applications.
Because of these limitations, traditional profiling sometimes produced results that did not match what system-level monitoring showed. CPU usage could be high, but the profiler would fail to clearly identify where that CPU was actually being spent. Developers would then optimize methods that merely appeared frequently in samples, while the real CPU hotspots remained hidden.
This gap between observed elapsed time and actual CPU usage became more noticeable as Java applications grew more complex, relied more on native libraries, and ran in cloud and containerized environments.
Understanding how profiling worked before Java 25 is important because it explains why improvements were needed. The limitations were not caused by bad tools, but by the fact that profiling was based on elapsed time rather than true CPU time.
A Simple Example: How Samples Are Taken
To understand profiling, it helps to see how sampling actually works using a very simple example.
Imagine a Java application with just two methods:
#calculateReport() – does heavy computation
#waitForDatabase() – waits for a database response
Now, assume the profiler takes one sample every 10 milliseconds.
Each time a sample is taken, the JVM records:
Over one second, the profiler takes 100 samples.
Let’s say the samples look like this:
| Sample Count | Method Seen |
|---|---|
| 65 samples | waitForDatabase() |
| 35 samples | calculateReport() |
From this data, the profiler concludes: #waitForDatabase() is “hotter” than #calculateReport()
But this conclusion is misleading.
What Actually Happened?
In reality:
#waitForDatabase() spent most of its time waiting
#calculateReport() did almost all of the CPU work
The waiting method appeared frequently in samples because the thread stayed there for a long time in wall-clock terms. The computing method ran in short bursts and finished quickly, so it appeared less often. The profiler was not wrong—it was simply measuring elapsed time presence, not CPU usage.
Traditional Java CPU profiling, which relied mainly on wall-clock (elapsed time) sampling, had several important limitations that affected accuracy and decision-making.
First, it measured presence, not actual CPU work – A method appeared “hot” simply because a thread was observed inside it frequently, even if the thread was mostly waiting for I/O, locks, or scheduler time. This made elapsed-time-heavy code look CPU-intensive when it was not.
Second, CPU-heavy work could be underrepresented or missed entirely – Sampling happens at fixed intervals, so short but intense bursts of computation could execute between samples and never be recorded. As a result, genuinely expensive code sometimes appeared insignificant in the profile.
Third, CPU usage and profiler results often did not match – System-level monitoring might show high CPU usage, while the Java profiler failed to clearly identify where that CPU time was spent. This mismatch reduced trust in profiling data and made root-cause analysis harder.
Finally, native code and modern execution patterns were poorly reflected. When Java code called into native libraries or ran in highly concurrent environments, traditional profiling struggled to correctly attribute CPU usage back to the originating Java methods. This became more problematic as applications grew more complex.
JEP 509 is a Java 25 enhancement that improves how CPU profiling works in the JVM. It introduces CPU-time–based profiling to Java Flight Recorder, allowing the profiler to measure actual CPU consumption instead of relying on elapsed (wall-clock) time sampling. The goal is not to replace existing profiling, but to make CPU attribution more accurate and reliable.
With JEP 509, sampling is triggered based on how much CPU time a thread has consumed, rather than how much real time has passed. This means that methods are sampled when they truly burn CPU, not when a thread happens to be observed while waiting or blocked. As a result, the profiling data aligns much more closely with what the operating system reports as CPU usage.
This feature is Linux-only and experimental in Java 25. Linux-only means it relies on CPU-time mechanisms provided by the Linux kernel. Experimental means the feature is available for use, but its behavior, configuration, or APIs may evolve in future Java releases based on real-world feedback.
At a high level, JEP 509 closes a long-standing gap in Java profiling: it finally allows developers to see where CPU time actually goes. This makes performance analysis more trustworthy and optimization efforts far more effective, especially in modern, highly concurrent Java applications.
To understand CPU-time sampling, it helps to clearly separate two very different ways of looking at execution: clock-time sampling and CPU-time sampling. Traditional profilers asked, “Where is the thread right now?” at fixed points in real time. CPU-time sampling asks a different question: “After the CPU has done a certain amount of work, where did that work happen?” This change in perspective is the core idea behind JEP 509.
In clock-time (elapsed-time) sampling, samples are taken every N milliseconds of real time, regardless of whether the thread is actively executing or just waiting. If a thread is blocked on I/O or sleeping, it can still appear repeatedly in samples simply because it stays in the same method for a long time. This causes waiting code to look expensive, even though the CPU is doing little or no work.
CPU-time sampling works differently. Instead of sampling every fixed interval of wall-clock time, the JVM triggers a sample only after a thread has actually consumed a certain amount of CPU time. In other words, the profiler waits until real computation has happened before taking a snapshot. If a thread is waiting, blocked, or sleeping, it does not consume CPU time, and therefore, no samples are taken during that period.
This is why waiting does not generate samples in CPU-time profiling. Waiting does not burn CPU cycles, so from the profiler’s point of view, nothing interesting is happening. Samples appear only when instructions are being executed on the CPU. This simple rule removes a major source of noise present in traditional profiling.
Consider a simple example with two methods. One method performs heavy computation in a loop, while another method calls an external service and waits for a response. In elapsed-time sampling, the waiting method may dominate the profile because the thread spends more real time there. In CPU-time sampling, the compute-heavy method dominates, because it is the only place where the CPU is actually doing work.
The result is a profile that aligns with reality. CPU-time sampling highlights the code paths that truly consume CPU resources and naturally ignores waiting, blocking, and idle time. This makes the profile much easier to reason about and far more reliable when deciding what to optimize.
At a high level, CPU-time sampling shifts profiling from “where was the thread observed?” to “where did the CPU spend its time?” That single shift is what makes Java 25’s profiling behavior fundamentally more accurate than what existed before.
Java 25 introduces a significant improvement to Java Flight Recorder by adding true CPU-time–based profiling. This enhancement is centered around a new JFR event called jdk.CPUTimeSample, which changes how and when profiling samples are collected. Instead of relying on elapsed (wall-clock) time, JFR can now sample execution based on actual CPU time consumed by a thread.
The jdk.CPUTimeSample event is generated when a thread has consumed a fixed amount of CPU time. In simple terms, the JVM now waits until the CPU has done a certain amount of real work before taking a sample. This ensures that every recorded sample represents actual computation, not waiting, blocking, or idle time. As a result, the profiling data directly reflects where CPU cycles are truly being spent.
To make this possible, Java 25 leverages Linux CPU timers. At a high level, these timers are provided by the Linux operating system and allow the JVM to track how much CPU time a thread has consumed, independent of wall-clock time. Because this mechanism is provided by the OS kernel, it is precise and efficient, but it also explains why the feature is Linux-only in its initial release.
Another important improvement is that CPU-time sampling works even when execution moves into native code. In earlier profiling approaches, time spent inside native libraries was often invisible or poorly attributed. With the new CPU-time sampling, JFR can track CPU consumption across Java and native boundaries and then correctly attribute that CPU usage back to the Java method that initiated the call.
This attribution is crucial for modern Java applications, which frequently rely on native code for networking, cryptography, compression, and I/O. Even though the work happens outside the JVM, developers can now see which Java call paths are responsible for CPU usage, making profiles far more accurate and actionable.
Overall, what’s new in Java 25 is not just another event, but a shift in how CPU profiling works. By sampling based on CPU time, using OS-level timers, and correctly handling native execution, JFR produces profiles that closely match real CPU usage and eliminate much of the ambiguity present in older profiling approaches.
Once CPU-time sampling is enabled, the JVM starts generating samples—but what exactly are these samples, and what happens to them next? Understanding this is important because profiling is not about exact timing. It is about collecting and interpreting samples correctly.
A single sample is not a measurement of how long a method ran. Instead, it is a snapshot taken after a ms of CPU was consumed. Each sample typically contains the thread that was running, the full Java stack trace at that moment, and metadata such as timestamps and CPU-related information. In the case of CPU-time profiling, the sample represents a point where a thread has just consumed a fixed amount of CPU time.
These samples are written into a Java Flight Recorder recording, which is stored as a .jfr file. Internally, the recording is a highly optimized binary format designed to store large numbers of events efficiently with minimal overhead. The JVM continuously appends samples to this recording while it is running, either until the recording is stopped or the buffer rotates.
It is important to understand that profiling works by counting samples, not by measuring exact durations. The profiler does not say, “This method took 12.3 milliseconds.” Instead, it says, “this method appeared in 120 out of 1,000 samples.” From this, we infer relative CPU usage. This statistical approach is what allows profiling to remain lightweight and safe for production.
More samples directly indicate more CPU consumption. If a method appears twice as often as another in CPU-time samples, it means the CPU spent roughly twice as much time executing that method or its callees. The exact timing is approximate, but the relative proportions are highly reliable.
This is why profilers can confidently highlight hotspots without measuring every instruction. Sampling trades exact precision for scalability and low overhead, and CPU-time sampling makes that tradeoff far more accurate than elapsed-time sampling ever could.
This is the point where profiling stops being abstract and starts making sense.
Up to now, we’ve talked about samples—snapshots of stack traces taken when CPU time is consumed. But raw samples are just data. To understand them intuitively, we need a visual representation. That representation is the flame graph.
A flame graph is a visualization that shows where CPU time is spent across the call stacks of an application.
It does not show:
Instead, it shows:
Think of a flame graph as a summary of all samples merged together.
How Stack Traces Turn into a Flame Graph?
Let’s start from samples, because flame graphs are nothing but samples counted and merged.
Step 1: The Actual Samples Collected
Assume CPU-time sampling produced 5 samples like this:
Sample 1: main → handleRequest → calculateFees
Sample 2: main → handleRequest → calculateFees
Sample 3: main → handleRequest → calculateFees
Sample 4: main → handleRequest → validateRequest
Sample 5: main → handleRequest → validateRequest
So the sample count per method is:
| Method | Samples |
|---|---|
| main | 5 |
| handleRequest | 5 |
| calculateFees | 3 |
| validateRequest | 2 |
This table is the truth. Everything in the flame graph comes directly from this.
Step 2: How Flame Graph Width Is Built
Flame graphs do not show samples vertically one by one. They merge identical call paths and expand them horizontally based on count. Think of each sample as one horizontal unit. Now lay them side by side:
Sample units (each block = 1 CPU-time sample)
[main]
[main]
[main]
[main]
[main]
Since main appears in all 5 samples, it gets 5 units of width.
Same for handleRequest:
[handleRequest][handleRequest][handleRequest][handleRequest][handleRequest]
Step 3: The Crucial Part — Splitting at the Leaf Methods
Now the stack branches at the last method:
Samples 1–3 → calculateFees
Samples 4–5 → validateRequest
So visually, the flame graph looks like this:
calculateFees | calculateFees | calculateFees
───────────── | ───────────── | ─────────────
validateReq | validateReq
─────────── | ───────────
──────────────────────────────────────────────────────────────
handleRequest
──────────────────────────────────────────────────────────────
main
But that’s still a bit abstract — so let’s compress it into a proper flame-graph style bar.
Step 4: Final Flame Graph Representation
Width proportional to CPU samples
(each █ = 1 sample)
calculateFees ███
validateRequest ██
handleRequest █████
main █████
Now it should be obvious:
calculateFees has 3 samples → 3 blocks
validateRequest has 2 samples → 2 blocks
Therefore, calculateFees is wider
Wider = more CPU consumed
Because these samples were taken after fixed amounts of CPU time:
So the flame graph is no longer misleading — it’s directly tied to CPU usage.
In a flame graph, width is nothing more than “how many times this method appeared in CPU-time samples.”
Now that we understand the idea behind CPU-time sampling, let’s see how to actually turn it on in Java 25. This section is practical and focused on how you would really use it.
CPU-time profiling is enabled through Java Flight Recorder (JFR), just like other profiling features. The difference is that we explicitly enable the new jdk.CPUTimeSample event.
The simplest way to enable CPU-time profiling is when starting the JVM.
java \ -XX:StartFlightRecording=filename=cpu-time.jfr \ -XX:FlightRecorderOptions=stackdepth=256 \ -jar app.jar
This starts a JFR recording immediately and writes it to cpu-time.jfr. At this point, JFR is running, but CPU-time sampling still needs to be enabled as an event.
CPU-time profiling is controlled via a JFR event, just like other JFR data. You enable it using JFR configuration settings, either at startup or dynamically.
A practical example using jcmd:
jcmd <pid> JFR.configure jdk.CPUTimeSample#enabled=true
Once enabled, the JVM starts generating CPU-time samples whenever a thread consumes CPU.
CPU-time sampling uses a fixed CPU-time interval, not wall-clock time. You can control how frequently samples are taken.
Conceptually, this means:
Typical configurations look like:
10 ms CPU interval → great detail, more samples
500 samples per second → capped overhead
Example:
jcmd <pid> JFR.configure jdk.CPUTimeSample#period=10ms Or using a rate limit: jcmd <pid> JFR.configure jdk.CPUTimeSample#throttle=500/s
The choice depends on what you are profiling.
If you are investigating:
Use short intervals (e.g., 10 ms) to get fine-grained detail.
If you are:
Use throttled sampling (e.g., 500/s) to keep overhead minimal.
The key idea is that CPU-time sampling is already efficient, so even higher sampling rates are usually safe compared to traditional profilers.
Once you have a .jfr recording, the next question is: how do we actually read it in a beginner-friendly way? You have two good options: a quick CLI summary (great for servers/SSH) and a visual deep-dive using Mission Control (best for learning and “aha” moments).
Option 1: Quick CLI check with jfr view cpu-time-hot-methods
Java 25 adds a handy view that prints the hottest CPU methods from the CPU-time sampler:
jfr view cpu-time-hot-methods profile.jfr
This view is specifically mentioned alongside the new CPU-time profiling work. What you should understand as a beginner: this output is sample-based, not exact timing. It’s essentially saying: “These methods were on-CPU most often when the CPU-time sampler took snapshots.” More appearances ⇒ more CPU usage.
Option 2: Open the .jfr in JDK Mission Control (JMC)
For most people, understanding clicks in JMC because you can see the profile. Open the recording in Mission Control, then open the Flame Graph / Flame View (JMC provides a dedicated Flame Graph view for aggregated stack traces).
In JMC, your beginner goal is simple: find the widest boxes in the flame graph. Width represents “how much CPU” (because it represents how many CPU-time samples are included in that stack).
What to look for as a beginner
Start with these three checks:
1. Widest leaf boxes (top of stacks) – Those are often the “where CPU is burned” methods (or very close to them).
2. The full hot path underneath – The width is inherited from callers → callees. So look down the stack to understand why a hot method is being reached.
3. Repeated patterns – If you see the same “shape” repeated across threads or time ranges, that’s a strong signal of a systemic hotspot.
Common mistakes while reading flame graphs
1. Thinking height means “more time.” – Height is just called depth. Width is what matters for the CPU.
2. Treating sample counts as exact milliseconds – Sampling is statistical. It’s great for “what’s hottest,” not for exact timing numbers.
3. Blaming the widest parent method – Often, the parent is wide because it calls something hot. The real cost is usually in the leaf methods (or the specific branch that is wide).
4. Forgetting this is CPU-time profiling – If code is waiting (I/O, lock, sleep), it should not dominate CPU-time samples—so a “wide” section is a real CPU consumer, not “just waiting.” (That’s the whole point of the new CPU-time sampler.)
CPU-time profiling is best used when you want to understand where the CPU is actually being spent, not just where time appears to pass.
1. When CPU usage is high or unpredictable – If system metrics show high CPU, but it’s unclear which code paths are responsible, CPU-time profiling gives a trustworthy answer.
2. When optimizing performance or throughput, it helps identify true CPU hotspots, making it ideal for tuning algorithms, loops, serialization, parsing, encryption, or any compute-heavy logic.
3. In highly concurrent or multi-threaded applications, CPU-time sampling avoids noise from blocked or waiting threads and highlights the threads that are genuinely consuming CPU.
4. When native code is involved – If your application relies on native libraries (networking, crypto, compression), CPU-time profiling can correctly attribute that CPU usage back to Java call paths.
5. For production or near-production analysis – Its low overhead makes it safe for real workloads where attaching heavy profilers is not feasible.
CPU-time profiling is not the right tool for every performance problem.
1. When the problem is latency caused by waiting, if delays come from I/O, database calls, locks, or external services, CPU-time profiling will not highlight them because waiting does not consume CPU.
2. When you need exact timing measurements, Sampling provides relative insights, not precise method execution times. For exact durations, tracing or instrumentation is more appropriate.
3. For very short-lived programs – If the application runs for only a few milliseconds or seconds, sampling may not collect enough data to be meaningful.
4. When analyzing memory or GC issues, CPU-time profiling focuses on CPU usage. For memory leaks or garbage collection problems, memory profiling and GC analysis tools are more effective.
CPU-time profiling in Java 25 is a major improvement, but it is not magic. Keeping these points in mind helps avoid wrong conclusions.
1. It is Linux-only and experimental (for now) – CPU-time profiling relies on Linux CPU timers, so it is available only on Linux in Java 25. Being experimental means it is safe to use, but behavior, defaults, or configuration may evolve in future releases.
2. It shows CPU usage, not waiting or latency – Code that waits for I/O, locks, or external systems will not appear prominently, because waiting does not consume CPU. This is expected behavior, not a missing feature.
3. Sampling is statistical, not exact – CPU-time profiling works by counting samples, not by measuring precise execution times. The results are excellent for identifying hotspots, but they are not suitable for exact millisecond-level measurements.
4. Hot methods may include their callees – A wide method in a flame graph often represents a hot call path, not necessarily a single expensive line of code. Always inspect the full stack to understand where the CPU is truly being spent.
5. Profiling should guide, not replace, reasoning – CPU-time profiles tell you where to look, not how to fix the problem. Use them alongside code knowledge, benchmarks, and metrics before making changes.
Performance problems are hard not because they are rare, but because they are often misunderstood. For years, Java developers relied on elapsed-time–based profiling, which worked reasonably well but frequently blurred the line between doing work and waiting for work. This made CPU analysis noisy and, at times, misleading.
With Java 25 and JEP 509, profiling takes a meaningful step forward. CPU-time–based sampling shifts the focus to what actually matters: where the CPU is truly spending its time. By sampling only after real CPU work is done, Java Flight Recorder produces profiles that align with system metrics, correctly handle native execution, and eliminate much of the guesswork from performance tuning.
The biggest win is clarity. Flame graphs become easier to trust, hotspots become easier to spot, and optimization efforts become more focused. Instead of asking “Why does this method look slow?”, developers can confidently ask “Is this where the CPU is being consumed?”
CPU-time profiling is not a replacement for all performance tools, nor is it a silver bullet. But when used for the right problems, it provides one of the most accurate and production-safe views of Java performance available today.