JFR CPU-Time Profiling in Java 25

Last Updated: December 19, 2025
By: javahandson
Series

JFR CPU-Time Profiling in Java 25

JFR CPU-Time Profiling in Java 25 introduces a more accurate way to analyze performance by sampling based on actual CPU consumption instead of elapsed time. This article explains profiling from first principles, the limitations of old CPU profiling, how CPU-time sampling works, how samples become flame graphs, and how to enable, analyze, and interpret JFR CPU-Time Profiling safely in production.

1. Profiling
2. CPU Time Vs Elapsed Time
3. Java Flight Recorder (JFR)
4. Profiling Before Java 25
5. What Is JEP 509?
6. From Samples to Flame Graphs
7. How to Enable CPU-Time Profiling in Java 25
8. How to View and Analyze the Results
9. When Should We Use CPU-Time Profiling?
10. When Should We Not Use CPU-Time Profiling?
11. Limitations and Things to Remember
12. Conclusion

1. Profiling

When we hear that a Java application is “slow,” the immediate question is: Slow because of what?

CPU? Memory? A loop? Database? Network? Without clear answers, performance tuning often turns into guesswork. This is exactly why profiling exists.

A Java application that was working fine earlier may suddenly start showing symptoms like:

APIs are taking seconds instead of milliseconds
CPU usage staying at 90–100%
Requests are timing out under load
Users are reporting that the application “feels slow.”

At this stage, we only see the symptoms, not the root cause.

The codebase may be large, with thousands of methods, multiple threads, and external calls. From the outside, everything looks normal—except performance.

When performance issues appear, many developers instinctively start guessing:

This loop looks expensive
Streams might be slow
Let’s add caching here
Let’s make this async

These changes are often based on:

Intuition
Code complexity
Past experiences
Assumptions

Unfortunately, this approach usually leads to:

Optimizing code that was never a bottleneck
Adding unnecessary complexity
Little to no performance improvement

In short, guessing rarely fixes performance problems.

1.1. Guessing vs Measuring Performance

Performance tuning has two very different approaches.

Guessing

Based on assumptions
Influenced by intuition and bias
Often targets the wrong code
Produces inconsistent results

Measuring

Based on actual runtime behavior
Shows which code executes most often
Reveals where CPU time is really spent
Provides objective data

Profiling is about measuring performance, not guessing it.

1.2. What Does Profiling Mean?

In the simplest terms, Profiling means observing a running application and recording where time is actually spent.

A profiler answers questions like:

Which methods execute the most?
Which methods consume the most CPU?
Which code paths are “hot”?
Where does the application spend most of its execution time?

You can think of profiling as a CCTV camera for your running code:

It does not change behavior
It only observes and records

Why Developers Often Optimize the Wrong Code

A key rule in performance engineering is: Most of the execution time is spent in a very small portion of the code. This is commonly explained using the 80/20 rule:

80% of execution time is spent in 20% of the code

Sometimes, even 90% in just 5%

Without profiling:

Rarely executed methods may get optimized
Frequently executed methods may be ignored
Complex-looking code gets attention instead of hot code

Profiling highlights:

What actually matters
What does not impact performance at all

2. CPU Time Vs Elapsed Time

One of the most common sources of confusion in performance discussions is the difference between CPU time and elapsed (wall-clock) time.

Many beginners assume they are the same thing.
They are not.

Understanding this distinction is essential before learning profiling, because profilers measure CPU time, not just how long something appears to take.

2.1. What Is Elapsed (Wall-Clock) Time?

Elapsed time is the real-world time that passes between two moments. For example, we click an API, and the response comes back after 5 seconds.

Elapsed time = 5 seconds

Elapsed time includes everything:

Actual computation
Waiting for database responses
Waiting for network calls
Waiting for locks
Waiting for the operating system to schedule the thread

It measures how long the user waited, not how much work the CPU actually did.

2.2. What Is CPU Time?

CPU time is the amount of time the CPU actively spends executing instructions for your program. It counts only:

When the thread is running on the CPU
When actual computation is happening

It does not include:

Waiting for I/O
Sleeping
Being blocked on the locks
Waiting in a thread queue

If a method runs for 5 seconds in elapsed time but spends most of that time waiting, the CPU time could be only a few milliseconds.

Simple Example: Waiting vs Computing

Let’s look at two very simple scenarios.

Scenario 1: Waiting

Thread.sleep(5000);

Elapsed time: ~5 seconds
CPU time: almost zero

The thread is alive, but the CPU is not doing any work.

Scenario 2: Computing

for (long i = 0; i < 5_000_000_000L; i++) {
    // busy computation
}

Elapsed time: ~5 seconds
CPU time: ~5 seconds

Here, the CPU is continuously executing instructions.

Same Elapsed Time, Very Different CPU Usage

Scenario	Elapsed Time	CPU Time
Sleeping	5 seconds	~0 sec
Computing	5 seconds	~5 sec

2.3. Why Elapsed and CPU Time Are Often Confused?

The confusion happens because:

Users experience elapsed time
Developers often measure time using System.currentTimeMillis()
Logs usually show start and end timestamps
Monitoring dashboards show response time, not CPU usage

As a result:

Slow API is assumed to be “CPU heavy.”
High response time is mistaken for high CPU usage

In reality:

An API can be slow due to waiting
An API can be CPU-intensive but fast
Elapsed time and CPU time can tell completely different stories

2.4. Why Profilers Care About CPU Time?

Profilers focus on CPU time because:

CPU is a limited resource
CPU time tells where the computation actually happens
Optimizing waiting does not reduce CPU usage
Optimizing CPU-heavy code improves throughput

This is why profiling tools highlight:

Hot methods
Hot call stacks
CPU-intensive execution paths

3. Java Flight Recorder (JFR)

Before discussing advanced profiling techniques or new Java features, it is important to understand the tool that makes all of this possible: Java Flight Recorder, commonly known as JFR.

Java Flight Recorder is a built-in monitoring and profiling framework inside the JVM. It continuously collects detailed information about what a Java application is doing while it is running. This includes how the CPU is being used, how threads behave, how memory is allocated, and how the garbage collector operates. All this information is captured with very low overhead and stored in a compact recording file.

In simple terms, JFR allows us to observe the JVM from the inside, without changing application code or affecting normal execution.

3.1. Why Java Flight Recorder Exists?

Performance problems rarely happen in a clean, predictable way. They often appear only under real traffic, specific workloads, or production data. By the time a developer tries to reproduce the issue locally, the problem has already disappeared.

Traditional debugging tools are not well-suited for these situations. Debuggers pause execution, logging changes, timing behavior, and test environments rarely match production conditions. Java Flight Recorder was created to solve this exact problem: to observe real application behavior without disturbing it.

JFR is designed to run continuously, even in production, so that when a problem occurs, the data needed to understand it already exists.

3.2. How JFR Works at a High Level?

At a high level, JFR works by recording events generated by the JVM. An event represents something meaningful that happened during execution—such as a thread running on the CPU, a method being sampled, a garbage collection pause, or a lock being contended.

These events are collected into a recording, which runs for a defined duration or continuously using a circular buffer. The recording is stored as a .jfr file, which can later be analyzed using tools like Java Mission Control.

The important point is that JFR does not rely on application code to produce data. The JVM itself emits structured events, each containing rich information such as timestamps, thread details, stack traces, and durations. This makes the recorded data accurate and consistent.

3.3. Logging vs Profiling: A Crucial Difference

Many developers try to understand performance issues using logs, but logging and profiling serve very different purposes.

Logging is something we add manually. We decide where to log, what to log, and when to log. This means logs reflect what we think is important. Logs also introduce overhead, increase I/O, and often require redeploying the application to add new log statements.

Profiling, on the other hand, is observational. With JFR, the JVM records what actually happens during execution, not what we assumed would happen. No code changes are required, and the data is collected in a highly efficient way. Profiling captures facts, not guesses.

This is why profiling is far more reliable than logs when diagnosing performance problems.

3.4. Why JFR Is Safe for Production Use?

A common concern is whether running JFR in production is risky. This concern is understandable, but JFR was specifically designed to be production-safe.

Because JFR is built directly into the JVM, it avoids heavy instrumentation and excessive data collection. The overhead is typically very low and predictable. Many organizations run JFR continuously in production with rotating recordings, so they always have historical data available if an issue occurs.

This design allows teams to analyze performance problems after they happen, without trying to reproduce them or adding emergency logging.

3.5. Why JFR Matters for Profiling?

Java Flight Recorder is the foundation of modern Java profiling. It provides a unified, reliable view of JVM behavior, covering CPU usage, threads, memory, and more—all in one place.

Advanced profiling features, including CPU-time profiling introduced in newer Java versions, build on top of JFR’s event-based architecture. Without JFR, these features would not be possible in a safe and scalable way.

4. Profiling Before Java 25

Before Java 25, Java profiling—especially CPU profiling—was mainly based on wall-clock (elapsed time) sampling. The idea was simple: at regular intervals, the JVM would take a snapshot of running threads and record their stack traces. Over time, these snapshots were aggregated to identify which methods appeared most often. Methods that showed up frequently were considered “hot” and assumed to be responsible for high CPU usage.

This approach is known as sampling-based profiling, and it has been widely used because it is lightweight, fast, and safe to run even in production environments. Java Flight Recorder relied on this technique for many years, and for long-running, CPU-bound workloads, it often produced reasonably useful results.

However, this technique is based on an important assumption: if a method appears often in elapsed-time samples, it must be consuming a lot of CPU. In practice, this assumption does not always hold true.

Wall-clock sampling observes when a thread is seen, not how much CPU it has actually consumed. A thread can appear in a stack trace even when it is waiting for I/O, blocked on a lock, parked by the scheduler, or executing native code. From the profiler’s point of view, the thread looks active, but from the CPU’s point of view, it may be doing very little work.

There is also the issue of missed execution. Sampling happens at fixed intervals, and threads that run briefly between those intervals may never be captured at all, even though they consumed CPU time. This becomes especially problematic for short-lived tasks, bursty workloads, and highly concurrent applications.

Because of these limitations, traditional profiling sometimes produced results that did not match what system-level monitoring showed. CPU usage could be high, but the profiler would fail to clearly identify where that CPU was actually being spent. Developers would then optimize methods that merely appeared frequently in samples, while the real CPU hotspots remained hidden.

This gap between observed elapsed time and actual CPU usage became more noticeable as Java applications grew more complex, relied more on native libraries, and ran in cloud and containerized environments.

Understanding how profiling worked before Java 25 is important because it explains why improvements were needed. The limitations were not caused by bad tools, but by the fact that profiling was based on elapsed time rather than true CPU time.

A Simple Example: How Samples Are Taken

To understand profiling, it helps to see how sampling actually works using a very simple example.

Imagine a Java application with just two methods:

#calculateReport() – does heavy computation

#waitForDatabase() – waits for a database response

Now, assume the profiler takes one sample every 10 milliseconds.

Each time a sample is taken, the JVM records:

Which thread is running
What method is currently on the stack

Over one second, the profiler takes 100 samples.

Let’s say the samples look like this:

Sample Count	Method Seen
65 samples	`waitForDatabase()`
35 samples	`calculateReport()`

From this data, the profiler concludes: #waitForDatabase() is “hotter” than #calculateReport()

But this conclusion is misleading.

What Actually Happened?

In reality:

#waitForDatabase() spent most of its time waiting

#calculateReport() did almost all of the CPU work

The waiting method appeared frequently in samples because the thread stayed there for a long time in wall-clock terms. The computing method ran in short bursts and finished quickly, so it appeared less often. The profiler was not wrong—it was simply measuring elapsed time presence, not CPU usage.

4.1. Problems with Old CPU Profiling

Traditional Java CPU profiling, which relied mainly on wall-clock (elapsed time) sampling, had several important limitations that affected accuracy and decision-making.

First, it measured presence, not actual CPU work – A method appeared “hot” simply because a thread was observed inside it frequently, even if the thread was mostly waiting for I/O, locks, or scheduler time. This made elapsed-time-heavy code look CPU-intensive when it was not.

Second, CPU-heavy work could be underrepresented or missed entirely – Sampling happens at fixed intervals, so short but intense bursts of computation could execute between samples and never be recorded. As a result, genuinely expensive code sometimes appeared insignificant in the profile.

Third, CPU usage and profiler results often did not match – System-level monitoring might show high CPU usage, while the Java profiler failed to clearly identify where that CPU time was spent. This mismatch reduced trust in profiling data and made root-cause analysis harder.

Finally, native code and modern execution patterns were poorly reflected. When Java code called into native libraries or ran in highly concurrent environments, traditional profiling struggled to correctly attribute CPU usage back to the originating Java methods. This became more problematic as applications grew more complex.

5. What Is JEP 509?

JEP 509 is a Java 25 enhancement that improves how CPU profiling works in the JVM. It introduces CPU-time–based profiling to Java Flight Recorder, allowing the profiler to measure actual CPU consumption instead of relying on elapsed (wall-clock) time sampling. The goal is not to replace existing profiling, but to make CPU attribution more accurate and reliable.

With JEP 509, sampling is triggered based on how much CPU time a thread has consumed, rather than how much real time has passed. This means that methods are sampled when they truly burn CPU, not when a thread happens to be observed while waiting or blocked. As a result, the profiling data aligns much more closely with what the operating system reports as CPU usage.

This feature is Linux-only and experimental in Java 25. Linux-only means it relies on CPU-time mechanisms provided by the Linux kernel. Experimental means the feature is available for use, but its behavior, configuration, or APIs may evolve in future Java releases based on real-world feedback.

At a high level, JEP 509 closes a long-standing gap in Java profiling: it finally allows developers to see where CPU time actually goes. This makes performance analysis more trustworthy and optimization efforts far more effective, especially in modern, highly concurrent Java applications.

5.1. The Big Idea: CPU-Time Sampling

To understand CPU-time sampling, it helps to clearly separate two very different ways of looking at execution: clock-time sampling and CPU-time sampling. Traditional profilers asked, “Where is the thread right now?” at fixed points in real time. CPU-time sampling asks a different question: “After the CPU has done a certain amount of work, where did that work happen?” This change in perspective is the core idea behind JEP 509.

In clock-time (elapsed-time) sampling, samples are taken every N milliseconds of real time, regardless of whether the thread is actively executing or just waiting. If a thread is blocked on I/O or sleeping, it can still appear repeatedly in samples simply because it stays in the same method for a long time. This causes waiting code to look expensive, even though the CPU is doing little or no work.

CPU-time sampling works differently. Instead of sampling every fixed interval of wall-clock time, the JVM triggers a sample only after a thread has actually consumed a certain amount of CPU time. In other words, the profiler waits until real computation has happened before taking a snapshot. If a thread is waiting, blocked, or sleeping, it does not consume CPU time, and therefore, no samples are taken during that period.

This is why waiting does not generate samples in CPU-time profiling. Waiting does not burn CPU cycles, so from the profiler’s point of view, nothing interesting is happening. Samples appear only when instructions are being executed on the CPU. This simple rule removes a major source of noise present in traditional profiling.

Consider a simple example with two methods. One method performs heavy computation in a loop, while another method calls an external service and waits for a response. In elapsed-time sampling, the waiting method may dominate the profile because the thread spends more real time there. In CPU-time sampling, the compute-heavy method dominates, because it is the only place where the CPU is actually doing work.

The result is a profile that aligns with reality. CPU-time sampling highlights the code paths that truly consume CPU resources and naturally ignores waiting, blocking, and idle time. This makes the profile much easier to reason about and far more reliable when deciding what to optimize.

At a high level, CPU-time sampling shifts profiling from “where was the thread observed?” to “where did the CPU spend its time?” That single shift is what makes Java 25’s profiling behavior fundamentally more accurate than what existed before.

5.2. What Exactly Is New in JFR?

Java 25 introduces a significant improvement to Java Flight Recorder by adding true CPU-time–based profiling. This enhancement is centered around a new JFR event called jdk.CPUTimeSample, which changes how and when profiling samples are collected. Instead of relying on elapsed (wall-clock) time, JFR can now sample execution based on actual CPU time consumed by a thread.

The jdk.CPUTimeSample An event is generated when a thread has consumed a fixed amount of CPU time. In simple terms, the JVM now waits until the CPU has done a certain amount of real work before taking a sample. This ensures that every recorded sample represents actual computation, not waiting, blocking, or idle time. As a result, the profiling data directly reflects where CPU cycles are truly being spent.

To make this possible, Java 25 leverages Linux CPU timers. At a high level, these timers are provided by the Linux operating system and allow the JVM to track how much CPU time a thread has consumed, independent of wall-clock time. Because this mechanism is provided by the OS kernel, it is precise and efficient, but it also explains why the feature is Linux-only in its initial release.

Another important improvement is that CPU-time sampling works even when execution moves into native code. In earlier profiling approaches, time spent inside native libraries was often invisible or poorly attributed. With the new CPU-time sampling, JFR can track CPU consumption across Java and native boundaries and then correctly attribute that CPU usage back to the Java method that initiated the call.

This attribution is crucial for modern Java applications, which frequently rely on native code for networking, cryptography, compression, and I/O. Even though the work happens outside the JVM, developers can now see which Java call paths are responsible for CPU usage, making profiles far more accurate and actionable.

Overall, what’s new in Java 25 is not just another event, but a shift in how CPU profiling works. By sampling based on CPU time, using OS-level timers, and correctly handling native execution, JFR produces profiles that closely match real CPU usage and eliminate much of the ambiguity present in older profiling approaches.

5.3. What Happens to the Samples?

Once CPU-time sampling is enabled, the JVM starts generating samples—but what exactly are these samples, and what happens to them next? Understanding this is important because profiling is not about exact timing. It is about collecting and interpreting samples correctly.

A single sample is not a measurement of how long a method ran. Instead, it is a snapshot taken after a ms of CPU was consumed. Each sample typically contains the thread that was running, the full Java stack trace at that moment, and metadata such as timestamps and CPU-related information. In the case of CPU-time profiling, the sample represents a point where a thread has just consumed a fixed amount of CPU time.

These samples are written into a Java Flight Recorder recording, which is stored as a .jfr file. Internally, the recording is a highly optimized binary format designed to store large numbers of events efficiently with minimal overhead. The JVM continuously appends samples to this recording while it is running, either until the recording is stopped or the buffer rotates.

It is important to understand that profiling works by counting samples, not by measuring exact durations. The profiler does not say, “This method took 12.3 milliseconds.” Instead, it says, “this method appeared in 120 out of 1,000 samples.” From this, we infer relative CPU usage. This statistical approach is what allows profiling to remain lightweight and safe for production.

More samples directly indicate more CPU consumption. If a method appears twice as often as another in CPU-time samples, it means the CPU spent roughly twice as much time executing that method or its callees. The exact timing is approximate, but the relative proportions are highly reliable.

This is why profilers can confidently highlight hotspots without measuring every instruction. Sampling trades exact precision for scalability and low overhead, and CPU-time sampling makes that tradeoff far more accurate than elapsed-time sampling ever could.

6. From Samples to Flame Graphs

This is the point where profiling stops being abstract and starts making sense.

Up to now, we’ve talked about samples—snapshots of stack traces taken when CPU time is consumed. But raw samples are just data. To understand them intuitively, we need a visual representation. That representation is the flame graph.

6.1. What Is a Flame Graph?

A flame graph is a visualization that shows where CPU time is spent across the call stacks of an application.

It does not show:

Time on the clock
Order of execution
Timeline progression

Instead, it shows:

Which methods were on the CPU
How often did they appear in samples
How call stacks are structured

Think of a flame graph as a summary of all samples merged together.

How Stack Traces Turn into a Flame Graph?

Let’s start from samples, because flame graphs are nothing but samples counted and merged.

Step 1: The Actual Samples Collected

Assume CPU-time sampling produced 5 samples like this:

Sample 1: main → handleRequest → calculateFees
Sample 2: main → handleRequest → calculateFees
Sample 3: main → handleRequest → calculateFees
Sample 4: main → handleRequest → validateRequest
Sample 5: main → handleRequest → validateRequest

So the sample count per method is:

Method	Samples
main	5
handleRequest	5
calculateFees	3
validateRequest	2

This table is the truth. Everything in the flame graph comes directly from this.

Step 2: How Flame Graph Width Is Built

Flame graphs do not show samples vertically one by one. They merge identical call paths and expand them horizontally based on count. Think of each sample as one horizontal unit. Now lay them side by side:

Sample units (each block = 1 CPU-time sample)

[main]
[main]
[main]
[main]
[main]

Since main appears in all 5 samples, it gets 5 units of width.

Same for handleRequest:

[handleRequest][handleRequest][handleRequest][handleRequest][handleRequest]

Step 3: The Crucial Part — Splitting at the Leaf Methods

Now the stack branches at the last method:

Samples 1–3 → calculateFees
Samples 4–5 → validateRequest

So visually, the flame graph looks like this:

             calculateFees | calculateFees | calculateFees
             ───────────── | ───────────── | ─────────────
             validateReq   | validateReq
             ───────────   | ───────────
──────────────────────────────────────────────────────────────
                      handleRequest
──────────────────────────────────────────────────────────────
                          main

But that’s still a bit abstract — so let’s compress it into a proper flame-graph style bar.

Step 4: Final Flame Graph Representation

Width proportional to CPU samples
(each █ = 1 sample)

calculateFees ███
validateRequest ██

handleRequest █████
main █████

Now it should be obvious:

calculateFees has 3 samples → 3 blocks
validateRequest has 2 samples → 2 blocks

Therefore, calculateFees is wider

Wider = more CPU consumed

6.2. Why This Is Reliable (With CPU-Time Sampling)

Because these samples were taken after fixed amounts of CPU time:

#calculateFees Generated more samples only because it burned more CPU
#validateRequest Generated fewer samples because it did less CPU work
Waiting or blocking would generate zero width

So the flame graph is no longer misleading — it’s directly tied to CPU usage.

In a flame graph, width is nothing more than “how many times this method appeared in CPU-time samples.”

7. How to Enable CPU-Time Profiling in Java 25

Now that we understand the idea behind CPU-time sampling, let’s see how to actually turn it on in Java 25. This section is practical and focused on how you would really use it.

CPU-time profiling is enabled through Java Flight Recorder (JFR), just like other profiling features. The difference is that we explicitly enable the new jdk.CPUTimeSample event.

7.1. Starting a Recording at JVM Startup

The simplest way to enable CPU-time profiling is when starting the JVM.

java \
  -XX:StartFlightRecording=filename=cpu-time.jfr \
  -XX:FlightRecorderOptions=stackdepth=256 \
  -jar app.jar

This starts a JFR recording immediately and writes it to cpu-time.jfr. At this point, JFR is running, but CPU-time sampling still needs to be enabled as an event.

7.2. Enabling `jdk.CPUTimeSample`

CPU-time profiling is controlled via a JFR event, just like other JFR data. You enable it using JFR configuration settings, either at startup or dynamically.

A practical example using jcmd:

jcmd <pid> JFR.configure jdk.CPUTimeSample#enabled=true

Once enabled, the JVM starts generating CPU-time samples whenever a thread consumes CPU.

No code changes.
No restart required (if JFR is already running).

7.3. Controlling the Sampling Rate (Throttle)

CPU-time sampling uses a fixed CPU-time interval, not wall-clock time. You can control how frequently samples are taken.

Conceptually, this means:

Sample after N milliseconds of CPU time, or
Limit samples to N samples per second

Typical configurations look like:

10 ms CPU interval → great detail, more samples

500 samples per second → capped overhead

Example:

jcmd <pid> JFR.configure jdk.CPUTimeSample#period=10ms

Or using a rate limit:

jcmd <pid> JFR.configure jdk.CPUTimeSample#throttle=500/s

7.4. Choosing the Right Setting

The choice depends on what you are profiling.

If you are investigating:

A short CPU spike
A sudden latency issue
A suspected hot loop

Use short intervals (e.g., 10 ms) to get fine-grained detail.

If you are:

Running continuously in production
Monitoring long-running services
Looking for general CPU hotspots

Use throttled sampling (e.g., 500/s) to keep overhead minimal.

The key idea is that CPU-time sampling is already efficient, so even higher sampling rates are usually safe compared to traditional profilers.

8. How to View and Analyze the Results

Once you have a .jfr recording, the next question is: how do we actually read it in a beginner-friendly way? You have two good options: a quick CLI summary (great for servers/SSH) and a visual deep-dive using Mission Control (best for learning and “aha” moments).

Option 1: Quick CLI check with jfr view cpu-time-hot-methods

Java 25 adds a handy view that prints the hottest CPU methods from the CPU-time sampler:

jfr view cpu-time-hot-methods profile.jfr

This view is specifically mentioned alongside the new CPU-time profiling work. What you should understand as a beginner: this output is sample-based, not exact timing. It’s essentially saying: “These methods were on-CPU most often when the CPU-time sampler took snapshots.” More appearances ⇒ more CPU usage.

Option 2: Open the .jfr in JDK Mission Control (JMC)

For most people, understanding clicks in JMC because you can see the profile. Open the recording in Mission Control, then open the Flame Graph / Flame View (JMC provides a dedicated Flame Graph view for aggregated stack traces).

In JMC, your beginner goal is simple: find the widest boxes in the flame graph. Width represents “how much CPU” (because it represents how many CPU-time samples are included in that stack).

What to look for as a beginner. Start with these three checks:

a. Widest leaf boxes (top of stacks) – Those are often the “where CPU is burned” methods (or very close to them).

b. The full hot path underneath – The width is inherited from callers → callees. So look down the stack to understand why a hot method is being reached.

c. Repeated patterns – If you see the same “shape” repeated across threads or time ranges, that’s a strong signal of a systemic hotspot.

8.1. Common mistakes while reading flame graphs

a. Thinking height means “more time.” – Height is just called depth. Width is what matters for the CPU.

b. Treating sample counts as exact milliseconds – Sampling is statistical. It’s great for “what’s hottest,” not for exact timing numbers.

c. Blaming the widest parent method – Often, the parent is wide because it calls something hot. The real cost is usually in the leaf methods (or the specific branch that is wide).

d. Forgetting this is CPU-time profiling – If code is waiting (I/O, lock, sleep), it should not dominate CPU-time samples—so a “wide” section is a real CPU consumer, not “just waiting.” (That’s the whole point of the new CPU-time sampler.)

9. When Should We Use CPU-Time Profiling?

CPU-time profiling is best used when you want to understand where the CPU is actually being spent, not just where time appears to pass.

a. When CPU usage is high or unpredictable – If system metrics show high CPU, but it’s unclear which code paths are responsible, CPU-time profiling gives a trustworthy answer.

b. When optimizing performance or throughput, it helps identify true CPU hotspots, making it ideal for tuning algorithms, loops, serialization, parsing, encryption, or any compute-heavy logic.

c. In highly concurrent or multi-threaded applications, CPU-time sampling avoids noise from blocked or waiting threads and highlights the threads that are genuinely consuming CPU.

d. When native code is involved – If your application relies on native libraries (networking, crypto, compression), CPU-time profiling can correctly attribute that CPU usage back to Java call paths.

e. For production or near-production analysis – Its low overhead makes it safe for real workloads where attaching heavy profilers is not feasible.

10. When Should We Not Use CPU-Time Profiling?

CPU-time profiling is not the right tool for every performance problem.

a. When the problem is latency caused by waiting, if delays come from I/O, database calls, locks, or external services, CPU-time profiling will not highlight them because waiting does not consume CPU.

b. When you need exact timing measurements, Sampling provides relative insights, not precise method execution times. For exact durations, tracing or instrumentation is more appropriate.

c. For very short-lived programs – If the application runs for only a few milliseconds or seconds, sampling may not collect enough data to be meaningful.

d. When analyzing memory or GC issues, CPU-time profiling focuses on CPU usage. For memory leaks or garbage collection problems, memory profiling and GC analysis tools are more effective.

11. Limitations and Things to Remember

CPU-time profiling in Java 25 is a major improvement, but it is not magic. Keeping these points in mind helps avoid wrong conclusions.

a. It is Linux-only and experimental (for now) – CPU-time profiling relies on Linux CPU timers, so it is available only on Linux in Java 25. Being experimental means it is safe to use, but behavior, defaults, or configuration may evolve in future releases.

b. It shows CPU usage, not waiting or latency – Code that waits for I/O, locks, or external systems will not appear prominently, because waiting does not consume CPU. This is expected behavior, not a missing feature.

c. Sampling is statistical, not exact – CPU-time profiling works by counting samples, not by measuring precise execution times. The results are excellent for identifying hotspots, but they are not suitable for exact millisecond-level measurements.

d. Hot methods may include their callees – A wide method in a flame graph often represents a hot call path, not necessarily a single expensive line of code. Always inspect the full stack to understand where the CPU is truly being spent.

e. Profiling should guide, not replace, reasoning – CPU-time profiles tell you where to look, not how to fix the problem. Use them alongside code knowledge, benchmarks, and metrics before making changes.

12. Conclusion

Performance problems are hard not because they are rare, but because they are often misunderstood. For years, Java developers relied on elapsed-time–based profiling, which worked reasonably well but frequently blurred the line between doing work and waiting for work. This made CPU analysis noisy and, at times, misleading.

With Java 25 and JEP 509, profiling takes a meaningful step forward. CPU-time–based sampling shifts the focus to what actually matters: where the CPU is truly spending its time. By sampling only after real CPU work is done, Java Flight Recorder produces profiles that align with system metrics, correctly handle native execution, and eliminate much of the guesswork from performance tuning.

The biggest win is clarity. Flame graphs become easier to trust, hotspots become easier to spot, and optimization efforts become more focused. Instead of asking “Why does this method look slow?”, developers can confidently ask “Is this where the CPU is being consumed?”

CPU-time profiling is not a replacement for all performance tools, nor is it a silver bullet. But when used for the right problems, it provides one of the most accurate and production-safe views of Java performance available today.

JFR CPU-Time Profiling in Java 25

Last Updated: December 19, 2025

By: javahandson

Series

JFR CPU-Time Profiling in Java 25

1. Profiling

1.1. Guessing vs Measuring Performance

1.2. What Does Profiling Mean?

2. CPU Time Vs Elapsed Time

2.1. What Is Elapsed (Wall-Clock) Time?

2.2. What Is CPU Time?

2.3. Why Elapsed and CPU Time Are Often Confused?

2.4. Why Profilers Care About CPU Time?

3. Java Flight Recorder (JFR)

3.1. Why Java Flight Recorder Exists?

3.2. How JFR Works at a High Level?

3.3. Logging vs Profiling: A Crucial Difference

3.4. Why JFR Is Safe for Production Use?

3.5. Why JFR Matters for Profiling?

4. Profiling Before Java 25

4.1. Problems with Old CPU Profiling

5. What Is JEP 509?

5.1. The Big Idea: CPU-Time Sampling

5.2. What Exactly Is New in JFR?

5.3. What Happens to the Samples?

6. From Samples to Flame Graphs

6.1. What Is a Flame Graph?

6.2. Why This Is Reliable (With CPU-Time Sampling)

7. How to Enable CPU-Time Profiling in Java 25

7.1. Starting a Recording at JVM Startup

7.2. Enabling jdk.CPUTimeSample

7.3. Controlling the Sampling Rate (Throttle)

7.4. Choosing the Right Setting

8. How to View and Analyze the Results

8.1. Common mistakes while reading flame graphs

9. When Should We Use CPU-Time Profiling?

10. When Should We Not Use CPU-Time Profiling?

11. Limitations and Things to Remember

12. Conclusion

Leave a Comment Cancel reply

Latest Posts For Java 25 Features

JavaHandsOn is a learning website. Here I will share knowledge related to Java and its frameworks like Spring, Hibernate, Spring Boot, etc. We will learn every topic using some basic examples. I hope you get in-depth knowledge of Java here and be a proficient developer.

7.2. Enabling `jdk.CPUTimeSample`