Java Profiling Basics - What to Measure First
Before profiling, decide what to optimize: CPU, memory, I/O, or locks. This article gives a priority for "what to measure first," common tools and metrics, and a quick reference table to locate bottlenecks.
Overview
- CPU: Hot methods, thread CPU usage. Tools: JProfiler, Async Profiler, arthas cpu. If CPU is high, find hot methods, then optimize algorithms or reduce calls.
- Memory: Heap usage, allocation rate, GC, leaks. Tools: MAT, JProfiler, heap dump, GC logs. If memory is high or OOM, check object dominance and reference chains.
- I/O: Disk and network wait. Tools: flame graphs, thread stacks, iostat, network metrics. If I/O wait is high, optimize reads/writes or change storage/network.
- Locks: Contention, wait, deadlock. Tools: jstack, arthas, JProfiler lock view. If lock contention is high, reduce lock granularity or use lock-free structures.
Example
Example 1: Suggested diagnosis order
| Priority | Symptom | Measure first | Example tools |
|---|---|---|---|
| 1 | Slow response, high CPU | CPU hotspots | Async Profiler, arthas profiler |
| 2 | OOM, memory growth | Heap, GC, dump | MAT, -XX:+HeapDumpOnOOM, GC logs |
| 3 | Slow response, low CPU | I/O, locks | Thread stacks, iostat, lock analysis |
| 4 | Intermittent stalls | GC pauses, locks | GC logs, jstack sampling |
Example 2: Key metrics
- Throughput: QPS, TPS; related to CPU and GC time.
- Latency: P50, P99, P999; related to GC pauses, locks, I/O.
- Memory: Heap usage, old gen ratio, promotion rate; related to Full GC and OOM.
- GC: Minor/Full frequency, pause times; use
-Xlog:gc*to observe.
Example 3: Tools and uses
| Tool | Use |
|---|---|
| jstack | Thread stacks, deadlock |
| jmap | Heap dump, heap summary |
| arthas | Online diagnosis, profiler, watch |
| MAT / JProfiler | Analyze dump, leaks, dominance tree |
| Async Profiler | Low-overhead CPU/memory sampling |
| GC logs | GC count, pause, cause |
Core Mechanism / Behavior
- Sampling: Periodically capture thread or call stacks and aggregate hot spots. Low overhead; suitable for production. Instrumentation: Insert code at method entry/exit for precision; higher overhead; often used in test.
- Dump: Heap snapshot for offline object and reference analysis. Can cause STW; use full dump with care in production; consider head dump or sampling instead.
Key Rules
- Define the goal first: Optimize latency or throughput? Then pick metrics and tools.
- Be careful with heavy ops in production: Full dump, frequent jmap can impact the service. Prefer low-intrusion tools like arthas and Async Profiler.
- Context matters: Interpret hot methods in business context to see if there is room for optimization (algorithms, caching, batching).
What's Next
See GC Basics and OOM Playbook for memory and GC. See Slow Query for database. See JIT/Escape Analysis for compiler optimizations.