GC Tuning Principles (Not a Flag List)

GC tuning should follow principles, not a fixed parameter list: first measure (logs, metrics), then identify the issue (pause, throughput, memory), and finally adjust incrementally and validate. This article gives principle-based guidance with a problem-symptom table and tuning order, not a flag dump.

Overview

  • Measure first: Enable GC logs (-Xlog:gc*) and monitoring. Observe Minor/Full frequency, pause times, heap usage, and promotion rate. No data, no tuning.
  • Define goals: Low latency (e.g. P99 < 100ms) or high throughput? Large or small heap? Different goals lead to different strategies.
  • Address root causes first: Reducing allocations, shortening object lifetime, and fixing leaks often help more than tweaking GC flags. Tuning is a fallback, not the first step.
  • Iterate in small steps: Change one or two parameters at a time, observe, then decide. Avoid changing many parameters at once and losing traceability.

Example

Example 1: Common issues and approaches

SymptomPossible causeApproach
Frequent Full GCOld gen full, too much promotion, leakCheck leaks, tune young/old, promotion threshold, reduce large objects
Long Minor GC pausesYoung gen too large, high copy costReduce young gen, check allocation rate
Low throughputGC dominates total timeReduce GC frequency (less allocation, larger heap) or use throughput collector
Metaspace OOMToo many classes loadedSet MaxMetaspaceSize, check dynamic class loading, hot reload

Example 2: Tuning order (conceptual)

  1. Set heap size (-Xms=-Xmx to avoid resizing) and collector (e.g. G1).
  2. If you have a pause target, set MaxGCPauseMillis (G1) or equivalent.
  3. Observe young/old ratio and promotion rate; tune young gen size, Survivor, and tenure age as needed.
  4. If still insufficient, consider off-heap, large object handling, or a different collector (e.g. ZGC, Shenandoah for lower pauses).

Example 3: What to avoid

  • Tuning blindly without metrics.
  • Changing many parameters at once so you cannot attribute effects.
  • Only increasing -Xmx without investigating leaks.
  • Treating GC tuning as a cure-all and ignoring code and architecture.

Core Mechanism / Behavior

  • Generational: Short-lived objects are mostly collected in young gen; long-lived ones promote to old and can drive Full GC. Reduce allocations and object lifetime to ease GC pressure.
  • Pause: Shorter STW is better. G1/ZGC use regions, concurrency, and incremental collection. Throughput collectors (e.g. Parallel) may have longer pauses but less total GC time.
  • Heap size: Too large → longer single GC; too small → more frequent GC. Balance pause, throughput, and resources.

Key Rules

  • Measure first: Use GC logs and monitoring; establish a baseline and compare before/after changes.
  • Business logic first: Reduce allocation, shorten lifetime, fix leaks; tuning is secondary.
  • Small, verifiable steps: Each change should be traceable and reversible; validate in load/staging before production.

What's Next

See GC Basics and G1 Overview for concepts. See OOM Playbook for memory issue diagnosis. See Java Profiling for measurement tools.