JDK12 ShenandoahGC trial scalpel

  java, jdk12

Order

This article mainly tries out ShenandoahGC newly introduced by JDK12.

ShenandoahGC

Shenandoah is a concurrent and parallel garbage collector
图片描述

  • Like ZGC, it is also a garbage collector for low-pause-time, but ZGC is implemented based on colored pointers, while Shenandoah GC is implemented based on brooks pointers.
  • Compared with G1 GC, g1evacuation is parallel but not concurrent, while Shenandoah’s evacuation is concurrent, which can reduce pause time better.
  • Like G1 GC, ShenandoahGC is also a region-based GC, except that ShenandoahGC has no logical generation, so there is no young/old

GC cycle

ShenandoahGC mainly has the following stages:
图片描述

Snapshot-at-the-beginning concurrent mark

This includes Init Mark (Pause)、Concurrent Mark、Final Mark(Pause); White (not yet visited)、Gray(visited, but references are not scanned yet)、Black(visited, and fully scanned) Color algorithm to mark

Concurrent evacuation

This is the evacuation phase different from G1, which is concurrent; Brooks Pointers (object version change with additional atomically changed indirection) algorithm to copy,

Concurrent update references (optional)

This includes Init update Refs (Pause)、Concurrent update Refs、Final update Refs(Pause)
Final Mark or Final update Refs may be followed by Concurrent cleanup and garbage collection.

Related parameters

Shenandoah initial parameter

     bool ShenandoahAcmpBarrier                    = true                                   {diagnostic} {default}
     bool ShenandoahAllocFailureALot               = false                                  {diagnostic} {default}
    uintx ShenandoahAllocSpikeFactor               = 5                                    {experimental} {default}
     intx ShenandoahAllocationStallThreshold       = 10000                                  {diagnostic} {default}
    uintx ShenandoahAllocationThreshold            = 0                                    {experimental} {default}
     bool ShenandoahAllocationTrace                = false                                  {diagnostic} {default}
     bool ShenandoahAllowMixedAllocs               = true                                   {diagnostic} {default}
     bool ShenandoahAlwaysClearSoftRefs            = false                                {experimental} {default}
     bool ShenandoahAlwaysPreTouch                 = false                                  {diagnostic} {default}
     bool ShenandoahCASBarrier                     = true                                   {diagnostic} {default}
     bool ShenandoahCloneBarrier                   = true                                   {diagnostic} {default}
    uintx ShenandoahCodeRootsStyle                 = 2                                    {experimental} {default}
     bool ShenandoahCommonGCStateLoads             = false                                {experimental} {default}
     bool ShenandoahConcurrentScanCodeRoots        = true                                 {experimental} {default}
    uintx ShenandoahControlIntervalAdjustPeriod    = 1000                                 {experimental} {default}
    uintx ShenandoahControlIntervalMax             = 10                                   {experimental} {default}
    uintx ShenandoahControlIntervalMin             = 1                                    {experimental} {default}
    uintx ShenandoahCriticalFreeThreshold          = 1                                    {experimental} {default}
     bool ShenandoahDecreaseRegisterPressure       = false                                  {diagnostic} {default}
     bool ShenandoahDegeneratedGC                  = true                                   {diagnostic} {default}
     bool ShenandoahDontIncreaseWBFreq             = true                                 {experimental} {default}
     bool ShenandoahElasticTLAB                    = true                                   {diagnostic} {default}
    uintx ShenandoahEvacAssist                     = 10                                   {experimental} {default}
    uintx ShenandoahEvacReserve                    = 5                                    {experimental} {default}
     bool ShenandoahEvacReserveOverflow            = true                                 {experimental} {default}
   double ShenandoahEvacWaste                      = 1.200000                             {experimental} {default}
    uintx ShenandoahFreeThreshold                  = 10                                   {experimental} {default}
    uintx ShenandoahFullGCThreshold                = 3                                    {experimental} {default}
    ccstr ShenandoahGCHeuristics                   = adaptive                             {experimental} {default}
    uintx ShenandoahGarbageThreshold               = 60                                   {experimental} {default}
    uintx ShenandoahGuaranteedGCInterval           = 300000                               {experimental} {default}
   size_t ShenandoahHeapRegionSize                 = 0                                    {experimental} {default}
     bool ShenandoahHumongousMoves                 = true                                 {experimental} {default}
     intx ShenandoahHumongousThreshold             = 100                                  {experimental} {default}
    uintx ShenandoahImmediateThreshold             = 90                                   {experimental} {default}
     bool ShenandoahImplicitGCInvokesConcurrent    = true                                 {experimental} {default}
    uintx ShenandoahInitFreeThreshold              = 70                                   {experimental} {default}
     bool ShenandoahKeepAliveBarrier               = true                                   {diagnostic} {default}
    uintx ShenandoahLearningSteps                  = 5                                    {experimental} {default}
     bool ShenandoahLoopOptsAfterExpansion         = true                                 {experimental} {default}
    uintx ShenandoahMarkLoopStride                 = 1000                                 {experimental} {default}
     intx ShenandoahMarkScanPrefetch               = 32                                   {experimental} {default}
   size_t ShenandoahMaxRegionSize                  = 33554432                             {experimental} {default}
    uintx ShenandoahMergeUpdateRefsMaxGap          = 200                                  {experimental} {default}
    uintx ShenandoahMergeUpdateRefsMinGap          = 100                                  {experimental} {default}
    uintx ShenandoahMinFreeThreshold               = 10                                   {experimental} {default}
   size_t ShenandoahMinRegionSize                  = 262144                               {experimental} {default}
     bool ShenandoahOOMDuringEvacALot              = false                                  {diagnostic} {default}
     bool ShenandoahOptimizeInstanceFinals         = false                                {experimental} {default}
     bool ShenandoahOptimizeStableFinals           = false                                {experimental} {default}
     bool ShenandoahOptimizeStaticFinals           = true                                 {experimental} {default}
     bool ShenandoahPacing                         = true                                 {experimental} {default}
    uintx ShenandoahPacingCycleSlack               = 10                                   {experimental} {default}
    uintx ShenandoahPacingIdleSlack                = 2                                    {experimental} {default}
    uintx ShenandoahPacingMaxDelay                 = 10                                   {experimental} {default}
   double ShenandoahPacingSurcharge                = 1.100000                             {experimental} {default}
    uintx ShenandoahParallelRegionStride           = 1024                                 {experimental} {default}
     uint ShenandoahParallelSafepointThreads       = 4                                    {experimental} {default}
     bool ShenandoahPreclean                       = true                                 {experimental} {default}
     bool ShenandoahReadBarrier                    = true                                   {diagnostic} {default}
    uintx ShenandoahRefProcFrequency               = 5                                    {experimental} {default}
     bool ShenandoahRegionSampling                 = true                                 {experimental} {command line}
      int ShenandoahRegionSamplingRate             = 40                                   {experimental} {default}
     bool ShenandoahSATBBarrier                    = true                                   {diagnostic} {default}
    uintx ShenandoahSATBBufferFlushInterval        = 100                                  {experimental} {default}
   size_t ShenandoahSATBBufferSize                 = 1024                                 {experimental} {default}
     bool ShenandoahStoreCheck                     = false                                  {diagnostic} {default}
     bool ShenandoahStoreValEnqueueBarrier         = false                                  {diagnostic} {default}
     bool ShenandoahStoreValReadBarrier            = true                                   {diagnostic} {default}
     bool ShenandoahSuspendibleWorkers             = false                                {experimental} {default}
   size_t ShenandoahTargetNumRegions               = 2048                                 {experimental} {default}
     bool ShenandoahTerminationTrace               = false                                  {diagnostic} {default}
     bool ShenandoahUncommit                       = true                                 {experimental} {default}
    uintx ShenandoahUncommitDelay                  = 300000                               {experimental} {default}
    uintx ShenandoahUnloadClassesFrequency         = 0                                    {experimental} {default}
    ccstr ShenandoahUpdateRefsEarly                = adaptive                             {experimental} {default}
     bool ShenandoahVerify                         = false                                  {diagnostic} {default}
     intx ShenandoahVerifyLevel                    = 4                                      {diagnostic} {default}
     bool ShenandoahWriteBarrier                   = true                                   {diagnostic} {default}

Some of them are used by diagnostic, such as ShenandoahAcmpBarrier, ShenandoahAllocFailureALot, Shenandoahallocationstall hreshold, etc.

Heuristics related parameters

    ccstr ShenandoahGCHeuristics                   = adaptive                             {experimental} {default}
    uintx ShenandoahInitFreeThreshold              = 70                                   {experimental} {default}
    uintx ShenandoahMinFreeThreshold               = 10                                   {experimental} {default}
    uintx ShenandoahAllocSpikeFactor               = 5                                    {experimental} {default}
    uintx ShenandoahGarbageThreshold               = 60                                   {experimental} {default}
    uintx ShenandoahFreeThreshold                  = 10                                   {experimental} {default}
    uintx ShenandoahAllocationThreshold            = 0                                    {experimental} {default}
    ccstr ShenandoahUpdateRefsEarly                = adaptive                             {experimental} {default}

Heuristics is mainly used to tell Shenandoah when to start a GC cycle, among which ShenandoahGCHeuristics is used to select different policies, and its optional values are adaptive (Default)、static、compact、passive(For diagnostic purposes)、aggressive(For diagnostic purposes)

  • The adaptive approach is mainly through ShenandoahInitFreeThreshold (Initial remaining free heap threshold for learning steps)、ShenandoahMinFreeThreshold(free space threshold at which heuristics triggers the GC unconditionally)、ShenandoahAllocSpikeFactor(How much heap to reserve for absorbing allocation spikes)、XX:ShenandoahGarbageThreshold(Sets the percentage of garbage a region need to contain before it can be marked for collection) to set the appropriate startup GC cycle
  • The static method is mainly based on heap occupancy and allocation pressure to decide whether to start GC cycle. The relevant parameters are: ShenandoahFreeThreshold (Set the percentage of free heap at which a GC cycle is started)、ShenandoahAllocationThreshold(Set percentage of memory allocated since last GC cycle before a new GC cycle is started)、ShenandoahGarbageThreshold
  • The compact mode is continuous. as long as allocation occurs, a new GC cycle will be started after the end of the previous GC cycle. the relevant parameters are ConcGCThreads (Trim down the number of concurrent GC threads to make more room for application to run)、ShenandoahAllocationThreshold
  • The passive mode is completely passive, which triggers STW when memory runs out, and is usually used for diagnostic purposes.
  • The aggressive mode is completely active, starting a new GC cycle (It is somewhat similar to the compact method.However, it will evacuate all live objects and is usually used for diagnostic.

Failure Modes

When allocation failure occurs, Shenandoah has some elegant degradation ladder to deal with this situation, as follows:

  • Pacing(<10 ms)

The ShenandoahPacing parameter is turned on by default. Pacer is used to stall the threads to which objects are being allocated when the gc is not fast enough. When the gc speed catches up, stall is released for these threads. Stall is not open-ended. There is a ShenandoahPacingMaxDelay (Unit millisecond) parameter can be set, and allocation will be generated once the value is exceeded. When allocation pressure is high, Pacer can do nothing. This time, Pacer will go to the next step.

  • Degenerated GC(<100 ms)

The ShenandoahDegeneratedGC parameter is turned on by default. in this degeneredcycle, the number of threads used by Shenandoah is taken from ParallelGCThreads instead of ConcCGThreads.

  • Full GC(>100 ms)

When there is not enough memory after Degenerated GC, enter Full GC cycle, which will compact as much as possible and then release memory to ensure OOM does not occur.

Example

Startup parameters

-server -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC -XX:+UsePerfData -XX:+ShenandoahRegionSampling -XX:ParallelGCThreads=4 -XX:ConcGCThreads=4 -XX:+UnlockDiagnosticVMOptions -Xlog:age*,ergo*,gc*=info

Gc log

[2019-03-21T15:12:53.771-0800][8707][gc] Consider -XX:+ClassUnloadingWithConcurrentMark if large pause times are observed on class-unloading sensitive workloads
[2019-03-21T15:12:53.862-0800][8707][gc,init] Regions: 2048 x 1024K
[2019-03-21T15:12:53.862-0800][8707][gc,init] Humongous object threshold: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] Max TLAB size: 1024K
[2019-03-21T15:12:53.863-0800][8707][gc,init] GC threads: 4 parallel, 4 concurrent
[2019-03-21T15:12:53.863-0800][8707][gc,init] Reference processing: parallel
[2019-03-21T15:12:53.864-0800][8707][gc     ] Heuristics ergonomically sets -XX:+ExplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc     ] Heuristics ergonomically sets -XX:+ShenandoahImplicitGCInvokesConcurrent
[2019-03-21T15:12:53.864-0800][8707][gc,init] Shenandoah heuristics: adaptive
[2019-03-21T15:12:53.864-0800][8707][gc,heap] Initialize Shenandoah heap with initial size 128M
[2019-03-21T15:12:53.865-0800][8707][gc,ergo] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:12:53.883-0800][8707][gc,init] Safepointing mechanism: global-page poll
[2019-03-21T15:12:53.883-0800][8707][gc     ] Using Shenandoah
[2019-03-21T15:12:53.884-0800][8707][gc,heap,coops] Heap address: 0x0000000780000000, size: 2048 MB, Compressed Oops mode: Zero based, Oop shift amount: 3
[2019-03-21T15:12:59.530-0800][14083][gc           ] Trigger: Metadata GC Threshold
[2019-03-21T15:12:59.532-0800][14083][gc,ergo      ] Free: 1813M (1813 regions), Max regular: 1024K, Max humongous: 1855488K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.532-0800][14083][gc,ergo      ] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.532-0800][14083][gc,start     ] GC(0) Concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc,task      ] GC(0) Using 4 of 4 workers for concurrent reset
[2019-03-21T15:12:59.533-0800][14083][gc           ] GC(0) Concurrent reset 132M->132M(2048M) 0.441ms
[2019-03-21T15:12:59.533-0800][15619][gc,start     ] GC(0) Pause Init Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.533-0800][15619][gc,task      ] GC(0) Using 4 of 4 workers for init marking
[2019-03-21T15:12:59.541-0800][15619][gc,ergo      ] GC(0) Pacer for Mark. Expected Live: 204M, Free: 1813M, Non-Taxable: 181M, Alloc Tax Rate: 0.4x
[2019-03-21T15:12:59.541-0800][15619][gc           ] GC(0) Pause Init Mark (process weakrefs) (unload classes) 7.568ms
[2019-03-21T15:12:59.541-0800][14083][gc,start     ] GC(0) Concurrent marking (process weakrefs) (unload classes)
[2019-03-21T15:12:59.541-0800][14083][gc,task      ] GC(0) Using 4 of 4 workers for concurrent marking
[2019-03-21T15:12:59.619-0800][14083][gc           ] GC(0) Concurrent marking (process weakrefs) (unload classes) 132M->134M(2048M) 78.373ms
[2019-03-21T15:12:59.619-0800][14083][gc,start     ] GC(0) Concurrent precleaning
[2019-03-21T15:12:59.619-0800][14083][gc,task      ] GC(0) Using 1 of 4 workers for concurrent preclean
[2019-03-21T15:12:59.622-0800][14083][gc           ] GC(0) Concurrent precleaning 134M->134M(2048M) 2.397ms
[2019-03-21T15:12:59.622-0800][15619][gc,start     ] GC(0) Pause Final Mark (process weakrefs) (unload classes)
[2019-03-21T15:12:59.622-0800][15619][gc,task      ] GC(0) Using 4 of 4 workers for final marking
[2019-03-21T15:12:59.625-0800][15619][gc,stringtable] GC(0) Cleaned string table, strings: 13692 processed, 50 removed
[2019-03-21T15:12:59.626-0800][15619][gc,ergo       ] GC(0) Adaptive CSet Selection. Target Free: 204M, Actual Free: 1914M, Max CSet: 85M, Min Garbage: 0M
[2019-03-21T15:12:59.626-0800][15619][gc,ergo       ] GC(0) Collectable Garbage: 117M (97% of total), 8M CSet, 126 CSet regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo       ] GC(0) Immediate Garbage: 0M (0% of total), 0 regions
[2019-03-21T15:12:59.626-0800][15619][gc,ergo       ] GC(0) Pacer for Evacuation. Used CSet: 126M, Free: 1811M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.626-0800][15619][gc            ] GC(0) Pause Final Mark (process weakrefs) (unload classes) 4.712ms
[2019-03-21T15:12:59.626-0800][14083][gc,start      ] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.627-0800][14083][gc            ] GC(0) Concurrent cleanup 134M->135M(2048M) 0.132ms
[2019-03-21T15:12:59.627-0800][14083][gc,ergo       ] GC(0) Free: 1810M (1810 regions), Max regular: 1024K, Max humongous: 1852416K, External frag: 1%, Internal frag: 0%
[2019-03-21T15:12:59.627-0800][14083][gc,ergo       ] GC(0) Evacuation Reserve: 102M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.627-0800][14083][gc,start      ] GC(0) Concurrent evacuation
[2019-03-21T15:12:59.627-0800][14083][gc,task       ] GC(0) Using 4 of 4 workers for concurrent evacuation
[2019-03-21T15:12:59.643-0800][14083][gc            ] GC(0) Concurrent evacuation 135M->145M(2048M) 15.912ms
[2019-03-21T15:12:59.643-0800][15619][gc,start      ] GC(0) Pause Init Update Refs
[2019-03-21T15:12:59.643-0800][15619][gc,ergo       ] GC(0) Pacer for Update Refs. Used: 145M, Free: 1810M, Non-Taxable: 181M, Alloc Tax Rate: 1.1x
[2019-03-21T15:12:59.643-0800][15619][gc            ] GC(0) Pause Init Update Refs 0.090ms
[2019-03-21T15:12:59.643-0800][14083][gc,start      ] GC(0) Concurrent update references
[2019-03-21T15:12:59.643-0800][14083][gc,task       ] GC(0) Using 4 of 4 workers for concurrent reference update
[2019-03-21T15:12:59.652-0800][14083][gc            ] GC(0) Concurrent update references 145M->147M(2048M) 9.028ms
[2019-03-21T15:12:59.652-0800][15619][gc,start      ] GC(0) Pause Final Update Refs
[2019-03-21T15:12:59.652-0800][15619][gc,task       ] GC(0) Using 4 of 4 workers for final reference update
[2019-03-21T15:12:59.653-0800][15619][gc            ] GC(0) Pause Final Update Refs 0.489ms
[2019-03-21T15:12:59.653-0800][14083][gc,start      ] GC(0) Concurrent cleanup
[2019-03-21T15:12:59.653-0800][14083][gc            ] GC(0) Concurrent cleanup 147M->21M(2048M) 0.088ms
[2019-03-21T15:12:59.653-0800][14083][gc,ergo       ] Free: 1924M (1924 regions), Max regular: 1024K, Max humongous: 1840128K, External frag: 7%, Internal frag: 0%
[2019-03-21T15:12:59.653-0800][14083][gc,ergo       ] Evacuation Reserve: 103M (103 regions), Max regular: 1024K
[2019-03-21T15:12:59.653-0800][14083][gc,ergo       ] Pacer for Idle. Initial: 40M, Alloc Tax Rate: 1.0x
[2019-03-21T15:17:59.666-0800][14083][gc            ] Trigger: Time since last GC (300009 ms) is larger than guaranteed interval (300000 ms)

gc visualizer

There is oneshenandoah-visualizerThe tool can be used to visualize ShenandoahGC with the following visualization effects:
图片描述

Summary

  • Shenandoah is a concurrent and parallel garbage collector. Like ZGC, it is also a garbage collector for low-pause-time, but ZGC is implemented based on colored pointers, while Shenandoah GC is implemented based on brooks pointers. Compared with G1 GC, g1evacuation is parallel but not concurrent, while Shenandoah’s evacuation is concurrent, which can reduce pause time; better. Like G1 GC, ShenandoahGC is also a region-based GC, except that ShenandoahGC has no logical generation, so there is no young/old
  • Shenandoah’s GC cycle mainly includes snapshot-at-the-begining concurrent mark.Including Init Mark (Pause)、Concurrent Mark、Final Mark(Pause)、Concurrent evacuation、Concurrent update references (optional)Including Init update Refs (Pause)、Concurrent update Refs、Final update Refs(Pause); After Final Mark or Final update Refs, it is possible to carry out Concurrent cleanup and garbage collection.
  • Heuristics is mainly used to tell Shenandoah when to start a GC, among which ShenandoahGCHeuristics is used to select different policies, and its optional values are adaptive (Default)、static、compact、passive(For diagnostic purposes)、aggressive(For diagnostic purposes); In addition, when allocation failure occurs, Shenandoah has some elegant degradation ladder to deal with this situation, including Pacing (<10 ms)、Degenerated GC(<100 ms)、Full GC(>100 ms)

doc