Talk about stringdeplication of G1 GC

  jvm

Order

This article mainly studies the stringdeplication of G1 GC.

-XX:+UseStringDeduplication

  • Jdk8u20 brings a String Deduplication feature to G1 GC to point the same string to the same piece of data to reduce the memory overhead of repeated strings.
  • This feature is turned off by default and can be turned on using–XX:+UseStringDeduplication (The premise is to use -XX:+UseG1GC)
  • The specific implementation is roughly that the JVM will record the weak reference and hash value of char[] . when finding a String with the same hash code, the String will be compared one by one. when all match are made, then one String will modify the pointer to the char[] of the other string, so that the char[] of the former can be recycled.

Example

Experimental code

    @Test
    public void testG1StringDeduplication() throws InterruptedException {
        List<String> data = IntStream.rangeClosed(1,10000000)
                .mapToObj(i -> "number is " + ( i % 2 == 0 ? "odd" : "even"))
                .collect(Collectors.toList());
        System.gc();
        long bytes = RamUsageEstimator.sizeOfAll(data);
        System.out.println("string list size in MB:" + bytes*1.0/1024/1024);
        System.out.println("used heap size in MB:" + ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed()*1.0/1024/1024);
        System.out.println("used non heap size in MB:" + ManagementFactory.getMemoryMXBean().getNonHeapMemoryUsage().getUsed()*1.0/1024/1024);

    }

Close StringDeduplication

-XX:+UseG1GC -XX:-UseStringDeduplication

The output is as follows:

string list size in MB:586.8727111816406
used heap size in MB:831.772346496582
used non heap size in MB:6.448394775390625
  • The entire jvm heap takes up about 831MB, of which string list takes up about 586MB.

Open StringDeduplication

-XX:+UseG1GC -XX:+UseStringDeduplication

The output is as follows:

string list size in MB:296.83294677734375
used heap size in MB:645.0970153808594
used non heap size in MB:6.376350402832031
  • The entire jvm heap takes up about 645MB, of which string list takes up about 296MB.

Summary

  • Jdk8u20 brings a String Deduplication feature to G1 GC to point the same string to the same piece of data to reduce the memory overhead of repeated strings.
  • This feature is turned off by default and can be turned on using–XX:+UseStringDeduplication (The premise is to use -XX:+UseG1GC)
  • On the premise of a large number of repetitions of string, using G1 GC to open String Deduplication can really save some memory, and can save about 20% of memory, but this is an ideal situation, because string repetition in ordinary applications may not be very much

doc