<?xml version="1.0" encoding="UTF-8"?> | |
<!DOCTYPE reference PUBLIC "-//OASIS//DTD DITA Reference//EN" "reference.dtd" > | |
<reference id="ref_inspections_component_report" xml:lang="en-us"> | |
<title>Component Report</title> | |
<shortdesc>Analyze a component for possible memory waste and | |
other inefficiencies.</shortdesc> | |
<prolog> | |
<copyright> | |
<copyryear year=""></copyryear> | |
<copyrholder> | |
Copyright (c) 2008, 2010 SAP AG and others. | |
All rights reserved. This program and the accompanying materials | |
are made available under the terms of the Eclipse Public License v1.0 | |
which accompanies this distribution, and is available at | |
http://www.eclipse.org/legal/epl-v10.html | |
</copyrholder> | |
</copyright> | |
</prolog> | |
<refbody> | |
<section id="introduction"> | |
<title>Introduction</title> | |
<p>A heap dump contains millions of objects. But which of those | |
belong to your component? And what conclusions can you draw from | |
them? This is where the Component Report can help.</p> | |
<p> | |
Before starting, one has to decide what constitutes a component. | |
Typically, a component is either a set of classes in a | |
<b>common root package</b> | |
or a set of classes loaded by the same | |
<b>class loader</b> | |
. | |
</p> | |
<p> | |
Using this root set of objects, the component report calculates a | |
customized retained set. This retained set includes all objects kept | |
alive by the root set. Additionally, it assumes that all objects | |
that have become | |
<i>finalizable</i> | |
actually have been finalized and that also all soft references have | |
been cleared. | |
</p> | |
</section> | |
<section id="run"> | |
<title>Executing the Component Report</title> | |
<p> To run the report for a common root package, select the component | |
report from the tool bar and provide a regular expression to match | |
the package:</p> | |
<image href="component_report_package.png"> | |
<alt>Regular expression to match common root package to be | |
used for the component report.</alt> | |
</image> | |
<p> Alternatively, one can group the class histogram by class loader | |
and then right-click the appropriate class loader and select the | |
component report:</p> | |
<image href="component_report_classloader.png"> | |
<alt>Group histogram by class loader.</alt> | |
</image> | |
</section> | |
<section id="overview"> | |
<title>Overview</title> | |
<p>The component report is rendered as HTML. It is stored in a ZIP | |
file next to the heap dump file.</p> | |
<image href="component_report_overview.png"> | |
<alt>Overview section of the component report.</alt> | |
</image> | |
<p> | |
<ol outputclass="arrows"> | |
<li>Details about the size, the number of classes, the | |
number of objects and the number of different class loaders.</li> | |
<li>The pie chart shows the size of the component relative to | |
the total heap size.</li> | |
<li> | |
The | |
<xref href="top_consumers.dita" scope="local">Top Consumers</xref> | |
section lists the biggest object, classes, class loader and | |
packages which are retained by the component. It provides a good | |
overview of what is actually kept alive by the component. | |
</li> | |
<li> | |
<xref href="retained_set.dita" scope="local">Retained Set | |
</xref> | |
displays all objects grouped by classes which are retained. | |
</li> | |
</ol> | |
</p> | |
</section> | |
<section id="strings"> | |
<title>Duplicate Strings</title> | |
<p> | |
Duplicate Strings are a prime example for memory waste: multiple | |
char arrays with identical content. To find the duplicates, the | |
report | |
<xref href="group_by_value.dita" scope="local">groups</xref> | |
the char arrays by their value. It lists all char arrays with 10 or | |
more instances with identical content. | |
</p> | |
<p> | |
The content of the char arrays typically gives away ideas how to | |
reduce the duplicates: | |
<ul> | |
<li> | |
Sometimes the duplicate strings are used as | |
<b>keys or values in hash maps</b> | |
. For example, when reading heap dumps, MAT itself used to read | |
the char constant denoting the type of an attribute into memory. | |
It turned out that the heap was littered with many 'L's for | |
references, 'B's for bytes, and 'Z's for booleans, etc. By | |
replacing the | |
<codeph>char</codeph> | |
with an | |
<codeph>int</codeph> | |
, MAT could save some of the precious memory. Alternatively, | |
Enumerations could do the same trick. | |
</li> | |
<li> | |
When reading | |
<b>XML documents</b> | |
, fragments like | |
<codeph>UTF-8</codeph> | |
, tag names or tag content remains in memory. Again, think about | |
using Enumerations for the repetitive content. | |
</li> | |
<li> | |
Another option is | |
<xref | |
href="http://java.sun.com/javase/6/docs/api/java/lang/String.html#intern()" | |
format="html">interning</xref> | |
the String. This adds the string to a pool of strings which is | |
maintained privately by the class | |
<codeph>String</codeph> | |
. For each unique string, the pool will keep on instance alive. | |
However, if you are interning, make sure do it | |
<b>responsibly</b> | |
: A big pool of strings will have maintenance costs and one cannot | |
rely on interned strings being garbage collected. | |
</li> | |
</ul> | |
</p> | |
</section> | |
<section id="emptycol"> | |
<title>Empty Collections</title> | |
<p>Even if collections are empty, they usually consume memory | |
through their internal object array. Imagine a tree structure where | |
every node eagerly creates array lists to hold its children, but | |
only a few nodes actually possess children.</p> | |
<p> | |
One remedy is the lazy initialization of the collections: create the | |
collection only when it is actually needed. To find out who is | |
responsible for the empty collections, use the | |
<xref href="immediate_dominators.dita" scope="local">immediate | |
dominators</xref> | |
command. | |
</p> | |
</section> | |
<section id="colfillratio"> | |
<title>Collection Fill Ratio</title> | |
<p>Just like empty ones, collections with only a few elements | |
also take up a lot of memory. Again, the backing array of the | |
collection is the main culprit. The examination of the fill ratios | |
using a heap dump from a production system gives hints to what | |
initial capacity to use.</p> | |
</section> | |
<section id="softref"> | |
<title>Soft Reference Statistics</title> | |
<p> | |
Soft references are cleared by the virtual machine in response to | |
memory demand. Usually, soft references are used to implement | |
caches: keep the objects around while there is sufficient memory, | |
clear the objects if free memory becomes low. | |
<ul> | |
<li>Usually objects are cached, because they are expensive | |
to re-create. Across a whole application, soft referenced objects | |
might carry very different costs. However, the virtual machine | |
cannot know this and clears the objects on some least recently | |
used algorithm. From the outside, this is very unpredictable and | |
difficult to fine tune.</li> | |
<li> | |
Furthermore, soft references can impose a | |
<i>stop-the-world</i> | |
phase during garbage collection. Oversimplified, the GC marks the | |
object graph behind the soft references while the virtual machine | |
is stopped. | |
</li> | |
</ul> | |
</p> | |
</section> | |
<section id="finalizer"> | |
<title>Finalizer Statistics</title> | |
<p> | |
Objects which implement the | |
<codeph>finalize</codeph> | |
method are included in the component report, because those objects | |
can have serious implications for the memory of a Java Virtual | |
Machine: | |
<ul> | |
<li> | |
Whenever an object with finalizer is created, a corresponding | |
<codeph>java.lang.ref.Finalizer</codeph> | |
object is created. If the object is only reachable via its | |
finalizer, it is placed in the queue of the finalizer thread and | |
processed. Only then the next garbage collection will actually | |
free the memory. Therefore it takes at least two garbage | |
collections until the memory is freed. | |
</li> | |
<li>When using Sun's current virtual machine implementation, | |
the finalizer thread is a single thread processing the finalizer | |
objects sequentially. One blocking finalizer queue therefore can | |
easily keep alive big chunks of memory (all those other objects | |
ready to be finalized).</li> | |
<li> | |
Depending on the actual algorithm, finalizer may require a | |
<i>stop-the-world</i> | |
pause during garbage collections. This, of course, can have | |
serious implications for the responsiveness of the whole | |
application. | |
</li> | |
<li>Last not least, the time of execution of the finalizer is | |
up to the VM and therefore unpredictable.</li> | |
</ul> | |
</p> | |
</section> | |
<section id="mapcollision"> | |
<title>Map Collision Ratios</title> | |
<p>This sections analyzes the collision ratios of hash maps. Maps | |
place the values in different buckets based on the hash code of the | |
keys. If the hash code points to the same bucket, the elements | |
inside the bucket are typically compared linearly.</p> | |
<p>High collision ratios can indicate sub-optimal hash codes. | |
This is not a memory problem (a better hash code does not save | |
space) but rather performance problem because of the linear access | |
inside the buckets.</p> | |
</section> | |
</refbody> | |
</reference> |