blob: 9f33f9b30a9df1461f685d5e0836f7517ea68f53 [file] [log] [blame]
<?php
/**
* *****************************************************************************
* Copyright (c) 2015, 2016 Eclipse Foundation and others.
* All rights reserved. This program and the accompanying materials
* are made available under the terms of the Eclipse Public License v1.0
* which accompanies this distribution, and is available at
* http://eclipse.org/legal/epl-v10.html
*
* Contributors:
* Eric Poirier (Eclipse Foundation) - Initial implementation
* Christopher Guindon (Eclipse Foundation)
* *****************************************************************************
*/
// This file must be included
if (basename(__FILE__) == basename($_SERVER['PHP_SELF'])) {
exit();
}
?>
<h1 class="article-title"><?php echo $pageTitle; ?></h1>
<p>It&rsquo;s commonly accepted that using open source components with known
vulnerabilities is a widespread problem that imposes severe security risks to end users. In the
last few years, an increasing number of developers and researchers have addressed the problem from
different angles.</p>
<p>We were triggered to work on the security of open source supply chains after reading a
whitepaper entitled &ldquo;The Unfortunate Reality of Insecure Libraries&rdquo; [<a href="#ref_18">18</a>], which was
first authored by Jeff Williams and Arshan Dabirsiaghi in 2012.</p>
<p>
Fast-forward to 2019: My colleagues and I developed a code-centric approach to address this
problem [<a href="#ref_11">11</a>]. After releasing the solution as open source code in 2018, the tool is now available
at the Eclipse Foundation as<a href="https://projects.eclipse.org/projects/technology.steady">
Eclipse Steady</a>.
</p>
<h2>Open Source Vulnerabilities Are a Significant Issue</h2>
<p>Open source components with known vulnerabilities have been the root cause of many data
breaches [<a href="#ref_19">19</a>], including the infamous Equifax data breach in 2017. Accordingly, the last two
editions of the Open Web Application Security Project (OWASP) top 10 most critical security risks
for Web applications, published in 2013 and 2017, highlight the widespread prevalence of this
security issue [<a href="#ref_1">1</a>, <a href="#ref_2">2</a>].</p>
<p>Since 2013, detecting this issue has become easier thanks to great open source
solutions such as OWASP Dependency Check [<a href="#ref_3">3</a>] and Retire.js [<a href="#ref_4">4</a>], as well as numerous commercial
solutions, many of which can be used to check open source projects for free. All of these
solutions compete in the software composition analysis (SCA) market [<a href="#ref_5">5</a>], which has grown
significantly over the past few years due to developers&rsquo; concerns about license compliance
and security.</p>
<h2>There Are Still Problems to Resolve</h2>
<p>While it sounds like the problem has been resolved and we can move on, that&rsquo;s not
the case for two main reasons.</p>
<p>The first reason is technical in nature and relates to the reachability of vulnerable
code in a given application context. Many tools rely on different kinds of metadata, such as Maven
artifact identifiers, to detect that an application depends on a vulnerable component version.
However, the tools cannot determine whether vulnerable code can be executed in the context of the
application being analyzed, but this is necessary when assessing whether the vulnerability can be
exploited by an attacker. Components with vulnerable code that can never be executed do not
require an update, which can save considerable testing effort by software developers and users.</p>
<p>The second reason is the belief that the entire problem must be addressed by open
source solutions such as OWASP Dependency Check, Retire.js, or Eclipse Steady. These solutions
enable widespread tool adoption by open source and commercial software developers, and remain
independent of vendor-specific interests and infrastructures. Similar to vaccines, only very broad
tool adoption can ensure that open source ecosystems are healthy and trusted.</p>
<h2>Eclipse Steady Is a Code-Centric Approach to Open Source Vulnerabilities</h2>
<p>Work on Eclipse Steady started at SAP Security Research in 2014 with the development of
a code-centric approach to open source vulnerabilities, which boils down to identifying single
vulnerable open source methods [<a href="#ref_10">10</a>, <a href="#ref_11">11</a>]. This fine-grain approach allows for the application and
combination of all kinds of static and dynamic program analyses, which is out of scope for
coarse-grain approaches that only map component versions to vulnerability identifiers.</p>
<p>What started as a research prototype has matured into an industry-grade tool that is
used to scan all Java applications developed at SAP. More than 1,500 distinct projects have been
scanned since 2017, and there are more than 150,000 individual scans each month. The tool has been
open source since 2018 under the name &ldquo;vulnerability assessment tool&rdquo; [<a href="#ref_6">6</a>] and is now
in the process of being moved to the Eclipse Foundation as the Eclipse Steady project [<a href="#ref_7">7</a>].</p>
<p>
<strong>Note: </strong>At this time, the source code is still located in the<a
href="https://github.com/SAP/vulnerability-assessment-tool"
> SAP GitHub</a> repository. However, it will soon be available in the<a
href="https://github.com/eclipse/steady"
> Eclipse GitHub</a> repository.
</p>
<h2>The Code-Centric Approach Is Crucial</h2>
<p>The technical approach relies on identifying source code with a given vulnerability
through automated analysis of so-called &ldquo;fix commits.&rdquo;</p>
<p>For example, the vulnerability CVE-2018-1000632 in Dom4j was fixed by commit e598e in
its source code repository [<a href="#ref_8">8</a>, <a href="#ref_9">9</a>]. As part of the commit, the method
<code>org.dom4j.tree.QNameCache.get(String,String)</code> was modified and the method
<code>org.dom4j.QName.validateName(String) </code>was added.</p>
<p>The signature of vulnerable source code constructs, such as methods, as well as the
abstract syntax trees of the vulnerable and fixed versions, are stored in a PostgreSQL database,
and can be consulted using a dedicated Web frontend.</p>
<p>This information is then used to detect vulnerable code in application dependencies and
for static and dynamic reachability analyses.</p>
<p>Using this information, the detection of open source vulnerabilities consists of
identifying the signature of vulnerable methods in application dependencies, such as Java archives
(JARs), and comparing whether the respective method body is equal (closer) to the fixed method
body or to the vulnerable method body (obtained from the fix commit). This approach makes
detection very precise and robust against the re-bundling of Java classes, which is a very common
technique in Java.</p>
<p>If a given vulnerable method is found in an application dependency, Eclipse Steady can
perform static and dynamic analyses, alone or in combination. Figure 1 provides an overview of
findings for a sample application. Here, the red exclamation marks indicate that vulnerable code
is present, while the red footprints indicate the vulnerable code is potentially reachable
(according to static analysis) or has been executed (according to dynamic analysis).</p>
<p>
<strong>Figure 1: Overview analysis results</strong>
</p>
<p><img src="images/6_1.png"/>
</p>
<p>The static analysis uses the open source tools Wala [<a href="#ref_12">12</a>] or Soot [<a href="#ref_13">13</a>] to build a call
graph that starts at the application methods and checks for vulnerable methods. If any are found,
the analysis concludes there is an execution path from an application method to a vulnerable
method.</p>
<p>For example, Figure 2 shows that the vulnerable constructor <code>Namespace(String,String)</code> is
reachable from the application methods <code>processRequest</code>, <code>doPost</code>, and <code>main</code> (highlighted in green).</p>
<p>
<strong>Figure 2: Execution path from application method to vulnerable method</strong>
</p>
<p><img src="images/6_2.png"/></p>
<p>The dynamic analysis uses a dedicated Java agent to instrument all methods so the
execution of vulnerable methods can be detected. This analysis can be completed during execution
of Junit and integration tests and with any standalone Java Virtual Machine (JVM).</p>
<p>Static and dynamic analyses can also be combined to overcome the weaknesses in each
approach &mdash; the use of reflection in static analysis, and the limited test coverage in
dynamic analysis. When they are combined, all methods executed from the application and its
dependencies during tests are used as entry points for call graph construction. Experiments have
confirmed that combining the two techniques results in a 7.9 percent increase in evidence that
vulnerable code is potentially executable [<a href="#ref_11">11</a>].</p>
<p>The results from static and dynamic analyses are also used to compute update metrics
that help developers choose the best alternative when updating a vulnerable version to a
non-vulnerable version.</p>
<p>For example, the metrics consider whether the component API used by the application
changes and, and a result, whether an upgrade would result in compile exceptions. If there are no
direct API calls from application methods to open source methods (so-called touchpoints), which is
typically the case with transitive dependencies, the metrics consider the stability of methods
between the version in use and the respective non-vulnerable alternative.</p>
<p>In Figure 3, 276 of 288 methods have an identical method signature and body in version
3.17 of Maven artifact <code>org.apache.poi:poi-ooxml</code>.</p>
<p>
<strong>Figure 3: Touchpoints and update metrics</strong>
</p>
<p><img src="images/6_3.png"/></p>
<h2>The Pros and Cons of a Code-Centric Approach</h2>
<p>The immediate advantage of a code-centric approach is precision in detecting vulnerable
code, no matter which archive or artifact contains it. In addition, it&rsquo;s possible to apply a
variety of software analysis techniques to determine, for example, the reachability of vulnerable
code. Future extensions of Eclipse Steady could go even further, perhaps moving toward slicing
(reducing) dependencies to the share of code that a given application uses.</p>
<p>However, we won&rsquo;t hide the fact that a code-centric approach also comes with a
cost for maintainers and users.</p>
<p>First, fix commits are not readily available for all vulnerabilities. They are
sometimes referenced by Common Vulnerabilities and Exposures (CVE) entries, as is the case with
CVE-2018-1000632 [<a href="#ref_9">9</a>]. In many other cases, they must be collected by manually searching through
issue trackers and commit histories, which can be tedious and inefficient. So far, we have
collected about 1,300 fix commits, and have made them available in a dedicated repository [<a href="#ref_14">14</a>].</p>
<p>To foster development of code-centric tools for vulnerability management independent of
Eclipse Steady, we strongly recommend that open source projects mention fix commit(s) in public
security advisories and communicate fix commits to the National Vulnerability Database (NVD) or to
MITRE Corporation for CVE entries. In other words: Community, tell the world about your fix
commits!</p>
<p>Second, the current implementation of Eclipse Steady requires fix commits to be
analyzed once by each development organization that wants to use Steady. So far, due to license
concerns, we refrained from sharing the results of these analyses, which are basically the project
source code, in the repository [<a href="#ref_14">14</a>].</p>
<p>Third, the focus on code requires Eclipse Steady to dig deep into the specifics of
different programming languages. So far, we have only developed the full breadth of analyses for
Java. Python support is limited to detection of vulnerable code, and static and dynamic analyses
are not yet supported.</p>
<p>In comparison to Eclipse Steady, open source vulnerability scanners that rely on
metadata can be extended more easily toward different languages. The OWASP Dependency Check, for
example, fully supports Java and .NET, offers experimental support for Ruby, Node.js, and Python,
and offers limited support for C/C++ build systems [<a href="#ref_4">4</a>]. Being language-agnostic is particularly
useful in development projects that mix different programming languages because it avoids the need
for different tools.</p>
<h2>Getting Started With Eclipse Steady</h2>
<p>Today, using Eclipse Steady requires running several Docker containers to persist
vulnerability information and analysis results. This is facilitated through the provision of
Docker images on Docker Hub [<a href="#ref_15">15</a>] as well as Docker Compose files and Helm charts.</p>
<p>Next, users must analyze fix commits to populate the local PostgreSQL database with
detailed information about vulnerable methods (signatures and abstract syntax trees). This is
typically done for fix commits of open source projects, such as the ones shared through the SAP
repository. However, it can also be done for fix commits of proprietary software projects
maintained in private source code repositories.</p>
<p>Once these steps are complete, Java applications can be scanned using, for example, the
plugins for Maven and Gradle. Necessary configuration parameters include the URL of the backend
service as well as the token of a so-called workspace, which serves as a container for scan
results.</p>
<p>Assuming the template Maven profile [<a href="#ref_17">17</a>] has been included in the application&rsquo;s
pom.xml file, detection of vulnerable code can be triggered using the following Maven command:</p>
<p><code>mvn -Dvulas compile vulas:app</code></p>
<p>The static analysis starting from application code can be triggered as follows:</p>
<p><code>mvn -Dvulas compile vulas:a2c</code></p>
<p>More information about the various plugin goals can be found in the comprehensive user
manual available through the SAP GitHub [<a href="#ref_16">16</a>].</p>
<h2>Looking Ahead</h2>
<p>If you&rsquo;re asking yourself whether Eclipse Steady is ready for production, the
clear answer is yes because SAP has been successfully running the code for almost three years.
However, the effort required to continuously operate Steady in a private cloud, provide user
support, and maintain the vulnerability database keeps two engineers busy full time. Even if we
assume it will become easier to identify fix commits in the future, the time and effort required
exceeds the capacity of individual software developers and small development organizations.</p>
<p>It is understood that these resource requirements inhibit tool adoption, so future
developments must try to lower the barrier.</p>
<p>First, we aim to improve management and synchronization of fix commits. It should be as
easy as possible for open source project maintainers to contribute new fix commits to repositories
[<a href="#ref_14">14</a>]. For users of Eclipse Steady, the local synchronization and analysis must happen in a
completely automated fashion.</p>
<p>Second, we aim to develop a version that makes the presence of an always-on central
component optional. In other words, users will be able to scan their applications without the need
to operate Docker containers. At the same time, bigger software organizations will be able to run
Steady with a central backend. This approach gives these organizations several interesting
features, including trend analyses and the potential to find all applications affected by a given
vulnerability.</p>
<p>Third, once the signature of a vulnerable method is found in a Java archive, Steady
compares its method body (Java bytecode) with the method bodies obtained from the source code
repository of the respective open source project (Java source code). Today, Eclipse Steady
requires running a periodic batch job that uses different strategies to perform this comparison
[<a href="#ref_11">11</a>]. The current implementation, however, does not always find an answer and manual intervention
is required. To overcome this problem, we aim to develop a better bytecode to source code
comparison, possibly using intermediate code representations such as Soot Jimple [<a href="#ref_13">13</a>].</p>
<h2>References</h2>
<p id="ref_1">
[1]&nbsp;<a
href="https://www.owasp.org/index.php/Top_10-2017_A9-Using_Components_with_Known_Vulnerabilities"
>https://www.owasp.org/index.php/Top_10-2017_A9-Using_Components_with_Known_Vulnerabilities</a>
</p>
<p id="ref_2">
[2]&nbsp;<a
href="https://www.owasp.org/index.php/Top_10_2013-A9-Using_Components_with_Known_Vulnerabilities"
>https://www.owasp.org/index.php/Top_10_2013-A9-Using_Components_with_Known_Vulnerabilities</a>
</p>
<p id="ref_3">
[3]&nbsp;<a href="https://github.com/retirejs/retire.js/">https://github.com/retirejs/retire.js/</a>
</p>
<p id="ref_4">
[4]&nbsp;<a href="https://www.owasp.org/index.php/OWASP_Dependency_Check">https://www.owasp.org/index.php/OWASP_Dependency_Check</a>
</p>
<p id="ref_5">
[5]&nbsp;<a
href="https://www.gartner.com/en/documents/3971011/technology-insight-for-software-composition-analysis"
>https://www.gartner.com/en/documents/3971011/technology-insight-for-software-composition-analysis</a>
</p>
<p id="ref_6">
[6]&nbsp;<a href="https://sap.github.io/vulnerability-assessment-tool/">https://sap.github.io/vulnerability-assessment-tool/</a>
</p>
<p id="ref_7">
[7]&nbsp;<a href="https://projects.eclipse.org/projects/technology.steady">https://projects.eclipse.org/projects/technology.steady</a>
</p>
<p id="ref_8">
[8]&nbsp;<a href="https://github.com/dom4j/dom4j/commit/e598e">https://github.com/dom4j/dom4j/commit/e598e</a>
</p>
<p id="ref_9">
[9]&nbsp;<a href="https://nvd.nist.gov/vuln/detail/CVE-2018-1000632">https://nvd.nist.gov/vuln/detail/CVE-2018-1000632</a>
</p>
<p id="ref_10">
[10]&nbsp;<a href="https://arxiv.org/pdf/1504.04971">https://arxiv.org/pdf/1504.04971</a>
</p>
<p id="ref_11">
[11]&nbsp;<a href="https://arxiv.org/abs/1806.05893">https://arxiv.org/abs/1806.05893</a>
</p>
<p id="ref_12">
[12]&nbsp;<a href="https://github.com/wala/WALA">https://github.com/wala/WALA</a>
</p>
<p id="ref_13">
[13]&nbsp;<a href="http://sable.github.io/soot/">http://sable.github.io/soot/</a>
</p>
<p id="ref_14">
[14]<a href="https://github.com/SAP/vulnerability-assessment-kb">
https://github.com/SAP/vulnerability-assessment-kb</a>
</p>
<p id="ref_15">
[15]<a href="https://hub.docker.com/u/vulas"> https://hub.docker.com/u/vulas</a>
</p>
<p id="ref_16">
[16]<a href="https://sap.github.io/vulnerability-assessment-tool/user/manuals/">
https://sap.github.io/vulnerability-assessment-tool/user/manuals/</a>
</p>
<p id="ref_17">
[17]<a href="https://sap.github.io/vulnerability-assessment-tool/user/manuals/setup/#maven">
https://sap.github.io/vulnerability-assessment-tool/user/manuals/setup/#maven</a>
</p>
<p id="ref_18">
[18]<a
href="https://cdn2.hubspot.net/hub/203759/file-1100864196-pdf/docs/Contrast_-_Insecure_Libraries_2014.pdf"
>
https://cdn2.hubspot.net/hub/203759/file-1100864196-pdf/docs/Contrast_-_Insecure_Libraries_2014.pdf</a>
</p>
<p id="ref_19">
[19]<a href="https://snyk.io/blog/owasp-top-10-breaches/">
https://snyk.io/blog/owasp-top-10-breaches/</a>
</p>
<div class="bottomitem">
<h3>About the Author</h3>
<div class="row">
<div class="col-sm-12">
<div class="row">
<div class="col-sm-8">
<img class="img-responsive"
src="/community/eclipse_newsletter/2019/december/images/henrik.png" alt="Henrik Plate"
/>
</div>
<div class="col-sm-16">
<p class="author-name">Henrik Plate</p>
<p>Senior Researcher<br>
SAP Security Research</p>
</div>
</div>
</div>
</div>
</div>