| <?php |
| /******************************************************************************* |
| * Copyright (c) 2015 Eclipse Foundation and others. |
| * All rights reserved. This program and the accompanying materials |
| * are made available under the terms of the Eclipse Public License v1.0 |
| * which accompanies this distribution, and is available at |
| * http://eclipse.org/legal/epl-v10.html |
| * |
| * Contributors: |
| * Eric Poirier (Eclipse Foundation) - Initial implementation |
| *******************************************************************************/ |
| ?> |
| |
| <h1 class="article-title"><?php echo $pageTitle; ?></h1> |
| <h2>John D. McGregor J. Yates Monteith John E. Ingram</h2> |
| <p> |
| <i>Strategic Software Engineering Research Group<br> Clemson |
| University<br> Clemson, SC 29634<br> {johnmc, jymonte, |
| jei}@clemson.edu |
| </i> |
| </p> |
| |
| <p>The science research enterprise – including organizations such as |
| universities, companies, and federal agencies – supports the |
| development of a large amount of software. In some cases, a large |
| community of scientific users comes to depend on the continued |
| availability of one of these software systems or one of its |
| constituent parts. Examples of these systems include Hadoop, R, |
| Eclipse, and many more. In the case of open-source software, much |
| of the software and software systems developed for scientific |
| users depends on numerous software packages, some with a long |
| lineage of “parent” software projects. The future of the system |
| being developed depends on these components being maintained but |
| there are just too many open source software systems for |
| universities, companies, or federal agencies to support all of |
| them. The science research enterprise must strategically choose |
| which software systems to develop, support, and maintain and which |
| to petition the original producers to maintain.</p> |
| <p>When a software tool becomes popular outside the research group |
| that developed it, the continued use of the software system is a |
| point of risk for the advancement of scientific goals. Scientific |
| outcomes are dependent on the continued support of not just the |
| target software package, but also on the continued maintenance of |
| the ecosystem of software packages upon which a product depends. |
| When decisions must be made about continued funding for these |
| research projects, these decisions should be partially based on |
| the quality and availability of the supporting software |
| infrastructure and the proposed software’s future impact on its |
| scientific community.</p> |
| <p>Our operating premise is that software, which is supported by a |
| healthy ecosystem [8], will be nurtured and sustained. This is |
| easier for “Big Science” projects [3] that involve professional |
| staff than it is for projects with one or two senior investigators |
| and a few graduate students. GitHub and similar development |
| support infrastructure facilitate some mechanical tasks but small |
| groups may not have a computing specialist and may have a hard |
| time identifying and understanding how to use a robust |
| infrastructure. There is a substantial difference between a |
| warehouse such as GitHub, which stores discrete pieces of |
| software, and a development community, which stores software that |
| contributes to the specific products developed by the community.</p> |
| |
| <p> |
| The National Science Foundation (NSF) report: <b><u>A VISION AND |
| STRATEGY FOR SOFTWARE FOR SCIENCE, ENGINEERING, AND EDUCATION</u></b> |
| [9] recommends that NSF “Support the creation and maintenance of |
| an innovative, integrated, reliable, sustainable and accessible |
| ecosystem of software and services that advances scientific |
| inquiry and application at unprecedented complexity and scale.” |
| Taking a software ecosystem approach addresses both organizational |
| issues and technical issues [2, 6]. An analysis of the ecosystem |
| surrounding a project could assist in evaluating requests for |
| funding software development. Such an analysis should include an |
| evaluation of the strength of the community support for the |
| software and the software’s fit with the larger context as defined |
| by the ecosystem’s architecture. The community’s contributions to |
| the software through add-ons, testing, and other continuing |
| activities is an important factor [5]. The analysis might also |
| include an evaluation of the product itself through the quality of |
| the code, architecture, and supporting elements such as automated |
| test cases [5]. |
| </p> |
| <p>This strategic view can be difficult to motivate in basic |
| scientific research projects where the return on investment is |
| even more indirect than for an open source product. The impacts of |
| a research project and its intellectual merit should be considered |
| in the context of value chain analysis to point out the balance |
| between cost and value. Evaluating the potential of start-up |
| companies and patents securing research results would also |
| strengthen the business case.</p> |
| <p>This is not simply an economic issue. Scientific research must be |
| reproducible. Changes to libraries somewhere in the supply chain |
| may affect results and be virtually impossible to trace. Having |
| access to the entire supply chain is essential to reproducibility. |
| A scientific software ecosystem should support reproducibility, as |
| does a commercial product development environment, by providing |
| meta-data that identifies the exact tool chain and software |
| component chains used to produce a specific set of results.</p> |
| <p> |
| There are numerous other issues regarding the sustainability of |
| scientific research software. Many of these issues have been |
| surfaced at the Workshop on Sustainable Software for Science: |
| Practice and Experiences (WSSPE) workshop series (<a |
| target="_blank" |
| href="http://wssspe.researchcomputing.org.uk/wssspe2/cfp/">http://wssspe.researchcomputing.org.uk/wssspe2/cfp/</a>). |
| For example, Allen and Schmidt pointed out issues with |
| establishing a repository of code for a discipline including the |
| need for meta-data curation and giving the repository sufficient |
| within the discipline. They state that “the greatest inhibitors |
| relate to human nature, including the unwillingness of scientists |
| to share their codes openly, the effect of the lack of an adequate |
| reward system for software authorship, and the competitive |
| environment in astronomy [1]”. Habermann et al [4] look at |
| sustainability from the point of view of data “In order to be |
| sustainable in the long-term, data must be preserved in |
| well-documented, self-describing formats accessible on multiple |
| platforms using many programming languages.” |
| </p> |
| <p>Clemson University, a longtime member of the Eclipse Foundation, |
| joined the Eclipse Science Working Group with the goal of |
| participating in the formation of a model ecosystem that sustains |
| scientific research software for a domain. As part of a National |
| Science Foundation funded project, we have already produced |
| several studies and modified our ecosystem modeling technique to |
| facilitate understanding the available software within an |
| ecosystem [6,7]. We look forward to participating in growing and |
| maturing the community and to raising awareness of the issues and |
| potential solutions to developing long-lived scientific research |
| software.</p> |
| <p>This work was partially funded by the National Science Foundation |
| grant #ACI-1343033.</p> |
| <ol> |
| <li>Alice Allen and Judy Schmidt. Looking before leaping: Creating |
| a software registry. http://arxiv.org/abs/1407.5378, 2014.</li> |
| <li>G. Chastek and J. D. McGregor, “It takes an ecosystem,” SSTC, |
| 2012.</li> |
| <li>The CRASH Report - 2011/12 (CAST Report on Application |
| Software Health), <a target="_blank" |
| href="http://www.castsoftware.com/resources/resource/whitepapers/cast-report-on-application-software-health?gad=otd">http://www.castsoftware.com/resources/resource/whitepapers/cast-report-on-application-software-health?gad=otd</a>. |
| </li> |
| <li>Habermann, Ted; Collette, Andrew; Vincena, Steve; Billings, |
| Jay Jay; Gerring, Matt; Hinsen, Konrad; Benger, Werner; Maia, |
| Filipe RNC; Byna, Suren; de Buyl, Pierre (2014): The |
| Hierarchical Data Format (HDF): A Foundation for Sustainable |
| Data and Software. <a target="_blank" |
| href="http://figshare.com/articles/The_Hierarchical_Data_Format_HDF_A_Foundation_for_Sustainable_Data_and_Software/1112485">http://dx.doi.org/10.6084/m9.figshare.1112485</a>. |
| </li> |
| <li>John D. McGregor: A method for analyzing software product line |
| ecosystems: First International Workshop on Software Ecosystems, |
| 73-80, 2008.</li> |
| <li>John Yates Monteith, John D. McGregor, and John E. Ingram. |
| 2014. Proposed metrics on ecosystem health. In Proceedings of |
| the 2014 ACM international workshop on Software-defined |
| ecosystems (BigSystem '14). ACM, New York, NY, USA, 33-36. |
| DOI=10.1145/2609441.2609643 <a target="_blank" |
| href="http://dl.acm.org/citation.cfm?doid=2609441.2609643">http://doi.acm.org/10.1145/2609441.2609643</a>. |
| </li> |
| <li>J. Yates Monteith, John D. McGregor, and John E. Ingram. 2014. |
| Scientific Research Software Ecosystems. In Proceedings of the |
| 2014 European Conference on Software Architecture Workshops |
| (ECSAW '14). ACM, New York, NY, USA, , Article 9 , 6 pages. |
| DOI=10.1145/2642803.2642812 <a target="_blank" |
| href="http://dl.acm.org/citation.cfm?doid=2642803.2642812">http://doi.acm.org/10.1145/2642803.2642812</a>. |
| </li> |
| <li>David G. Messerschmitt and Clemens Szyperski (2003). Software |
| Ecosystem: Understanding an Indispensable Technology and |
| Industry. Cambridge, MA, USA: MIT Press.</li> |
| <li>National Science Foundation, A VISION AND STRATEGY FOR |
| SOFTWARE FORSCIENCE, ENGINEERING, AND EDUCATION |
| CYBERINFRASTRUCTURE FRAMEWORKFOR THE 21ST CENTURY, <a |
| target="_blank" |
| href="http://www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf">www.nsf.gov/pubs/2012/nsf12113/nsf12113.pdf</a>. |
| </li> |
| </ol> |
| |
| <div class="bottomitem"> |
| <h3>About the Authors</h3> |
| |
| <div class="row"> |
| <div class="col-sm-12"> |
| <div class="row"> |
| <div class="col-sm-8"> |
| |
| </div> |
| <div class="col-sm-16"> |
| <p class="author-name"> |
| John D. McGregor<br /> |
| <a target="_blank" href="http://www.clemson.edu/">Clemson |
| University</a> |
| </p> |
| <ul class="author-link"> |
| <!--<li><a target="_blank" href="http://geospatial.blogs.com/">Blog</a></li> |
| <li><a target="_blank" href="https://twitter.com/gzeiss">Twitter</a></li> |
| <li><a target="_blank" href="">Google +</a></li> |
| $og--> |
| </ul> |
| </div> |
| </div> |
| </div> |
| </div> |
| </div> |
| |