blob: ffd253692d232fdf71f03aeee98778260e789118 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="keywords" content="SMILA/Specifications/Partitioning Storages" />
<link rel="shortcut icon" href="http://wiki.eclipse.org/SMILA/Specifications/favicon.ico" />
<link rel="search" type="application/opensearchdescription+xml" href="http://wiki.eclipse.org/opensearch_desc.php" title="Eclipsepedia (English)" />
<link rel="alternate" type="application/rss+xml" title="Eclipsepedia RSS Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=rss" />
<link rel="alternate" type="application/atom+xml" title="Eclipsepedia Atom Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=atom" />
<title>SMILA/Specifications/Partitioning Storages - Eclipsepedia</title>
<style type="text/css" media="screen,projection">/*<![CDATA[*/ @import "/skins/eclipsenova/novaWide.css?116"; /*]]>*/</style>
<link rel="stylesheet" type="text/css" media="print" href="http://wiki.eclipse.org/skins/eclipsenova/eclipsenovaPrint.css?116" />
<link rel="stylesheet" type="text/css" media="handheld" href="http://wiki.eclipse.org/skins/eclipsenova/handheld.css?116" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/header.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/tabs.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/visual.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/layout.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/footer.css" media="screen" />
<!--[if IE]><link rel="stylesheet" type="text/css" href="/skins/eclipsenova/IEpngfix.css" media="screen" /><![endif]-->
<!--[if lt IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE50Fixes.css?116";</style> <![endif]-->
<!--[if IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE55Fixes.css?116";</style><![endif]-->
<!--[if IE 6]><style type="text/css">@import "/skins/eclipsenova/IE60Fixes.css?116";</style><![endif]-->
<!--[if IE 7]><style type="text/css">@import "/skins/eclipsenova/IE70Fixes.css?116";</style><![endif]-->
<!--[if lt IE 7]><script type="text/javascript" src="/skins/common/IEFixes.js?116"></script>
<meta http-equiv="imagetoolbar" content="no" /><![endif]-->
<script type= "text/javascript">/*<![CDATA[*/
var skin = "eclipsenova";
var stylepath = "/skins";
var wgArticlePath = "/$1";
var wgScriptPath = "";
var wgScript = "/index.php";
var wgServer = "http://wiki.eclipse.org";
var wgCanonicalNamespace = "";
var wgCanonicalSpecialPageName = false;
var wgNamespaceNumber = 0;
var wgPageName = "SMILA/Specifications/Partitioning_Storages";
var wgTitle = "SMILA/Specifications/Partitioning Storages";
var wgAction = "view";
var wgRestrictionEdit = [];
var wgRestrictionMove = [];
var wgArticleId = "16740";
var wgIsArticle = true;
var wgUserName = null;
var wgUserGroups = null;
var wgUserLanguage = "en";
var wgContentLanguage = "en";
var wgBreakFrames = false;
var wgCurRevisionId = "286049";
var wgVersion = "1.12.0";
var wgEnableAPI = true;
var wgEnableWriteAPI = false;
/*]]>*/</script>
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/wikibits.js?116"><!-- wikibits js --></script>
<!-- Performance mods similar to those for bug 166401 -->
<script type="text/javascript" src="http://wiki.eclipse.org/index.php?title=-&amp;action=raw&amp;gen=js&amp;useskin=eclipsenova"><!-- site js --></script>
<!-- Head Scripts -->
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/ajax.js?116"></script>
<style type="text/css">/*<![CDATA[*/
.source-xml {line-height: normal; font-size: medium;}
.source-xml li {line-height: normal;}
/**
* GeSHi Dynamically Generated Stylesheet
* --------------------------------------
* Dynamically generated stylesheet for xml
* CSS class: source-xml, CSS id:
* GeSHi (C) 2004 - 2007 Nigel McNie (http://qbnz.com/highlighter)
*/
.source-xml .de1, .source-xml .de2 {font-family: 'Courier New', Courier, monospace; font-weight: normal;}
.source-xml {}
.source-xml .head {}
.source-xml .foot {}
.source-xml .imp {font-weight: bold; color: red;}
.source-xml .ln-xtra {color: #cc0; background-color: #ffc;}
.source-xml li {font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;}
.source-xml li.li2 {font-weight: bold;}
.source-xml .coMULTI {color: #808080; font-style: italic;}
.source-xml .es0 {color: #000099; font-weight: bold;}
.source-xml .br0 {color: #66cc66;}
.source-xml .st0 {color: #ff0000;}
.source-xml .nu0 {color: #cc66cc;}
.source-xml .sc0 {color: #00bbdd;}
.source-xml .sc1 {color: #ddbb00;}
.source-xml .sc2 {color: #339933;}
.source-xml .sc3 {color: #009900;}
.source-xml .re0 {color: #000066;}
.source-xml .re1 {font-weight: bold; color: black;}
.source-xml .re2 {font-weight: bold; color: black;}
/*]]>*/
</style>
<style type="text/css">/*<![CDATA[*/
@import "/index.php?title=MediaWiki:Geshi.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=18000";
/*]]>*/
</style><style type="text/css">/*<![CDATA[*/
.source-java {line-height: normal; font-size: medium;}
.source-java li {line-height: normal;}
/**
* GeSHi Dynamically Generated Stylesheet
* --------------------------------------
* Dynamically generated stylesheet for java
* CSS class: source-java, CSS id:
* GeSHi (C) 2004 - 2007 Nigel McNie (http://qbnz.com/highlighter)
*/
.source-java .de1, .source-java .de2 {font-family: 'Courier New', Courier, monospace; font-weight: normal;}
.source-java {}
.source-java .head {}
.source-java .foot {}
.source-java .imp {font-weight: bold; color: red;}
.source-java .ln-xtra {color: #cc0; background-color: #ffc;}
.source-java li {font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;}
.source-java li.li2 {font-weight: bold;}
.source-java .kw1 {color: #7F0055; font-weight: bold;}
.source-java .kw2 {color: #7F0055; font-weight: bold;}
.source-java .kw3 {color: #000000; font-weight: normal}
.source-java .kw4 {color: #7F0055; font-weight: bold;}
.source-java .co1 {color: #3F7F5F; font-style: italic;}
.source-java .co2 {color: #3F7F5F;}
.source-java .co3 {color: #3F7F5F; font-style: italic; font-weight: bold;}
.source-java .coMULTI {color: #3F5FBF; font-style: italic;}
.source-java .es0 {color: #000000;}
.source-java .br0 {color: #000000;}
.source-java .st0 {color: #2A00ff;}
.source-java .nu0 {color: #000000;}
.source-java .me1 {color: #000000;}
.source-java .me2 {color: #000000;}
/*]]>*/
</style>
<style type="text/css">/*<![CDATA[*/
@import "/index.php?title=MediaWiki:Geshi.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=18000";
/*]]>*/
</style><link rel="stylesheet" type="text/css" href="Partitioning_Storages.html" /> </head>
<body class="mediawiki ns-0 ltr page-SMILA_Specifications_Partitioning_Storages">
<div id="globalWrapper">
<div id="column-one">
<!-- Eclipse Additions for the Top Nav start here M. Ward-->
<div id="header">
<div id="header-graphic">
<img src="http://wiki.eclipse.org/skins/eclipsenova/eclipse.png" alt="Eclipse Wiki">
</div>
<!-- Pulled 101409 Mward -->
<div class="portlet" id="p-personal">
<div class="pBody">
<ul>
<li id="pt-login"><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Specifications/Partitioning_Storages">Log in</a></li>
</ul>
</div>
</div>
<div id="header-icons">
<div id="sites">
<ul id="sitesUL">
<li><a href="http://www.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/eclipseIcon.png" width="28" height="28" alt="Eclipse Foundation" title="Eclipse Foundation" /><div>Eclipse Foundation</div></a></li>
<li><a href="http://marketplace.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/marketplace.png" width="28" height="28" alt="Eclipse Marketplace" title="Eclipse Marketplace" /><div>Eclipse Marketplace</div></a></li>
<li><a href="https://bugs.eclipse.org/bugs"><img src="http://dev.eclipse.org/custom_icons/system-search-bw.png" width="28" height="28" alt="Bugzilla" title="Bugzilla" /><div>Bugzilla</div></a></li>
<li><a href="http://live.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/audio-input-microphone-bw.png" width="28" height="28" alt="Live" title="Live" /><div>Eclipse Live</div></a></li>
<li><a href="http://planeteclipse.org"><img src="http://dev.eclipse.org/large_icons/devices/audio-card.png" width="28" height="28" alt="PlanetEclipse" title="Planet" /><div>Planet Eclipse</div></a></li>
<li><a href="http://portal.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/preferences-system-network-proxy-bw.png" width="28" height="28" alt="Portal" title="Portal" /><div>My Foundation Portal</div></a></li>
</ul>
</div>
</div>
</div>
<!-- NEW HEADER STUFF HERE -->
<div id="header-menu">
<div id="header-nav">
<ul> <li><a class="first_one" href="http://wiki.eclipse.org/" target="_self">Home</a></li> <li><a href="http://www.eclipse.org/downloads/" target="_self">Downloads</a></li>
<li><a href="http://www.eclipse.org/users/" target="_self">Users</a></li>
<li><a href="http://www.eclipse.org/membership/" target="_self">Members</a></li>
<li><a href="http://wiki.eclipse.org/index.php/Development_Resources" target="_self">Committers</a></li>
<li><a href="http://www.eclipse.org/resources/" target="_self">Resources</a></li>
<li><a href="http://www.eclipse.org/projects/" target="_self">Projects</a></li>
<li><a href="http://www.eclipse.org/org/" target="_self">About Us</a></li>
</ul>
</div>
<div id="header-utils">
<!-- moved the search window here -->
<form action="http://wiki.eclipse.org/Special:Search" >
<input class="input" name="search" type="text" accesskey="f" value="" />
<input type='submit' onclick="this.submit();" name="go" id="searchGoButton" class="button" title="Go to a page with this exact name if one exists" value="Go" />&nbsp;
<input type='submit' onclick="this.submit();" name="fulltext" class="button" id="mw-searchButton" title="Search Eclipsepedia for this text" value="Search" />
</form>
</div>
</div>
<!-- Eclipse Additions for the Header stop here -->
<!-- Additions and mods for leftside nav Start here -->
<!--Started nav rip here-->
<!-- these are the nav controls main page, changes etc -->
<div id="novaContent" class="faux">
<div id="leftcol">
<ul id="leftnav">
<!-- these are the page controls, edit history etc -->
<li class="separator"><a class="separator">Navigation &#160;&#160;</li>
<li id="n-mainpage"><a href="http://wiki.eclipse.org/Main_Page">Main Page</a></li>
<li id="n-portal"><a href="http://wiki.eclipse.org/Eclipsepedia:Community_Portal">Community portal</a></li>
<li id="n-currentevents"><a href="http://wiki.eclipse.org/Eclipsepedia:Current_events">Current events</a></li>
<li id="n-recentchanges"><a href="http://wiki.eclipse.org/Special:Recentchanges">Recent changes</a></li>
<li id="n-randompage"><a href="http://wiki.eclipse.org/Special:Random">Random page</a></li>
<li id="n-help"><a href="http://wiki.eclipse.org/Help:Contents">Help</a></li>
<li class="separator"><a class="separator">Toolbox &#160;&#160;</a></li>
<li id="t-whatlinkshere"><a href="http://wiki.eclipse.org/Special:Whatlinkshere/SMILA/Specifications/Partitioning_Storages">What links here</a></li>
<li id="t-recentchangeslinked"><a href="http://wiki.eclipse.org/Special:Recentchangeslinked/SMILA/Specifications/Partitioning_Storages">Related changes</a></li>
<!-- This is the toolbox section -->
<li id="t-upload"><a href="http://wiki.eclipse.org/Special:Upload">Upload file</a></li>
<li id="t-specialpages"><a href="http://wiki.eclipse.org/Special:Specialpages">Special pages</a></li>
<li id="t-print"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/Partitioning_Storages&amp;printable=yes">Printable version</a></li> <li id="t-permalink"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/Partitioning_Storages&amp;oldid=286049">Permanent link</a></li> </ul>
</div>
<!-- Additions and mods for leftside nav End here -->
<div id="column-content">
<div id="content">
<a name="top" id="top"></a>
<div id="tabs">
<ul class="primary">
<li class="active"><a href="Partitioning_Storages.html"><span class="tab">Page</span></a></li>
<li><a href="http://wiki.eclipse.org/Talk:SMILA/Specifications/Partitioning_Storages"><span class="tab">Discussion</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/Partitioning_Storages&amp;action=edit"><span class="tab">View source</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/Partitioning_Storages&amp;action=history"><span class="tab">History</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Specifications/Partitioning%20Storages"><span class="tab">Edit</span></a></li>
</ul>
</div>
<script type="text/javascript"> if (window.isMSIE55) fixalpha(); </script>
<h1 class="firstHeading">SMILA/Specifications/Partitioning Storages</h1>
<div id="bodyContent">
<h3 id="siteSub">From Eclipsepedia</h3>
<div id="contentSub"><span class="subpages">&lt; <a href="../../SMILA.html" title="SMILA">SMILA</a> | <a href="../Specifications.1.html" title="SMILA/Specifications">Specifications</a></span></div>
<div id="jump-to-nav">Jump to: <a href="Partitioning_Storages.html#column-one">navigation</a>, <a href="Partitioning_Storages.html#searchInput">search</a></div> <!-- start content -->
<div class="messagebox" style="background-color: #def3fe; border: 1px solid #c5d7e0; color: black; padding: 5px; margin: 1ex 0; min-height: 35px; padding-left: 45px;">
<div style="float: left; margin-left: -40px;"><a href="http://wiki.eclipse.org/Image:Note.png" class="image" title="Note.png"><img alt="" src="http://wiki.eclipse.org/images/c/cc/Note.png" width="35" height="35" border="0" /></a></div>
<div><b>This page is out of data, and I'm not sure if we still need it all. It should be either rewritten or deleted.</b><br /></div>
</div>
<table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div>
<ul>
<li class="toclevel-1"><a href="Partitioning_Storages.html#Implementing_Storage_Points_for_Data_Backup_and_Reusing"><span class="tocnumber">1</span> <span class="toctext">Implementing Storage Points for Data Backup and Reusing</span></a>
<ul>
<li class="toclevel-2"><a href="Partitioning_Storages.html#Implementing_Storage_Points"><span class="tocnumber">1.1</span> <span class="toctext">Implementing Storage Points</span></a>
<ul>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Requirements"><span class="tocnumber">1.1.1</span> <span class="toctext">Requirements</span></a></li>
</ul>
</li>
<li class="toclevel-2"><a href="Partitioning_Storages.html#Architecture_overview"><span class="tocnumber">1.2</span> <span class="toctext">Architecture overview</span></a></li>
<li class="toclevel-2"><a href="Partitioning_Storages.html#Proposed_changes"><span class="tocnumber">1.3</span> <span class="toctext">Proposed changes</span></a>
<ul>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Storage_points_configuration"><span class="tocnumber">1.3.1</span> <span class="toctext">Storage points configuration</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Alternative:_Storage_Point_ID_as_OSGi_service_property"><span class="tocnumber">1.3.2</span> <span class="toctext">Alternative: Storage Point ID as OSGi service property</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Configuring_storage_points_for_DFP"><span class="tocnumber">1.3.3</span> <span class="toctext">Configuring storage points for DFP</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Passing_the_Storage-Location_to_BPEL.2FPipelets.2FBlackboard"><span class="tocnumber">1.3.4</span> <span class="toctext">Passing the Storage-Location to BPEL/Pipelets/Blackboard</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Changes_in_the_Blackboard_service"><span class="tocnumber">1.3.5</span> <span class="toctext">Changes in the Blackboard service</span></a></li>
</ul>
</li>
<li class="toclevel-2"><a href="Partitioning_Storages.html#Partitions"><span class="tocnumber">1.4</span> <span class="toctext">Partitions</span></a>
<ul>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Requirements_2"><span class="tocnumber">1.4.1</span> <span class="toctext">Requirements</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Changes_in_Storages_API"><span class="tocnumber">1.4.2</span> <span class="toctext">Changes in Storages API</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Alternative_implenmentation_using_OSGi_services"><span class="tocnumber">1.4.3</span> <span class="toctext">Alternative implenmentation using OSGi services</span></a></li>
<li class="toclevel-3"><a href="Partitioning_Storages.html#Proposed_further_changes"><span class="tocnumber">1.4.4</span> <span class="toctext">Proposed further changes</span></a></li>
</ul>
</li>
</ul>
</li>
</ul>
</td></tr></table><script type="text/javascript"> if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); } </script>
<a name="Implementing_Storage_Points_for_Data_Backup_and_Reusing"></a><h2> <span class="mw-headline"> Implementing Storage Points for Data Backup and Reusing </span></h2>
<a name="Implementing_Storage_Points"></a><h3> <span class="mw-headline"> Implementing Storage Points </span></h3>
<a name="Requirements"></a><h4> <span class="mw-headline"> Requirements </span></h4>
<p>The core of the SMILA framework consists of a pipeline where data is pushed into a queue whence it is fed into data flow processors (DFP). The requirement for Storage Points is that it should be possible to store records in a specific “storage point” after each DFP. Storage Point in this case means some storage configuration where data is saved, for example it can be partition in local storage (please see Chapter 2 for more information) or some remote storage.
Storage points should be configured in the following way:
</p>
<ol><li> Each DFP should have a configuration for the storage point from where data will be loaded before the BPEL processing (“Input” storage point);
</li><li> Optionally it should be possible to configure the storage point where data should be stored after BPEL processing (“Output” storage point). If this configuration is omitted, data should not be stored to any storage point at all after BPEL processing;
</li><li> If “Input” and “Output” storage points have the same configuration, data in the “Input” storage point should be overridden after BPEL processing.
</li></ol>
<p>The goal for these modifications is that information that is stored to storage points can be accessed anytime later. This will solve following problems:
</p>
<ol><li> Backup and recovery: It will be possible to make a backup copy of some specific state of data
</li><li> Failure recovery during processing: Some DFP involve expensive processing. With storage points it will be possible to continue processing from one of the previously saved states in case of DFP failure
</li><li> Reusing data collected from previous DFPs: Data that is a result of executing some DFP sequence can be saved to storage point and reused later
</li><li> Easy data management: It will be possible to easily manage saved states of data, for example delete or move some storage point that contains obsolete data
</li></ol>
<p>However, this all should not make the basic configuration of a SMILA system more complicated: If one does not care about multiple storages at all, it should not be necessary to configure storage points all over the configuration files, but everything should run OK on defaults.
</p>
<a name="Architecture_overview"></a><h3> <span class="mw-headline"> Architecture overview </span></h3>
<p>Following figure shows the overview of the core components of the SMILA framework:
</p><p><a href="http://wiki.eclipse.org/Image:SMILA-storagepoints-architecture-overview.png" class="image" title="SMILA Architecture Overview"><img alt="SMILA Architecture Overview" src="http://wiki.eclipse.org/images/c/c4/SMILA-storagepoints-architecture-overview.png" width="640" height="625" border="0" /></a>
</p><p>To support the above requirements components shown on Figure 1 should be changed in the following way:
</p><p>A. There should be a way to configure storage points for each DFP;
</p><p>B. Blackboard service should be changed to handle storage points.
</p>
<a name="Proposed_changes"></a><h3> <span class="mw-headline"> Proposed changes </span></h3>
<a name="Storage_points_configuration"></a><h4> <span class="mw-headline"> Storage points configuration </span></h4>
<p>It is proposed to use XML configuration file to configure storage points. Storage points will be identified by name and the whole configuration will look like following:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;StoragePoints<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;StoragePoint</span> <span class="re0">Id</span>=“point1“<span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;storage</span> point parameters, eg. storage interface, partition etc<span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/StoragePoint<span class="re2">&gt;</span></span></span>
...
<span class="sc3"><span class="re1">&lt;/StoragePoints<span class="re2">&gt;</span></span></span></pre></div>
<p>For example:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml">...
<span class="sc3"><span class="re1">&lt;StoragePoint</span> <span class="re0">Id</span>=”point1”<span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;XmlStorage</span> <span class="re0">Service</span>=”XmlStorageService” <span class="re0">Partition</span>=”A”<span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;BinaryStorage</span> <span class="re0">Service</span>=”BinaryStorageService” <span class="re0">Partition</span>=”B”<span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/StoragePoint<span class="re2">&gt;</span></span></span>
...</pre></div>
<p>A User can define a default StorageID, that is every time used when no specific StorageID is defined
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml">...
<span class="sc3"><span class="re1">&lt;StoragePoint</span> <span class="re0">Id</span>=”DefaultStoragePoint”<span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;XmlStorage</span> <span class="re0">Service</span>=”XmlStorageService” <span class="re0">Partition</span>=”A”<span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;BinaryStorage</span> <span class="re0">Service</span>=”BinaryStorageService” <span class="re0">Partition</span>=”B”<span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/StoragePoint<span class="re2">&gt;</span></span></span>
...</pre></div>
<p><br />
To make configuration easier storages API can be normalized so that all storages will implement the same interface. (Some proposal on this subject was posted by Ivan into mailing list).
</p>
<a name="Alternative:_Storage_Point_ID_as_OSGi_service_property"></a><h4> <span class="mw-headline"> Alternative: Storage Point ID as OSGi service property </span></h4>
<p>In this example we do not need a centralized configuration of storages and storage points, but just add a Storage Point ID to each Record Metadata/Binary Storage or as JMSProperty (which is discussed in the next section) as a OSGi service property, e.g in a DS component description of an binary storage service:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;component</span> <span class="re0">name</span>=<span class="st0">&quot;BinaryStorageService&quot;</span> <span class="re0">immediate</span>=<span class="st0">&quot;true&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;service<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;provide</span> <span class="re0">interface</span>=<span class="st0">&quot;org.eclipse.smila.binarystorage.BinaryStorageService&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/service<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;property</span> <span class="re0">name</span>=<span class="st0">&quot;smila.storage.point.id&quot;</span> <span class="re0">value</span>=<span class="st0">&quot;point1&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/component<span class="re2">&gt;</span></span></span></pre></div>
<p>And in an (XML) Record Metadata storage service:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;component</span> <span class="re0">name</span>=<span class="st0">&quot;XmlStorageService&quot;</span> <span class="re0">immediate</span>=<span class="st0">&quot;true&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;implementation</span> <span class="re0">class</span>=<span class="st0">&quot;org.eclipse.smila.xmlstorage.internal.impl.XmlStorageServiceImpl&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;service<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;provide</span> <span class="re0">interface</span>=<span class="st0">&quot;org.eclipse.smila.xmlstorage.XmlStorageService&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;provide</span> <span class="re0">interface</span>=<span class="st0">&quot;org.eclipse.smila.storage.RecordMetadataStorageService&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/service<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;property</span> <span class="re0">name</span>=<span class="st0">&quot;smila.storage.point.id&quot;</span> <span class="re0">value</span>=<span class="st0">&quot;point1&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/component<span class="re2">&gt;</span></span></span></pre></div>
<p>(note that we also introduced a second interface here that is more specialized for storing and reading Record Metadata for a Blackboard than the XmlStorageService, but does not enforce that the service uses XML to store record metadata).
</p><p>Then the Blackboard wanting to use Storage Point "point1" would just look for a RecordMetadataStorageService and a BinaryStorageService (for attachments) having the property set to "point1". There would be no need to implement a central StoragePoint configuration facility.
</p>
<a name="Configuring_storage_points_for_DFP"></a><h4> <span class="mw-headline"> Configuring storage points for DFP </span></h4>
<p>As shown on the Figure 1, Listener component is responsible for getting Record from JMS queue, loading record on the Blackboard and executing BPEL workflow. Storage points cannot be configured inside the BPEL Workflow because the same BPEL Workflow can be used in multiple DFPs and hence can use different storage points.
Thus it’s proposed to configure storage point IDs into Listener rules. With such configuration it will be possible to have separate storage points configurations for each workflow and all DFPs will be configured in a single place.
</p><p>There are two ways of how storage points can be configured:
</p><p>1. Listener rule contains configuration only for “Output” storage point. The “Input” storage point Id is read form the queue. After processing is finished “Output” storage poitn Id is sent back to the queue and becomes “Input” configuration for the next DFP. The whole process is shown on the following picture:
</p><p><a href="http://wiki.eclipse.org/Image:SMILA-storagepoints-queue-option-1.png" class="image" title="Listener rules contain &quot;output&quot; point only"><img alt="Listener rules contain &quot;output&quot; point only" src="http://wiki.eclipse.org/images/d/dc/SMILA-storagepoints-queue-option-1.png" width="640" height="267" border="0" /></a>
</p><p>The advantage of this way is that user needs to carry only about “Output” storage point configuration because “Input” storage point configuration will be automatically obtained from the queue. On the other hand, it can greatly complicate management, backup and data reusing tasks because it will be not possible to find out which storage point was used as “Input” when particular Listener rule was applied.
</p><p>Example: Listener Config:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java">&lt;Rule <span class="kw3">Name</span>=<span class="st0">&quot;ADD Rule&quot;</span> WaitMessageTimeout=<span class="st0">&quot;10&quot;</span> Workers=<span class="st0">&quot;2&quot;</span> TargetStoragePoint=<span class="st0">&quot;p1&quot;</span>&gt;
...
&lt;/Rule&gt;</pre></div>
<p>The source targetStorePoint is transferred over the Queue by storing it in the Record as MetaData or by sending it as JMSProperty (we used JMSProperties right now for DataSourceID by now) with the Record.
</p><p>2. Listener rule contains configuration for both “Input” and “Output” partitions. In this case storage points configuration is not sent over the queue:
</p><p><a href="http://wiki.eclipse.org/Image:SMILA-storagepoints-queue-option-2.png" class="image" title="Listener rules contain &quot;input&quot; and &quot;output&quot; point"><img alt="Listener rules contain &quot;input&quot; and &quot;output&quot; point" src="http://wiki.eclipse.org/images/d/df/SMILA-storagepoints-queue-option-2.png" width="640" height="251" border="0" /></a>
</p><p>The advantage of this way is that for some particular rule it will always be possible to find out which “Input” and “Output” storage points were used for this rule. In this case it’s up to user to make sure that provided configuration is correct and consistent. This greatly simplifies backup and data management tasks so it’s proposed to implement this way of configuration. Also with this way configuration will be a little more complex, for example if the same rule should be applied in two different DFP sequences but data should be loaded from different “Input” storage points , it will be required to create two rules for each “Input” partition.
</p><p>Example: Listener Config:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java">&lt;Rule <span class="kw3">Name</span>=<span class="st0">&quot;ADD Rule&quot;</span> WaitMessageTimeout=<span class="st0">&quot;10&quot;</span> Workers=<span class="st0">&quot;2&quot;</span> TargetStoragePoint=<span class="st0">&quot;p1&quot;</span> SourceStoragePoint=<span class="st0">&quot;p2&quot;</span>&gt;
...
&lt;/Rule&gt;</pre></div>
<p>Note: Maybe it could even be possible (and useful?) to implement both: Default "input" storage points could be defined in Listener Rules, while a storage point ID could be passed with the message to override the default?
</p><p><b>Rules regarding the Alternative with OSGI-Properties</b>
</p><p>Example: Listener Config:
</p><p>It's completely the same:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java">&lt;Rule <span class="kw3">Name</span>=<span class="st0">&quot;ADD Rule&quot;</span> WaitMessageTimeout=<span class="st0">&quot;10&quot;</span> Workers=<span class="st0">&quot;2&quot;</span> TargetStoragePoint=<span class="st0">&quot;p1&quot;</span> SourceStoragePoint=<span class="st0">&quot;p2&quot;</span>&gt;
...
&lt;/Rule&gt;</pre></div>
<p>To find the services providing the named storage points the DFP has to lookup services for the appropriate interfaces (RecordMetadataStorageService and BinaryStorageService) that have the specific storage point names set as property "smila.storage.point.id". It is not necessary to know the name of the service registration or to specify two separate names for Record and Binary storage service.
</p>
<a name="Passing_the_Storage-Location_to_BPEL.2FPipelets.2FBlackboard"></a><h4> <span class="mw-headline"> Passing the Storage-Location to BPEL/Pipelets/Blackboard </span></h4>
<p>After Listener obtains storage point configuration it should pass this configuration to the Blackboard so that records from that storage point can be loaded on the Blackboard and processed by BPEL workflow. It can be done in the following different ways ( this can be combined with the both upper solutions ):
</p>
<ol><li> Storage point configuration passed as a Record Id property: The advantage of this approach is that it will always be possible to find out easily to which storage point this particular record belongs to. Disadvantage is that record Id is immutable object to be used as a hash key and changing Id properties during processing can be not a good idea. (would best apply to option 1 in the above section)
</li><li> Storage point configuration passed as a Record Metadata: In this case an attribute or annotation containing the storage point ID should be added to the Record metadata before processing.
</li><li> Storage point configuration passed separately from the record: In this case record won’t contain any information about storage points configuration into itself: E.g. in the case where the listener rules do not contain the input storage porint, it could be passed in the queue messsage as a message property. This has the advantage that the listener can also select messages by their storage point of the contained records (e.g., to manage load balancing, or because not all listeners have access to all storage points)
</li></ol>
<p>Therefore we propose to use the third option. Note that it is still possible in this case to store the storage point ID in record metadata for informational purpose (e.g. setting a field in the final search index to read the storage point ID after search). But the relevant storage point ID for the queue listener will be a message property.
</p>
<a name="Changes_in_the_Blackboard_service"></a><h4> <span class="mw-headline"> Changes in the Blackboard service </span></h4>
<p>There are following proposals for Blackboard service changes:
</p>
<ol><li> Blackboard API will expose additional new methods that will allow working with storage points: This imposes to many changes to clients of the blackboard service, so we do not want to follow this road. Further details omitted for now.
</li><li> Blackboard API won’t be changed.
</li></ol>
<p>In this case we introduce a new BlackboardManager service that returns references to the actual Blackboard instances using the default or a named storage point:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java"><span class="kw1">interface</span> BlackboardManager <span class="br0">&#123;</span>
Blackboard getDefaultBlackboard<span class="br0">&#40;</span><span class="br0">&#41;</span>;
Blackboard getBlackboardForStoragePoint<span class="br0">&#40;</span><span class="kw3">String</span> storagePointId<span class="br0">&#41;</span>;
<span class="br0">&#125;</span>
&nbsp;
<span class="kw1">interface</span> Blackboard <span class="br0">&#123;</span>
&lt;contains current Blackboard API methods&gt;
<span class="br0">&#125;</span></pre></div>
<p>With this configuration correct reference of the Blackboard object should be passed to BPEL workflow each time workflow is executed. This can be done by WorkflowProcessor that will send the right Blackboard to the BPEL server. In their invocation, Pipelets and Processing Services get the Blackboard instance to be used from the processing engine anyway, so they will continue working with Blackboard in the same way like it is implemented now.
</p><p>Thus, the WP process(…) method should be enhanced to accept the Blackboard instance as an additional method argument instead of being statically linked to a single blackboard:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java">Id<span class="br0">&#91;</span><span class="br0">&#93;</span> process<span class="br0">&#40;</span><span class="kw3">String</span> workflowName, Blackboard blackboard, Id<span class="br0">&#91;</span><span class="br0">&#93;</span> recordIds<span class="br0">&#41;</span> <span class="kw1">throws</span> ProcessingException;</pre></div>
<p>This resembles the Pipelet/ProcessingService API. However, a difficulty of this may be to find a way to pass the information about which blackboard is to be used in a pipelet/processing service invocation around the integrated BPEL engine to the Pipelet/ServiceProcessingManagers. But I hope this can be solved.
</p>
<a name="Partitions"></a><h3> <span class="mw-headline"> Partitions </span></h3>
<a name="Requirements_2"></a><h4> <span class="mw-headline"> Requirements </span></h4>
<p>The requirement for Partitions is that xml storage and binary storage should be able to work with ‘partitions’. This means that storages should be able to store data to different internal partitions.
</p>
<a name="Changes_in_Storages_API"></a><h4> <span class="mw-headline"> Changes in Storages API </span></h4>
<p>Currently SMILA operates with two physical storages – xml storage and binary storage. API of both storages should be extended to handle partitioning. API should provide methods that will allow getting data from specified partition and saving data to specified partition. Partitioning configuration should be passed as an additional parameter:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java">_xssConnection.<span class="me1">getDocument</span><span class="br0">&#40;</span>Id, Partition<span class="br0">&#41;</span>;
_xssConnection.<span class="me1">addOrUpdateDocument</span><span class="br0">&#40;</span>Id, <span class="kw3">Document</span>, Partition<span class="br0">&#41;</span>;</pre></div>
<p>For the first implementation storages will work with data in partitions in the following way:
</p>
<ul><li> With xml storage each partition will contain its own copy of the record Document.
</li><li> With binary storage each partition will contain its own copy of the attachment.
</li></ul>
<p>This behavior can be improved for better performance in further versions. For more information please see next section.
</p>
<a name="Alternative_implenmentation_using_OSGi_services"></a><h4> <span class="mw-headline"> Alternative implenmentation using OSGi services </span></h4>
<p>If we use the implementation of storage points using OSGi service properties described above in section "Alternative: Storage Point ID as OSGi service properties" we can use this to hide partitions completely from clients: In this case a storage service that wants to provide different partitions could register one "partition proxy" OSGi service for each partition that each have its own storage point ID, provide the correct storage interface (binary/record metdata/XML), but do not store data on their own, but just forward requests to the "master" storage service by just adding the partition name. This proxy service creation can be done programmatically and dynamically by the master service when a new partition is created (via service configuration or management console) so it's not necessary to create a DS component description for each partition.
</p><p>The following figure should illustrate this setup:
</p><p><a href="http://wiki.eclipse.org/Image:SMILA-storagepoint-partition-proxies.png" class="image" title="Use of partition proxy services to hide partitioning of storages"><img alt="Use of partition proxy services to hide partitioning of storages" src="http://wiki.eclipse.org/images/d/d6/SMILA-storagepoint-partition-proxies.png" width="800" height="600" border="0" /></a>
</p><p>This way no client would ever need to use additional partition IDs when communicating with a storage service, and storages that cannot provide partitions do not need to implement methods with partition parameters that cannot be used anyway.
</p><p><br />
</p>
<a name="Proposed_further_changes"></a><h4> <span class="mw-headline"> Proposed further changes </span></h4>
<p>With binary storage attachments can have a big size (for example, when crawling video files), therefore creating actual copy for each partition can be ineffective and can cause serious performance issues. As a solution for this problem binary storage should not create an actual attachment copy for each partition but rather keep reference to actual attachment when attachment was not changed from one partition to another.
</p><p>Anyway, this solution can cause some problems too:
</p>
<ol><li> Problems can occur if backup job is being done with some external tool that is not aware of references. This problem should not generally happen because backups will rather be done with properly configured tool;
</li><li> Some pipelet can change Attachment1 into Partition 1, while Partition 2 should still keep old version of attachment. In this case there should be some service that will be monitoring references consistency.
</li></ol>
<!--
NewPP limit report
Preprocessor node count: 85/1000000
Post-expand include size: 889/2097152 bytes
Template argument size: 359/2097152 bytes
#ifexist count: 0/100
-->
<!-- Saved in parser cache with key wikidb:pcache:idhash:16740-0!1!0!!en!2!edit=0 and timestamp 20120203101330 -->
<div class="printfooter">
Retrieved from "<a href="Partitioning_Storages.html">http://wiki.eclipse.org/SMILA/Specifications/Partitioning_Storages</a>"</div>
<div id="catlinks"><p class='catlinks'><a href="http://wiki.eclipse.org/Special:Categories" title="Special:Categories">Category</a>: <span dir='ltr'><a href="http://wiki.eclipse.org/Category:SMILA" title="Category:SMILA">SMILA</a></span></p></div> <!-- end content -->
<div class="visualClear"></div>
</div>
</div>
</div>
<!-- Yoink of toolbox for phoenix moved up -->
</div>
</div>
<div id="clearFooter"/>
<div id="footer" >
<ul id="footernav">
<li class="first"><a href="http://www.eclipse.org/">Home</a></li>
<li><a href="http://www.eclipse.org/legal/privacy.php">Privacy Policy</a></li>
<li><a href="http://www.eclipse.org/legal/termsofuse.php">Terms of Use</a></li>
<li><a href="http://www.eclipse.org/legal/copyright.php">Copyright Agent</a></li>
<li><a href="http://www.eclipse.org/org/foundation/contact.php">Contact</a></li>
<li><a href="http://wiki.eclipse.org/Eclipsepedia:About" title="Eclipsepedia:About">About Eclipsepedia</a></li>
</ul>
<span id="copyright">Copyright &copy; 2012 The Eclipse Foundation. All Rights Reserved</span>
<p id="footercredit">This page was last modified 12:06, 24 January 2012 by <a href="http://wiki.eclipse.org/index.php?title=User:Juergen.schumacher.attensity.com&amp;action=edit" class="new" title="User:Juergen.schumacher.attensity.com">Juergen Schumacher</a>. Based on work by <a href="http://wiki.eclipse.org/User:Juergen.schumacher.empolis.com" title="User:Juergen.schumacher.empolis.com">Juergen Schumacher</a>, <a href="http://wiki.eclipse.org/index.php?title=User:Svoigt.brox.de&amp;action=edit" class="new" title="User:Svoigt.brox.de">Sebastian Voigt</a> and <a href="http://wiki.eclipse.org/index.php?title=User:Dhazin.brox.de&amp;action=edit" class="new" title="User:Dhazin.brox.de">Dmitry Hazin</a> and <a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/Partitioning_Storages&amp;action=credits" title="SMILA/Specifications/Partitioning Storages">others</a>.</p>
<p id="footerviews">This page has been accessed 2,621 times.</p>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-910670-4");
pageTracker._trackPageview();
</script>
<!-- <div class="visualClear"></div> -->
<script type="text/javascript">if (window.runOnloadHook) runOnloadHook();</script>
</div>
<!-- Served in 0.291 secs. --></body></html>