blob: ab5f686f93213b5b3e44f162251828dcdaf64f10 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="keywords" content="SMILA/Documentation/ConnectivityFramework,SMILA/Documentation/Agent,SMILA/Documentation/AgentController,SMILA/Documentation/CompoundManagement,SMILA/Documentation/ConnectivityManager,SMILA/Documentation/Crawler,SMILA/Documentation/CrawlerController,SMILA/Documentation/DeltaIndexingManager" />
<link rel="shortcut icon" href="http://wiki.eclipse.org/SMILA/Documentation/favicon.ico" />
<link rel="search" type="application/opensearchdescription+xml" href="http://wiki.eclipse.org/opensearch_desc.php" title="Eclipsepedia (English)" />
<link rel="alternate" type="application/rss+xml" title="Eclipsepedia RSS Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=rss" />
<link rel="alternate" type="application/atom+xml" title="Eclipsepedia Atom Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=atom" />
<title>SMILA/Documentation/ConnectivityFramework - Eclipsepedia</title>
<style type="text/css" media="screen,projection">/*<![CDATA[*/ @import "/skins/eclipsenova/novaWide.css?116"; /*]]>*/</style>
<link rel="stylesheet" type="text/css" media="print" href="http://wiki.eclipse.org/skins/eclipsenova/eclipsenovaPrint.css?116" />
<link rel="stylesheet" type="text/css" media="handheld" href="http://wiki.eclipse.org/skins/eclipsenova/handheld.css?116" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/header.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/tabs.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/visual.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/layout.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/footer.css" media="screen" />
<!--[if IE]><link rel="stylesheet" type="text/css" href="/skins/eclipsenova/IEpngfix.css" media="screen" /><![endif]-->
<!--[if lt IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE50Fixes.css?116";</style> <![endif]-->
<!--[if IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE55Fixes.css?116";</style><![endif]-->
<!--[if IE 6]><style type="text/css">@import "/skins/eclipsenova/IE60Fixes.css?116";</style><![endif]-->
<!--[if IE 7]><style type="text/css">@import "/skins/eclipsenova/IE70Fixes.css?116";</style><![endif]-->
<!--[if lt IE 7]><script type="text/javascript" src="/skins/common/IEFixes.js?116"></script>
<meta http-equiv="imagetoolbar" content="no" /><![endif]-->
<script type= "text/javascript">/*<![CDATA[*/
var skin = "eclipsenova";
var stylepath = "/skins";
var wgArticlePath = "/$1";
var wgScriptPath = "";
var wgScript = "/index.php";
var wgServer = "http://wiki.eclipse.org";
var wgCanonicalNamespace = "";
var wgCanonicalSpecialPageName = false;
var wgNamespaceNumber = 0;
var wgPageName = "SMILA/Documentation/ConnectivityFramework";
var wgTitle = "SMILA/Documentation/ConnectivityFramework";
var wgAction = "view";
var wgRestrictionEdit = [];
var wgRestrictionMove = [];
var wgArticleId = "18822";
var wgIsArticle = true;
var wgUserName = null;
var wgUserGroups = null;
var wgUserLanguage = "en";
var wgContentLanguage = "en";
var wgBreakFrames = false;
var wgCurRevisionId = "285981";
var wgVersion = "1.12.0";
var wgEnableAPI = true;
var wgEnableWriteAPI = false;
/*]]>*/</script>
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/wikibits.js?116"><!-- wikibits js --></script>
<!-- Performance mods similar to those for bug 166401 -->
<script type="text/javascript" src="http://wiki.eclipse.org/index.php?title=-&amp;action=raw&amp;gen=js&amp;useskin=eclipsenova"><!-- site js --></script>
<!-- Head Scripts -->
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/ajax.js?116"></script>
<link rel="stylesheet" type="text/css" href="ConnectivityFramework.html" /> </head>
<body class="mediawiki ns-0 ltr page-SMILA_Documentation_ConnectivityFramework">
<div id="globalWrapper">
<div id="column-one">
<!-- Eclipse Additions for the Top Nav start here M. Ward-->
<div id="header">
<div id="header-graphic">
<img src="http://wiki.eclipse.org/skins/eclipsenova/eclipse.png" alt="Eclipse Wiki">
</div>
<!-- Pulled 101409 Mward -->
<div class="portlet" id="p-personal">
<div class="pBody">
<ul>
<li id="pt-login"><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Documentation/ConnectivityFramework">Log in</a></li>
</ul>
</div>
</div>
<div id="header-icons">
<div id="sites">
<ul id="sitesUL">
<li><a href="http://www.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/eclipseIcon.png" width="28" height="28" alt="Eclipse Foundation" title="Eclipse Foundation" /><div>Eclipse Foundation</div></a></li>
<li><a href="http://marketplace.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/marketplace.png" width="28" height="28" alt="Eclipse Marketplace" title="Eclipse Marketplace" /><div>Eclipse Marketplace</div></a></li>
<li><a href="https://bugs.eclipse.org/bugs"><img src="http://dev.eclipse.org/custom_icons/system-search-bw.png" width="28" height="28" alt="Bugzilla" title="Bugzilla" /><div>Bugzilla</div></a></li>
<li><a href="http://live.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/audio-input-microphone-bw.png" width="28" height="28" alt="Live" title="Live" /><div>Eclipse Live</div></a></li>
<li><a href="http://planeteclipse.org"><img src="http://dev.eclipse.org/large_icons/devices/audio-card.png" width="28" height="28" alt="PlanetEclipse" title="Planet" /><div>Planet Eclipse</div></a></li>
<li><a href="http://portal.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/preferences-system-network-proxy-bw.png" width="28" height="28" alt="Portal" title="Portal" /><div>My Foundation Portal</div></a></li>
</ul>
</div>
</div>
</div>
<!-- NEW HEADER STUFF HERE -->
<div id="header-menu">
<div id="header-nav">
<ul> <li><a class="first_one" href="http://wiki.eclipse.org/" target="_self">Home</a></li> <li><a href="http://www.eclipse.org/downloads/" target="_self">Downloads</a></li>
<li><a href="http://www.eclipse.org/users/" target="_self">Users</a></li>
<li><a href="http://www.eclipse.org/membership/" target="_self">Members</a></li>
<li><a href="http://wiki.eclipse.org/index.php/Development_Resources" target="_self">Committers</a></li>
<li><a href="http://www.eclipse.org/resources/" target="_self">Resources</a></li>
<li><a href="http://www.eclipse.org/projects/" target="_self">Projects</a></li>
<li><a href="http://www.eclipse.org/org/" target="_self">About Us</a></li>
</ul>
</div>
<div id="header-utils">
<!-- moved the search window here -->
<form action="http://wiki.eclipse.org/Special:Search" >
<input class="input" name="search" type="text" accesskey="f" value="" />
<input type='submit' onclick="this.submit();" name="go" id="searchGoButton" class="button" title="Go to a page with this exact name if one exists" value="Go" />&nbsp;
<input type='submit' onclick="this.submit();" name="fulltext" class="button" id="mw-searchButton" title="Search Eclipsepedia for this text" value="Search" />
</form>
</div>
</div>
<!-- Eclipse Additions for the Header stop here -->
<!-- Additions and mods for leftside nav Start here -->
<!--Started nav rip here-->
<!-- these are the nav controls main page, changes etc -->
<div id="novaContent" class="faux">
<div id="leftcol">
<ul id="leftnav">
<!-- these are the page controls, edit history etc -->
<li class="separator"><a class="separator">Navigation &#160;&#160;</li>
<li id="n-mainpage"><a href="http://wiki.eclipse.org/Main_Page">Main Page</a></li>
<li id="n-portal"><a href="http://wiki.eclipse.org/Eclipsepedia:Community_Portal">Community portal</a></li>
<li id="n-currentevents"><a href="http://wiki.eclipse.org/Eclipsepedia:Current_events">Current events</a></li>
<li id="n-recentchanges"><a href="http://wiki.eclipse.org/Special:Recentchanges">Recent changes</a></li>
<li id="n-randompage"><a href="http://wiki.eclipse.org/Special:Random">Random page</a></li>
<li id="n-help"><a href="http://wiki.eclipse.org/Help:Contents">Help</a></li>
<li class="separator"><a class="separator">Toolbox &#160;&#160;</a></li>
<li id="t-whatlinkshere"><a href="http://wiki.eclipse.org/Special:Whatlinkshere/SMILA/Documentation/ConnectivityFramework">What links here</a></li>
<li id="t-recentchangeslinked"><a href="http://wiki.eclipse.org/Special:Recentchangeslinked/SMILA/Documentation/ConnectivityFramework">Related changes</a></li>
<!-- This is the toolbox section -->
<li id="t-upload"><a href="http://wiki.eclipse.org/Special:Upload">Upload file</a></li>
<li id="t-specialpages"><a href="http://wiki.eclipse.org/Special:Specialpages">Special pages</a></li>
<li id="t-print"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Documentation/ConnectivityFramework&amp;printable=yes">Printable version</a></li> <li id="t-permalink"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Documentation/ConnectivityFramework&amp;oldid=285981">Permanent link</a></li> </ul>
</div>
<!-- Additions and mods for leftside nav End here -->
<div id="column-content">
<div id="content">
<a name="top" id="top"></a>
<div id="tabs">
<ul class="primary">
<li class="active"><a href="ConnectivityFramework.html"><span class="tab">Page</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Talk:SMILA/Documentation/ConnectivityFramework&amp;action=edit"><span class="tab">Discussion</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Documentation/ConnectivityFramework&amp;action=edit"><span class="tab">View source</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Documentation/ConnectivityFramework&amp;action=history"><span class="tab">History</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Documentation/ConnectivityFramework"><span class="tab">Edit</span></a></li>
</ul>
</div>
<script type="text/javascript"> if (window.isMSIE55) fixalpha(); </script>
<h1 class="firstHeading">SMILA/Documentation/ConnectivityFramework</h1>
<div id="bodyContent">
<h3 id="siteSub">From Eclipsepedia</h3>
<div id="contentSub"><span class="subpages">&lt; <a href="../../SMILA.html" title="SMILA">SMILA</a> | <a href="../Documentation.1.html" title="SMILA/Documentation">Documentation</a></span></div>
<div id="jump-to-nav">Jump to: <a href="ConnectivityFramework.html#column-one">navigation</a>, <a href="ConnectivityFramework.html#searchInput">search</a></div> <!-- start content -->
<div class="messagebox" style="background-color: #def3fe; border: 1px solid #c5d7e0; color: black; padding: 5px; margin: 1ex 0; min-height: 35px; padding-left: 45px;">
<div style="float: left; margin-left: -40px;"><a href="http://wiki.eclipse.org/Image:Note.png" class="image" title="Note.png"><img alt="" src="http://wiki.eclipse.org/images/c/cc/Note.png" width="35" height="35" border="0" /></a></div>
<div><b>This is deprecated for SMILA 1.0, the connectivity framework is still functional but will aimed to be replaced by scalable import based on SMILAs job management.</b><br /></div>
</div>
<table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div>
<ul>
<li class="toclevel-1"><a href="ConnectivityFramework.html#Overview"><span class="tocnumber">1</span> <span class="toctext">Overview</span></a></li>
<li class="toclevel-1"><a href="ConnectivityFramework.html#Architecture"><span class="tocnumber">2</span> <span class="toctext">Architecture</span></a></li>
<li class="toclevel-1"><a href="ConnectivityFramework.html#Configuration"><span class="tocnumber">3</span> <span class="toctext">Configuration</span></a></li>
<li class="toclevel-1"><a href="ConnectivityFramework.html#Performance_Counters"><span class="tocnumber">4</span> <span class="toctext">Performance Counters</span></a></li>
</ul>
</td></tr></table><script type="text/javascript"> if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); } </script>
<a name="Overview"></a><h2> <span class="mw-headline"> Overview </span></h2>
<p>The Connectivity framework, as the name suggests, provides a framework to easily integrate data from external systems into SMILA. To access external data two kinds of components are supported: Agents and Crawlers. To integrate some new datasource type into SMILA just a new Agent or Crawler has to be implemented.
</p>
<a name="Architecture"></a><h2> <span class="mw-headline"> Architecture </span></h2>
<p>Here is a short overview of the components of the ConnectivityFramework:
</p>
<ul><li> <a href="AgentController.html" title="SMILA/Documentation/AgentController"><b>AgentController</b></a>: The AgentController implements the general processing logic common for all types of Agents. It's service interface is used by Agents to trigger add/update/delete actions. This component is not yet implemented!
</li><li> <a href="Agent.html" title="SMILA/Documentation/Agent"><b>Agents</b></a>: Agents monitor datasources for changes (add/update/delete) or are triggered by events (e.g. trigger in databases) and report those changes to the AgentController. Currently we do not provide any Agent implementation!
</li><li> <a href="CrawlerController.html" title="SMILA/Documentation/CrawlerController"><b>CrawlerController</b></a>: The CrawlerController implements the general processing logic common for all types of Crawlers.It's service interface is used by clients (e.g. JMX console) to start/stop crawls.
</li><li> <a href="Crawler.html" title="SMILA/Documentation/Crawler"><b>Crawlers</b></a>: A Crawler crawls a data source (e.g. a filesystem or a website) and returns all found data objects.
</li><li> <a href="CompoundManagement.html" title="SMILA/Documentation/CompoundManagement"><b>CompoundManagement</b></a>: Provides extractors for certain MimeTypes (e.g. zip, chm) and handles the processing of compound objects.
</li></ul>
<p>In addition there are three components that are not part of the ConnectivityFramework, but that interact with it:
</p>
<ul><li> <a href="ConnectivityManager.html" title="SMILA/Documentation/ConnectivityManager"><b>ConnectivityManager</b></a>: The ConnectivityManager is the single point of entry for data in the SMILA. The Agent- and CrawlerController push the data through this component into the Queue.
</li><li> <a href="DeltaIndexingManager.html" title="SMILA/Documentation/DeltaIndexingManager"><b>DeltaIndexingManager</b></a>: The DeltaIndexingManager provides functionailty to decide wheter a record needs to be updated and sent to the ConnectivityManager or not.
</li><li><b>Configuration Management</b>: This component is not yet implemented. It is designed to manage configurations for all kinds of services, e.g. DataSources for crawlers. At the moment all configurations have to be provided locally in the SMILA configuration folder.
</li></ul>
<p><br />
The following chart shows the architecture of the Connectivity Framework with it's plugable components (Agents/Crawlers) and relationship to the SMILA entry point Connectivity Module.
<a href="http://wiki.eclipse.org/Image:ConnectivityFrameworkArchitecture.png" class="image" title="Connectivity Framework Architecture"><img alt="Connectivity Framework Architecture" src="http://wiki.eclipse.org/images/7/71/ConnectivityFrameworkArchitecture.png" width="960" height="720" border="0" /></a>
</p><p>The red labeled components are not yet implemented.
</p>
<a name="Configuration"></a><h2> <span class="mw-headline"> Configuration </span></h2>
<p>There is no overall configuration for the framework. Check out the documentation of each framework component for detailed infomation.
</p><p><br />
</p>
<a name="Performance_Counters"></a><h2> <span class="mw-headline"> Performance Counters </span></h2>
<p>The class <code>org.eclipse.smila.connectivity.framework.performancecounters.ConnectivityPerformanceAgent</code> defines many common performance counters for crawlers and agents. Crawler and agent implementations can extend this class to provide additional specific counters, or just use this class if the common counters are sufficient.
</p><p>The common counters are:
</p>
<ul><li> startDate: date/time when importer was started
</li><li> endDate: date/time when importer has finished or was stopped
</li><li> jobName: name of job to which records where submitted
</li><li> importRunId: ID of the importer run
</li><li> records: number of records created by importer
</li><li> deltaIndices: number of requests to delta indexing manager
</li><li> averageRecordsProcessingTime: time since start divided by "records" in milliseconds
</li><li> averageDeltaIndicesProcessingTime: time since start divided by "deltaIndices" in milliseconds
</li><li> attachmentBytesTransferred: complete size of attachments added to records
</li><li> attachmentsTransferRate: time since start divided by attachmentBytesTransferred
</li><li> exceptions: number of non-fatal errors during importing
</li><li> exceptionsCritical: number of fatal errors during importing
</li><li> errorBuffer: List of descriptions of the last 10 exceptions.
</li></ul>
<!--
NewPP limit report
Preprocessor node count: 42/1000000
Post-expand include size: 1045/2097152 bytes
Template argument size: 515/2097152 bytes
#ifexist count: 0/100
-->
<!-- Saved in parser cache with key wikidb:pcache:idhash:18822-0!1!0!!en!2!edit=0 and timestamp 20120202171438 -->
<div class="printfooter">
Retrieved from "<a href="ConnectivityFramework.html">http://wiki.eclipse.org/SMILA/Documentation/ConnectivityFramework</a>"</div>
<div id="catlinks"><p class='catlinks'><a href="http://wiki.eclipse.org/Special:Categories" title="Special:Categories">Category</a>: <span dir='ltr'><a href="http://wiki.eclipse.org/Category:SMILA" title="Category:SMILA">SMILA</a></span></p></div> <!-- end content -->
<div class="visualClear"></div>
</div>
</div>
</div>
<!-- Yoink of toolbox for phoenix moved up -->
</div>
</div>
<div id="clearFooter"/>
<div id="footer" >
<ul id="footernav">
<li class="first"><a href="http://www.eclipse.org/">Home</a></li>
<li><a href="http://www.eclipse.org/legal/privacy.php">Privacy Policy</a></li>
<li><a href="http://www.eclipse.org/legal/termsofuse.php">Terms of Use</a></li>
<li><a href="http://www.eclipse.org/legal/copyright.php">Copyright Agent</a></li>
<li><a href="http://www.eclipse.org/org/foundation/contact.php">Contact</a></li>
<li><a href="http://wiki.eclipse.org/Eclipsepedia:About" title="Eclipsepedia:About">About Eclipsepedia</a></li>
</ul>
<span id="copyright">Copyright &copy; 2012 The Eclipse Foundation. All Rights Reserved</span>
<p id="footercredit">This page was last modified 09:35, 24 January 2012 by <a href="http://wiki.eclipse.org/index.php?title=User:Juergen.schumacher.attensity.com&amp;action=edit" class="new" title="User:Juergen.schumacher.attensity.com">Juergen Schumacher</a>. Based on work by <a href="http://wiki.eclipse.org/User:Daniel.stucky.empolis.com" title="User:Daniel.stucky.empolis.com">Daniel Stucky</a> and <a href="http://wiki.eclipse.org/User:Igor.novakovic.empolis.com" title="User:Igor.novakovic.empolis.com">Igor Novakovic</a>.</p>
<p id="footerviews">This page has been accessed 2,968 times.</p>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-910670-4");
pageTracker._trackPageview();
</script>
<!-- <div class="visualClear"></div> -->
<script type="text/javascript">if (window.runOnloadHook) runOnloadHook();</script>
</div>
<!-- Served in 0.070 secs. --></body></html>