blob: 0691ac388e0f1c6118999d13c0f7ea23e73e79eb [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="keywords" content="SMILA/Specifications/LuceneIntegration" />
<link rel="shortcut icon" href="http://wiki.eclipse.org/SMILA/Specifications/favicon.ico" />
<link rel="search" type="application/opensearchdescription+xml" href="http://wiki.eclipse.org/opensearch_desc.php" title="Eclipsepedia (English)" />
<link rel="alternate" type="application/rss+xml" title="Eclipsepedia RSS Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=rss" />
<link rel="alternate" type="application/atom+xml" title="Eclipsepedia Atom Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=atom" />
<title>SMILA/Specifications/LuceneIntegration - Eclipsepedia</title>
<style type="text/css" media="screen,projection">/*<![CDATA[*/ @import "/skins/eclipsenova/novaWide.css?116"; /*]]>*/</style>
<link rel="stylesheet" type="text/css" media="print" href="http://wiki.eclipse.org/skins/eclipsenova/eclipsenovaPrint.css?116" />
<link rel="stylesheet" type="text/css" media="handheld" href="http://wiki.eclipse.org/skins/eclipsenova/handheld.css?116" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/header.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/tabs.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/visual.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/layout.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/footer.css" media="screen" />
<!--[if IE]><link rel="stylesheet" type="text/css" href="/skins/eclipsenova/IEpngfix.css" media="screen" /><![endif]-->
<!--[if lt IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE50Fixes.css?116";</style> <![endif]-->
<!--[if IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE55Fixes.css?116";</style><![endif]-->
<!--[if IE 6]><style type="text/css">@import "/skins/eclipsenova/IE60Fixes.css?116";</style><![endif]-->
<!--[if IE 7]><style type="text/css">@import "/skins/eclipsenova/IE70Fixes.css?116";</style><![endif]-->
<!--[if lt IE 7]><script type="text/javascript" src="/skins/common/IEFixes.js?116"></script>
<meta http-equiv="imagetoolbar" content="no" /><![endif]-->
<script type= "text/javascript">/*<![CDATA[*/
var skin = "eclipsenova";
var stylepath = "/skins";
var wgArticlePath = "/$1";
var wgScriptPath = "";
var wgScript = "/index.php";
var wgServer = "http://wiki.eclipse.org";
var wgCanonicalNamespace = "";
var wgCanonicalSpecialPageName = false;
var wgNamespaceNumber = 0;
var wgPageName = "SMILA/Specifications/LuceneIntegration";
var wgTitle = "SMILA/Specifications/LuceneIntegration";
var wgAction = "view";
var wgRestrictionEdit = [];
var wgRestrictionMove = [];
var wgArticleId = "18062";
var wgIsArticle = true;
var wgUserName = null;
var wgUserGroups = null;
var wgUserLanguage = "en";
var wgContentLanguage = "en";
var wgBreakFrames = false;
var wgCurRevisionId = "141504";
var wgVersion = "1.12.0";
var wgEnableAPI = true;
var wgEnableWriteAPI = false;
/*]]>*/</script>
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/wikibits.js?116"><!-- wikibits js --></script>
<!-- Performance mods similar to those for bug 166401 -->
<script type="text/javascript" src="http://wiki.eclipse.org/index.php?title=-&amp;action=raw&amp;gen=js&amp;useskin=eclipsenova"><!-- site js --></script>
<!-- Head Scripts -->
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/ajax.js?116"></script>
<style type="text/css">/*<![CDATA[*/
.source-xml {line-height: normal; font-size: medium;}
.source-xml li {line-height: normal;}
/**
* GeSHi Dynamically Generated Stylesheet
* --------------------------------------
* Dynamically generated stylesheet for xml
* CSS class: source-xml, CSS id:
* GeSHi (C) 2004 - 2007 Nigel McNie (http://qbnz.com/highlighter)
*/
.source-xml .de1, .source-xml .de2 {font-family: 'Courier New', Courier, monospace; font-weight: normal;}
.source-xml {}
.source-xml .head {}
.source-xml .foot {}
.source-xml .imp {font-weight: bold; color: red;}
.source-xml .ln-xtra {color: #cc0; background-color: #ffc;}
.source-xml li {font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;}
.source-xml li.li2 {font-weight: bold;}
.source-xml .coMULTI {color: #808080; font-style: italic;}
.source-xml .es0 {color: #000099; font-weight: bold;}
.source-xml .br0 {color: #66cc66;}
.source-xml .st0 {color: #ff0000;}
.source-xml .nu0 {color: #cc66cc;}
.source-xml .sc0 {color: #00bbdd;}
.source-xml .sc1 {color: #ddbb00;}
.source-xml .sc2 {color: #339933;}
.source-xml .sc3 {color: #009900;}
.source-xml .re0 {color: #000066;}
.source-xml .re1 {font-weight: bold; color: black;}
.source-xml .re2 {font-weight: bold; color: black;}
/*]]>*/
</style>
<style type="text/css">/*<![CDATA[*/
@import "/index.php?title=MediaWiki:Geshi.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=18000";
/*]]>*/
</style><link rel="stylesheet" type="text/css" href="LuceneIntegration.html" /> </head>
<body class="mediawiki ns-0 ltr page-SMILA_Specifications_LuceneIntegration">
<div id="globalWrapper">
<div id="column-one">
<!-- Eclipse Additions for the Top Nav start here M. Ward-->
<div id="header">
<div id="header-graphic">
<img src="http://wiki.eclipse.org/skins/eclipsenova/eclipse.png" alt="Eclipse Wiki">
</div>
<!-- Pulled 101409 Mward -->
<div class="portlet" id="p-personal">
<div class="pBody">
<ul>
<li id="pt-login"><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Specifications/LuceneIntegration">Log in</a></li>
</ul>
</div>
</div>
<div id="header-icons">
<div id="sites">
<ul id="sitesUL">
<li><a href="http://www.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/eclipseIcon.png" width="28" height="28" alt="Eclipse Foundation" title="Eclipse Foundation" /><div>Eclipse Foundation</div></a></li>
<li><a href="http://marketplace.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/marketplace.png" width="28" height="28" alt="Eclipse Marketplace" title="Eclipse Marketplace" /><div>Eclipse Marketplace</div></a></li>
<li><a href="https://bugs.eclipse.org/bugs"><img src="http://dev.eclipse.org/custom_icons/system-search-bw.png" width="28" height="28" alt="Bugzilla" title="Bugzilla" /><div>Bugzilla</div></a></li>
<li><a href="http://live.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/audio-input-microphone-bw.png" width="28" height="28" alt="Live" title="Live" /><div>Eclipse Live</div></a></li>
<li><a href="http://planeteclipse.org"><img src="http://dev.eclipse.org/large_icons/devices/audio-card.png" width="28" height="28" alt="PlanetEclipse" title="Planet" /><div>Planet Eclipse</div></a></li>
<li><a href="http://portal.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/preferences-system-network-proxy-bw.png" width="28" height="28" alt="Portal" title="Portal" /><div>My Foundation Portal</div></a></li>
</ul>
</div>
</div>
</div>
<!-- NEW HEADER STUFF HERE -->
<div id="header-menu">
<div id="header-nav">
<ul> <li><a class="first_one" href="http://wiki.eclipse.org/" target="_self">Home</a></li> <li><a href="http://www.eclipse.org/downloads/" target="_self">Downloads</a></li>
<li><a href="http://www.eclipse.org/users/" target="_self">Users</a></li>
<li><a href="http://www.eclipse.org/membership/" target="_self">Members</a></li>
<li><a href="http://wiki.eclipse.org/index.php/Development_Resources" target="_self">Committers</a></li>
<li><a href="http://www.eclipse.org/resources/" target="_self">Resources</a></li>
<li><a href="http://www.eclipse.org/projects/" target="_self">Projects</a></li>
<li><a href="http://www.eclipse.org/org/" target="_self">About Us</a></li>
</ul>
</div>
<div id="header-utils">
<!-- moved the search window here -->
<form action="http://wiki.eclipse.org/Special:Search" >
<input class="input" name="search" type="text" accesskey="f" value="" />
<input type='submit' onclick="this.submit();" name="go" id="searchGoButton" class="button" title="Go to a page with this exact name if one exists" value="Go" />&nbsp;
<input type='submit' onclick="this.submit();" name="fulltext" class="button" id="mw-searchButton" title="Search Eclipsepedia for this text" value="Search" />
</form>
</div>
</div>
<!-- Eclipse Additions for the Header stop here -->
<!-- Additions and mods for leftside nav Start here -->
<!--Started nav rip here-->
<!-- these are the nav controls main page, changes etc -->
<div id="novaContent" class="faux">
<div id="leftcol">
<ul id="leftnav">
<!-- these are the page controls, edit history etc -->
<li class="separator"><a class="separator">Navigation &#160;&#160;</li>
<li id="n-mainpage"><a href="http://wiki.eclipse.org/Main_Page">Main Page</a></li>
<li id="n-portal"><a href="http://wiki.eclipse.org/Eclipsepedia:Community_Portal">Community portal</a></li>
<li id="n-currentevents"><a href="http://wiki.eclipse.org/Eclipsepedia:Current_events">Current events</a></li>
<li id="n-recentchanges"><a href="http://wiki.eclipse.org/Special:Recentchanges">Recent changes</a></li>
<li id="n-randompage"><a href="http://wiki.eclipse.org/Special:Random">Random page</a></li>
<li id="n-help"><a href="http://wiki.eclipse.org/Help:Contents">Help</a></li>
<li class="separator"><a class="separator">Toolbox &#160;&#160;</a></li>
<li id="t-whatlinkshere"><a href="http://wiki.eclipse.org/Special:Whatlinkshere/SMILA/Specifications/LuceneIntegration">What links here</a></li>
<li id="t-recentchangeslinked"><a href="http://wiki.eclipse.org/Special:Recentchangeslinked/SMILA/Specifications/LuceneIntegration">Related changes</a></li>
<!-- This is the toolbox section -->
<li id="t-upload"><a href="http://wiki.eclipse.org/Special:Upload">Upload file</a></li>
<li id="t-specialpages"><a href="http://wiki.eclipse.org/Special:Specialpages">Special pages</a></li>
<li id="t-print"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/LuceneIntegration&amp;printable=yes">Printable version</a></li> <li id="t-permalink"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/LuceneIntegration&amp;oldid=141504">Permanent link</a></li> </ul>
</div>
<!-- Additions and mods for leftside nav End here -->
<div id="column-content">
<div id="content">
<a name="top" id="top"></a>
<div id="tabs">
<ul class="primary">
<li class="active"><a href="LuceneIntegration.html"><span class="tab">Page</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Talk:SMILA/Specifications/LuceneIntegration&amp;action=edit"><span class="tab">Discussion</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/LuceneIntegration&amp;action=edit"><span class="tab">View source</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Specifications/LuceneIntegration&amp;action=history"><span class="tab">History</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Specifications/LuceneIntegration"><span class="tab">Edit</span></a></li>
</ul>
</div>
<script type="text/javascript"> if (window.isMSIE55) fixalpha(); </script>
<h1 class="firstHeading">SMILA/Specifications/LuceneIntegration</h1>
<div id="bodyContent">
<h3 id="siteSub">From Eclipsepedia</h3>
<div id="contentSub"><span class="subpages">&lt; <a href="../../SMILA.html" title="SMILA">SMILA</a> | <a href="../Specifications.1.html" title="SMILA/Specifications">Specifications</a></span></div>
<div id="jump-to-nav">Jump to: <a href="LuceneIntegration.html#column-one">navigation</a>, <a href="LuceneIntegration.html#searchInput">search</a></div> <!-- start content -->
<table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div>
<ul>
<li class="toclevel-1"><a href="LuceneIntegration.html#Description"><span class="tocnumber">1</span> <span class="toctext">Description</span></a></li>
<li class="toclevel-1"><a href="LuceneIntegration.html#Discussion"><span class="tocnumber">2</span> <span class="toctext">Discussion</span></a></li>
<li class="toclevel-1"><a href="LuceneIntegration.html#Status_Quo"><span class="tocnumber">3</span> <span class="toctext">Status Quo</span></a></li>
<li class="toclevel-1"><a href="LuceneIntegration.html#Technical_proposal"><span class="tocnumber">4</span> <span class="toctext">Technical proposal</span></a>
<ul>
<li class="toclevel-2"><a href="LuceneIntegration.html#Features"><span class="tocnumber">4.1</span> <span class="toctext">Features</span></a></li>
<li class="toclevel-2"><a href="LuceneIntegration.html#Lucene_specific_vs._generic"><span class="tocnumber">4.2</span> <span class="toctext">Lucene specific vs. generic</span></a></li>
<li class="toclevel-2"><a href="LuceneIntegration.html#Configuration"><span class="tocnumber">4.3</span> <span class="toctext">Configuration</span></a></li>
<li class="toclevel-2"><a href="LuceneIntegration.html#Bundles.2C_Packages.2C_Extension_Points"><span class="tocnumber">4.4</span> <span class="toctext">Bundles, Packages, Extension Points</span></a></li>
</ul>
</li>
</ul>
</td></tr></table><script type="text/javascript"> if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); } </script>
<a name="Description"></a><h1> <span class="mw-headline">Description</span></h1>
<p>This page is about the integration of Lucene as a sample indexing/search engine in Smila.
</p>
<a name="Discussion"></a><h1> <span class="mw-headline">Discussion</span></h1>
<a name="Status_Quo"></a><h1> <span class="mw-headline">Status Quo</span></h1>
<p>At the moment we have two ProcessingServices for indexing and searching records in Lucene:
</p>
<ul><li>LuceneIndexService
</li><li>LuceneSearchService
</li></ul>
<p>Both services support multiple indexes, selectable via annotations. As an integration layer between these services and the Lucene api the brox anyfinder classes are used for now. The configuration of the services and the Lucene index and some search properties is a mixture of a service specific record to index field mapping file (mappings.xml) and anyfinders own DataDictionary (DataDictionary.xml).
</p><p><br />
</p>
<a name="Technical_proposal"></a><h1> <span class="mw-headline">Technical proposal</span></h1>
<p>One of the goals of Smila was to create the framework from scratch without any legacy code. Therefore wo should refactor the anyfinder Lucene integration to contain only the classes that are needed. Below are some thoughts about issues with the current implementation and what to reuse/refactor:
</p>
<a name="Features"></a><h2> <span class="mw-headline">Features</span></h2>
<p>The following features should be supported by the integration:
</p>
<ul><li> configuration of index fields (analyzers, indexation, tokenization)
</li><li> simple search (query over a dedicated text field)
</li><li> advanced search (query over multiple fields and filter support)
</li><li> simple highlighting (return a formated html text)
</li><li> advanced highlighting (return highlight positions and weights)
</li></ul>
<p><br />
</p>
<a name="Lucene_specific_vs._generic"></a><h2> <span class="mw-headline">Lucene specific vs. generic</span></h2>
<p>Anyfinder abstracts from concrete search engines, offering a generic api for search engine integration. Smila offers the same, using the BPEL Pipelet/ProcessingService approach. Therefore most abstract classes or interfaces of anyfinder can be removed or merged with concrete Lucene implementations. This will minimize the number of classes.
</p><p><br />
</p>
<a name="Configuration"></a><h2> <span class="mw-headline">Configuration</span></h2>
<p>The configuration files mappings.xml and DataDictionary.xml should be merged into one xml configuration. The configuration for result and highlighting attributes should be a default configuration which is used if the search process does not explicitly requests other results. As it is not relevant for the LuceneIndexService it could be moved in a separate config file. The defined mapping of record attributes/attachments to index fields should be reused by the LuceneSearchService (by having a reference to the LuceneIndexService and providing methods to acess the mapping information in both directions).
</p><p>Here are my ideas for a index and search configuration, reusing anyfinder concepts:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;LuceneIndexConfig<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Index</span> <span class="re0">Name</span>=<span class="st0">&quot;test_index&quot;</span> <span class="re0">MaxConnections</span>=<span class="st0">&quot;5&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;IndexStructure<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Analyzer</span> <span class="re0">ClassName</span>=<span class="st0">&quot;org.apache.lucene.analysis.standard.StandardAnalyzer&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;Title&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;IndexField</span> <span class="re0">Name</span>=<span class="st0">&quot;Title&quot;</span> <span class="re0">IndexValue</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">StoreText</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Tokenize</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Type</span>=<span class="st0">&quot;Text&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;Url&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;IndexField</span> <span class="re0">Name</span>=<span class="st0">&quot;Url&quot;</span> <span class="re0">IndexValue</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">StoreText</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Tokenize</span>=<span class="st0">&quot;false&quot;</span> <span class="re0">Type</span>=<span class="st0">&quot;Text&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Analyzer</span> <span class="re0">ClassName</span>=<span class="st0">&quot;org.apache.lucene.analysis.WhitespaceAnalyzer&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;/IndexField<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;LastModifiedDate&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;IndexField</span> <span class="re0">Name</span>=<span class="st0">&quot;LastModifiedDate&quot;</span> <span class="re0">IndexValue</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">StoreText</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Tokenize</span>=<span class="st0">&quot;false&quot;</span> <span class="re0">Type</span>=<span class="st0">&quot;Text&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span><span class="re2">/&gt;</span></span>
...
<span class="sc3"><span class="re1">&lt;Attachment</span> <span class="re0">path</span>=<span class="st0">&quot;Content&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;IndexField</span> <span class="re0">Name</span>=<span class="st0">&quot;Content&quot;</span> <span class="re0">IndexValue</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">StoreText</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Tokenize</span>=<span class="st0">&quot;true&quot;</span> <span class="re0">Type</span>=<span class="st0">&quot;Text&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attachment</span><span class="re2">/&gt;</span></span>
...
<span class="sc3"><span class="re1">&lt;/IndexStructure<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Index<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Index</span> <span class="re0">Name</span>=<span class="st0">&quot;another_index&quot;</span> <span class="re0">MaxConnections</span>=<span class="st0">&quot;5&quot;</span><span class="re2">&gt;</span></span>
...
<span class="sc3"><span class="re1">&lt;/Index<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/LuceneIndexConfig<span class="re2">&gt;</span></span></span></pre></div>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;LuceneSearchConfig<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Index</span> <span class="re0">Name</span>=<span class="st0">&quot;test_index&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Result<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;MimeType&quot;</span><span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;LastModifiedDate&quot;</span> <span class="re2">/&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Attribute</span> <span class="re0">name</span>=<span class="st0">&quot;Url&quot;</span> <span class="re2">/&gt;</span></span>
...
<span class="sc3"><span class="re1">&lt;Attachment</span> <span class="re0">name</span>=<span class="st0">&quot;Content&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;HighlightingTransformer</span> <span class="re0">Name</span>=<span class="st0">&quot;urn:Sentence&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;ParameterSet<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;MaxLength&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;Integer&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>300<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;MaxHLElements&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;Integer&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>999<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;MaxSucceedingCharacters&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;Integer&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>30<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;SucceedingCharacters&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;String&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>...<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;SortAlgorithm&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;String&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>Occurrence<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;Parameter</span> <span class="re0">Name</span>=<span class="st0">&quot;TextHandling&quot;</span> <span class="re0">xsi:type</span>=<span class="st0">&quot;String&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;Value<span class="re2">&gt;</span></span></span>ReturnSnipplet<span class="sc3"><span class="re1">&lt;/Value<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Parameter<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/ParameterSet<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/HighlightingTransformer<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Attachment<span class="re2">&gt;</span></span></span>
...
<span class="sc3"><span class="re1">&lt;/Result<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/Index<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/LuceneSearchConfig<span class="re2">&gt;</span></span></span></pre></div>
<p><br />
In addition, a Lucene index needs two special IndexFields that are not configurable but fixed:
</p>
<ul><li> ID: this is a hashed version of the record Id. It is used to identify the record in the index
</li><li> XMLID: this contains the xml representation of the record Id. It is only stored in the index and part of every result, as it is used to create Id objects from it
</li></ul>
<p><br />
</p>
<a name="Bundles.2C_Packages.2C_Extension_Points"></a><h2> <span class="mw-headline">Bundles, Packages, Extension Points</span></h2>
<p>All classes neeeded for Lucene integration should be in the bundle org.eclipse.smila.lucene or in bundles extending this package structure.
org.eclipse.smila.search should be reserved for Smila Search API and more generic stuff to come (perhaps the highlighting transformer could fit in there).
</p><p>There are some packages and lots of classes I don't know what they are used for:
</p>
<ul><li> org.eclipse.smila.transformation (except the Highlighting* classes)
</li><li> org.eclipse.smila.transformation.transformer
</li><li> org.eclipse.smila.search.datadictionary - should most of these classes be generated by Jaxb&nbsp;?
</li><li> org.eclipse.smila.search.feature
</li><li> org.eclipse.smila.search.irm
</li><li> org.eclipse.smila.search.tools - why are such common classes like exception in here&nbsp;?
</li><li> org.eclipse.smila.search.tools.indexstructur (seems to be obsolete if merged with org.eclipse.smila.lucene)
</li><li> what are all thos D-classes for. Why are the duplicate class names in different packages&nbsp;? Semms to be some wrapper classes where in turn Lucene classes could be used.
</li><li> what are all those template classes about&nbsp;? I guess we don't need them anymore.
</li></ul>
<p>Also anyfinder bundles make use of extension points. What is it used for&nbsp;? I don't think that it is needed for a concrete Lucene integration.
</p>
<!--
NewPP limit report
Preprocessor node count: 15/1000000
Post-expand include size: 0/2097152 bytes
Template argument size: 0/2097152 bytes
#ifexist count: 0/100
-->
<!-- Saved in parser cache with key wikidb:pcache:idhash:18062-0!1!0!!en!2!edit=0 and timestamp 20120203063221 -->
<div class="printfooter">
Retrieved from "<a href="LuceneIntegration.html">http://wiki.eclipse.org/SMILA/Specifications/LuceneIntegration</a>"</div>
<!-- end content -->
<div class="visualClear"></div>
</div>
</div>
</div>
<!-- Yoink of toolbox for phoenix moved up -->
</div>
</div>
<div id="clearFooter"/>
<div id="footer" >
<ul id="footernav">
<li class="first"><a href="http://www.eclipse.org/">Home</a></li>
<li><a href="http://www.eclipse.org/legal/privacy.php">Privacy Policy</a></li>
<li><a href="http://www.eclipse.org/legal/termsofuse.php">Terms of Use</a></li>
<li><a href="http://www.eclipse.org/legal/copyright.php">Copyright Agent</a></li>
<li><a href="http://www.eclipse.org/org/foundation/contact.php">Contact</a></li>
<li><a href="http://wiki.eclipse.org/Eclipsepedia:About" title="Eclipsepedia:About">About Eclipsepedia</a></li>
</ul>
<span id="copyright">Copyright &copy; 2012 The Eclipse Foundation. All Rights Reserved</span>
<p id="footercredit">This page was last modified 16:40, 25 February 2009 by <a href="http://wiki.eclipse.org/User:Daniel.stucky.empolis.com" title="User:Daniel.stucky.empolis.com">Daniel Stucky</a>. </p>
<p id="footerviews">This page has been accessed 1,788 times.</p>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-910670-4");
pageTracker._trackPageview();
</script>
<!-- <div class="visualClear"></div> -->
<script type="text/javascript">if (window.runOnloadHook) runOnloadHook();</script>
</div>
<!-- Served in 0.058 secs. --></body></html>