SMILA/Glossary - Eclipsepedia
<body class="mediawiki ns-0 ltr page-SMILA_Glossary">
<div id="globalWrapper">
<!-- Additions and mods for leftside nav Start here -->
<!--Started nav rip here-->
<!-- these are the nav controls main page, changes etc -->
<div id="column-content">
<div id="content">
<a name="top" id="top"></a>
<h1 class="firstHeading">SMILA/Glossary</h1>
<table class="wikitable">
<th> <a href="Glossary.html#A" title="">A</a> |
</th><th> <a href="Glossary.html#B" title="">B</a> |
</th><th> <a href="Glossary.html#C" title="">C</a> |
</th><th> <a href="Glossary.html#D" title="">D</a> |
</th><th> <a href="Glossary.html#E" title="">E</a> |
</th><th> <a href="Glossary.html#F" title="">F</a> |
</th><th> <a href="Glossary.html#G" title="">G</a> |
</th><th> <a href="Glossary.html#H" title="">H</a> |
</th><th> <a href="Glossary.html#I" title="">I</a> |
</th><th> <a href="Glossary.html#J" title="">J</a> |
</th><th> <a href="Glossary.html#K" title="">K</a> |
</th><th> <a href="Glossary.html#L" title="">L</a> |
</th><th> <a href="Glossary.html#M" title="">M</a> |
</th><th> <a href="Glossary.html#N" title="">N</a> |
</th><th> <a href="Glossary.html#O" title="">O</a> |
</th><th> <a href="Glossary.html#P" title="">P</a> |
</th><th> <a href="Glossary.html#Q" title="">Q</a> |
</th><th> <a href="Glossary.html#R" title="">R</a> |
</th><th> <a href="Glossary.html#S" title="">S</a> |
</th><th> <a href="Glossary.html#T" title="">T</a> |
</th><th> <a href="Glossary.html#U" title="">U</a> |
</th><th> <a href="Glossary.html#V" title="">V</a> |
</th><th> <a href="Glossary.html#W" title="">W</a> |
</th><th> <a href="Glossary.html#X" title="">X</a> |
</th><th> <a href="Glossary.html#Y" title="">Y</a> |
</th><th> <a href="Glossary.html#Z" title="">Z</a>
<p><br />
<a name="A"></a><h2> <span class="mw-headline"> A </span></h2>
<ul><li> <b>Action</b> - An action is one step in an <a href="Glossary.html#W" title="">asynchronous workflow</a> associated with a certain <a href="Glossary.html#W" title="">worker</a> that does the actual processing.
</li><li> <b>Attachment</b> - Attachments are parts of <a href="Glossary.html#R" title="">records</a> used to store large binary data such as document content.
</li><li> <b>Attribute</b> - Attributes are parts of <a href="Glossary.html#R" title="">records</a> and contain simple data objects that are easily represented in XML or json, such as <tt>String</tt>, <tt>Integer</tt>, <tt>Float</tt>, and <tt>Date</tt>.
<a name="B"></a><h2> <span class="mw-headline"> B </span></h2>
<ul><li> <b>Blackboard</b> or <b>blackboard service</b> - The blackboard service manages SMILA <a href="Glossary.html#R" title="">records</a> during processing in a SMILA component (connectivity, workflow processor). In addition it hides the handling of record persistence from these components. For a complete description see <a href="Documentation/Usage_of_Blackboard_Service.html" title="SMILA/Documentation/Usage of Blackboard Service">Usage of Blackboard Service</a>.
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">BPEL</a></b> - BPEL is an XML-based language defining several constructs to write business processes. It defines a set of basic control structures like conditions or loops as well as elements to invoke web services and receive messages from services. It relies on <a href="Glossary.html#W" title="">WSDL</a> to express web services interfaces. Message structures can be manipulated, assigning parts or the whole of them to variables that can in turn be used to send other messages.
<ul><li> <b>Bucket</b> - Data container in an <a href="Glossary.html#W" title="">asynchronous workflow</a>, containing logically grouped <a href="Glossary.html#D" title="">data objects</a> of the same type. Can be <i>transient</i> for interim data, which means that data is not persisted and removal of data is under job management control, or <i>persistent</i>, which means that removal of data is not under job management control.
<ul><li> <b>Bulk</b> - a number of <a href="Glossary.html#R" title="">records</a> bundled in a single file to enhance throughput when processing records in <a href="Glossary.html#W" title="">asynchronous workflows</a>.
<ul><li> <b>Bulkbuilder</b> - An <a href="Glossary.html#W" title="">asynchronous workflow</a> <a href="Glossary.html#W" title="">worker</a> that accepts single <a href="Glossary.html#R" title="">records</a> and combines them to a <a href="Glossary.html#B" title="">bulk</a>. See <a href="Documentation/Bulkbuilder.html" title="SMILA/Documentation/Bulkbuilder">Bulkbuilder documentation</a>.
<a name="C"></a><h2> <span class="mw-headline"> C </span></h2>
<ul><li> <b>Crawler</b> - A crawler is a special <a href="Glossary.html#W" title="">worker</a> in an <a href="Glossary.html#W" title="">asynchronous workflow</a> that imports data from a data source (e.g. filesystem, web or database) into SMILA. It iterates over the data elements and creates <a href="Glossary.html#R" title="">records</a> for all elements that will be further processed in the workflow. In general crawlers resp. crawl workflows are used for initial (bulk) import of data sources. (see SMILA <a href="Documentation.1.html#Importing" title="SMILA/Documentation">Importing</a> for more details)
<a name="D"></a><h2> <span class="mw-headline"> D </span></h2>
<ul><li> <b>Data Object</b> - The smallest unit of data handled by an asychronous workflow (e.g. a <a href="Glossary.html#R" title="">record bulk</a>).
<ul><li> <b>DeltaChecker</b> - The DeltaChecker is a <a href="Glossary.html#W" title="">worker</a> in an (asynchronous) import <a href="Glossary.html#W" title="">workflow</a> that handles the <a href="Glossary.html#D" title="">delta indexing</a>.
<ul><li> <b>Delta indexing</b> - Delta indexing is also known as incremental or generation based indexing. It is driven by <tt>DeltaChecker</tt> <a href="Glossary.html#W" title="">worker</a>.(see SMILA <a href="Documentation.1.html#Importing" title="SMILA/Documentation">Importing</a> for more details)
<a name="E"></a><h2> <span class="mw-headline"> E </span></h2>
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">Eclipse</a></b> - Eclipse is an open source community, whose projects are focused on building an open development platform comprised of extensible frameworks, tools and runtimes for building, deploying and managing software across the lifecycle.
</li><li> <b>EILF</b> - EILF (Enterprise Information Logistics Framework) was the original proposed name of SMILA. Since this abbreviation was difficult to pronounce, it was not accepted by the community and thus changed to SMILA.
</li><li> <b><a href="" class="external text" title="" rel="nofollow">Equinox</a></b> - Equinox is a base technology from <a href="" class="external text" title="" rel="nofollow">Eclipse</a> implementing the <a href="Glossary.html#O" title="">OSGi</a> specification. Not only delivering a high performance class loading mechanism Equinox also provides an environment for managing component dependencies.
<a name="F"></a><h2> <span class="mw-headline"> F </span></h2>
<ul><li> <b>Fetcher</b> - A fetcher is a <a href="Glossary.html#W" title="">worker</a> in an (asynchronous) import <a href="Glossary.html#W" title="">workflow</a> that retrieves Records containing an URL or file path, etc from a <a href="Glossary.html#C" title="">crawler</a> and actually fetches the content (e.g. of files) from the data source ((e.g. <tt>FileFetcherWorker</tt> or <tt>WebFetcherWorker</tt>)), attaches it to <a href="Glossary.html#R" title="">records</a> and sends them to the <a href="Glossary.html#U" title="">UpdatePusher</a>. (see SMILA <a href="Documentation.1.html#Importing" title="SMILA/Documentation">Importing</a> for more details)
<a name="G"></a><h2> <span class="mw-headline"> G </span></h2>
<a name="H"></a><h2> <span class="mw-headline"> H </span></h2>
<a name="I"></a><h2> <span class="mw-headline"> I </span></h2>
<ul><li> <b>ID</b> - An ID identifies a <a href="Glossary.html#R" title="">record</a> in SMILA and is part of a <a href="Glossary.html#R" title="">record's</a> metadata.
<a name="J"></a><h2> <span class="mw-headline"> J </span></h2>
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">JMX</a></b> - Java Management Extension is a specification to administrating and monitoring java applications.
<ul><li> <b>Job</b> - A Job is a description of a distinct and repeatable working process that the system should accomplish. It references and parametrizes an <a href="Glossary.html#W" title="">asynchronous workflow</a>.
<ul><li> <b>Job Run</b> - A Job Run is an "instance" of a Job, for example one run of an import of a data source to an index. Only one active job run can existe per job. Statistics will be accumulated for each job run. A job run is automatically stopped when SMILA shuts down.
<a name="K"></a><h2> <span class="mw-headline"> K </span></h2>
<a name="L"></a><h2> <span class="mw-headline"> L </span></h2>
<a name="M"></a><h2> <span class="mw-headline"> M </span></h2>
<ul><li> <b>Micro bulk</b> - a (small) bundle of <a href="Glossary.html#R" title="">records</a> in one single file which can be pushed into the system using the <a href="Glossary.html#B" title="">Bulkbuilder</a>. The micro bulk in itself is not JSON, but a file where each line must consist of a single JSON representation of a <a href="Glossary.html#R" title="">record</a>. E.g.:
{&quot;_recordid&quot;: &quot;id1&quot;, &quot;attribute1&quot;: &quot;attribute1&quot;, ...}
{&quot;_recordid&quot;: &quot;id2&quot;, &quot;attribute1&quot;: &quot;attribute2&quot;, ...}
{&quot;_recordid&quot;: &quot;id3&quot;, &quot;attribute1&quot;: &quot;attribute3&quot;, ...}
<a name="N"></a><h2> <span class="mw-headline"> N </span></h2>
<a name="O"></a><h2> <span class="mw-headline"> O </span></h2>
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">ODE</a></b> - Apache ODE (Orchestration Director Engine) executes business processes following the <a href="Glossary.html#B" title="">BPEL</a>/<a href="Glossary.html#W" title="">WS-BPEL</a> standard. It talks to web services, sending and receiving messages, handling data manipulation and error recovery as described by your process definition. It supports both long and short living process executions to orchestrate all the services that are part of your application.
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">OSGi</a></b> - The OSGi specification is about managing a component based software system. It defines an in-VM Service Oriented Architecture (SOA) for networked systems. An OSGi Service Platform provides a standardized, component-oriented computing environment for cooperating networked services. This architecture significantly reduces the overall complexity of building, maintaining, and deploying applications.
<a name="P"></a><h2> <span class="mw-headline"> P </span></h2>
<ul><li> <b>Pipelet</b> - A pipelet is a reusable component (POJO) in a <a href="Glossary.html#B" title="">BPEL</a> workflow used to process data contained in <a href="Glossary.html#R" title="">records</a>. See <a href="Documentation/Pipelets.html" title="SMILA/Documentation/Pipelets">Pipelets</a> for details.
</li><li> <b>Pipeline</b> - A pipeline is the definition of a <a href="Glossary.html#B" title="">BPEL</a> process (or workflow) that orchestrates pipelets and other BPEL services (e.g. web services).
<a name="Q"></a><h2> <span class="mw-headline"> Q </span></h2>
<a name="R"></a><h2> <span class="mw-headline"> R </span></h2>
<ul><li> <b>Record</b> - A record is a sole element in SMILA that contains data to process (e.g. content and metadata of a document). The record consists of metadata elements, see <a href="Documentation/Data_Model_and_Serialization_Formats.html" title="SMILA/Documentation/Data Model and Serialization Formats">SMILA/Documentation/Data_Model_and_Serialization_Formats</a>.
<ul><li> <b>Record Bulk</b> - a <a href="Glossary.html#D" title="">Data Object</a> containing a sequence of <a href="Glossary.html#R" title="">Records</a>
<a name="S"></a><h2> <span class="mw-headline"> S </span></h2>
<ul><li> <b>Slot</b> - An (input/output) slot is a description for the input/output behaviour of a <a href="Glossary.html#W" title="">worker</a>. In a concrete <a href="Glossary.html#W" title="">asynchronous workflow</a> slots are assigned to <a href="Glossary.html#B" title="">buckets</a>
<ul><li> <b>SNMP</b> - Simple Network Management Protocol is a network protocol which controls the communication between supervised devices and the monitoring application (e.g. <a href="Glossary.html#J" title="">JMX</a>).
<a name="T"></a><h2> <span class="mw-headline"> T </span></h2>
<ul><li> <b>Task</b> - Description of a single unit of work to be processed by a <a href="Glossary.html#W" title="">Worker</a>. A task can contain worker specific properties.
</li><li> <b><a href="" class="external text" title="" rel="nofollow">Tika</a></b> - The Apache Tika™ toolkit detects and extracts metadata and structured text content from various documents using existing parser libraries
<a name="U"></a><h2> <span class="mw-headline"> U </span></h2>
<ul><li> <b>UpdatePusher</b> - The UpdatePusher is a <a href="Glossary.html#W" title="">worker</a> in an (asynchronous) import <a href="Glossary.html#W" title="">workflow</a> that pushes the crawled records to the <a href="Glossary.html#B" title="">BulkBuilder</a> of a running import <a href="Glossary.html#J" title="">job</a>.
<a name="V"></a><h2> <span class="mw-headline"> V </span></h2>
<a name="W"></a><h2> <span class="mw-headline"> W </span></h2>
<ul><li> <b>Worker</b> - Single processing component in an asychnronous workflow. Pulls <a href="Glossary.html#T" title="">tasks</a> to process. Defined in a worker description.
<ul><li> <b>Worker Description</b> - Description of a worker, e.g. its input/output <a href="Glossary.html#S" title="">slots</a>.
<ul><li> <b>Workflow (asynchronous)</b> - Describes an asynchronously processed workflow by specifying a sequence of workers and associating their input/output <a href="Glossary.html#S" title="">slots</a> to <a href="Glossary.html#B" title="">buckets</a>.
<ul><li> <b>Workflow (synchronous/BPEL)</b> - see <a href="Glossary.html#P" title="">pipeline</a>
<ul><li> <b>Workflow run</b> - Single traversal of a workflow.
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">WSDL</a></b> - WSDL is an XML format for describing network services as a set of endpoints operating on messages containing either document-oriented or procedure-oriented information. The operations and messages are described abstractly, and then bound to a concrete network protocol and message format to define an endpoint. Related concrete endpoints are combined into abstract endpoints (services). WSDL is extensible to allow description of endpoints and their messages regardless of what message formats or network protocols are used to communicate.
<ul><li> <b><a href="" class="external text" title="" rel="nofollow">WS-BPEL</a></b> - see <a href="Glossary.html#B" title="">BPEL</a>
<a name="X"></a><h2> <span class="mw-headline"> X </span></h2>
<a name="Y"></a><h2> <span class="mw-headline"> Y </span></h2>
<a name="Z"></a><h2> <span class="mw-headline"> Z </span></h2>
Category: SMILA
