blob: b7a6e2f0f1a140cd088b427c71c3b9fad50c1749 [file] [log] [blame]
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<html>
<body bgcolor="white">
This package provides classes for processing bidirectional
text with a special structure to ensure its proper presentation.
<p>
There are various types of structured text. Each type should
be handled by a specific <i>type handler</i>. A number of
standard type handlers are supplied with this package.
<h2>Introduction to Structured Text</h2>
<p>
Bidirectional text offers interesting challenges to presentation systems.
For plain text, the Unicode Bidirectional Algorithm
(<a href="http://www.unicode.org/reports/tr9/">UBA</a>)
generally specifies satisfactorily how to reorder bidirectional text for
display. This algorithm is implemented in many presentation and operating
systems, like Java/Swing, Windows, Linux.
</p><p>
However, all bidirectional text is not necessarily plain text. There
are also instances of text structured to follow a given syntax, which
should be reflected in the display order. The general algorithm, which
has no awareness of these special cases, often gives incorrect results
when displaying such structured text.
</p><p>
The general idea in handling structured text in this package is to add
directional formatting characters at proper locations in the text to
supplement the standard algorithm, so that the final result is correctly
displayed using the UBA.
</p><p>
A class which handles structured text is thus essentially a
transformation engine which receives text without directional formatting
characters as input and produces as output the same text with added
directional formatting characters, hopefully in the minimum quantity
which is sufficient to ensure correct display, considering the type of
structured text involved.
</p><p>
In this package, text without directional formatting characters is
called <b><i>lean</i></b> text while the text with added directional
formatting characters is called <b><i>full</i></b> text.
</p><p>
The class
<a href="StructuredTextProcessor.html"><b>StructuredTextProcessor</b></a>
is the main tool for processing structured text. It facilitates
handling several types of structured text, each type being handled
by a specific
<a href="./custom/StructuredTextTypeHandler.html">
<b><i>type handler</i></b></a> :</p>
<ul>
<li>comma delimited list</li>
<li>e-mail address</li>
<li>directory and file path</li>
<li>Java code</li>
<li>regular expression</li>
<li>SQL statements</li>
<li>compound name (xxx_yy_zzzz)</li>
<li>URL</li>
<li>Xpath</li>
</ul>
<p>
For each of these types, an identifier is defined in
<a href="StructuredTextTypeHandlerFactory.html">
<b><i>StructuredTextTypeHandlerFactory</i></b></a> :</p>
<p>
These identifiers can be used as argument in some methods of
<b>StructuredTextProcessor</b> to specify the type of handler to apply.
</p><p>
The classes included in this package are intended for users who
need to process structured text in the most straightforward
manner, when the following conditions are satisfied:
<ul>
<li>There exists an appropriate handler for the type of the
structured text.</li>
<li>There is no need to specify non-default conditions related
to the <a href="advanced/StructuredTextEnvironment.html">
environment</a>.</li>
<li>The only operations needed are to transform <i>lean</i> text
into <i>full</i> text or vice versa.</li>
<li>There is no interdependence between the processing of a
given string and the processing of preceding or
succeeding strings.</li>
</ul>
<p>
When their needs go beyond the conditions above,
users can use classes in the
<a href="advanced/package-summary.html">
org.eclipse.equinox.bidi.advanced</a>}package.
</p><p>
Developers who want to develop new handlers to support types of
structured text not currently supported can use components
of the package <a href="custom/package-summary.html">
org.eclipse.equinox.bidi.custom</a>.
The source code of packages org.eclipse.equinox.bidi.* can serve as example of
how to develop processors for currently unsupported types of structured text.
</p><p>
However, users wishing to process the currently supported types of structured
text typically don't need to deal with the org.eclipse.equinox.bidi.custom
package.
</p>
<h2>Abbreviations used in the documentation of this package</h2>
<dl>
<dt><b>UBA</b><dd>Unicode Bidirectional Algorithm
<dt><b>Bidi</b><dd>Bidirectional
<dt><b>GUI</b><dd>Graphical User Interface
<dt><b>UI</b><dd>User Interface
<dt><b>LTR</b><dd>Left to Right
<dt><b>RTL</b><dd>Right to Left
<dt><b>LRM</b><dd>Left-to-Right Mark
<dt><b>RLM</b><dd>Right-to-Left Mark
<dt><b>LRE</b><dd>Left-to-Right Embedding
<dt><b>RLE</b><dd>Right-to-Left Embedding
<dt><b>PDF</b><dd>Pop Directional Formatting
</dl>
<p>&nbsp;</p>
<h2>Known Limitations</h2>
<p>The proposed solution is making extensive usage of LRM, RLM, LRE, RLE
and PDF directional controls which are invisible but affect the way bidi
text is displayed. The following related key points merit special
attention:</p>
<ul>
<li>Implementations of the UBA on various platforms (e.g., Windows and
Linux) are very similar but nevertheless have known differences. Those
differences are minor and will not have a visible effect in most cases.
However there might be cases in which the same bidi text on two
platforms will look different. These differences will surface in Java
applications when they use the platform visual components for their UI
(e.g., AWT, SWT).</li>
<li>It is assumed that the presentation engine supports LRE, RLE and
PDF directional formatting characters.</li>
<li>Because some presentation engines are not strictly conformant to the
UBA, the implementation of structured text in this package adds LRM
or RLM characters in association with LRE, RLE or PDF in cases where
this would not be needed if the presentation engine was fully conformant
to the UBA. Such added marks will not have harmful effects on
conformant presentation engines and will help less conformant engines to
achieve the desired presentation.</li>
</ul>
</body>
</html>