blob: 60aab899be875ce55e7be3ca158d9e7d5e326a03 [file] [log] [blame]
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="keywords" content="SMILA/Project Concepts/MimeTypeIdentifier" />
<link rel="shortcut icon" href="http://wiki.eclipse.org/SMILA/Project_Concepts/favicon.ico" />
<link rel="search" type="application/opensearchdescription+xml" href="http://wiki.eclipse.org/opensearch_desc.php" title="Eclipsepedia (English)" />
<link rel="alternate" type="application/rss+xml" title="Eclipsepedia RSS Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=rss" />
<link rel="alternate" type="application/atom+xml" title="Eclipsepedia Atom Feed" href="http://wiki.eclipse.org/index.php?title=Special:Recentchanges&amp;feed=atom" />
<title>SMILA/Project Concepts/MimeTypeIdentifier - Eclipsepedia</title>
<style type="text/css" media="screen,projection">/*<![CDATA[*/ @import "http://wiki.eclipse.org/skins/eclipsenova/novaWide.css?116"; /*]]>*/</style>
<link rel="stylesheet" type="text/css" media="print" href="http://wiki.eclipse.org/skins/eclipsenova/eclipsenovaPrint.css?116" />
<link rel="stylesheet" type="text/css" media="handheld" href="http://wiki.eclipse.org/skins/eclipsenova/handheld.css?116" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/header.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/tabs.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/visual.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/layout.css" media="screen" />
<link rel="stylesheet" type="text/css" href="http://wiki.eclipse.org/skins/eclipsenova/Nova/css/footer.css" media="screen" />
<!--[if IE]><link rel="stylesheet" type="text/css" href="/skins/eclipsenova/IEpngfix.css" media="screen" /><![endif]-->
<!--[if lt IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE50Fixes.css?116";</style> <![endif]-->
<!--[if IE 5.5000]><style type="text/css">@import "/skins/eclipsenova/IE55Fixes.css?116";</style><![endif]-->
<!--[if IE 6]><style type="text/css">@import "/skins/eclipsenova/IE60Fixes.css?116";</style><![endif]-->
<!--[if IE 7]><style type="text/css">@import "/skins/eclipsenova/IE70Fixes.css?116";</style><![endif]-->
<!--[if lt IE 7]><script type="text/javascript" src="/skins/common/IEFixes.js?116"></script>
<meta http-equiv="imagetoolbar" content="no" /><![endif]-->
<script type= "text/javascript">/*<![CDATA[*/
var skin = "eclipsenova";
var stylepath = "/skins";
var wgArticlePath = "/$1";
var wgScriptPath = "";
var wgScript = "/index.php";
var wgServer = "http://wiki.eclipse.org";
var wgCanonicalNamespace = "";
var wgCanonicalSpecialPageName = false;
var wgNamespaceNumber = 0;
var wgPageName = "SMILA/Project_Concepts/MimeTypeIdentifier";
var wgTitle = "SMILA/Project Concepts/MimeTypeIdentifier";
var wgAction = "view";
var wgRestrictionEdit = [];
var wgRestrictionMove = [];
var wgArticleId = "15231";
var wgIsArticle = true;
var wgUserName = null;
var wgUserGroups = null;
var wgUserLanguage = "en";
var wgContentLanguage = "en";
var wgBreakFrames = false;
var wgCurRevisionId = "113328";
var wgVersion = "1.12.0";
var wgEnableAPI = true;
var wgEnableWriteAPI = false;
/*]]>*/</script>
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/wikibits.js?116"><!-- wikibits js --></script>
<!-- Performance mods similar to those for bug 166401 -->
<script type="text/javascript" src="http://wiki.eclipse.org/index.php?title=-&amp;action=raw&amp;gen=js&amp;useskin=eclipsenova"><!-- site js --></script>
<!-- Head Scripts -->
<script type="text/javascript" src="http://wiki.eclipse.org/skins/common/ajax.js?116"></script>
<style type="text/css">/*<![CDATA[*/
.source-xml {line-height: normal; font-size: medium;}
.source-xml li {line-height: normal;}
/**
* GeSHi Dynamically Generated Stylesheet
* --------------------------------------
* Dynamically generated stylesheet for xml
* CSS class: source-xml, CSS id:
* GeSHi (C) 2004 - 2007 Nigel McNie (http://qbnz.com/highlighter)
*/
.source-xml .de1, .source-xml .de2 {font-family: 'Courier New', Courier, monospace; font-weight: normal;}
.source-xml {}
.source-xml .head {}
.source-xml .foot {}
.source-xml .imp {font-weight: bold; color: red;}
.source-xml .ln-xtra {color: #cc0; background-color: #ffc;}
.source-xml li {font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;}
.source-xml li.li2 {font-weight: bold;}
.source-xml .coMULTI {color: #808080; font-style: italic;}
.source-xml .es0 {color: #000099; font-weight: bold;}
.source-xml .br0 {color: #66cc66;}
.source-xml .st0 {color: #ff0000;}
.source-xml .nu0 {color: #cc66cc;}
.source-xml .sc0 {color: #00bbdd;}
.source-xml .sc1 {color: #ddbb00;}
.source-xml .sc2 {color: #339933;}
.source-xml .sc3 {color: #009900;}
.source-xml .re0 {color: #000066;}
.source-xml .re1 {font-weight: bold; color: black;}
.source-xml .re2 {font-weight: bold; color: black;}
/*]]>*/
</style>
<style type="text/css">/*<![CDATA[*/
@import "http://wiki.eclipse.org/index.php?title=MediaWiki:Geshi.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=18000";
/*]]>*/
</style><style type="text/css">/*<![CDATA[*/
.source-java {line-height: normal; font-size: medium;}
.source-java li {line-height: normal;}
/**
* GeSHi Dynamically Generated Stylesheet
* --------------------------------------
* Dynamically generated stylesheet for java
* CSS class: source-java, CSS id:
* GeSHi (C) 2004 - 2007 Nigel McNie (http://qbnz.com/highlighter)
*/
.source-java .de1, .source-java .de2 {font-family: 'Courier New', Courier, monospace; font-weight: normal;}
.source-java {}
.source-java .head {}
.source-java .foot {}
.source-java .imp {font-weight: bold; color: red;}
.source-java .ln-xtra {color: #cc0; background-color: #ffc;}
.source-java li {font-family: 'Courier New', Courier, monospace; color: black; font-weight: normal; font-style: normal;}
.source-java li.li2 {font-weight: bold;}
.source-java .kw1 {color: #7F0055; font-weight: bold;}
.source-java .kw2 {color: #7F0055; font-weight: bold;}
.source-java .kw3 {color: #000000; font-weight: normal}
.source-java .kw4 {color: #7F0055; font-weight: bold;}
.source-java .co1 {color: #3F7F5F; font-style: italic;}
.source-java .co2 {color: #3F7F5F;}
.source-java .co3 {color: #3F7F5F; font-style: italic; font-weight: bold;}
.source-java .coMULTI {color: #3F5FBF; font-style: italic;}
.source-java .es0 {color: #000000;}
.source-java .br0 {color: #000000;}
.source-java .st0 {color: #2A00ff;}
.source-java .nu0 {color: #000000;}
.source-java .me1 {color: #000000;}
.source-java .me2 {color: #000000;}
/*]]>*/
</style>
<style type="text/css">/*<![CDATA[*/
@import "http://wiki.eclipse.org/index.php?title=MediaWiki:Geshi.css&usemsgcache=yes&action=raw&ctype=text/css&smaxage=18000";
/*]]>*/
</style><link rel="stylesheet" type="text/css" href="MimeTypeIdentifier.html" /> </head>
<body class="mediawiki ns-0 ltr page-SMILA_Project_Concepts_MimeTypeIdentifier">
<div id="globalWrapper">
<div id="column-one">
<!-- Eclipse Additions for the Top Nav start here M. Ward-->
<div id="header">
<div id="header-graphic">
<img src="http://wiki.eclipse.org/skins/eclipsenova/eclipse.png" alt="Eclipse Wiki">
</div>
<!-- Pulled 101409 Mward -->
<div class="portlet" id="p-personal">
<div class="pBody">
<ul>
<li id="pt-login"><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Project_Concepts/MimeTypeIdentifier">Log in</a></li>
</ul>
</div>
</div>
<div id="header-icons">
<div id="sites">
<ul id="sitesUL">
<li><a href="http://www.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/eclipseIcon.png" width="28" height="28" alt="Eclipse Foundation" title="Eclipse Foundation" /><div>Eclipse Foundation</div></a></li>
<li><a href="http://marketplace.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/marketplace.png" width="28" height="28" alt="Eclipse Marketplace" title="Eclipse Marketplace" /><div>Eclipse Marketplace</div></a></li>
<li><a href="https://bugs.eclipse.org/bugs"><img src="http://dev.eclipse.org/custom_icons/system-search-bw.png" width="28" height="28" alt="Bugzilla" title="Bugzilla" /><div>Bugzilla</div></a></li>
<li><a href="http://live.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/audio-input-microphone-bw.png" width="28" height="28" alt="Live" title="Live" /><div>Eclipse Live</div></a></li>
<li><a href="http://planeteclipse.org"><img src="http://dev.eclipse.org/large_icons/devices/audio-card.png" width="28" height="28" alt="PlanetEclipse" title="Planet" /><div>Planet Eclipse</div></a></li>
<li><a href="http://portal.eclipse.org"><img src="http://dev.eclipse.org/custom_icons/preferences-system-network-proxy-bw.png" width="28" height="28" alt="Portal" title="Portal" /><div>My Foundation Portal</div></a></li>
</ul>
</div>
</div>
</div>
<!-- NEW HEADER STUFF HERE -->
<div id="header-menu">
<div id="header-nav">
<ul> <li><a class="first_one" href="http://wiki.eclipse.org/" target="_self">Home</a></li> <li><a href="http://www.eclipse.org/downloads/" target="_self">Downloads</a></li>
<li><a href="http://www.eclipse.org/users/" target="_self">Users</a></li>
<li><a href="http://www.eclipse.org/membership/" target="_self">Members</a></li>
<li><a href="http://wiki.eclipse.org/index.php/Development_Resources" target="_self">Committers</a></li>
<li><a href="http://www.eclipse.org/resources/" target="_self">Resources</a></li>
<li><a href="http://www.eclipse.org/projects/" target="_self">Projects</a></li>
<li><a href="http://www.eclipse.org/org/" target="_self">About Us</a></li>
</ul>
</div>
<div id="header-utils">
<!-- moved the search window here -->
<form action="http://wiki.eclipse.org/Special:Search" >
<input class="input" name="search" type="text" accesskey="f" value="" />
<input type='submit' onclick="this.submit();" name="go" id="searchGoButton" class="button" title="Go to a page with this exact name if one exists" value="Go" />&nbsp;
<input type='submit' onclick="this.submit();" name="fulltext" class="button" id="mw-searchButton" title="Search Eclipsepedia for this text" value="Search" />
</form>
</div>
</div>
<!-- Eclipse Additions for the Header stop here -->
<!-- Additions and mods for leftside nav Start here -->
<!--Started nav rip here-->
<!-- these are the nav controls main page, changes etc -->
<div id="novaContent" class="faux">
<div id="leftcol">
<ul id="leftnav">
<!-- these are the page controls, edit history etc -->
<li class="separator"><a class="separator">Navigation &#160;&#160;</li>
<li id="n-mainpage"><a href="http://wiki.eclipse.org/Main_Page">Main Page</a></li>
<li id="n-portal"><a href="http://wiki.eclipse.org/Eclipsepedia:Community_Portal">Community portal</a></li>
<li id="n-currentevents"><a href="http://wiki.eclipse.org/Eclipsepedia:Current_events">Current events</a></li>
<li id="n-recentchanges"><a href="http://wiki.eclipse.org/Special:Recentchanges">Recent changes</a></li>
<li id="n-randompage"><a href="http://wiki.eclipse.org/Special:Random">Random page</a></li>
<li id="n-help"><a href="http://wiki.eclipse.org/Help:Contents">Help</a></li>
<li class="separator"><a class="separator">Toolbox &#160;&#160;</a></li>
<li id="t-whatlinkshere"><a href="http://wiki.eclipse.org/Special:Whatlinkshere/SMILA/Project_Concepts/MimeTypeIdentifier">What links here</a></li>
<li id="t-recentchangeslinked"><a href="http://wiki.eclipse.org/Special:Recentchangeslinked/SMILA/Project_Concepts/MimeTypeIdentifier">Related changes</a></li>
<!-- This is the toolbox section -->
<li id="t-upload"><a href="http://wiki.eclipse.org/Special:Upload">Upload file</a></li>
<li id="t-specialpages"><a href="http://wiki.eclipse.org/Special:Specialpages">Special pages</a></li>
<li id="t-print"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Project_Concepts/MimeTypeIdentifier&amp;printable=yes">Printable version</a></li> <li id="t-permalink"><a href="http://wiki.eclipse.org/index.php?title=SMILA/Project_Concepts/MimeTypeIdentifier&amp;oldid=113328">Permanent link</a></li> </ul>
</div>
<!-- Additions and mods for leftside nav End here -->
<div id="column-content">
<div id="content">
<a name="top" id="top"></a>
<div id="tabs">
<ul class="primary">
<li class="active"><a href="MimeTypeIdentifier.html"><span class="tab">Page</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Talk:SMILA/Project_Concepts/MimeTypeIdentifier&amp;action=edit"><span class="tab">Discussion</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Project_Concepts/MimeTypeIdentifier&amp;action=edit"><span class="tab">View source</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=SMILA/Project_Concepts/MimeTypeIdentifier&amp;action=history"><span class="tab">History</span></a></li>
<li><a href="http://wiki.eclipse.org/index.php?title=Special:Userlogin&amp;returnto=SMILA/Project&#32;Concepts/MimeTypeIdentifier"><span class="tab">Edit</span></a></li>
</ul>
</div>
<script type="text/javascript"> if (window.isMSIE55) fixalpha(); </script>
<h1 class="firstHeading">SMILA/Project Concepts/MimeTypeIdentifier</h1>
<div id="bodyContent">
<h3 id="siteSub">From Eclipsepedia</h3>
<div id="contentSub"><span class="subpages">&lt; <a href="../../SMILA.html" title="SMILA">SMILA</a> | <a href="../Project_Concepts.1.html" title="SMILA/Project Concepts">Project Concepts</a></span></div>
<div id="jump-to-nav">Jump to: <a href="MimeTypeIdentifier.html#column-one">navigation</a>, <a href="MimeTypeIdentifier.html#searchInput">search</a></div> <!-- start content -->
<table id="toc" class="toc" summary="Contents"><tr><td><div id="toctitle"><h2>Contents</h2></div>
<ul>
<li class="toclevel-1"><a href="MimeTypeIdentifier.html#Description"><span class="tocnumber">1</span> <span class="toctext">Description</span></a></li>
<li class="toclevel-1"><a href="MimeTypeIdentifier.html#Technical_proposal"><span class="tocnumber">2</span> <span class="toctext">Technical proposal</span></a>
<ul>
<li class="toclevel-2"><a href="MimeTypeIdentifier.html#Configuration"><span class="tocnumber">2.1</span> <span class="toctext">Configuration</span></a></li>
<li class="toclevel-2"><a href="MimeTypeIdentifier.html#Interface"><span class="tocnumber">2.2</span> <span class="toctext">Interface</span></a></li>
<li class="toclevel-2"><a href="MimeTypeIdentifier.html#Related_functionality"><span class="tocnumber">2.3</span> <span class="toctext">Related functionality</span></a></li>
</ul>
</li>
</ul>
</td></tr></table><script type="text/javascript"> if (window.showTocToggle) { var tocShowText = "show"; var tocHideText = "hide"; showTocToggle(); } </script>
<a name="Description"></a><h2> <span class="mw-headline"> Description </span></h2>
<p>We need the functionality to identify the mimetype of documents, e.g. for compound handling or to control data transformation in BPEL.
</p>
<a name="Technical_proposal"></a><h2> <span class="mw-headline"> Technical proposal </span></h2>
<p>The MimeTypeIdentifier has to provide functionality to identify the mimetype of a document, either by the documents content or by file extension mapping to mimetypes.
The interface supports both approaches, as both may be combined for optimized results. Implementations could be done stepwise:
</p>
<ul><li> initial implementation: file extension to mime type mapping
</li><li> advanced implementation: magic bytes analysis
</li></ul>
<p>{info:title=Useful Information}
Mimetype identification is one of the core tasks of aperture. Their approach is a combination of MagicBytes interpretation and file extension to mimetype mapping.
Perhaps it is possible that aperture contributes this functionality to SMILA&nbsp;? For details see [<a href="http://aperture.wiki.sourceforge.net/MIMETypeIdentification" class="external autonumber" title="http://aperture.wiki.sourceforge.net/MIMETypeIdentification" rel="nofollow">[1]</a>]
</p><p>Aperture is also capable of identifying (and converting) OpenOffice formats like .docx. So these formats are not confused with zips containing XML files. see [<a href="https://sourceforge.net/mailarchive/message.php?msg_id=14cc92570704020743h33c5685at97b9618a5c4c04e3%40mail.gmail.com" class="external autonumber" title="https://sourceforge.net/mailarchive/message.php?msg_id=14cc92570704020743h33c5685at97b9618a5c4c04e3%40mail.gmail.com" rel="nofollow">[2]</a>]
{info}
</p>
<a name="Configuration"></a><h3> <span class="mw-headline"> Configuration </span></h3>
<p>For the file extension - mimetype mapping a configuration has to be provided. Note that multiple extensions may be associated with a single mimetype. This is supported. In theory a single extension may also be associated with multiple mimetypes. This may be a valid case but the mapping used by MimeTypeIdentifier has to be unambiguous! The implementation has to ensure this and prevent such configurations (e.g. by simply overriding ).
A configuration could look like this:
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-xml"><span class="sc3"><span class="re1">&lt;mimetypes<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;mimetype</span> <span class="re0">id</span>=<span class="st0">&quot;application/rtf&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>rtf<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/mimetype<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;mimetype</span> <span class="re0">id</span>=<span class="st0">&quot;application/vnd.ms-excel&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>xls<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/mimetype<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;mimetype</span> <span class="re0">id</span>=<span class="st0">&quot;application/vnd.ms-powerpoint&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>ppt<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/mimetype<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;mimetype</span> <span class="re0">id</span>=<span class="st0">&quot;image/jpeg&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>jpe<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>jpeg<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>jpg<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/mimetype<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;mimetype</span> <span class="re0">id</span>=<span class="st0">&quot;text/html&quot;</span><span class="re2">&gt;</span></span>
<span class="sc3"><span class="re1">&lt;extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>htm<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;ext<span class="re2">&gt;</span></span></span>html<span class="sc3"><span class="re1">&lt;/ext<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/extensions<span class="re2">&gt;</span></span></span>
<span class="sc3"><span class="re1">&lt;/mimetype<span class="re2">&gt;</span></span></span>
...
<span class="sc3"><span class="re1">&lt;/mimetypes<span class="re2">&gt;</span></span></span></pre></div>
<p>At the moment I don't know if there is anything the be configured for MagicBytes analysis. The implementation should be extendable to provide MagicByte detection for additional mimetypes.
</p><p><br />
</p>
<a name="Interface"></a><h3> <span class="mw-headline"> Interface </span></h3>
<div dir="ltr" style="text-align: left;"><pre class="source-java"><span class="kw1">interface</span> MimeTypeIdentifier
<span class="br0">&#123;</span>
<span class="coMULTI">/**
Identifies the mimetype of a document.
@param document - the document to identify the charset for. Note that the provided bytes do not need to be the whole document.
@param String - the filename of the document. This could be a simple filename, a full path or even a complex URI
@return a String containing the mimetype or null if none could be identified
*/</span>
<span class="kw1">public</span> <span class="kw3">String</span> identify<span class="br0">&#40;</span> <span class="kw4">byte</span><span class="br0">&#91;</span><span class="br0">&#93;</span> document, <span class="kw3">String</span> filename <span class="br0">&#41;</span>;
&nbsp;
<span class="coMULTI">/**
Returns the minimum number of bytes needed to identify mimetypes. The size of parameter document of method identify must not be less than this value. Otherwise identification can not be done.
@return the minimum number of bytes needed to identify mimetypes
*/</span>
<span class="kw1">public</span> <span class="kw4">int</span> getMinByteCount<span class="br0">&#40;</span><span class="br0">&#41;</span>;
<span class="br0">&#125;</span></pre></div>
<p>This functionality may be needed at various stages in the SMILA. Besides a simple POJO implementation that provides the core functionality, we should also consider wrapping the functionality in a BPEL service.
</p><p><br />
</p>
<a name="Related_functionality"></a><h3> <span class="mw-headline"> Related functionality </span></h3>
<ul><li> A utility component is needed to extract the file extension of a filename, path or uri.
</li></ul>
<div dir="ltr" style="text-align: left;"><pre class="source-java"><span class="kw1">interface</span> ExtensionExtractor
<span class="br0">&#123;</span>
<span class="coMULTI">/**
Extractes the file extension of a filename
@param String - the filename of the document. This could be a simple filename, a full path or even a complex URI
@return a String containing the file extension or null if none could be identified
*/</span>
<span class="kw1">public</span> <span class="kw3">String</span> getExtension<span class="br0">&#40;</span><span class="kw3">String</span> filename <span class="br0">&#41;</span>;
<span class="br0">&#125;</span></pre></div>
<p><br />
</p>
<ul><li> Another component could be needed that extracts the encoding/charset of text documents (txt, html, xml, etc.).
</li></ul>
<p>This can be done by checking for BOM (ByteOrderMark) and/or by checking for encoding/charset tags/attributes in markup documents.
</p>
<div dir="ltr" style="text-align: left;"><pre class="source-java"><span class="kw1">interface</span> EncodingIdentifier
<span class="br0">&#123;</span>
<span class="coMULTI">/**
Identifies the charset of a text or markup document.
@param document - the document to identify the charset for. Note that the provided bytes do not need to be the whole document.
@return a String containing the charset or null if none could be identified
*/</span>
<span class="kw1">public</span> <span class="kw3">String</span> identify<span class="br0">&#40;</span> <span class="kw4">byte</span><span class="br0">&#91;</span><span class="br0">&#93;</span> document <span class="br0">&#41;</span>;
&nbsp;
<span class="coMULTI">/**
Returns the minimum number of bytes needed to identify the charset. The size of parameter document of method identify must not be less than this value. Otherwise identification can not be done.
@return the minimum number of bytes needed to identify the charset
*/</span>
<span class="kw1">public</span> <span class="kw4">int</span> getMinByteCount<span class="br0">&#40;</span><span class="br0">&#41;</span>;
<span class="br0">&#125;</span></pre></div>
<!--
NewPP limit report
Preprocessor node count: 18/1000000
Post-expand include size: 0/2097152 bytes
Template argument size: 0/2097152 bytes
#ifexist count: 0/100
-->
<!-- Saved in parser cache with key wikidb:pcache:idhash:15231-0!1!0!!en!2!edit=0 and timestamp 20120710093539 -->
<div class="printfooter">
Retrieved from "<a href="MimeTypeIdentifier.html">http://wiki.eclipse.org/SMILA/Project_Concepts/MimeTypeIdentifier</a>"</div>
<!-- end content -->
<div class="visualClear"></div>
</div>
</div>
</div>
<!-- Yoink of toolbox for phoenix moved up -->
</div>
</div>
<div id="clearFooter"/>
<div id="footer" >
<ul id="footernav">
<li class="first"><a href="http://www.eclipse.org/">Home</a></li>
<li><a href="http://www.eclipse.org/legal/privacy.php">Privacy Policy</a></li>
<li><a href="http://www.eclipse.org/legal/termsofuse.php">Terms of Use</a></li>
<li><a href="http://www.eclipse.org/legal/copyright.php">Copyright Agent</a></li>
<li><a href="http://www.eclipse.org/org/foundation/contact.php">Contact</a></li>
<li><a href="http://wiki.eclipse.org/Eclipsepedia:About" title="Eclipsepedia:About">About Eclipsepedia</a></li>
</ul>
<span id="copyright">Copyright &copy; 2012 The Eclipse Foundation. All Rights Reserved</span>
<p id="footercredit">This page was last modified 14:01, 13 August 2008 by <a href="http://wiki.eclipse.org/User:Daniel.stucky.empolis.com" title="User:Daniel.stucky.empolis.com">Daniel Stucky</a>. </p>
<p id="footerviews">This page has been accessed 2,700 times.</p>
</div>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://www.");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
var pageTracker = _gat._getTracker("UA-910670-4");
pageTracker._trackPageview();
</script>
<!-- <div class="visualClear"></div> -->
<script type="text/javascript">if (window.runOnloadHook) runOnloadHook();</script>
</div>
<!-- Served in 0.108 secs. --></body></html>