platform-core/documents/plan_encoding_intro.html - www.eclipse.org/eclipse - Git at Google

 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
 <html>
 <head>
   <title>Non-uniform file encodings in the Eclipse Platform</title>
   <link rel="stylesheet" href="../default_style.css" type="text/css">
 </head>
 <body text="#000000" bgcolor="#ffffff">
 <h1>Non-uniform file encodings in the Eclipse Platform</h1>
 <p><font size="-1">Last modified: February 23, 2004</font> </p>
 <cite><strong>Plan item description:</strong> Eclipse 2.1 uses a single
 global file encoding setting for reading and writing files in the
 workspace. This is problematic; for example, when Java source files in
 the workspace use OS default file encoding while XML files in the
 workspace use UTF-8 file encoding. The Platform should support
 non-uniform file encodings. [Platform Core, Platform UI, Text, Search,
 Compare, JDT UI, JDT Core] [Theme: User experience] (bug <a
  href="http://bugs.eclipse.org/bugs/show_bug.cgi?id=37933">37933</a>, <a
  href="http://dev.eclipse.org/bugs/show_bug.cgi?id=5399">5399</a>) </cite>
 <p>The pre-M7 situation is as follows:</p>
 <ul>
   <li><code>ResourcesPlugin.getEncoding</code> returns the default
 encoding for the workspace (the <code>org.eclipse.core.resources.encoding</code>
 preference value if available, otherwise the value of the <code>file.encoding</code>
 Java system property).</li>
   <li><code>IFile.getContents</code>/<code>setContents</code> work with
 byte streams - no encoding can be applied.</li>
   <li><code>IFile.getEncoding</code> tries to guess the file encoding
 (looking for the <a
  href="http://www.unicode.org/unicode/faq/utf_bom.html">Byte Order Mark</a>),
 which is not enough. Also, this API has no known client
 so far. This API method would be deprecated.</li>
   <li>the Java compiler supports non-uniform encondings for Java source
 files, but in Eclipse it relies on <code>ResourcesPlugin.getEncoding<span
  style="font-family: helvetica,arial,sans-serif;">
 (same value for all sources)</span></code><span
  style="font-family: helvetica,arial,sans-serif;">.</span></li>
   <li>the text editor framework supports setting the encoding for files
 being edited (setting a persistent property on the file resource), but
 there is no support for setting the encoding of multiple files
 simultaneously, and other components are not aware of the encoding
 settings.</li>
 </ul>
 <h2>Requirements&nbsp;</h2>
 <ul>
   <li>the encoding of a file should be automatically determined by
 considering the file's content and/or its name or extension.</li>
   <li>encoding information should be available not only for workspace
 resources (e.g. IFiles) but for external files too. This has become
 more important in light of the recent RCP work since use of the
 resource plugin has become optional.</li>
   <li>encoding information should be available for local history
 contents (IFileState) and archives (e.g. *.zip, and *.jar).</li>
   <li>in the future it should be possible to use the content based
 encoding interpreter for determining more information about the file
 (e.g. content type) without duplicating the mechanism but rather by
 augmenting
 it.</li>
   <li>users should be able to set the default encoding for a project.</li>
   <li>users should be able to share the default encoding settings in a
 team
 repository (metadata should reside in the project content area).</li>
   <li>file contents-based encoding prevails upon the inherited encoding
 setting.</li>
   <li>users should be able to easily store a file in a different
 encoding (in order to change its encoding).</li>
 </ul>
 <h2>Proposed solution</h2>
 In addition to the existing approach of having a single global encoding for a
 workbench we propose
 <ol>
   <li>an extensible mechanism to determine the encoding of a stream by
 analyzing its contents or, if available its file name,</li>
   <li>to add a default encoding property to projects. This default
 encoding is used if no encoding could be determined in the first step.</li>
 </ol>
 We do not (yet) propose a settable encoding attribute per file because
 <ul>
   <li>we do not see an immediate need for this fine level of
 granularity,</li>
   <li>we have no sharable file attributes in Eclipse which would make
 sharing of files with different encodings difficult.</li>
 </ul>
 The encoding for a stream or an IStorage (as returned by two<code> getCharset</code>
 methods - see API changes) will be:
 <ol>
   <li> the encoding discovered by a <span style="font-style: italic;">content</span><em>
     interpreter</em> associated to the file extension (or file type), if one exists
     <em>and</em> can determine the encoding, or</li>
   <li> the default encoding define for the enclosing project, if any, or</li>
   <li>the global workspace encoding (equivalent to
 ResourcesPlugin.getEncoding()).</li>
 </ol>
 <p>Regarding #1, an extension-point would allow file format-aware
 encoding interpreters to register to the encoding discovery mechanism
 for specific file types (extensions) or to associate existing encoding
 interpreters to their own file extensions. Users would be able to
 associate more file extensions for the known interpreters (preference).</p>
 <p>All clients, when creating character-based streams when
 reading/writing the contents of a file resource, should pass along the
 charset string obtained from one of the <code>getCharset</code>
 methods instead of
 the one provided by <code>ResourcesPlugin.getEncoding</code>. Examples
 are: text editors, compiler, search, compare. </p>
 <h3>API changes</h3>
 <h4>Added:</h4>
 To make the encoding support available for non-workspace based
 resources we propose to add the following method to
 org.eclipse.core.runtime.IPlatform:<br>
 <pre>public interface <span style="font-weight: bold;">IPlatform</span> {<br>    // ...<br>    public String <span
  style="font-weight: bold;">getCharset</span>(InputStream stream, String fileExtension) throws CoreException;<br>    // ...<br>}<br></pre>
 <p>The InputStream seems to be the most widely used and scalable
 mechanism to get
 access to any kind of byte content. InputStreams can be easily created
 for a
 java.io.File, an IStorage (which subsumes IFile and IFileState; see
 below), as
 well as for bytes in memory (ByteArrayInputStream).<br>
 The optional file extension argument can be used to quickly reject more
 expensive ways for infering the encoding from the contents.<br>
 </p>
 A corresponding implementation (based on IContentInterpreters; see
 below) lives in org.eclipse.core.runtime.Platform.<br>
 <br>
 For the resource plugin we propose to add a new interface
 IEncodedStorage that adds the single method getCharset to the existing
 IStorage interface:<br>
 <pre>interface <span style="font-weight: bold;">IEncodedStorage</span> extends IStorage {<br>    public String <span
  style="font-weight: bold;">getCharset</span>() throws CoreException;<br>}<br></pre>
 <p>Its method getCharset returns the name of the encoding for an
 IStorage. It would make sense to add this method directly to the
 IStorage interface, since any InputStream can only be interpreted
 correctly if the used encoding is known. But because clients are
 allowed to implement IStorage this would be a breaking API change, so
 we decided to introduce a separate extension to IStorage. <br>
 </p>
 <p>Two existing interfaces will extend IEncodedStorage: IFile and
 IFileState, two concrete class will provide an implementation: File and
 FileState.<br>
 </p>
 <p>For both, files and file states, the implementation of getCharset
 first uses IPlatform.getCharset(...) from above to find an encoding
 based on any registered IContentInterpreters. If no encoding can be
 determined, File.getCharset() locates the enclosing project of the file
 and queries its IProjectDescription for a default encoding. For this we
 need the following two new methods on IProjectDescription:<br>
 </p>
 <pre>interface <span style="font-weight: bold;">IProjectDescription</span> {<br>    // ...<br>    public String <span
  style="font-weight: bold;">getDefaultCharset</span>();<br>    public void <span
  style="font-weight: bold;">setDefaultCharset</span>(String charset);<br>    // ...<br>}<br></pre>
 If no default encoding has been defined fo the project, the workspace's
 default encoding preference is returned (via the existing API).<br>
 <br>
 Other implementers of IStorage will have to decide whether they should
 base their implementation on IEncodedStorage.<br>
 <br>
 The implementation of Platform.getCharset will make use of content interpreters
 implementing the IContentInterpreter interface and that can be associated to file
 types through a new Core Runtime extension point &quot;org.eclipse.core.runtime.contentInterpreter&quot;.
 Users can associate additional file extensions via preferences.<br>
 <p>The method interpretContent does not return the detected encoding but stores
   it into a result object of type IContentInfo that is passed in as an argument.
   This approach makes it possible to allow for collecting additional information
   (like 'type'/'subtype') instead of just the encoding.</p>
 <pre>interface <span style="font-weight: bold;">IContentInterpreter</span> {
 	public void <span style="font-weight: bold;">interpretContent</span>(IContentInfo result, InputStream contents);
 }</pre>
 The IContentInfo is:
 <pre>public interface <span style="font-weight: bold;">IContentInfo</span> {
 	public void <span style="font-weight: bold;">setCharset</span>(String charset);
 	public String <span style="font-weight: bold;">getCharset</span>();
 }</pre>
 Since we would not allow clients to implement (or extend) IContentInfo,
 we will be able to extend the API with new setters and getters in the
 future without breaking API. <br>
 <p>The platform would provide itself implementations of
 IContentInterpreters for xml and other
 popular file formats.<br>
 </p>
 <h4>Deprecated:</h4>
 <span style="font-family: monospace;">public int IFile.getEncoding()</span><br>
 <span style="font-family: monospace;">public int IFile.ENCODING_</span>*
 constants<br>
 <br>
 <code><span style="font-family: monospace;">public String
 ResourcesPlugin.getEncoding()</span>: </code>Since all clients of this
 method will most likely have to adapt their code, I suggest to
 deprecate getEncoding() and introduce a new method getDefaultCharset()
 that better reflects the real purpose (and brings it more in line with
 IProjectDescription.getDefaultCharset())<br>
 <br>
 <h3>UI Changes<br>
 </h3>
 We need to add new UI for changing the default encoding for a project.
 A
 good place for this would be the Property dialog
 since encoding can be considered a property of the project, similar to
 the read-only property etc. The property dialog for files
 would only show the current value for the encoding but would not allow
 to change it. <br>
 <br>
 We should provide a &quot;<span style="font-style: italic;">Convert Encoding</span>&quot;
 action that converts the contents of a file (or all files in a hierarchy) to a
 different encoding. This action would ask the user for two encodings: the first
 is used when reading all selected files and the second when writing these files
 back to the workspace. <br>
 The action would not
 change the encoding value returned by getCharset() but it would provide
 a means to make the encoding of multiple files consistent with the
 default encoding of the enclosing project.<br>
 (An alternative to this UI would be to provide something like a &quot;Save with
 encoding&quot; action for editors. But this UI seems to be less convenient if
 the encoding of multiple files needs to be changed).<br>
 <br>
 In order to make sharing of files with heterogenous encodings easier,
 we'll have to enhance the compare/merge tools to be able to work with
 heterogenous encodings:<br>
 <br>
 To facilitate that, we try to automatically determine the encoding for
 the remote resource<br>
 <ul>
   <li>by knowing the default encoding of the remote project if the
 .project file (containing the encoding attribut) has been synched
 first, or<br>
   </li>
   <li>by using the local IEncodingInterpreter mechanism for the remote
 resources (which are available as streams), or<br>
   </li>
   <li>by allowing the user to change the encoding for the remote
 resource on the fly until it displays correctly.</li>
 </ul>
 With these means it becomes possible to compare and merge files
 independent from the fact whether we use the same encodings on both
 sides or not.<br>
 <br>
 However, if we want to use the same encoding (that is if we catchup with the remote
 .project file), we will have to convert the encoding of our local files to adapt
 them to the new encoding. For this we will provide the &quot;<span style="font-style: italic;">Convert
 Encoding&quot;</span> action in the Compare/Merge tools where required.<br>
 <br>
 <h3>Scenarios</h3>
 <ul>
   <li>The user opens text files whose contents was created using encoding &quot;MS932&quot;
     in a workspace whose default encoding is &quot;US-ASCII&quot;. It was not
     possible to guess the file encoding automatically, so what the user sees is
     gibberish. The user figures out the cause of the problem and explicitly sets
     the encoding for the project containg the files to be &quot;MS932&quot;. He
     will have to reload all editors to see the contents correctly and will have
     to trigger a full build in the affected project. </li>
   <li>The user has a Java project with a few Java files and no explicitely specified
     project encoding and a CP1252 workspace encoding. Now he wants to start using
     all kinds of Unicode characters in his Java files. He sets the default encoding
     of his project to UTF-16 and he converts all existing Java files to the UTF-16
     encoding. All newly created Java files will automatically have the correct
     encoding. Project metadata files like &quot;.project&quot; or &quot;plugin.xml&quot;
     files will still be read in their correct encoding since IContentInterpreters
     still apply to them.</li>
   <li>Determining the encoding to use for newly created files: Normally the encoding
     becomes relevant on saving the file for the first time. Since the IFile already
     exists and knows its project, the encoding to use can be determined by the
     proposed API. A potential problem might arise from the fact that a newly created
     file should use an encoding that is consistent with the encoding it would
     get from an IContentInterpreter. Examples are *.properties and *.xml files.
     They have a UTF-8 encoding even if the enclosing project uses a different
     encoding. The code that writes these files to disk (and defines the initial
     encoding) must understand this. It can neither use the encoding for the project,
     nor can it use the encoding for the file (because the file is still empty
     when the IEncodingInterpreter tries to determines its encoding).<br>
   </li>
 </ul>
 <ul>
 </ul>
 </body>
 </html>
	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
	<html>
	<head>
	<title>Non-uniform file encodings in the Eclipse Platform</title>
	<link rel="stylesheet" href="../default_style.css" type="text/css">
	</head>
	<body text="#000000" bgcolor="#ffffff">
	<h1>Non-uniform file encodings in the Eclipse Platform</h1>
	<p><font size="-1">Last modified: February 23, 2004</font> </p>
	<cite><strong>Plan item description:</strong> Eclipse 2.1 uses a single
	global file encoding setting for reading and writing files in the
	workspace. This is problematic; for example, when Java source files in
	the workspace use OS default file encoding while XML files in the
	workspace use UTF-8 file encoding. The Platform should support
	non-uniform file encodings. [Platform Core, Platform UI, Text, Search,
	Compare, JDT UI, JDT Core] [Theme: User experience] (bug <a
	href="http://bugs.eclipse.org/bugs/show_bug.cgi?id=37933">37933</a>, <a
	href="http://dev.eclipse.org/bugs/show_bug.cgi?id=5399">5399</a>) </cite>
	<p>The pre-M7 situation is as follows:</p>
	<ul>
	<li><code>ResourcesPlugin.getEncoding</code> returns the default
	encoding for the workspace (the <code>org.eclipse.core.resources.encoding</code>
	preference value if available, otherwise the value of the <code>file.encoding</code>
	Java system property).</li>
	<li><code>IFile.getContents</code>/<code>setContents</code> work with
	byte streams - no encoding can be applied.</li>
	<li><code>IFile.getEncoding</code> tries to guess the file encoding
	(looking for the <a
	href="http://www.unicode.org/unicode/faq/utf_bom.html">Byte Order Mark</a>),
	which is not enough. Also, this API has no known client
	so far. This API method would be deprecated.</li>
	<li>the Java compiler supports non-uniform encondings for Java source
	files, but in Eclipse it relies on <code>ResourcesPlugin.getEncoding<span
	style="font-family: helvetica,arial,sans-serif;">
	(same value for all sources)</span></code><span
	style="font-family: helvetica,arial,sans-serif;">.</span></li>
	<li>the text editor framework supports setting the encoding for files
	being edited (setting a persistent property on the file resource), but
	there is no support for setting the encoding of multiple files
	simultaneously, and other components are not aware of the encoding
	settings.</li>
	</ul>
	<h2>Requirements </h2>
	<ul>
	<li>the encoding of a file should be automatically determined by
	considering the file's content and/or its name or extension.</li>
	<li>encoding information should be available not only for workspace
	resources (e.g. IFiles) but for external files too. This has become
	more important in light of the recent RCP work since use of the
	resource plugin has become optional.</li>
	<li>encoding information should be available for local history
	contents (IFileState) and archives (e.g. .zip, and .jar).</li>
	<li>in the future it should be possible to use the content based
	encoding interpreter for determining more information about the file
	(e.g. content type) without duplicating the mechanism but rather by
	augmenting
	it.</li>
	<li>users should be able to set the default encoding for a project.</li>
	<li>users should be able to share the default encoding settings in a
	team
	repository (metadata should reside in the project content area).</li>
	<li>file contents-based encoding prevails upon the inherited encoding
	setting.</li>
	<li>users should be able to easily store a file in a different
	encoding (in order to change its encoding).</li>
	</ul>
	<h2>Proposed solution</h2>
	In addition to the existing approach of having a single global encoding for a
	workbench we propose
	<ol>
	<li>an extensible mechanism to determine the encoding of a stream by
	analyzing its contents or, if available its file name,</li>
	<li>to add a default encoding property to projects. This default
	encoding is used if no encoding could be determined in the first step.</li>
	</ol>
	We do not (yet) propose a settable encoding attribute per file because
	<ul>
	<li>we do not see an immediate need for this fine level of
	granularity,</li>
	<li>we have no sharable file attributes in Eclipse which would make
	sharing of files with different encodings difficult.</li>
	</ul>
	The encoding for a stream or an IStorage (as returned by two<code> getCharset</code>
	methods - see API changes) will be:
	<ol>
	<li> the encoding discovered by a <span style="font-style: italic;">content</span><em>
	interpreter</em> associated to the file extension (or file type), if one exists
	<em>and</em> can determine the encoding, or</li>
	<li> the default encoding define for the enclosing project, if any, or</li>
	<li>the global workspace encoding (equivalent to
	ResourcesPlugin.getEncoding()).</li>
	</ol>
	<p>Regarding #1, an extension-point would allow file format-aware
	encoding interpreters to register to the encoding discovery mechanism
	for specific file types (extensions) or to associate existing encoding
	interpreters to their own file extensions. Users would be able to
	associate more file extensions for the known interpreters (preference).</p>
	<p>All clients, when creating character-based streams when
	reading/writing the contents of a file resource, should pass along the
	charset string obtained from one of the <code>getCharset</code>
	methods instead of
	the one provided by <code>ResourcesPlugin.getEncoding</code>. Examples
	are: text editors, compiler, search, compare. </p>
	<h3>API changes</h3>
	<h4>Added:</h4>
	To make the encoding support available for non-workspace based
	resources we propose to add the following method to
	org.eclipse.core.runtime.IPlatform:<br>
	<pre>public interface <span style="font-weight: bold;">IPlatform</span> {<br> // ...<br> public String <span
	style="font-weight: bold;">getCharset</span>(InputStream stream, String fileExtension) throws CoreException;<br> // ...<br>}<br></pre>
	<p>The InputStream seems to be the most widely used and scalable
	mechanism to get
	access to any kind of byte content. InputStreams can be easily created
	for a
	java.io.File, an IStorage (which subsumes IFile and IFileState; see
	below), as
	well as for bytes in memory (ByteArrayInputStream).<br>
	The optional file extension argument can be used to quickly reject more
	expensive ways for infering the encoding from the contents.<br>
	</p>
	A corresponding implementation (based on IContentInterpreters; see
	below) lives in org.eclipse.core.runtime.Platform.<br>
	<br>
	For the resource plugin we propose to add a new interface
	IEncodedStorage that adds the single method getCharset to the existing
	IStorage interface:<br>
	<pre>interface <span style="font-weight: bold;">IEncodedStorage</span> extends IStorage {<br> public String <span
	style="font-weight: bold;">getCharset</span>() throws CoreException;<br>}<br></pre>
	<p>Its method getCharset returns the name of the encoding for an
	IStorage. It would make sense to add this method directly to the
	IStorage interface, since any InputStream can only be interpreted
	correctly if the used encoding is known. But because clients are
	allowed to implement IStorage this would be a breaking API change, so
	we decided to introduce a separate extension to IStorage. <br>
	</p>
	<p>Two existing interfaces will extend IEncodedStorage: IFile and
	IFileState, two concrete class will provide an implementation: File and
	FileState.<br>
	</p>
	<p>For both, files and file states, the implementation of getCharset
	first uses IPlatform.getCharset(...) from above to find an encoding
	based on any registered IContentInterpreters. If no encoding can be
	determined, File.getCharset() locates the enclosing project of the file
	and queries its IProjectDescription for a default encoding. For this we
	need the following two new methods on IProjectDescription:<br>
	</p>
	<pre>interface <span style="font-weight: bold;">IProjectDescription</span> {<br> // ...<br> public String <span
	style="font-weight: bold;">getDefaultCharset</span>();<br> public void <span
	style="font-weight: bold;">setDefaultCharset</span>(String charset);<br> // ...<br>}<br></pre>
	If no default encoding has been defined fo the project, the workspace's
	default encoding preference is returned (via the existing API).<br>
	<br>
	Other implementers of IStorage will have to decide whether they should
	base their implementation on IEncodedStorage.<br>
	<br>
	The implementation of Platform.getCharset will make use of content interpreters
	implementing the IContentInterpreter interface and that can be associated to file
	types through a new Core Runtime extension point "org.eclipse.core.runtime.contentInterpreter".
	Users can associate additional file extensions via preferences.<br>
	<p>The method interpretContent does not return the detected encoding but stores
	it into a result object of type IContentInfo that is passed in as an argument.
	This approach makes it possible to allow for collecting additional information
	(like 'type'/'subtype') instead of just the encoding.</p>
	<pre>interface <span style="font-weight: bold;">IContentInterpreter</span> {
	public void <span style="font-weight: bold;">interpretContent</span>(IContentInfo result, InputStream contents);
	}</pre>
	The IContentInfo is:
	<pre>public interface <span style="font-weight: bold;">IContentInfo</span> {
	public void <span style="font-weight: bold;">setCharset</span>(String charset);
	public String <span style="font-weight: bold;">getCharset</span>();
	}</pre>
	Since we would not allow clients to implement (or extend) IContentInfo,
	we will be able to extend the API with new setters and getters in the
	future without breaking API. <br>
	<p>The platform would provide itself implementations of
	IContentInterpreters for xml and other
	popular file formats.<br>
	</p>
	<h4>Deprecated:</h4>
	<span style="font-family: monospace;">public int IFile.getEncoding()</span><br>
	<span style="font-family: monospace;">public int IFile.ENCODING_</span>*
	constants<br>
	<br>
	<code><span style="font-family: monospace;">public String
	ResourcesPlugin.getEncoding()</span>: </code>Since all clients of this
	method will most likely have to adapt their code, I suggest to
	deprecate getEncoding() and introduce a new method getDefaultCharset()
	that better reflects the real purpose (and brings it more in line with
	IProjectDescription.getDefaultCharset())<br>
	<br>
	<h3>UI Changes<br>
	</h3>
	We need to add new UI for changing the default encoding for a project.
	A
	good place for this would be the Property dialog
	since encoding can be considered a property of the project, similar to
	the read-only property etc. The property dialog for files
	would only show the current value for the encoding but would not allow
	to change it. <br>
	<br>
	We should provide a "<span style="font-style: italic;">Convert Encoding</span>"
	action that converts the contents of a file (or all files in a hierarchy) to a
	different encoding. This action would ask the user for two encodings: the first
	is used when reading all selected files and the second when writing these files
	back to the workspace. <br>
	The action would not
	change the encoding value returned by getCharset() but it would provide
	a means to make the encoding of multiple files consistent with the
	default encoding of the enclosing project.<br>
	(An alternative to this UI would be to provide something like a "Save with
	encoding" action for editors. But this UI seems to be less convenient if
	the encoding of multiple files needs to be changed).<br>
	<br>
	In order to make sharing of files with heterogenous encodings easier,
	we'll have to enhance the compare/merge tools to be able to work with
	heterogenous encodings:<br>
	<br>
	To facilitate that, we try to automatically determine the encoding for
	the remote resource<br>
	<ul>
	<li>by knowing the default encoding of the remote project if the
	.project file (containing the encoding attribut) has been synched
	first, or<br>
	</li>
	<li>by using the local IEncodingInterpreter mechanism for the remote
	resources (which are available as streams), or<br>
	</li>
	<li>by allowing the user to change the encoding for the remote
	resource on the fly until it displays correctly.</li>
	</ul>
	With these means it becomes possible to compare and merge files
	independent from the fact whether we use the same encodings on both
	sides or not.<br>
	<br>
	However, if we want to use the same encoding (that is if we catchup with the remote
	.project file), we will have to convert the encoding of our local files to adapt
	them to the new encoding. For this we will provide the "<span style="font-style: italic;">Convert
	Encoding"</span> action in the Compare/Merge tools where required.<br>
	<br>
	<h3>Scenarios</h3>
	<ul>
	<li>The user opens text files whose contents was created using encoding "MS932"
	in a workspace whose default encoding is "US-ASCII". It was not
	possible to guess the file encoding automatically, so what the user sees is
	gibberish. The user figures out the cause of the problem and explicitly sets
	the encoding for the project containg the files to be "MS932". He
	will have to reload all editors to see the contents correctly and will have
	to trigger a full build in the affected project. </li>
	<li>The user has a Java project with a few Java files and no explicitely specified
	project encoding and a CP1252 workspace encoding. Now he wants to start using
	all kinds of Unicode characters in his Java files. He sets the default encoding
	of his project to UTF-16 and he converts all existing Java files to the UTF-16
	encoding. All newly created Java files will automatically have the correct
	encoding. Project metadata files like ".project" or "plugin.xml"
	files will still be read in their correct encoding since IContentInterpreters
	still apply to them.</li>
	<li>Determining the encoding to use for newly created files: Normally the encoding
	becomes relevant on saving the file for the first time. Since the IFile already
	exists and knows its project, the encoding to use can be determined by the
	proposed API. A potential problem might arise from the fact that a newly created
	file should use an encoding that is consistent with the encoding it would
	get from an IContentInterpreter. Examples are .properties and .xml files.
	They have a UTF-8 encoding even if the enclosing project uses a different
	encoding. The code that writes these files to disk (and defines the initial
	encoding) must understand this. It can neither use the encoding for the project,
	nor can it use the encoding for the file (because the file is still empty
	when the IEncodingInterpreter tries to determines its encoding).<br>
	</li>
	</ul>
	<ul>
	</ul>
	</body>
	</html>