| How to support a new unicode version in the scanner: |
| |
| 1) Go to http://www.unicode.org/Public/ |
| 2) Select the folder that corresponds to the unicode version for which you want to generate the scanner resource files |
| 3) Select the ucdxml folder and download the file called ucd.all.flat.zip. |
| 4) Unzip that file on your disk. This creates a file called ucd.all.flat.xml. |
| 5) To generate the resource files for identifier starts, you need to invoke |
| org.eclipse.jdt.core.internal.tools.unicode.GenerateIdentifierStartResources with the following arguments: |
| - first argument: unicode version |
| - second argument: path to the ucd.all.flat.xml file. |
| - third argument: folder in which the resource files will be generated |
| For example: |
| 8.0 c:/unicode8.0.0/ucd.all.flat.xml c:/unicode8.0.0/res |
| |
| 6) To generate the resource files for identifier parts, you need to invoke |
| org.eclipse.jdt.core.internal.tools.unicode.GenerateIdentifierPartResources with the same arguments used previously. |
| 7) Once this is done, you need to edit org.eclipse.jdt.internal.compiler.parser.ScannerHelper to add a new table for the new unicode support. |
| |
| For example: |
| - add the new method: |
| static void initializeTable19() { |
| Tables9 = initializeTables("unicode8"); //$NON-NLS-1$ |
| } |
| - add the new static field Tables9. |
| - add a new folder unicode8 as a sub folder of org/eclipse/jdt/internal/compiler/parser/. |
| - put into this folder all resource files generated in step 5 and 6. |
| - modify |
| org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, int) |
| org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierStart(long, int) |
| To use the new Tables9 values based on the compliance value by adding a new else if condition. |
| |
| For org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierPart(long, int) this becomes |
| The last else becomes an else if that supports the previous 1.8 compliance |
| else if (complianceLevel <= ClassFileConstants.JDK1_8) { |
| // java 7 supports Unicode 6.2 |
| if (Tables8 == null) { |
| initializeTable18(); |
| } |
| switch((codePoint & 0x1F0000) >> 16) { |
| case 0 : |
| return isBitSet(Tables8[PART_INDEX][0], codePoint & 0xFFFF); |
| case 1 : |
| return isBitSet(Tables8[PART_INDEX][1], codePoint & 0xFFFF); |
| case 2 : |
| return isBitSet(Tables8[PART_INDEX][2], codePoint & 0xFFFF); |
| case 14 : |
| return isBitSet(Tables8[PART_INDEX][3], codePoint & 0xFFFF); |
| } |
| } else { |
| // java 9 supports Unicode 8 |
| if (Tables9 == null) { |
| initializeTable19(); |
| } |
| switch((codePoint & 0x1F0000) >> 16) { |
| case 0 : |
| return isBitSet(Tables9[PART_INDEX][0], codePoint & 0xFFFF); |
| case 1 : |
| return isBitSet(Tables9[PART_INDEX][1], codePoint & 0xFFFF); |
| case 2 : |
| return isBitSet(Tables9[PART_INDEX][2], codePoint & 0xFFFF); |
| case 14 : |
| return isBitSet(Tables9[PART_INDEX][3], codePoint & 0xFFFF); |
| } |
| } |
| |
| 8) Do the same set of changes for org.eclipse.jdt.internal.compiler.parser.ScannerHelper.isJavaIdentifierStart(long, int). |
| 9) You need to add a regression test class in org.eclipse.jdt.core.tests.compiler.regression similar to org.eclipse.jdt.core.tests.compiler.regression.Unicode18Test. |
| You can get the character value for the regression test by checking the ucd.all.flat.xml file and searching for an entry that has the age parameter equals to the |
| unicode version you want to check (i.e. for unicode 8, age="8.0"). |
| |
| If you have any questions regarding this tool, please comment in the bug report 506870: https://bugs.eclipse.org/bugs/show_bug.cgi?id=506870 |