Converting an Adapt It Regular Project to Adapt It Unicode.


Introduction


Adapt It has supported two forms of projects for several years. For use in Microsoft Windows, we have called the older of the two forms Adapt It Regular (sometimes Standard). This format uses the ASCII character set which uses a single byte of data for each character. Eventually a new standard called Unicode was developed internationally, it represents data using two or more bytes for each character. This larger size allows many more possibilities for representing characters. Unicode allows the representation of non-Roman scripts in a locale-independent way. On Windows, Unicode is supported in the version called Adapt It WX Unicode, or simply Adapt It Unicode. (The Linux and Mac OS X versions of Adapt It have have always been for use with Unicode texts.)


Adapt It Unicode is the preferred application of the two with Adapt It Regular being supported only to accommodate legacy translation projects running on old Windows machines that cannot support Unicode. If you want to convert from the non-Unicode Adapt It to Adapt It Unicode, the discussion below may help.


Important Note: This discussion presents a way that works only if the following condition is fulfilled.

Essential Condition: the adaptation documents produced by your Regular version have no special letter shapes other than those of English (ASCII). If that is not true, do not follow the steps below - the steps would not produce acceptably converted files. (Instead, you would need a technical person to do an "encoding conversion" of each document file first, using a Unicode conversion utility such as TECkit. Then, once the conversion has been done, you could transfer the documents following the steps below.)


A full discussion of the technical issues will not be attempted here. We will just make a couple of points. The procedure works because the English ASCII character set is supported within the Unicode character set by a simple addition of adding a byte with value zero to each ASCII character, to form the valid equivalent Unicode character. That's why it works just for the English letters. And here's the most important detail: Adapt It Regular stores documents as single-byte characters, and Adapt It Unicode stores characters as series of single-bytes in an encoding standard called UTF-8. And only for the English letters, those ASCII bytes are IDENTICAL to the UTF-8 bytes. That means that adaptation documents produced by the Regular version, if they only have English characters, will be read in by Adapt It Unicode 100% correctly. That's what this procedure below is relying on.


So this post is limited to the procedure of converting from Adapt It Regular (non-Unicode) to Adapt It Unicode, when only the English character set is used for the source and target languages.


File organization


Each type of Adapt It application maintains a Work folder. In the case of Adapt It Regular this folder is called Adapt It Work. The folder name for Adapt It Unicode is Adapt It Unicode Work. The Work folders can be found in the My Documents folder of a Windows based machine.


Each Adapt It project resides in a subfolder within the Work folder. While it might be tempting to simply move the project subfolder from Adapt It Work to Adapt It Unicode Work, just copying the project folder will NOT work because Adapt It Unicode uses different configuration files than those used for Adapt It regular -- so the project folder would contain an incompatible project configuration file that might make Adapt It Unicode crash. Thus, we need to go through several steps to make a correct transition.


Procedure

  1. Adapt It Unicode must contain an appropriate project to hold the documents you wish to convert. If such a project already exists, move on to the next step. If not, you can create a new Unicode project using the standard process within the Adapt It Unicode application. Run the Adapt It Unicode application and create a <New Project> going through the project setup wizard, then Cancel the wizard at the <New Document> wizard page. (This produces configuration files compatible with the Unicode version. It's an important step.)

  2. Copy document files from the non-Unicode project to the Unicode project.

    1. Using Windows Explorer, locate the non-Unicode project adaptation folder within the old Adapt It Work folder. The folder will have a name following the model “Source language name to Target language name adaptations”. (The italicized words represent the actual language names.)

    2. Open this folder and find the subfolder named “Adaptations”. Open the Adaptations folder, select and copy all of the files in the folder.

    3. Using Windows Explorer, locate the corresponding Unicode folder called Adapt It Unicode Work.

    4. Paste the files into the Adaptations subfolder of the Adapt It Unicode work folder. (Make no changes to them.)

  3. Determine how you wish to obtain a populated knowledge base for use in the Unicode project. Just copying the Regular version's knowledge base file to the project folder in the Unicode version's work folder should be all you need to do. The project folder has the same name as the project folder, but with .xml at its end. The xml within it is UTF-8, and it should be able to be read successfully by Adapt It Unicode for the same reason that Adapt It Unicode can read the Regular version's document files -- if only English characters are in them.

    If there is any problem with getting a populated knowledge base in the Unicode version, then two alternative ways are given below, a. or b; but if there are no problems then you don't need to try a. or b.

    1. A good alternative way to get the knowledge base populated is to use the Restore Knowledge Base… command under the File menu, in the running Adapt It Unicode. This process is easy to do and is discussed in the Adapt It Help files. If you do it, you must have copied across all the adaptation document files first. The command uses the information contained in the adaptation documents to reconstruct the knowledge base file.

    2. Alternatively, you could use the Export and Import commands found under the Export-Import menu. This process is also discussed in the Adapt It Help files.

      1. In Adapt It Regular use the Export Knowledge Base… command to create a dictionary file and save it in a folder where you can find it. Use the default “Standard Format” export type, which produces a dictionary file with standard format markers \lx and \ge (and some others, but \x and \ge are the important ones -- if you see those in the file, then you've got the right kind of knowledge base export).

      2. Check the knowledge base file in the project folder in the Work folder Adapt It Unicode Work. If the knowledge base file is larger than 1 kb, then delete it (using Windows Explorer). Adapt It Unicode will then make an empty new one when you next launch the application. You need to start with an empty knowledge base file. (An "empty" one has size of 1 kb -- there is a little xml within it, it's never of size 0 bytes.)

      3. Close Adapt It Regular and open Adapt It Unicode.

      4. Use the Import Knowledge Base… command – again using the default “Standard Format” type. Click OK and in the “Filename for KB Import” dialog locate the dictionary file (created in step i above), and click “Open” which will merge the entries into the Unicode version's empty knowledge base file.

Once these steps have been done, you are ready to continue the translation process using Adapt It Unicode.