Note: this article was originally posted in January 2015
The Problem (from a user’s email message):
Typing back translations in which ‘we’ forms may be ‘inclusive’ (of the hearers) or ‘exclusive’, to differentiate the English, I have entered we(inc) and we(exc) . What I have noticed is that the final “)” is dropped out of the adapted auto-inserted text, so that the meanings appear in the document as we(inc and we(exc . What can you tell me about this? Is there some way to directly edit the list of words that perform the conversion to make revisions/changes?
There is a difference between a punctuation character being in the source text at document parse-in time, versus a punctuation character introduced into the target text by your typing within the phrase box. Direct editing of the knowledge base forms will not work, because the problem is not because of what is stored in that database.
While this answer deals just with ( and ) being included in the punctuation settings as punctuation characters, the answer applies equally well to any punctuation not originally in the source text, but which has been introduced by you explicitly typing it into the phrase box as you work.
The automatic restoration of a punctuation character that has been stripped off, relies on that stripping off having been done at the time of creating the document from the supplied source text file. Then the stripped off punctuation character gets separately stored in the document within memory, and so can be automatically restored from there.
However, when you supply an (inc) substring to the end of the word by typing in the phrasebox, the following happens:
a) The ( character, even though listed as a punctuation character, it is word-medial. Since punctuation stripping works from both ends of the word towards the middle and stops at the first non-punctuation character encountered, the ( is never seen by the stripping algorithm, and so remains in the target text form sent to the knowledge base for storage and later retrieval for auto-inserting at some later document location which has the same source text word there.
b) The ) character, however, is at the word end. So it is recognised by the punctuation stripping algorithm as a word-final punctuation character — it gets stripped off before the rest of the phrasebox contents gets sent to the knowledge base. So we(inc is what is sent to the knowledge base for storage there.
Recall that this location in the document had no ) punctuation in the source text, you have introduced the word-final ) manually by explicitly typing it into the box. This results in different auto-insertion behaviour – the ) will not be automatically restored in later identical locations.
The reason for that is that while the knowledge base is getting the right form to store, there is no “)” already stored within the document in subsequent locations where the word form with (inc at it’s end is going to be later inserted. So automatic replacement of that user-introduced final ) into the target text cannot happen, unless one of the following three things is done.
1) Every time a word from with (inc at its end is auto-inserted into the document, you would need to halt auto-inserting temporarily, and relocate the phrase box by going back to those insertion locations and manually typing in the final ) parenthesis at each such place in the document. That would be tedious and frustrating, and I don’t recommend you do it. (Though it would work. But read to the end of this blog item, because for certain scenarios it would be the only reasonable way to proceed if a solution within Adapt It itself is wanted.)
2) Do Preferences > Punctuation
and in the Punctuation page of the preferences, remove the ) from at least the Target text column of the line which is
) -> )
or remove ) from both Source and Target columns.
Just removing ) from the Target column is sufficient provided you never use ( word initially.
The Target text will then not treat ) as a punctuation character, and so when you first supply the substring (inc) in the phrase box, at the end of the word the final ) will not get stripped off, and so the word form with the ) included will go into the knowledge base. Then at each subsequent location in the document where an auto-insert is done, the word form with final (inc) will be inserted into the document, which is what you would want.
3) Use Consistent Changes (within Adapt It) to get the final ) character replaced automatically where needed
See the comments at the end of this blog item, for how to do this.
Is 2) the best advice in all situations?
Possibly not – but it may be done if you take appropriate care.
Removing a punctuation character from the punctuation settings will ‘work’ in the sense that from that time onwards, words which you type with that punctuation character (let’s assume it is a comma) will include the final comma in the word form stored in the knowledge base; and so those can be restored to the document with the comma included. But the danger in this is that if you finish working with the current document, and create a new one from a new source text, and you forget to restore the comma to the punctuation settings, then the parse-in of the new document will not have commas stripped out. Then you are likely to think Adapt It has a bug – “it’s not stripping out commas!” you will think. But in fact the fault lies with you – you forgot to restore a comma to the list of punctuation characters.
Can Adapt It be reprogrammed to handle user-added punctuation characters to the target next, more conveniently, so that removing them from the punctuation settings is not required?
Unfortunately, it’s not a realistic option to try and reprogram Adapt It to try make it work like you might want without doing the punctuation settings change as in 2) above, or by using Consistent Changes. There’s no way that Adapt It could predict accurately what the user is going to do at other potential locations lower down in the document, and provide special handling at the right places only.
But manually going back and typing the missing punctuation character at each location that lacks it, will always work – though it’s tedious to do so.
How can I make Consistent Changes fix this problem?
The ability to use the Consistent Changes (CC) software is built into every released version of Adapt It. Adapt It’s knowledge base makes no use of it, so using it is entirely optional.
However CC, if turned on for use, applies the one or more changes tables you supply when a source text word has no translation in the knowledge base – the changes are done invisibly and the final result shown, selected, in the phrase box. You can use this behaviour to effect a neat solution to the problem under discussion, provided that the place where the change needs to be done in the source text word can be 100% reliably searched for an found by the CC software. In our present case, (inc would be a sufficiently long string for a search to reliably find only the places where we want the change to (inc) to be made. So for this scenario, CC would be ideal.
The following table line would do it, for the example above.
“(inc” > “(inc)”
i) Create a CC table with that line.
ii) Give the table file a convenient name (you can create and name the CC table within the dialog you see after clicking the Load Consistent Changes item in the Tools menu) and
iii) Select the CC table for use (do that in the same dialog) after you’ve created it and
iv) click OK to accept the dialog’s settings.
Thereafter, as auto-insertions take place, the wanted form will be correctly created in the phrase box at each subsequent location with the same source text; and you will not need to go back and manually fix things up.
Note: This Consistent Change solution would not be useful if there was no way to write a CC table line which always found only the right place to effect the needed change.