Automatic Translation – Personal Translator Technological Bases

For over 20 years now, a team of philologists, computer linguists and computer scientists at Linguatec has driven forward the development of automatic translation technology, consequently making Personal Translator one of the leading programs in this field.

Personal Translator 20 - Automatic TranslationFor the development of the neural context network, Linguatec has spent months reading in German and other language documents and analyzing them using the latest analysis programs.

The Linguatec corpus is a collection of texts with a scarcely conceivable volume of over 1.55 billion word forms (printed it would be a stack of paper 125 meters high), making it one of the largest text corpora in the world.

It is this gigantic text corpus that has enabled innovative use of a neural network in a translation program.

Automatic Translation: Grammar and method of working

Translate with the intelligence of neural networks

Personal Translator is the first PC-based translation system in the world which, in addition to strictly rule-based translation technology, contains a newly developed component that simulates the associative thinking processes of the human brain according to the principle of neural networks and makes them usable for translation. Whilst rule-based translation focuses entirely on an individual sentence, the neural network also analyzes the context and transfers it to a semantic network.

The neural network of Personal Translator always comes into effect in the translation when the rule-based approach delivers no result or only an unclear result. This is particularly useful in the following functions:

  • The subject area recognition function automatically assigns the correct subject areas to a document and makes sure that instead of the general translation (for example, board = Brett) a suitable specialist term is used (e.g. Aufsichtsrat in a business context).
  • The name recognition function automatically identifies proper names of people, places or organizations in order that they may be used in the translation without any problems.
  • Thanks to context analysis, the neural transfer function can offer a correct translation in cases of ambiguity if the current sentence offers no starting point for the application of a rule; for example, He holds a bill in his hand (bill = Rechnung or Geldschein) or Er stand vor der Bank(Bank = bench or bank). The neural transfer developed by Linguatec has an international patent pending.

The intelligent hybrid method, which is used for the first time in Personal Translator 2006 and combines rule-based translation technology with a neural network, ensures a considerably improved translation of terms with several meanings, unclear expressions, proper names and incomplete sentences, and also noticeably reduces manual editing.

Slot grammar

The rule-based translation technology that forms the basis of Personal Translator uses the principles of slot grammar, a grammatical description method developed originally by IBM. The basic idea of slot grammar is that every sentence or part of a sentence has a central element (its head) and modifiers. For each head, it is possible to determine what kind of slots are available for modifiers (fillers). The slots can be determined by the part of speech or by the word itself. Therefore, almost every noun can be modified by adjectives, but only particular nouns can be modified by complements with the preposition to, for example: message to the company. The slots dependent on individual words must be specified in the dictionary with the definition of the word, since otherwise source language sentences cannot be correctly analyzed and translated.

Slots

The central concept of the slot grammar system is the slot. Slot grammar is based on the premise that every word has its own particular slots. Here, every meaning of a word must be taken into account. The word house, for example, has different slots as a verb than as a noun. The concept of a slot is related to conventional concepts like complement, object and attribute.

The slots are filled by sentence parts (fillers), which can be individual words or even entire sentences. The German verb schenken, for example, specifies that the person giving the present appears as the subject in the sentence, the person receiving the present as the indirect object, and the actual present as the accusative object:

Er schenkt dem Kind ein Auto.

Slots that belong to a particular word are entered with that word in the dictionary. Since a word can have several slots, one talks of the slot frame of the word (or the case governed by the word). The slot frame of the verb schenken comprises subject, accusative object and indirect object.

Slots can be optional, like accusative and indirect object in the case of schenken, or they can be obligatory, like the accusative object of verursachen. Optional slots can remain empty; obligatory slots must always be filled to create a complete sentence.

Fillers

Slots can be filled with different sentence parts. These are called fillers and must be specified for the individual slots for each word.

With a word like schenken, accusative and indirect object are nominal groups (sentence parts whose head is a noun). With the verb vergessen, on the other hand, the accusative object slot can be filled by clauses beginning with that and by infinitive clauses as well as by nominal groups:

Er hat das Prinzip vergessen.
Er hat vergessen, dass das Prinzip gilt.
Er hat vergessen, das Prinzip zu beachten.

The way Personal Translator works

Personal Translator breaks down a text into individual sentences (sometimes fragments of sentences) and then proceeds sentence by sentence. A sentence is first broken down into individual words. These words are reduced to their basic form and looked up in the dictionary. The grammatical properties and possible translations are assigned to the words. A syntactic analysis of the sentence then takes place, in which the sentence is broken down into its individual components (sentence parts). Actual translation then takes place in two phases: first, the lexical transfer, in which each word is assigned a valid translation for the specific context based on the translation conditions, and then the structural transfer, in which correct word order in the translation is ensured and other necessary structural changes are made (see also Correlation of source and target language complements and Transformations). Finally, the correct word forms are generated and the translation is produced in its final form.

Correlation of source and target language complements

To translate a sentence, not only must the equivalent words in the source and target language be known but also the respective slot frames. For the German and English slots (complements), standard equivalents are defined. Thus, the direct object in English is assigned to the German accusative object as the standard equivalent.

Example:

begleiten   –   subject, accusative object
accompany   –   subject, direct object

Note the difference in this example:

bedĂĽrfen   –   subject, genitive object
require   –   subject, direct object

For genitive objects, there is no standard equivalent.

Transformations

  1. Lexical transformations, which are bound to specific lexical units. The respective transformations are entered in the Personal Translator dictionary with the relevant words or expressions.
  2. Structural transformations, which describe general structural differences between German and English. They are part of Personal Translator’s transfer component.

Examples of structural transformations:

End position of the verb in German subordinate clauses compared with normal word order in English:

English: It is good that he has come.
German: Es ist gut, dass er gekommen ist.

The do construction in English interrogative sentences:

English: Did you answer the letter?
German: Beantworteten Sie den Brief?

The do construction with negation:

English: I didn’t answer the letter.
German: Ich beantwortete den Brief nicht.

Objects placed in front in German:

German: Diesen Brief beantwortete ich nicht.
English: I didn’t answer this letter.

Constructions with modal verbs:

German: Er hatte das Buch nicht lesen wollen.
English: He hadn’t wanted to read the book.

Lexical transformations

When words or constructions are translated, it is not always possible for part of speech and slot frame to correspond exactly. Personal Translator contains lexical transformations, which can handle a large number of structural differences between German and English constructions.

Some verbs do not require an object in the source language, but do in the target language.

There are numerous examples:

English: He golfed.
German: Er spielte Golf.

English: I bank at Barclay’s.
German: Ich habe ein Konto bei Barclay.

English: I inconvenienced him.
German: Ich bereitete ihm Umstände.

English: He hiccuped.
German: Er hatte Schluckauf.

An English construction with a predicate adjective can often be translated with a German verb:

English: He is aware of the new situation.
German: Er weiĂź von der neuen Situation.

An English subject can be replaced by an indirect object in German, and the accusative object can become the subject; the word order changes with like but not with lack:

English: I like it that the vase is red.
German: Es gefällt mir, dass die Vase rot ist.

English: I lack money.
German: Mir fehlt Geld.

Verbs that have a causative meaning when used transitively, such as drop, are replaced in German with lassen, for which there is no direct equivalent in the English sentence.

English: He dropped it.
German: Er lieĂź es fallen.

English: You should hear me out.
German: Sie sollten mich ausreden lassen.

Some English constructions with an infinitive are best translated with an adverb in German:

English: I like to read books.
German: Ich lese gerne BĂĽcher.

English: She happened to find the book he lost last week.
German: Sie fand zufällig das Buch, das er letzte Woche verlor.

Some verbs in passive constructions are translated in German with reflexive verbs in the active voice.

English: He said that he was injured.
German: Er sagte, dass er sich verletzte.

The most common lexical transformation, which translates have with sein, is available when defining words in Personal Translator.

English: She has walked to the house.
German: Sie ist zum Haus gegangen.