Blog post written by our Polish team leader, Albin W, PhD., Poland
Both: Machine Translation and the ‘Twin Peaks’ TV series
What a strange title, you may think. Yes, it is a bit weird, but we will dive into the complicated Polish grammar here and you will see the connection.
In Polish you will find plurality of words for ‘both’. To choose a proper one you need to know what genders both subjects have. The form of ‘both’ depends also on the grammatical case used. I have counted eight basic forms of ‘both’ in Polish (obaj, obie, oboje, obu, obiema, oboma, obojgu, obojgiem) and even more longer synonyms (obydwie, obydwu, and so on).
Yet another difficulty is that in Polish the words ‘obaj, obie’ are used as adjectives or pronouns but never as a conjunction. Then, if you had to translate this title into Polish, you would need to reword it, for example, into ‘Two things:…’.
It can be one of many illustrations why an accurate machine translation could be considered only a dream in the near future. Machine translation may be an effective tool for acceleration translator’s work especially for schematic texts. I have personally written a macro in Microsoft Word, which gives me suggestions for English phrases containing from 1 to 5 words (the higher number of words having higher priority). It makes sense, because often few words constitute a meaningful phrase, while single words are meaningless. This macro used by me while translating patent documents (with many words or short phrases repeating) allows me to accelerate work as well as preserve consistency (the last being especially important in patents). My dictionary gets larger with every new document. So, in my opinion there are great possibilities to improve machine translation, but still I am very skeptical about ever seeing a high quality output even using my macro or any similar approach.
If you remember the ‘Twin Peaks’ series there is a shot (Episode 1, the first episode after pilot) with Bobby Briggs having dinner with his parents. At the end of his monologue major Briggs says: ‘To have his path made clear is the aspiration of every human being in our beclouded and tempestuous existence. Robert, you and I are going to work to make yours real clear.’ In the Polish version of the script the last sentence reads: ‘Obaj dołożymy starań, by twoje ścieżki były naprawdę proste’ which translates to: ‘We will both make efforts for your paths to be really clear.’ In this situation using ‘both’ in English could lead to ambiguity (maybe major Briggs will make efforts together with his wife?) while in Polish it is unambiguous (because ‘obaj’ is used only when two subjects have masculine gender).
Translation in this form is more attractive for a movie than for literal translation, because it is shorter and less complicated. But it is improbable to be a result of a machine translation, as it requires the knowledge of the gender of both persons involved and while this knowledge may be readily available for a human it may be not so easily accessible for a machine. If the computer does not understand all the context related to culture, human interactions, laws of physics and so on, it probably cannot produce superior results. And such an understanding requires the computer to think like a human, which is an idea put forward many years ago with actually no essential progress until now, despite the fact that we have lots of so-called artificial intelligence systems, but we are still a long way away from creating a perfect machine translation, and we might never get there when a translation like Polish has such complicated grammatical structure.