Translating Medical Text: Why It’s Important to Understand Post-translation

October 21, 2021 by Essay Writer

Notably, Arabic language is largely used by doctors and specialists, more especially those working in Arab countries. However, most of the medical texts and reports are written in English language. This means that translations from English to Arabic are required. However, this is tied up to several challenges such as lack of accuracy and inadequate knowledge of comprehending the ideas behind such terms. According to Aljlayl & Frieder (2001), most doctors and specialists working in the Arab world use English when writing medical texts or reports, even for prescriptions. Expressing terms requires accuracy, knowledge and understanding of ideas behind terms. It is important to acknowledge that, science and technology uses language characterised by complex terminologies. This makes it difficult in translating such terms.

As stated by Al-Ma’ni (2000), post-translation has played a major part in mitigating the effects of the barriers of communication and cultures. Translating scientific concepts and content can prove challenging for the reasons that, it calls for higher accuracy and understanding to decode the underlying message. The understanding of terms from text plays a major part in translating and expressing the terms, however, the targeted language is of equal importance. Still, scientific translation of text targets to achieve high level of precision of the words used found in “Target Language Text” (TLT) and “Source Language Text” (SLT). But, this must be achieved without losing the authenticity of the message.

Gass & Selinker, (2008) notes, the problem in maintaining the authenticity arises when translation equivalence (sameness or similarity) cannot be obtained. Most unfortunate, perfect translation equivalence is not possible. This is because each language is made up of various lexical, textual systems and different grammatical systems responsible for differentiating from one language to another (Gey, & Oard, 2001). The cultural challenges and the textual / semantic are the major challenges encountered in translating a text from English to Arabic. Still important to consider is that translator should understand the medical terms or texts from both languages. Therefore, this essay recommends that a translator should realise the need to understand the medical texts presented for post-translation. Such an understanding is required in both the source and target languages. Most important, the medical practitioners should be able to determine the sensitivity of the texts being translated

Semantic challenges

Montalt and Gonzalez (2007) states, “there exists a wide cultural differences between English-speaking world from that of Arabic world.” Some of these differences can be termed as purely semantic. Cultural difference translates into literally translating a term or terms or a text. In most cases, the English-Arabic literal translation of terms overlooks the accuracy of message. Therefore, the original meaning could be lost. Transliteration and arabizationfrom the SL (English) to TL (Arabic) in most cases leads to the loss of the accuracy of the original meaning. The use of transliteration and arabization allows the transfer or conveying of Latin letters into Arabic letters. However, it does not actually overcome the challenge presented by non-equivalence and neologism.

Montalt, V. and Gonzalez, M. (2007) also adds, “The semantic relationship present within the medical compound elements proves hard to work on.” This is because the choice of words to represent the accurate meaning is limited. In addition, real words translations cannot be found in bilingual dictionaries. Moreover, knowledge about medical terms in SL and TL is important. For instance, knowing what a suffix in a term from both languages can be paramount in achieving accuracy. This requires the translator to have vast understanding of the Latin and Greek languages. This would make it easier to know the meaning of the prefixes and suffixes used which in turn can help boost accuracy. Nevertheless, it is important to note that understanding the meaning of the prefixes and suffixes does not guarantee accuracy of post-translation.

Still, finding words that have equivalent meaning between English and Arabic is not always guaranteed. This is because there are terms or phrases that have more than one meaning, say in English, centrally to Arabic (Montgomery, 2000). Knowledge about the subject field of related texts is important in translation. Also, failure to recognise the textual level during the time of translation may undermine cohesion and attention. This is because the translator may end up ignoring the context of the message. Accurate and precision in translation of terms by the translator is very paramount as it plays a critical role in the structure and context of the subject text. Still, post-translation may require the translator carefully understand the meaning of the message before translating. In reality, context reliably provides guidelines in a for post-translation which can be used to determine the most applicable meaning as activated from the source code Gonzalez (2007). In addition, context will give an insight of the intended accurate meaning a text. Equally important to note is that there was generalisation of words during the translation. This did alter the meaning of the term to the target group. This is because of the availability of equivalent terms of words or phrases. The meaning of any medical text in Source Language (SL) automatically effects the way the text is translated in the Target Language (TL). This simply means that, lack of knowledge about the subject matter resulted in loss of accuracy in TL targets.

Cultural / Terminology Challenges

Cultural difference between English-speaking worlds from that of Arabic world creates translation obstacles. Determining the right medical phrase requires that the translator to have knowledge on both “Source Language” (English) and the “Target Language” (Arabic). The translator should clearly determine the choice of words of the target audience. That is, the post-translation should put into consideration of “who” will use the text being translated. For example, “alhimaq”, that is in Arabic, should be translated automatically to “Varicella”, that is in English. If the text was meant for the patient, the right choice of term, that is, “chickpox” should be used for the term. In most cases, where experience in translating medical texts is limited, a translator may interpret a drug name into what is referred as “Target Culture Equivalent” (TCI). However, the criteria may not function. This is simply because, while the text may be used to refer of name of the drug as it is known in English language, at other times the English language uses brand names.

In addition, culture differences have effects on linguistic competence. During translation, it is required that a translator should possess knowledge of principles and rules governing the structure of the source language and target language. More so, to achieve translation equivalence, comprehensive competence should be accounted for. This means that the translator should be able to extract information from the source language. This will enable the translator to analyse a text semantically and pragmatically. Accuracy in translation also is determined by the encyclopedic competence; the general knowledge of the translator on both languages. The translator’s interaction efficient of both languages is paramount. That is, the translator should be able to express what he understands a medical text into the “Target Language’ without losing the authenticity of the original message.

Nevertheless, cultural differences affect how competent a translator is during the translation. The translator should be able to reconstruct the meaning of the “Source Language” into a “Target Language” text without inconveniences. Understanding the textual and cultural features is vital in translation. Also, understanding the culture of the target language is a requisite to translation competence and accuracy. To address the translation competence, the translator should proceed to analyse the language structure, elements and language patterns of the target language.

According to Gass & Selinker (2008, p. 449), Lexical knowledge may as well be of great importance during the post-translation. In fact, this knowledge is regarded as the most important component in any translation stage. Of important here is that, the translator should be careful in their choice of words. There are some texts if directly or literally translated will distort the meaning. Translators ought to comprehend cultures choice of words before translating a message from source language to target language. Most of errors and problems experienced during post-translation are resultant of non-equivalence between the “Source Language” and the”Target Language.” Baker (1992.). The translator must identify areas of emphasis that must be translated from “Source Language” to the “Target Language.” The challenge is bigger when an emphasized word does not exist in the target group.

Challenges of State of Art

State of Art refers to different medical practices adopted by medical specialists and practitioners in health and medical domains. These practices vary because of what is referred to as diglossia. The term simply refers to sociolinguistic phenomena, whereby the language may be used differently for different social purposes. For instance, the Arabic dialect is not used for teaching pharmacy, medicine and other related programmes in all Arab countries. For this reason, it would prove difficult to try translating English dialect into Arabic for teaching health related programmes in Arab countries. It should be noted that, since most of the professional medical articles are written in English, it is hard to establish the dialect to be used. Dialect refers to the particular language that emerges following existence of certain social organization. The dialect may take considerable time to develop, simply because it is neither taught nor found in written text. This explains why it may take time to establish the dialect in Arabic language for teaching in Arab universities.


Generally, translating medical texts from source language to target language demands high level accuracy and consistency. Despite various challenges encountered in translation, some medical terms can be translated with ease. However, presence of complex structures and medical compound terms poses a challenge during translation. For example, terms such as hypergammaglobulinaemia, videofluoroscopy, etc. Translation of medical terms can be quite challenging more especially among the less experienced translators. Finding equivalent terms in translation from Source Language to Target Language may be affected by differences in culture and semantic problems among others.Translation of English-Arabic medical text has greatly contributed to evolution and development of the medical field, Montgomery (2000). As Schubet (1987) puts it, environment determines the accuracy and competence in translation. An interdisciplinary approached is paramount in solving non-equivalence translation related issues. Translators must train to deal with challenges involved in translating technical terms. Along this line; Sanchez (2010, p.186) recommends the need to equip all the medical translators with technical know-how required during translations. The rationale of this is that, lack of technical translators is the major setback in the stage of post-translation.

Translators should have knowledge on the structure and culture of the target language. In absence of an equivalent term, the translator should consult medical specialists to obtain the knowledge of the medical text and its meaning in English before translating it into Arabic. In addition, translators need to validate the information before translating the medical text. However, translators experience difficulties in coping up with challenges involved during the exercise of translation. Ambiguous and new terms from source language lack equivalent terms in target language.

Read more


A Corpus Based N-gram Hybrid Approach of Bengali to English Machine Translation

October 21, 2021 by Essay Writer


Machine translation means automatic translation which is performed by computer software. Although there are several approaches of machine translation, some of them require extensive linguistic knowledge while some oblige huge statistical calculations. Hence, this paper introduces a hybrid methodology integrating corpus based approach and statistical approach for translating Bengali sentences into English with the help of N-gram language model. The corpus based approach finds the corresponding target translation, selecting the best match text from the bilingual corpus to acquire knowledge while the n-gram model rearranges the sentence constituents to get accurate translation without employing any external linguistic rules. A variety of Bengali sentences of various structures and verb tenses are considered to be translated. The performance of the proposed system is evaluated in terms of WER, BLEU and F-measure, along with other conventional singleton approaches as well as Google Translate, a well-known machine translation service by Google. It has been found that experimental results of this work provide higher accuracy of 0.87 BLEU score over Google Translate and other methods.


Machine Translation, abbreviated as MT, pertains to the application of computers to automate some or all the processes of transforming text between any pairs natural human languages preserving the meaning and interpretation of both source and target languages [1]. It is a genesis of Natural Language Processing (NLP) and Computational Linguistics (CL). Though numerous researches have conducted in this area, it is still a challenging job to produce a completely automated translation machine. Verily, human languages are complex in practical with versatile characteristics. The major barriers for translating human languages by computers are: Word order: different languages follow different order of sentence constituents; word sense ambiguity: same words and phrases have different meanings; syntactic complexity: sentences are often conducted by anomalous grammar rules; lexical variance: a word in one language is to be expressed by group of words in another; elliptical and ungrammatical construction of sentences. So far, researches are being conducted to overcome these shortcomings.

At present, different types of methods are used for machine translation, such as— direct, transfer, interlingua, corpus based, statistical approach etc. In this paper, a new approach has been proposed for Bengali to English automatic translation. The new method blends the idea of corpus based approach and n-gram language model of IBM.

The rest of the paper is organized as follows: Section II reviews some previous researches on this topic. Some core machine translation approaches are discussed in section III. Section IV describes the new proposed hybrid approach with complexity analysis and Section V illustrates the experimental result including the corpus and comparative study. Finally, Section VI concludes the paper with some future directions.

Related researches

Bengali, also known by its endonym Bangla (বাংলা), is the sixth most spoken language in the world by population. In approximate, 250 million speakers are there worldwide in this language. Unfortunately, quite a few research works have explored in developing machine translation software which uses Bengali as the source language and English as the target language. Most of the Bengali MT systems proceed towards English to Bengali translation. Several rule based and statistical approaches have been explored for English-Bengali translation. Moreover, researchers are more concerned about verb tenses rather than sentence types— simple, complex, compound. This research takes the fact into consideration and works on both sentence types and tense.

Reference introduces new parameters of statistical machine translation (SMT) along with the existing parameters to translate complex Bengali sentences. In , a rule based approach is initiated considering the influences of verb and case in Bengali assertive and interrogative sentences. Later on, reference develops a transfer based algorithm to correspond with meaning and context of Bengali sentences. A framework is designed using context sensitive grammar rules in. Another empirical framework is modeled in to translate imperative, optative and exclamatory Bengali texts. In the meantime, some researchers attempt to build systems for Bengali to other language translation except English. Reference presents an architecture integrating transfer method and statistical machine translation for Bengali to Hindi translation. A system to translate Bengali texts to Assamese is described in utilizing Moses (a tool for MT).

Core Machine Translation Approaches

The ideas and techniques of machine translation involve linguistics, computer science, artificial intelligence, automata theory, translation theory and statistics. Different approaches are applied to automate translation. Some of core methods are discussed in this section.

MT methods can be classified into two main categories: 1) Rule based and 2) Example based. Rule based MT strategies are basically knowledge based techniques. Linguistic knowledge in the form of rules is applied externally separating sentences into possible linguistic unit for both source and target language. RBMT methodologies require syntactic, semantic and morphological analysis in context of grammar and lexicon. These approaches are: i) Direct, ii) Transfer and iii) Interlingua approach. Example based approaches are mostly data driven and analogy based where a set of translated texts have already stored in a bilingual database. These methods attempt in the fashion of such kind of methods are: i) Corpus based and ii) Statistical machine translation.

Direct Approach

The most primitive MT method is direct translation which is implemented between pairs of languages and based on morphological analysis and glossaries. It relies too much on dictionary look-up. In the direct translation approach, the SL text is analyzed operationally based on morphology for both source and target language pair. Direct Approach has five steps to translate:

Source sentence: তারা ফুটবল খেলছে।

Morphological Analysis:

তারা ফুটবল খেলছে PRESENT CONTINUOUS

Constituent Identification:


Dictionary Look up:


They are playing football

Target sentence: They are playing football.

Transfer Approach

This approach performs translation task considering the structural differences between the source and target language. It requires to know syntactic structures of languages.

The transfer model involves three stages: i) Analysis, ii) Transfer and iii) Generation. In the first stage, the source sentence is parsed and the sentence structure and the constituents are identified. In the next stage, transformations are applied to the source language parse tree to convert the structure to that of the target language. Finally, the translation is done on the basis of morphology of target language. In other words, this method can be summarized as: first parse, then reorder, finally translate. Figure 1 shows an illustration of this approach.

Source sentence: আমরা ফুটবল খেলি।


Sentence—আমরা [SUB] + ফুটবল [OBJ] + খেলি [VERB]





Target sentence: We play football.

Step 1: Analysis Step 2: Transfer Step 3: Generation

Transfer approach.

Interlingua Approach

Interlingua approach investigates a language-neutral analysis of the text. In this approach, the translation task comprises of two phases. First, the Source Language (SL) is converted into an intermediary form called Interlingua (IL) and then IL invokes the generation of text for Target Language (TL). IL shares an independent underlying representation from which translations can be generated to different TLs.

Source sentence: আমরা কলম দিয়ে লেখি।


Interlingua (IL) Representation:


ACTION write

INSTRUMENT pen number: singular

TENSE present


We write (with) pen.

Target sentence: We write with pen.

Corpus based Approach

Corpus-based machine translation (CBMT) approach is characterized by the use of a bilingual corpus at run time instead of human encoded linguistic knowledge. Previously translated texts are stored in a parallel training corpus and new sentences to be translated are treated as test set. The idea of this translation approach mitigates the need of prior translation rules and inspires to reuse the examples to create knowledge.

The method, at first, decomposes source sentence into fragments, finds translation for each of those from parallel corpus and then recomposes them accordingly. The most amazing thing with CBMT is it can be applied to any language pairs that have a parallel corpus and the only linguistic thing is to know is how to split into sentences.

Source sentence: তারা বাগানে কাজ করছে।

Bilingual Corpus:

তারাপড়ছে — They arereading.

আমরাখামারেকাজ করছি — We arein the farmworking.

তুমিবাগানেখেলছ — You arein the gardenplaying.

Target sentence: They are working in the garden.

Statistical Machine Translation

Statistical machine translation (SMT) is a data-oriented empirical translation framework which is based on probability distribution function. It finds the most likely translation among all possible target sentences by calculating the highest probability using Eq.

t ̂_1^I=(arg max)┬(t_1^I )⁡〖 {Pr(t_1^J |s_1^I )}〗

e ̂_1^I=(arg max)┬(t_1^I )⁡〖 {Pr(t_1^I ).Pr(s_1^J |t_1^I )}〗 … (1)

Given, a source sentence s_1^J=s_1…s_J to be translated into a target sentence t_1^I=t_1…t_I, where J and I indicates the number of words in the source and target sentence, respectively. The argmax operation denotes the search to generate output sentence while Pr(t_1^I ) is the language model of the target language and Pr(s_1^J |t_1^I ) is the translation model. It is noted that translation model of SMT assigns higher probability to the corresponding translation using a bilingual corpus while language model consigns to fluent or grammatically correct sentence from a monolingual corpus. SMT also requires search techniques and alignments to get the output.

Read more