Commercial CAT Tool performance in Translating Informative Texts from English into Bahasa Indonesia


Commercial CAT Tool performance in Translating Informative Texts from English into Bahasa Indonesia

Choirul Fuadi
Applied Linguistic, Yogyakarta State University
Choirulfuadi78@gmail.com


Abstract: Many researchers have been conducted about machine translation evaluation. The evaluations have aim to detect error and improve the machine translation performance. Machine Translation or known as Computer Assisted Translation (CAT) Tool has different performance from each other. There are many CAT Tools, but generally involve of two types; free (such as Google Translate) and commercial (such as Memsource). The built of Memsource as commercial CAT Tool has aim to create a cloud based translation, so multiple translators can work the same file at the same time and see the progress. In this article, it aimed to present the commercial CAT Tool performance in translating informative texts (journals) from English into Bahasa Indonesia by detecting the errors. Memsource as commercial CAT Tool used in the particular study. The data took randomly through stratified sampling and 2019 words length extracted of 14 documents. In the analysis of data, SAE J2450 metric by SAE used to detect the error. In the findings, in translating texts from English into Bahasa Indonesia, there were 81 errors produced by Memsource of 2019 words or 8.82%. The errors of Memsource in translating texts from English into Bahasa Indonesia are caused by two factors: Machine system such as Memsource terminology and lack information transfer, and language such as the affixes, terms, and grammar.

Keywords: Commercial CAT Tool, Informative Text, SAE J2450

Introduction

The brief history of machine translation were established by Warren Weaver in 1947 (Arnold, et.al. 1994, p.12). Firstly, the machine translation is needed to translate languages during the last World War, and then translation tool was developed. By the emergence of technology, it also affects to the presence of machine translation today. And, in the era of digital technology, translation software developers also grow rapidly in providing machine translation (Weda, 2014, p.153).

Machine translation or known as Computer Assisted Translation (CAT) Tool aims to help translators and is one solution for time consuming and costly human translator process. Not only does have it a purpose of reducing the cost of the translation process, using a translation tool has a purpose of increasing the quality of the translated material (Azer, 2015, p.226). In short, the presence of machine translation also answers the need of translation.

Discussing about the benefit of using machine translation, House (2013, pp.10-11) states the benefits of using machine translation. First, it helps translators solve difficult translation problems through workstations, such as grammatical words. Second, it assists the translator in his or her attempt to retrieve highly routine and idiomatic target language structure. Third, it provides the encyclopedic knowledge, such as problems on terminology.

In fact, the machine translation has different performance from each other. And, the CAT tools also provide service in many languages. The term of machine translation might be divided into two types; free (such as Google Translate) and commercial (such as Memsource).

Memsource as one of commercial CAT tool was founded by David Canek in Prague, The Czech Republic in 2010. Memsource is one of commercial CAT Tools (Albanesi, et al., 2015, p.85), and the commercial product (Sandrini, 2015, p.67). The built of Memsource as commercial CAT Tool has aim to create a cloud based translation, so multiple translators can work the same file at the same time and see the progress. In the market, Memsource approach requires a business relationship in which the translation client trusts (based on the tool reports) the language service provider that the fair amount of the post-editing effort is being charged (Teixeira, 2014, p.17). Rule-based machine translation (RBMT) systems were the first commercial machine translation systems (Jussa, et al, 2012, p.248).

People use CAT Tools to help translating many text types. A Journal article is one of the text types (an informative text). A Journal article as a source of information is one which students need to translate into many languages. They have choice to select CAT tools as their tool.
By the existence of commercial CAT tools, the problem may arise in the output of a CAT tool, such as the low quality of translation output from the source text to the target text. Then, a user needs to know the quality of their CAT tools that they will use. They need also to know the weaknesses and strengths of each CAT tool.

The particular article aims to present the commercial CAT Tool performance in translating informative texts (journals) from English into Bahasa Indonesia by detecting the errors. Memsource as one of commercial CAT Tools become one of the alternative tools to provide translation tools. Moreover, the evaluation of CAT tools systems is an important field of the research (Popovic, 2011, p.658 & Azer, 2015, p.226).

Translation evaluation has traditionally been based on error detection (Conde, 2011, p.70). House (2015, p.2) stated that translation quality assessment means both retrospectively assessing the worth of a translation and prospectively ensuring the quality in the production of a translation. Here, translation evaluation might evaluate the output of Commercial CAT tool in translating a text from English into Bahasa Indonesia by detecting the errors.

As in Oxford Dictionary (2009, p.151), evaluation means “decide on the value or quality”. The term of machine translation evaluation might be divided into manual and automatic. The term automatic machine translation evaluation refers to scoring the output from a machine translation system with respect to a small corpus of reference translations (Finch, A., Hwang, Y. S., & Sumita, E., 2005, p.17). The examples of automatic evaluation methods are BLEU, NEVA, WAFT, Word Accuracy and Meteor. And, manual evaluation refers to the collection of human judgments on a translation output (Federman, 2012, p.131). The example of manual evaluation method is SAE J2450 standard. The error categories, classification, and weights (SAE, 2001, p.5) is presented in table 1.
Table 1. Error Categories, Classifications, and Weights (SAE, 2001, p.5).
No
Category Name: Abbreviation
Sub - Classification : Abbreviation
Weight : Serious/ Minor
A
Wrong Term (WT)
Serious (s)
Minor (m)
5/2
B
Syntactic Error (SE)
4/2
C
Omission (OM)
4/2
D
Wrong Structure and Agreement Error (SA)
4/2
E
Misspelling (SP)
3/1
F
Punctuation Error (PE)
2/1
G
Miscellaneous Error (ME)
3/1

Method

Research Type

This study sought about the CAT tools performance in translating an informative texts (Journal) from English into Bahasa Indonesia by detecting the errors. A qualitative descriptive approach was used in the study. The process does not need manipulation; it looks at the real problem (Syahrina, 2011, p.6). The CAT tools was Memsource.

In the study, the data were analyzed using error analysis. Regarding the error analysis, Ellis cited in Corder argues that error analysis should be restricted to the study of error (1999, p.51). Stymne and Lahrenberg (2012, p.1785) define Error analysis as a means of assessing machine translation output in qualitative terms.

Regarding the errors, as defined in SAE J2450 (2001, p.3), the meaning is accommodated in the notion of weights score: serious and minor, of an error type. A serious weight error score happens when an error is clearly serious, or if not, its effect the meaning of the translation. In contrast, then it should be classified as a minor weight error score. In error analysis, cited in Burt (1975), Xie Fang & Jiang Xue-mei (2007, p.12) made a distinction between “global” and “local” errors. Global errors hinder communication in comprehending some aspects of the message. Local errors only affect a single element of a sentence, but do not prevent a message from being heard.

Data Source

The data of the English texts were 2019 words of 14 documents (journal articles). The texts types were: 1) education, 2) politic, 3) science, 4) administration, 5) economy, and 6) other genre texts. The data were collected by purposive sampling. The data was analyzed in utterance unit level (words, phrases, clause, and sentences). The data were translated by Memsouce on June, 2017. References translation was provided by a professional translator to evaluate the data.

Data Collection Technique

The process of this study consisted of three major parts. The first part was translation experiment.  There were diverse texts (journal article) and then, the texts were translated using Memsource. Then the result of translation was assessed with an evaluation standard which is called the SAE J2450 standard and computed the normal score. After that, assess the nature of error produced by CAT Tool. 

The  procedure of  SAE  J2450  metrics,  as written  in  SAE’s  publication (2001, p.4),  consists  of  five  actions, as follows : (a) marking the location of the error in the target text with a circle, (b) indicating the primary category of the error (See table 1), (c) indicating the sub-classification of the error as either “serious” or “minor, (d) looking up the numeric value of the error, and (e) computing the normalized score.

In the finding, the score was computed by using the formula:


sc = number of serious errors in the category c
mc = number of minor errors in the category c
N = number of words in the source text
Overall Score = Sum Score : Sum Source Word
Data Validity
Providing triangulation validity is one of the methods to ensure the data validity (Golafshani, 2003, p.603). This particular study used two of the triangulation types; data triangulation and investigator triangulation. First, in the data triangulation, the study had more than one translation results (output of CAT tools) English into Bahasa Indonesia and the data come from diverse texts (Journal article). Second, in the investigator triangulation, this study invited a professional translator (English into Bahasa Indonesia and vice versa) as a reference translation and gave scores to the error weights.

Data Analysis
After the texts were collected, data analysis must be performed. The results of translation by CAT Tools were presented in data display. Then the data were analyzed qualitatively by connecting the result with the nature of error of SAE J2450 model. Analysis unit is error of wrong term, misspelling, syntactic error, wrong structure and agreement error, omission, punctuation error and miscellaneous error. The last, it made conclusions about the commercial CAT Tool performance in translating informative text from English into Bahasa Indonesia.

Finding and discussion

The following table is about the errors made by Memsource in translating informative texts from English into Bahasa Indonesia. The table shows about the error categories and weight of Memsource analysis result in translating text from English into Bahasa Indonesia presented in Table 2.
Table 2. Memsource Analysis Results English into Bahasa Indonesia
Memsource
English into Bahasa Indonesia
Overall Score
C
Sc
Mc
Score
Wrong Term (WT)
4
21
4.5+ 21.2= 62
25/ 30.86%
Syntactic Error (SE)
2
18
2.4 + 18.2 = 46
20/ 24.7%
Omission (OM)

3
3.2 = 6
3/ 3.7%
Wrong structure and agreement error (SA)
2
9
2.4+ 9.2 = 26
11/ 13.59%
Misspelling (SP)
3
7
3.3 + 7.1 = 16
10/ 12.35%
Punctuation Error (PE)




Mis-cellaneous error (ME)
5
7
5.3 + 7.1 = 22
8/14.8%
Overall score
16
65
177: 2019= 0.08816= 8.82 %
81

Based on Table 2, the error weight of Memsource is 0.08816 or 8.82 % in the output of translating texts from English into Bahasa Indonesia. The total errors are 81 of seven categories. The Translation Error Rate (TER) is 81 (errors) divided by 2019 words, which is 4.01 %. The maximum score of TER is 30%, which means that the error weight of Memsource is under the 30%. With the TER score of 8.82%, Memsource has shown a good performance in the translation output.

Based on Table 2, Wrong Term is the first error type. There are 25 errors or 30.86% of the overall score. Second, Syntactic Error is the second common error type. There are 20 errors or 24.7% of the overall score. Miscellaneous Error is the third common error type. There are 12 errors or 14.8% of the overall score. Wrong structure and agreement error is the fourth common error type. There are 11 errors or 13.59% of the overall score.

Misspelling is the fifth common error type. There are 10 errors or 12.35 %.  The next error is Omission. They are 3 errors or 3.7%. Punctuation Error (PE) does not found in the translation texts English into Bahasa Indonesia using Memsource. The following diagram is about the error made by Memsource in translating diverse texts from English into Bahasa Indonesia. The diagram shows the errors of seven categories of SAE J2450 and their sub-classifications.


Figure 1. Errors made by Memsource (English into Bahasa Indonesia)

Figure 1 shows that the overall score of errors by Memsource in translating texts from English into Bahasa Indonesia are 81. In the serious and minor sub-classification, there are 16 serious weight errors or Global errors for 2019 words or 0.79 errors for 100 words. And there are 65 minor or local errors for 2019 words or 3.22 errors for 100 words. Overall, there are 81 errors for 2019 words or 4.01 errors for 100 words.

Figure 1 shows the error category ranking from the highest to the lowest, as follows: First, Wrong Term (WT) is the first common error type. There are 25 errors with 4 serious or global errors and 21 minor or local errors. The errors of the Wrong Term are found in wrong term, word inflection, abbreviation, wrong proper name, and multi word inflection. The difference between English and Bahasa Indonesia may become the first factor that caused errors. One word in English may have several translations in Bahasa Indonesia depending on the context and semantic meaning. The Memsource system built was based on linguistic feature and sub segment matching system; and then Memsource system requires post editing by the translator. Second, the insufficient terms or dictionary of the terms in Memsource terminology may influence in the fuzzy matching search.

Second, Syntactic Error (SE) is the second common error type. There are 20 errors with 2 serious or global errors and 18 minor or minor errors. The common error is found in the wrong linier order (misordering/ word order). The difference phrase or word order between Bahasa Indonesia and English may become the factor that caused errors. In fact, Memsource used sub-segment matching and predictive sub-segment matching; the quality of the translation depends on the size and quality of the source. Then, the source texts quality also affects to the output of Memsource.

Miscellaneous Error (ME) is the fourth common error type. There are 12 errors with 5 serious or global errors and 7 minor or local errors. The common errors are found in the literal translation of terms, inconsistent translation, additional words, confusing translation which culturally denotes the target language. In fact, the system of commercial CAT Tool is more complex than translating word to word or literal translation. And then Memsource is made through linguistic information based on the source and target language retrieved from rules and grammars. By sub-segment matching, Memsource made consistent in translation because a segment already translated will be suggested to next segment.

Wrong Structure and Agreement Error (SA) is the fifth common error type. There are 11 errors with 2 serious or global errors and 7 minor or local errors. The common errors of SA are found in inflection of verb/ tense, preposition, active passive voice, and connector inflection. The difference affixes between English and Bahasa Indonesia may become factors that caused the error. Bahasa Indonesia has many affixes such as “me”, and “ber” in the present, “me” and “ter” in the perfect, “di” and “me” in the past tense, prefix “di” in the passive voice; they may give a different meaning depending on the context and semantic meaning. Connector “yang” in Bahasa Indonesia is quite difficult to translate by machine.

The next error type is Misspelling (SP). There are 10 errors with 3 serious or global errors and 7 minor or local errors. The errors are mostly violating terms in the glossary, denote the concept, and inappropriate with the target language. It caused by translation to not only translate the language but also the culture. The lack of information is one of the weaknesses of Rule Based Machine Translation which fail on the sentence analysis.

The next error is Omission (OM). There are 3 errors with 3 minor or local errors. The common errors are found in un-translated words, such as un-translated prepositions, and verbs. Un-translated prepositions may happen through the failure of sentence analysis and the system develops the linguistic rule to have different meanings depending on the context. In the un-translation of the verb, the difference of verb tense between Bahasa Indonesia and English may become the factor that caused the error produced by the Memsource.

The last error is Punctuation Error (PE). PE is not found errors in the translation texts of English into Bahasa Indonesia using Memsource. It may happen because Memsource has an adequate typography in the Memsource terminology.


Conclusion
In summary, in translating texts from English into Bahasa Indonesia, there were 81 errors produced by Memsource of 2019 words or 8.82%. The first error type is Wrong Term (WT) with 25 errors. The common errors are found in wrong proper names, abbreviations, word inflections, wrong in terms and multi word inflections.

The second type is Syntactic Error (SE) with 20 errors. The common errors found in wrong linier orders (word orders). The third error type is Miscellaneous Error (ME) with 12 errors. The common errors are found in literal translations, additional words, confusing translation which culturally denotes of the target language.

The fourth error type is Word Structure and Agreement Error (SA) with 11 errors. The common errors are found errors in inflection of verb/tense inflections, passive – active voice, preposition, and connector inflections. The fifth error type is Misspelling (SP) with 10 errors. The common errors are found in violating the term in the glossary and inappropriate with the target language. The next error is Omission (OM) with 3 errors. The common errors are found in un-translated word, such un-translated of preposition, verbs, and words. Punctuation Error (PE) is not found in the analysis.

The last, the errors of Memsource in translating texts from English into Bahasa Indonesia are caused by two factors: Machine system such as Memsource terminology and lack information transfer, and language such as the affixes, terms, and grammar.
References
Albanesi, D., Albanesi, D., Bellandi, A., Benotto, G., Segni G.D., & Giovannetti, E. (2015). When translation requires interpretation: collaborative computer–assisted translation of ancient texts. Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities by LaTeCH 2015 , 30 July 2015. China: Association for Computational Linguistics and the Asian Federation of Natural Language Processing.
Arnold, D., Balkan, L., Meijer, S., Humphreys, R. L., & Sadler, L. (1994). Machine translation: An introductory guide. London: NCC Blackwell Ltd.
Azer, H.S., & Aghayi, M.B. (2015). An evaluation of output quality of machine translation (padideh software vs. google translate). Journal of Advances in Language and Literary Studies, 6 (4), 226 – 237.
Conde, T. (2011). Translation evaluation on the surface of texts: A preliminary analysis. The Journal of Specialised Translation, 15, 69 – 86.
Ellis, R. (1999). The study of second language acquisition. Oxford : Oxford University Press.
Federmann, C. (2012). Appraise: an Open-Source Toolkit or Manual Evaluation of MT Output. Proceeding of Poster and Demo Track of the 35th German Conference on Artificial Intelligence by KI-2012, 24-27 September 2012. Germany: KI-2012.
Finch, A., Hwang, Y. S., & Sumita, E. (2005). Using machine translation evaluation techniques to determine sentence-level semantic equivalence. Proceedings of the Third International Workshop on Paraphrasing (IWP2005) by Michigan University, USA, October 2005. USA: University of Michigan. 
Golafshani, N. (2003). Understanding reliability and validity in qualitative research. The Qualitative Report, 8 (4), 597 - 607.
Hornby, A.S. (2009). Oxford dictionary: Learner’s pocket dictionary (4thed). China: Oxford University Press.
House, J. (2013). Translation. China : Oxford University Press
House, J. (2015). Translation quality assessment: Past and present. New York: Routledge
Jussa, M.R.C, Farrus, M., Marino, J.B., & Fonollosa, J.A.R. (2012). Study and comparison of rule-based and statistical catalan-spanish machine translation systems. Journal of Computing and Informatics, 31 (2), 245–270.
Popovic, M., & Ney, H. (2011). Towards automatic error analysis of machine translation output. Journal of Computational Linguistic, 37 (4), 657–688.
SAE. (2001). Surface Vehicle Recommended Practice. Accessed on August 1, 2016, from APEX: http://www.apex-translations.com/documents/sae_j2450.pdf.
Sandrini, P. (2015). Openness in computing the case of linux for translators. In; Sandrini, P.; Garcia Gonzalez, M. (Ed). Translation and Opennes Innsbruck. Austria: Innsbruck university press.
Stymne, S., & Ahrenberg, L. (2012). On the practice of error analysis for machine translation evaluation. Proceedings of the Eight International Conference on Language Resources and Evaluation by LREC’12, May 2012. Istanbul: European Language Resources Association (ELRA).
Syahrina, A. (2011). Online machine translator system and result comparison - statistical machine translation vs hybrid machine translation. Thesis, Unpublished. University of Boras.
Teixeira, C.S. C. (2014). The impact of metadata on translator performance: how translators work with translation memories and machine translation. Doctoral Thesis, Unpublished. Universitat Rovira I Virgili.
Weda, S. (2014). Transtool versus conventional translation in digital technology era. International Journal on Studies in English Language and Literature (IJSELL), 2 (8), 149-153.
Xie Fang & Jiang Xue-mei. (2007). Error analysis and the EFL classroom teaching. US-China Education Review, 4 (9), 10 - 14.

Author’s Brief CV
Choirul Fuadi was born in Pangkalan Bun, on 31 August 1992. He graduated from the English Education Program at State Islamic College Palangka Raya in 2014. He continued his study at the Applied Linguistic program of Yogyakarta State University and graduated in 2017. He works as freelance and translator at tukangterjemah.com.


0 Comments:

Post a Comment