Wednesday, December 26, 2018

"Canned loop": the pre and the Odyssey of machine translation


I don't think that you can forget one of the first times that I used Google translate and wrote the name "James Bond" the translator. The result was "Canned loop" (must admit that it is a fair translation). That was the translation that we got in those days. We all agree that has come a long way since the 2000s until it launched officially as the translator of Google in April 2006. Although we are miles away from those early attempts at machine translation, GMT, or more recently renamed GNAT (Google Neural Machine Translation), system is not without limitations.
It seems that the artificial intelligence of the system is not enough to decide when it comes to consistency. For example, we usually see the same term translated differently in different segments. And, if you add a file unreliable source, such as for example, a converted file not premeditated... well, we are in serious trouble.
Projects with automatic translation, which usually involve post-editing, require this step mainly to increase productivity (i.e., the amount of words that one can process per day), and consequently, to make the work more quick. Today, machine translation is reliable to accelerate the translation process, but this does not mean that the system only will do the job. As we have already mentioned, we are far from "Canned loop", but there are still problems that we must solve. Even in some cases post-editing required more work, since we have to pay attention to the inconsistencies of machine translation, and often ended up more than postmeridian reintroduce. But fear not, it is still a more agile process that increases the amount of work we can do in a day.
Not only is it necessary to pay attention to the generated by machine translation inconsistencies, but that is of utmost importance to know the contents of the file source before applying the automatic translation. Here is where the per comes into play. In a previous publication, we mentioned that the converted files and copying content can generate unnecessary as labels problems when climbs the file translation tool if the passage of per-editing is not. The same applies to projects that are made with automatic translation. The labels are not the only problem. We also have the problem of incomplete sentences, the letters that are converted into numbers or symbols, which are not recognized by the system. And if they do, the system interprets something that makes no sense in our context.
Therefore, the point is that whenever we have a project of post-editing with files that need pre-editing, not we omitted this step. Our customers rely on our responsibility and professionalism when agreeing to pay for the service of pre-editing. They do not expect a translation trazanesca because that can do it for themselves. They expect us to take all measures necessary to ensure the consistency, and a good smooth translation that not appear automatically and to keep the original meaning. And to achieve that, we need to start improving the source file.