The World Library Book Beat Blog
Volume 2, Number 2
Wednessday, April 11, 2012
by John Guagliardo
Founder, World Public Library
World Public Library Machine Translation Editions
58 Versions of 2,000,000 eBooks Producing 116,000,000 eBooks
All of our 2,000,000 eBooks will soon be available in multiple language editions. Each eBook will be available to be read in any of 58 different languages. The original eBook is automatically converted by our Literary Machine Translation System (LMTS), which produces a translated edition. This edition will have the original page side-by-side with the Machine Translated page in order to provide ease of cross-reference assistance in clarifying contextual meaning.
What are World Public Library Machine Translation Editions?
World Public Library Literary Machine Translation System (LMTS), is an automated translation service that provides automated translations between 58 different languages. It produces translated editions of all of our 2,000,000 eBooks. Each eBook will be available to be downloaded in any combination of our supported languages. With World Public Library Machine Translation Editions, we hope to make information universally accessible and useful, no matter which language it is written in.
How does it work?
When World Public Library LMTS generates an eBook, it looks for patterns in hundreds of millions of documents to help decide on the best translation possible. By detecting patterns in documents that have already been translated by human translators, World Public Library LMTS can make intelligent guesses as to what the appropriate translation would be. This process of seeking patterns in large amounts of text is called "statistical machine translation". Since the translations are generated by machines, not all translations will be perfect. The more human-translated documents that World Public Library LMTS can analyze in a specific language, the better the translation quality will be. This is why translation accuracy will sometimes vary across languages.
Statistical Machine Translation (SMT) as a research area started in the late 1980s with the Candide project at IBM. IBM's original approach maps individual words to words and allows for deletion and insertion of words.
Lately, various researchers have shown better translation quality with the use of phrase translation. Phrase-based Machine Translations, which are based on a “phrase alignment” model, statistically computes translation of whole phrases rather than by words alone.
World Public Library’s LMTS uses a combination of syntax-based translation with a joint-probability model for phrase translation and transfer rules based on a rich translation lexicon. This builds a phrase arising from syntax-based models that either use real syntax trees generated by syntactic parsers, or tree transfer methods motivated by syntactic reordering patterns. Although the outcome is not always as elegant as the words written by an author, the base meaning is clear enough for a basic understanding of that author’s intended meaning.
What languages does World Public Library Machine Translation Editions support?
World Public Library Machine Translation Editions currently publishes in 58 languages:
Current beta languages are:
● Haitian Creole
World Public Library LMTS tests other languages, called "beta languages" that may have less-reliable translation quality than our supported languages. We are always working to support other languages and will introduce them as soon as the translation quality meets our standards.
For more information regarding the LMTS technology (click here).
We hope you will enjoy reading the new World Public Library Machine Translation Editions.