Facebook spans more than 100 languages and now the platform is bringing new features across multiple languages faster, thanks to multilingual embeddings powered by newly developed artificial intelligence. For Facebook users, the update will mean both faster, more accurate translations along with greater accuracy flagging content against the platform’s rules.
Neural language processing is a type of A.I. that is designed to process language — but bringing features powered by the tool to a new language takes almost as much work as building a completely new program. To solve this, Facebook engineers created multilingual embeddings. This technique classifies words that have similar meanings — such as soccer and fútbol — together, placing them with other similar words inside the embed.
By grouping words with similar meanings together, a single program can be trained in multiple languages. Facebook developers say that this works even if the language wasn’t used while training the A.I. For example, if soccer and fútbol are classified together, the program will recognize fútbol even when the program was only trained using the English soccer.
Before the multilingual embeddings, Facebook took one of two approaches to bringing features to different languages. One, they could create a separate set of training data for each language, but re-training for every language would mean delays in bringing the feature to a global audience. The second option previously used by
With the changes, Facebook users worldwide will see features in their language faster, including M Suggestions inside Messenger and Recommendations. Overall, using
But the update has implications even for English users that aren’t bilingual. The programs that detect posts that are against Facebook’s policies will gain more accuracy across more languages, Facebook says, supporting the platform’s ongoing work against hate speech, terrorism, and fake news.
Facebook’s engineers said that while the program shows improvements in English, German, French, and Spanish, there is more work to do to add data from additional languages. An expansion could help the software better understand cultural phrases that may not have literal translations, like “raining cats and dogs.”