Welcome to the May 2020 installment of BASELINE, Novetta’s Machine Learning Newsletter, where we share thoughts on important advances in machine learning technologies likely to impact our customers. This month’s topics include:
- New open source translation models
- Synthetic audio generation
- Resources for learning about advances in machine learning
Open Source Machine Translation
To understand world events, analysts can’t rely solely on English-language news and social media. While the large cloud service providers (Amazon Web Services, Google Cloud Platform, Microsoft Azure) provide high-quality, easy-to-use translation APIs, the costs of API calls can add up. That is why we are excited about the 140 languages now supported in Hugging Face’s transformers library. While translation quality is a bit lower than that of CSPs, open source machine translation will enable us to translate more data for customers and will help support our ongoing NLP research. We will also be incorporating this into AdaptNLP.
Synthetic Speech and Style Transfer
Synthetic speech generation has taken another step forward with the release of NVIDIA’s Flowtron, an improved speech synthesis model that also enables style transfer for voices. NVIDIA’s new approach is designed to generate human voices with greater realism than prior open source methods. Additionally, Flowtron can apply style transfer, which allows users to record a voice and convert it to sound like another voice. The tradeoff is that Flowtron is yet another tool that makes it challenging for people to trust what they see and hear online.
Data augmentation has played a big part in the ability of image models to improve accuracy, so it’s natural that researchers have attempted to apply analogous augmentations in the text domain. One of the most successful image augmentation techniques is mixup, which mixes together two images in some proportion. MixText similarly mixes representations of two pieces of text, using them to augment training data through semi-supervised learning. What is impressive about the researchers’ work is that MixText proved particularly effective when there were only a few labeled examples. Increasing accuracy while only needing a few labeled examples will lower the time and cost to train performant models.
Made with ML
One of the best ways to learn about advances in machine learning is to see how other researchers are approaching problems. The biggest challenge is often finding which examples are worth reviewing. Made with ML is a new site that provides an opportunity for researchers to share their work. Community members can vote on their favorite projects, making it easier for everyone to find high quality examples. Made with ML is becoming one of our go-to resources for discovering new approaches in machine learning.
Recent advances in transfer learning have aided the development of multilingual language models such as mBERT, M4, and XLM-R, but most evaluation methods focus on English-language capabilities. To help solve this problem, Google introduced XTREME, a benchmark for nine cross-lingual NLP tasks such as sentence classification, structured prediction, and question answering. XTREME supports 40 typologically diverse languages that span 12 language families in order to maximize language diversity for each task. Benchmarks like XTREME will help fight biases in models where some languages may be second class citizens. While the focus is typically on building better models, how we measure them can also be a new vector in how we improve NLP task performance.