by Ben West, Brian Sacash
AWS PSS attendees interact with Novetta’s Machine Learning Prototype demo on the show floor.
Introduction
In our previous blog post, Sharpening the Edge with IoT, ML, and Homomorphic Encryption, we discussed our prototype application with the following functionality:
- Identify when a simulated license plate (alphanumeric characters on a white card) appeared in a live video stream
- Extract text from the license plate card using Amazon Rekognition
- Securely search an encrypted watchlist for this text
- Return any matches to the user
Novetta and Enveil demonstrated this prototype at AWS Public Sector Summit, June 11-12.
This blog post is an inside look at how we created a machine learning prototype, running on a Raspberry Pi and AWS Snowball Edge, that securely searched an encrypted watchlist without ever decrypting the data.
Training the Machine Learning Model
Collecting Training Data
We needed to train a machine learning model to detect when a license plate card appeared within the video stream. When training a machine learning model, the first step is data collection and processing. In this step, you collect examples of what you want to identify so that the model can learn to detect the patterns associated with those images. In our case, we wanted to detect when a license plate card was visible in the video stream.
The first challenge we faced was collecting sufficient data to train the model. Few examples of images that we wanted to detect (people holding up white cards in a conference setting) were available online. Additionally, creating a dataset from scratch – taking pictures of people holding cards in conference-like settings – would have been time consuming.
We came up with a creative solution. We synthetically created our training set by writing a script to search for and download images taken at conferences then superimposing cards with random letters onto the image. This simulated the types of images we needed to detect in real life.
Model Training
With our new dataset, we used Fast.ai to train an image classifier to detect the presence of white cards containing numbers and letters. To accelerate this process and reduce the amount of training data we needed to create, we leveraged a machine learning technique called transfer learning, where a model trained for one type of task is repurposed for a different but related task. By starting with a pre-trained model – in this case, a Resnet18 convolutional neural network – we were able to train a model in about 20 minutes that correctly identified when the image contained a license plate card 97% of the time, with just over 2,000 training images.
While the model performed well on our training data, we wanted to confirm that it wasn’t actually detecting some artifact of our synthetic images. To test this, we evaluated the model on a live video stream of Novetta developers holding physical cards. In our lab, it worked just as well as it did on our synthetic data.
However, once we arrived at the conference center and tested it our booth, the results were not as consistent. To address this, we collected an additional set of approximately 200 training images at the conference center. To reduce false positives, this training set included people holding and not holding cards, and we varied elements such as shirt color, the distance at which the card was held, and the angle of the card. The process of collecting the additional images and retraining the model took less than 30 minutes, and significantly improved the accuracy and consistency of the model.
Deploying to a Raspberry Pi
Deploying our trained model “at the edge” to a Raspberry Pi 3 B+ was the second big challenge we faced.
Our initial model, which relied on PyTorch, could not reliably be pushed directly to the device. After multiple unsuccessful attempts to compile PyTorch and Fast.ai on the Raspberry Pi, we turned to a new AWS ML service, Amazon SageMaker Neo. Amazon SageMaker Neo allows models to be trained in a centralized way then cross-compiled for use on different cloud and edge devices, one of which is a Raspberry Pi. Using the SageMaker Neo API, we retrained our card recognition model using the MXNet framework and deployed it to the Pi in just a few minutes. The new model delivered the same performance and generated inferences with sub-second latency. With the model deployed and a video camera hooked up to the Pi, our prototype could detect when the video stream detected a license plate card.
Extracting the Text
Extracting the alphanumeric strings placed on the license plate cards was a further challenge. While we could have trained a custom model to do this ourselves, we decided it would be easier to use Amazon Rekognition for the task. Since Rekognition has already been trained on millions of images, an API call is all it took to extract the text from a license plate card.
Encrypted Search with Enveil
Once the text of the license plate card was extracted, we wanted to simulate searching against a watchlist hosted on the AWS Snowball Edge. In our scenario, AWS Snowball Edge users should be able to submit queries against the watchlist, but should not be able to view the (sensitive) contents of the watchlist. To enable this requirement, we utilized a homomorphic encryption capability provided by Enveil. Typically, for a computer to perform operations against encrypted data, the data must first be decrypted. The decrypted data is used in the computation and the result is then re-encrypted. Fully homomorphic encryption, however, enables the computation to be completed without ever decrypting the data.
Once the text of a license plate card has been extracted by Rekognition, it is returned to the Enveil ZeroReveal™ Server application running on the AWS Snowball Edge. The Enveil ZeroReveal™ Server compares the data input to the “secure watchlist” – a list of mission–sensitive license plates – and processes the information without exposing the contents of the watchlist in the untrusted environment. Users in a separate secure environment can view results and manage the contents of the watchlist through the Enveil ZeroReveal™ Client. This enables appropriately credentialed individuals to analyze the collection of data at the edge in near real time, without risking exposure of sensitive information.
Putting it All Together
By using a combination of custom trained models, Amazon Web Services, and Enveil, we created a prototype of a secure application operating in an edge environment. End-to-end, the system performed the following tasks:
- Collect Data at the Edge Using an IoT Device (Raspberry Pi)
- Process Video Using a Custom Novetta Object Detection Model
- Extract Text Using Amazon Rekognition
- Check Watchlist through Enveil ZeroReveal™ running on AWS Snowball Edge
- Communicate and Display Results in a Secure Environment
This prototype illustrates the ways in which Novetta enables our customers to use machine learning at the edge in disconnected environments, leveraging the power of the cloud. Using services like Amazon SageMaker Neo, AWS Snowball Edge and Enveil, we deliver the latest in sensor and machine learning technology to support critical missions.