CES: Mobileye/Intel Self-Driving Secret? 200PB of Data
Stored between AWS and on-premise systems, world's largest automotive dataset comprising more than 200PB of driving footage, equivalent to 16 million 1-minute driving clips from 25 years of real-world driving
This is a Press Release edited by StorageNewsletter.com on January 14, 2022 at 2:01 pmMobileye, an Intel Corp. company, is sitting on a virtual treasure trove of driving data – some 200PB worth.
Click to enlarge
When combined with the company’s computer vision technology and capable natural language understanding (NLU) models, the dataset can deliver thousands of results within second, even for incidents that fall into the ‘long tail’ of rare conditions and scenarios. This helps the AV and computer vision system handle edge cases and thereby achieve the high MTBF rate targeted for self-driving vehicles.
“Data and the infrastructure in place to harness it is the hidden complexity of autonomous driving. Mobileye has spent 25 years collecting and analyzing what we believe to be the database of real-world and simulated driving experience, setting Mobileye apart by enabling highly capable AV solutions that meet the high bar for mean time between failure.” said Prof. Amnon Shashua, president and CEO.
How it works:
Mobileye’s database – believed to be the world’s largest automotive dataset – comprises more than 200PB of driving footage, equivalent to 16 million 1-minute driving clips from 25 years of real-world driving. Those 200PB are stored between AWS and on-premise systems. The sheer size of the company’s dataset makes the company one of AWS’s largest customers by volume stored globally.
Large-scale data labeling is at the heart of building computer vision engines needed for autonomous driving. The firm’s rich and relevant dataset is annotated both automatically and manually by a team of more than 2,500 specialized annotators. The compute engine relies on 500,000 peak CPU cores at the AWS cloud to crunch 50 million datasets monthly – the equivalent to 100PB being processed every month related to 500,000 hours of driving.
Why it matters:
Data is only valuable if you can make sense of it and put it to use. This requires comprehension of natural language along with computer vision, Mobileye’s long-standing strength.
Every AV player faces the ‘long tail’ problem in which a self-driving vehicle encounters something it has not seen or experienced before. This long tail contains large datasets, but many do not have the tools to effectively make sense of it. The company’s computer vision technology combined with extremely capable NLU models enable the firm to query the dataset and return thousands of results within the long tail within second. Mobileye can then use this to train its computer vision system and make it even more capable.The company’s approach accelerates the development cycle.
What is included:
The firm’s team uses an in-house search engine database with millions of images, video clips and scenarios. They include anything from ‘tractor covered in snow’ to ‘traffic light in low sun,’ all collected by the company and feeding its algorithms. (See sample images).
Click to enlarge
More context:
With access to the industry’s high quality data and the talent required to put it to use, the company’s driving policy can make sound, informed decisions deterministically, an approach that removes the uncertainty of AI-based decisions and yields a statistically high mean time between failure rate. At the same time, the dataset hastens the development cycle to bring the lifesaving promise of AV technology to reality more quickly.
Resources:
Blog : Prof. Shashua Takes Us ‘Under the Hood’ at CES 2022
Blog : Discover the Latest Vehicles Driven by Mobileye
Blog : Online at CES 2022