What You’ll Learn in this Article:
- Microsoft wants its competitors in the facial recognition sector to be subject to testing measuring for accuracy and unfair bias.
- The company says it wants regulation because it is leading the industry, and recently touted high facial recognition algorithm accuracy levels reported in a globally-watched Commerce Department test.
- Results of these highly-regarded tests have been used as justification for face tech vendor choices by U.S. government agencies including the State Department.
- Corporations we’ve come to associate with face recognition including Amazon, Apple, Google and IBM did not participate in the voluntary test.
- Civil liberties, privacy and facial recognition fairness advocates worry that despite the industry’s massive accuracy improvement, facial recognition systems could be deployed in non-voluntary, undemocratic ways.
When a troupe of biometrics geeks convened in a conference room in October to hear a Commerce Department researcher deliver results of a test gauging facial recognition algorithm accuracy, they expected the usual incremental updates. Instead, some were stunned by what they heard. One wonk was so shocked she spilled her coffee.
Microsoft was the only U.S. company to garner top scores. Now the company wants a law that requires its competitors to undergo the same type of testing. It could be a state law – even a municipal law – it doesn’t matter to Microsoft’s longtime top legal counsel and President Brad Smith, who declared his regulation mission during a presentation at the Brookings Institute on December 6.
“What we need to do is not only impose an obligation of transparency, we need to require under the law that the companies that are in this business of providing facial recognition technology in fact enable third parties to test these services for accuracy and for unfair bias.”
Reports of Microsoft’s support for government regulation of face recognition tech trickled out after Smith spoke last week. But most missed what could be the key reason why Microsoft is making its splashy push for rules: it’s on top of the U.S. face recognition leaderboard. Arguably, guiding regulation in the immediate future that would require possibly less-accurate competitors to show their cards could help Microsoft dominate the face recognition sector and win important contracts.
Guiding Rules while Ahead
“We believe in the importance of a law not because we are behind, but because we are ahead,” said Smith, standing before an audience gathered at the think tank for an afternoon presentation and Q&A. “We are at the forefront of this industry when it comes to the development of this technology.”
The software firm is banking on closely-watched test results published November 30 by the National Institute of Standards and Technology, a Commerce Department agency which measures the accuracy of facial recognition technologies. Turns out Microsoft’s algorithm ranked at the top with match accuracy just under 100%, making the stodgy tech firm from Redmond suddenly seem more powerful. The test is well-regarded by many in the industry; results were anticipated by engineers and IT professionals in government and corporate circles across the globe evaluating face recognition tech vendors.
Computing capabilities have increased drastically since the last test, and the NIST report concluded that the entire industry has improved not just incrementally, but “massively,” noting that at least 28 developers’ algorithms now outperform the most accurate algorithm from late 2013. By incorporating highly complex neural networks, developers have improved the ability of face recognition algorithms to detect identities even when poor quality images are employed, for example.
The unprecedented advancement of these technologies worries civil liberties and human rights advocates not only because proof of high accuracy could spur wider use of face recognition identification, criminal investigation and surveillance by federal, state and even municipal governments, but because they could become commodified consumer-aimed tools. Some believe a time when everyday people have face recognition apps on their phones, providing details about individuals may not be far off.
“Virtually across the board our algorithms came out at or very close to the top,” Smith said in reference to the NIST test.
Where were Amazon, Apple, Google, IBM and the other big names in AI? Nowhere to be found. In fact, the U.S. corporations we’ve come to associate with facial recognition did not even participate in this voluntary test, the first of its kind since 2014.
“As a society, we need legislation that will put impartial testing groups like Consumer Reports and their counterparts in a position where they can test facial recognition services for accuracy and unfair bias in a transparent and even-handed manner,” Smith wrote in a blog post published the day of his Brookings talk. He suggested that companies make their technologies accessible via APIs or application programming interfaces.
“Virtually across the board our algorithms came out at or very close to the top.”
– Brad Smith
Smith implied that Microsoft should be among those guiding government regulations because the firm is already in discussions around issues related to privacy, fairness, democratic freedom and human rights as they relate to artificial intelligence. “Like any company, you don’t want to see the race run by some people who are taking the high road while others who may just not be thinking enough about these issues cause the market to tip.”
When he called for a law requiring testing of facial recognition systems for accuracy or fairness, it stands to reason that Microsoft may believe mandatory testing could be to its advantage because its system performed so well in the recent NIST test. Microsoft did not make a representative available to comment on the record for this story.
Gold Standard Rankings Beget Government Contracts
“Test results can inform policy,” Patrick Grother, a biometrics science researcher who led the NIST test told RedTail. He added in a NIST press statement, “The implication that error rates have fallen this far is that end users will need to update their technology.”
One biometrics insider who asked to remain anonymous reaffirmed that notion, suggesting government entities using facial recognition may be compelled to change vendors and work with Microsoft.
French multinational tech firm Idemia was awarded a one-year contract by the State Department’s Consular Affairs office in June to enable facial recognition services for passport and visa application identity screening. At the time, the agency pointed to the company’s earlier NIST test accuracy scores to justify its choice, noting in a statement that Idemia “is the most accurate non-Russian or Chinese software according to the Department of Commerce – National Institute of Standards and Technology’s Face Recognition Vendor Test.”
Today, however, not so much. In the new NIST measurement gauging how well systems work in ageing scenarios — when attempting to match current photos of faces to photos taken several years prior — Microsoft’s algorithm performed better than all others including those of Yitu, VisionLabs and Idemia – which came in seventh place.
Arguably, as the most accurate facial recognition technology outside of China’s Yitu, Microsoft could be seen as an attractive vendor to government agencies that most likely would prefer to work with a U.S. company for security reasons. According to the DoS document providing justification for choosing to work with Idemia, Microsoft is among a list of “contractors that expressed interest, but have never managed a Face Recognition Gallery with over 50 million faces for a U.S. Government agency.” U.S. competitors like Google, IBM and Amazon were not on that list; Lockheed Martin was.
“Globally speaking the NIST benchmarks are treated as gold standards for the industry….Favorable performance results also can serve marketing purposes.”
– Joy Buolamwini
“Globally speaking the NIST benchmarks are treated as gold standards for the industry,” said Joy Buolamwini, an MIT researcher and founder of the Algorithmic Justice League who has exposed examples of poorly-trained facial recognition technologies that incorrectly label women as men and cannot decipher distinctions among people with dark skin. Buolamwini, who communicated via email with RedTail, has inspected previous NIST tests and found what she considered to be flaws in the methodology.
“Tech firms regardless of location that are interested in contracting with the U.S. government are incentivized to submit to these [NIST] benchmarks to show the readiness of their technology,” she added. “Favorable performance results also can serve marketing purposes.”
“The benchmark results of NIST are well-recognized as the golden standards of global industry for its strictness,” a spokesperson from Yitu told RedTail via email. “That’s why Yitu joined in the contest to measure its technology.”
Microsoft, ICE and Employee Backlash
Microsoft and other firms may want government entities to use their facial recognition technologies, but when well-known consumer brand names including Amazon and Google have courted these types of contracts, they have sparked backlash from consumers, civil liberties advocates and employees.
Microsoft’s work with Immigration and Customs Enforcement became the subject of uproar in July, during the height of ICE’s controversial actions separating families at U.S.-Mexico border crossings. The company supplies use of its Azure cloud platform, which includes its facial recognition technology, to ICE, though Microsoft stated in July that the contract involves use of the system for email, messaging and document management, and was not being used for facial recognition.
Other government use of face recognition technology has spurred concern among civil liberty and privacy advocates for years. Just this month, ACLU Senior Policy Analyst Jay Stanley warned of potential abuse from a new plan to employ facial recognition in and around the White House, including on public streets. “Face recognition is one of the most dangerous biometrics from a privacy standpoint because it can so easily be expanded and abused — including by being deployed on a mass scale without people’s knowledge or permission. Unfortunately, there are good reasons to think that could happen.”
It may seem counterintuitive, but it’s the increasing improvement of these technologies that has privacy and civil liberties advocates worried. Put simply, the higher the accuracy level ratings of these systems, the more readily governments and commercial enterprises will be to implement them for purposes as far-ranging as criminal investigation to cruise ship traveler identification.
“Even if these tools reach some accuracy thresholds, they can still be abused and enlisted to create a camera-ready surveillance state,” argued Buolamwini. And, she suggested that even accurate technologies can be employed in ways that facilitate unfair treatment of innocent people and minorities. “Faulty systems that subject innocent people to undue scrutiny due to misidentifications or inaccuracies will most impact communities of color. Still accurate systems can be used to systematize racial profiling,” she told RedTail.
Proof in Numbers, or Not?
According to two accuracy-related metrics highlighted in the 2018 NIST report summary, when tested in June 2018, Microsoft’s algorithm performed best, missing just 0.15% and 0.23% of face matches respectively. Think of it this way: Microsoft’s face recognition algorithm was accurate at least 99.77% of the time according to those key measurements prominently featured in the report. Yitu did best when tested using another accuracy metric, missing 1.6% of the time.
It wasn’t just Microsoft or Yitu that got good marks. Overall, just 0.2 percent of searches by all algorithms tested failed to return matches in 2018, compared with a 4 percent failure rate in 2014 and 5 percent in 2010. “The technology from 2010 is almost certainly in the trash can,” said Grother.
What does this mean in a practical sense? Grother explained using a criminal investigation framework. “If you were sitting on a pile of photos that didn’t prove useful four years ago, if you were to research them today, some might bear fruit and give you an investigative lead,” he suggested.
The “one-to-many” NIST test measured the ability for face recognition algorithms to match a person’s photo with a different one of the same person stored in a database featuring millions of image samples. NIST employed a dataset including 26.6 million portrait photos of 12.3 million individuals and additional smaller data sets featuring webcam, photojournalism, video surveillance and amateur photo images. In all, the one-to-many test gauged 127 algorithms implementing identiﬁcation of faces from 45 developers from around the world.
“If you were sitting on a pile of photos that didn’t prove useful four years ago, today some might bear fruit and give you an investigative lead.”
– Patrick Grother
Technologists anticipate face recognition algorithms will continue to improve as a result of the constant influx of data and quickening pace of computing capacity. Ultimately, in order for Microsoft to take advantage of its recent NIST superiority, it would need to garner facial recognition contracts before competitors catch up in the speed and accuracy race.
“The facial recognition industry as a whole was been under increased scrutiny, so there is a need for industry players to attempt to prove the technical merits of their systems. However improved technical performance does not prevent abuse of these technologies and increases concerns about mass surveillance and privacy issues,” said Buolamwini.
She also took umbrage with the NIST test methodology, noting the lack of detailed demographic and phenotypic information related to the benchmarks used. “This current report does not provide detailed information on demographic performance or phenotypic (skin type/facial hair) performance so governments that are considering deployments on multi-ethnic populations or in places like airports are missing vital information.”
Grother said NIST is working out the details of a companion report covering demographic information; the agency aims to publish it in the first quarter of 2019.
“Even if a follow-on report is planned to look at demographics when it comes to performance, we need to know if certain groups are being underrepresented in the benchmark and the extent to which results can be generalized,” contended Buolamwini.
Even Grother himself cautioned that despite the overall industry-wide improvement of face recognition algorithm accuracy, the report should be considered a comparison among algorithms as applied in specific testing scenarios, more so than a grand conclusion validating face recognition system perfection.
“This is a snapshot in time,” said Grother regarding the new test results. Already, NIST is repeating the test based on algorithms submitted in late October, he added. “The accuracy revolution is ongoing.”