What you’ll learn in this article:
- RedTail’s Kate Kaye, a Portland resident, investigates the city’s plans for use of Replica, software from the controversial Google-affiliate Sidewalk Labs.
- AT&T funded and supplied data for UC Berkeley research that appears to have led to what Replica is today.
- Portland’s 2016 application for a U.S. Department of Transportation’s Smart City Challenge grant included plans for partnerships with Sidewalk Labs for wi-fi kiosks similar to those employing surveillance cameras in NYC.
- Sidewalk is not alone. Firms including AirSage, Teralytics and Streetlight Data also use mobile location data analytics for municipal clients.
(Note: See December 28, 2019 story update below.)
In its short four-year existence, Sidewalk Labs, a subsidiary of Google owner Alphabet, has inspired a reputation as an Orwellian tech contractor facilitating a surveillance state future in cities across North America. Sounds dire, right? Whether the firm deserves this perception is debatable; other companies provide some similar services to municipal clients. However, there indeed are a lot of questions swirling around the company, particularly about Replica, its city mobility software.
As city and state governments implement the technology to better understand how people move around their regions, people want to know more about how the software works, who is supplying the data used in Replica and whether it could be vulnerable to reidentification.
I do, too. I live in Portland, OR, where staff from three city agencies are in eager anticipation as they await access to a massive set of synthetic population data Sidewalk Labs has been building for them. Staff involved in the Replica project here expect to test the software soon, possibly starting in July. The launch of a year-long pilot of Replica, costing nearly half-a-million dollars in total, will likely follow.
In February I began what I surmise will be a continuing investigation of Sidewalk’s partnership here in my city. A story I wrote, published today in Geekwire, details the Portland project, discusses why city staffers want to work with the controversial firm, explains why some observers are not convinced that promises of deidentification are all they’re cracked up to be, and questions whether Portland residents are as aware of this partnership as they ought to be.
It’s been well-reported that Replica uses deidentified mobile location data gleaned from mobile app publishers, mobile location data aggregators and telcos. It’s been less-reported that, in order to help municipalities understand how different types of people travel, and gauge transportation impacts on specific demographics such as people of color or other marginalized groups, the company layers in demographic information from census data to build simulated personas. It might also add in credit bureau data to reveal income levels.
Please check out the Geekwire story if you want more details on synthetic population data and how Replica works.
AT&T provided raw cellphone data employed in research that led to the development of what Replica is today.
Before I delve into some background, I want to highlight one of the key findings of my initial reporting: AT&T provided raw cellphone data employed in research that led to the development of what Replica is today. This tidbit is important because, as far as I can tell, this is the only reporting pointing to a possible Sidewalk Labs data provider by name that isn’t linked to Alphabet itself.
AT&T and the Replica Origin Story
Digging around to learn more about how Sidewalk retrieves its data I was led to a research paper published in 2017 which appears to offer a detailed look into the Replica origin story. My Geekwire story provides additional detail on the research, conducted at UC Berkeley and partially funded by AT&T and the State of California Department of Transportation. The paper (download it here) details how researchers developed synthetic travel pattern models for regional mobility analysis from cellular data. (Sound familiar?)
One of the researchers involved in the UC Berkeley study, Alexei Pozdnoukhov is now a director of research at Sidewalk Labs.
“One of our goals is to enable activity based travel demand models that use cellular data to create synthetic agent travel patterns without compromising the privacy of cell phone users,” states the paper.
December 28, 2019 update
The technical process featured in this academic paper indeed describes an early phase of Replica, according to CEO Nick Bowden. In an email obtained by RedTail’s Kate Kaye sent from Bowden to Eliot Rose, technology strategist at Portland’s Metro in June 2018, Bowden stated, “As I have been attempting to compile this info, I realized that Alexei our head of data science and modeling published this peer-reviewed paper recently that is the most exhaustive description of our technical methods and process. We are continually evolving, but I think this is likely the most in-depth and best representation of the approach.”
Back to the original story:
It turns out the research wasn’t only funded by AT&T, it employed data provided by the company, as confirmed by Sidewalk Labs spokesperson Dan Levitan and another source who asked to remain anonymous. Levitan cautioned against the notion that the research was the “foundation” of the Replica software. In an email exchange with me, he noted, “Foundation is a bit strong, certainly this paper speaks to the potential of using anonymized location data for transportation modeling.”
However, when I asked Eliot Rose, technology strategist at Portland’s Metro to provide details on the sources of information used in the city’s Replica pilot, he pointed me to this paper, noting, “I’d suggest that you read the paper that SWL’s scientists published on their methodology.” Metro is one of the three Portland agencies paying for the software, and here was the agency’s tech strategist, someone who has been very closely involved in this project, stating that this paper describes Sidewalk’s methodology for Replica.
Metro is one of the three Portland agencies paying for the software, and here was the agency’s tech strategist stating that this is Sidewalk’s methodology for Replica.
More from the paper: “To examine the generative power of the model, we synthesized travel plans for each agent with home and work locations sampled from census data. An agent-based microscopic trafﬁc simulation was conducted to compare the resulting trafﬁc with real trafﬁc, and a reasonable ﬁt accuracy was observed.”
To be clear, I have been unable to determine whether AT&T provides data to Sidewalk Labs today, only that the telco supplied it for the earlier research. I reached out to multiple AT&T press contacts but have not received a response to requests to comment regarding the use of their data in the research, or whether AT&T currently supplies data to Sidewalk Labs in any capacity.
AT&T has made no secret of its smart city tech ambitions.
So, why all the hubbub about where the data comes from – isn’t it from Google? Sidewalk says “no,” Replica data does not come from Google, Google apps such as Maps, or its Android mobile operating system. However, it has been reported that data from the Google-owned Waze app could be used for Replica.
It’s worth noting that lots of companies gather and combine mobile location data over time to build models for ad targeting, marketing insights, or to help municipalities plan development and transit. A January story in The Intercept suggests that “packaging and selling location data on millions of cellphones” is Sidewalk’s business model for Replica.
The mere “packaging and selling” of location data is not only commonplace, the description simplifies what Sidewalk and competitors actually do. Sidewalk employs sophisticated algorithmic techniques to build synthetic population models. Other firms including AirSage, Teralytics and Streetlight Data take their own approaches to transforming location data into analytics and insights to help municipal clients understand how people move around their locales.
As a staff reporter covering the data industry for Advertising Age from 2012-2017, I often wrote about how mobile location data was anonymized then made available to marketers. They use it to get more information about customers and target consumers, or to build audience profiles for ad targeting. In 2015, I exposed how Verizon, Sprint, Telefonica and other carriers had partnered with firms including SAP, IBM, HP and AirSage to manage, package and sell various levels of never-before-available data to marketers and other clients. At the time, AirSage, which had tight integrations with mobile operators, had recently signed data deals with Verizon Wireless and Sprint.
Earlier this year, Motherboard revealed how real-time location data from AT&T, T-Mobile, and Sprint has been sold on the black market.
Portland Wanted Sensors, Wi-fi Kiosks, and More
OK, back to Portland.
It seems to have started around three years ago or so, when the city threw its hat in the ring to score a $40 million US Department of Transportation Smart City Challenge grant. Portland had grand plans for the “Ubiquitous Mobility” project detailed in its contest application.
It featured lots of potential corporate partnerships:
- Sidewalk Labs would support data analytics.
- Daimler and GM would handle autonomous and connected vehicles.
- Intersection – the public wi-fi provider chaired by Sidewalk’s CEO Dan Doctoroff – would, in conjunction with Sidewalk, install 100 kiosk-based sensors tracking air quality and pedestrian traffic. (Intersection’s “free” wi-fi kiosks installed throughout New York City were exposed recently for collecting surveillance camera footage that led to the arrest of someone who vandalized kiosks.)
In March 2016, the U.S. DOT announced its plan to work with Sidewalk Labs and the seven finalist cities — Austin, Columbus, Denver, Kansas City, Pittsburgh, Portland and San Francisco — to develop a data platform they called Flow, which appears to be an early version of Replica. Along with Sidewalk, AT&T and Amazon were DOT partners in the contest. In the end, Columbus, OH got the grant.
Since Portland incorporated Sidewalk into its Smart City Challenge plans, the company has come under fire in relation to its controversial partnership with a Toronto group aiming to turn the Quayside waterfront neighborhood there into a “global hub” of “urban innovation.”
Portland’s use of Replica is nowhere near the tech takeover proposed in Toronto. But when I first realized Portland signed on in December to use Replica, it wasn’t mobile location data that came to mind. Instead, I immediately thought of the physical accoutrements of surveillance – cameras, sensors, wi-fi kiosks and other IoT devices, for instance. After evaluating Portland’s Smart City Challenge application, it turns out my suspicions weren’t from out of left field after all. Had Portland won that grant, we may be talking about implementation of wi-fi kiosks and other tracking devices. The city and Sidewalk both confirmed that none of that sort of hardware is involved in the Replica project.
Yet, already, as part of a separate initiative involving GE, Intel and AT&T, Portland is testing traffic signal sensors in three of the city’s most dangerous intersections to identify how cars, pedestrians and bikes move throughout those areas. My sources at Smart City PDX, the umbrella group overseeing many of Portland’s emerging tech efforts, told me the data captured by those traffic sensors is analyzed immediately then destroyed directly at the sensor source before it’s uploaded to the cloud.
Peruse Portland’s Smart City Challenge application, and you’ll find a longer list of existing data sources used by Portland.
[The following was added 5.30.19]
Portland is part of Sidewalk’s “Charter Customer Program” and as such, the city, particularly lead agency Metro, has been asked to talk up the benefits of Replica’s data collection. A term sheet linked in an OPB story the day of the city council’s unanimous vote in December approving the Replica project states, “As part of the Charter Customer Program, Sidewalk Labs asks the lead agency to coordinate and champion data collection and retrieval efforts that assist in both improving Replica and facilitating acceptance Criteria Assessments.” As part of the ramp up to the city launching the official pilot, agencies here have determined specific criteria used to measure effectiveness of the software.
Here’s the official list from that document of agencies that can access Replica, including local colleges and cities in the surrounding area :
[End of 5.30.19 addition]
When Tech Drives Policy, Government Must Be Open to Robust Media Engagement
Simply because there are other products and services employing aggregated mobile location data for analytics, ad targeting or city planning, does that take Sidewalk Labs off the hook? The company has prompted concern for legitimate reasons. Data privacy and security experts wonder if data deidentification techniques might not work as well as the firm says they do. They worry about data being reidentified and employed by law enforcement, for instance.
As noted in my Geekwire story, a Portland Bureau of Transportation spokesperson told me the contract with Sidewalk Labs prevents law enforcement or any entities other than the three Portland agencies paying for the software from accessing Replica software or data.”
When just starting my research into Sidewalk Labs and its work with Portland in February, I spoke to Ann Cavoukian, the former privacy commissioner of Ontario who abruptly quit her advisory role with Sidewalk Toronto in October after attending a company board meeting. Cavoukian said she resigned because Sidewalk failed to require that its infrastructure partner firms in Toronto – companies that would provide waste disposal or highway information, for example – deidentify the data they gathered as part of the project.
“Sidewalk said it will encourage others to deidentify. As soon as they said that, I knew I had to leave.”
– Former Sidewalk Toronto Adviser and Ontario Privacy Commissioner Ann Cavoukian, regarding the Toronto project
Instead, recalled Cavoukian, Sidewalk said it “will encourage others to deidentify.” She told me, “As soon as they said that, I knew I had to leave…. I knew this was not going to be a smart city of privacy anymore.”
Though similar grandiose talk is typical in the so-called smart city universe, Sidewalk’s brazen declarations and stated mission to transform the day-to-day interactions people have in urban environments are disturbing to some people. In a 2017 video (above) introducing the Sidewalk Toronto project, Doctoroff was blunt: “We have an opportunity to fundamentally redefine what urban life can actually be.”
Privacy concerns aside, the prospect of technology companies creating government policy is no longer a theoretical concept. Sidewalk Toronto, the use of Replica software, and countless tech implementations around the world make it real. It’s up to city governments to ensure people are aware of and have a real say in decisions to implement new tech and partnerships. My Geekwire story details how Portland has addressed community engagement in relation to Replica and it’s not exactly what you’d call a splashy awareness campaign.
I’m not convinced the word is getting out to people, and I’d argue that more robust engagement with media would facilitate that. People read, watch and listen to local media coverage. Community meetings will only go so far to foster meaningful engagement. If governments really do want to get the word out, they must be more open to engaging with journalists, even when — no, particularly when — we ask tough questions.