CXO

What the SketchFactor app can tell us about the ethics of data use and crowdsourcing

The SketchFactor app identifies neighborhoods' "sketchiness" through publicly available data, as well as user updates that other users can vote on. The crowdsourcing aspect has led to controversy.

sketchfactoreastcoastus101014.png
A screenshot of the SketchFactor app

SketchFactor is an app that combines public crime data with mobile user reports to rate the "sketchiness" of different neighborhoods on the East Coast of the US. The iOS version went live earlier this summer, and the Android version was released on October 9, 2014. As SketchFactor, LLC cofounder Allison McGuire related to me, in an interview, the idea for the app came from her own experience.

"After living in Boston, London, Washington, DC, and now NY, I've developed a walking habit," she explained. "There are many times when I wished I knew what was on the next block. As I talked with friends all over the world, I noticed a common problem: people didn't have all the information they needed to explore cities on foot. Maybe they knew where a landmark was, but they had no information on the experience of city streets. I knew that information existed via public data, but didn't know how to access that myself. Plus, I saw people sharing their experiences socially. Then, I thought, what if we mapped both people's experiences and public data in a fun, engaging way? One night when I was living in DC, a woman standing on a street corner stopped me on my walk home from work. She asked if I lived in the neighborhood. She told me she was stopping every woman walking alone because the previous evening a woman was choked and mugged on that corner. I was horrified. She said that particular area was poorly lit and to be especially careful. She was a beacon. I knew then that I needed to quit my job to make SketchFactor a reality."

Frankly, when I first saw SketchFactor, I thought it looked a lot like the crime apps I've seen for years, with the added twist of user reporting. In particular, it called to mind Stumble Safely, a mobile app that used data from the District of Columbia to enable users to avoid risky routes home. At launch, Stumble Safely similarly combined open government data on crime reports with liquor licenses, bike lanes, and subway stations and mapped it out in a lovely mobile interface. Stumble Safely was a finalist in a city-sponsored apps contest in DC called Apps for Democracy, featured on the Knight Foundation blog, and written up favorably in Wired. (SketchFactor was a finalist in the city-sponsored apps contest in NYC.)

spotcrime101014.png
A screenshot of the SpotCrime app for comparison

"The data we use varies," McGuire told me. "From crime to 311 reports, we implement publicly available data when it's accessible. That last part is key, as there are many places across the US that don't provide data in a digestible format. This stifles innovation. There are many cities that heatmap crime, for example, which paints a broad picture of where incidents occur, but that information isn't specific enough to actually help anyone traversing streets on foot. In addition, this can skew neighborhoods' reputations."

As with so many other apps produced by apps contests, Stumble Safely didn't endure in DC. The code, at least, was adapted by Chris Metcalfe, a developer at Socrata, and lives on as a demo application to which visitors or residents can send text messages about their location in the Baltimore area and receive a risk rating in return.

SketchFactor's path to sustainability isn't much clearer to me, though McGuire shared some directions they might go in that might use personalization and updates to target ads or provide business intelligence to partners.

"We're not focused on monetizing at this moment — we are focused on user growth," she said. "We're strictly focused on optimizing a product our users love and can't stop using. We're well on our way, as within our first month of launch, our users spent a total of over 1 year reading stories in the app. We have planned revenue models, however, which will be implemented once we have a greater user base and understanding of their needs. These models include integrating advertising and data sales. For example, if someone walks a great deal, they may be served an ad for walking shoes. As SketchFactor gains real-time insights into city streets, we can segment that data for appropriate partners, such as energy companies. Many of these businesses don't have access to immediate light outages, relying on you or me to report a burned out bulb. If a number of users tell SketchFactor the area is poorly lit, we can share that with these partners. Energy companies make more money with additional lighting, and pedestrians benefit from a better lit street. That's a win-win."

When I explored hundreds of reports around DC, though, the vast majority of updates were political commentary, satire, or jokes. When I asked about how sketchiness was determined and how or whether the company was vetting the responses, McGuire told me they do not.

sketchfactordc101014.png

"There are two things we keep in mind here," she said. "One, truth is stranger than fiction and two, it's really important that we use the power of crowdsourcing to weigh stories. Many have asked us how SketchFactor vets reports. The simple answer is that we, the company, don't. It's not important what just [cofounder] Daniel [Harrington] and I think is sketchy. What is important is how other people receive reports. As we hear user feedback — via upvoting or downvoting stories — we learn from them as to what they think is sketchy and adjust our algorithms to fit their needs. For example, I always hear stories of sexual harassment, desolate areas, and poorly lit streets and think: sketch. For Daniel, he sees large groups of people (like popular tourist attractions) and thinks: sketch. When I use the app, stories of catcalling at a poorly lit gas station will be more heavily weighted, but when Daniel launches SketchFactor, experiences in Times Square are highlighted for him."

Where SketchFactor went beyond Stumble Safely is in the addition of these user-generated reports and the impact of the crowd rating them, and that's what has lead to some controversy.

What's behind the SketchFactor controversy?

First, it might be useful to add some context. When it comes to the economic or social impact of open data, it's not just how big the data set is, but whether and how an entity applies it.

As I've explored in past columns, entrepreneurs, government agencies, hospitals, media, and researchers can apply open data to increasing resilience against climate change, deliver insights into the cost of healthcare and outcomes, improve consumer protection, or enable journalists to hold power accountable. Over the years, however, I've also seen growing concern about how data can be used to discriminate against the powerless in society on the basis of their race, creed, color, sex, or class. While advocates hope that releases of open data about crime, traffic, education, government, campaign finance, energy, the environment, or a host of other statistics will reduce information asymmetries in societies, the same data sets may also be used in ways that make people deeply uncomfortable.

In their initial media interviews, the founders of SketchFactor positioned the citizen reporting aspect of the app as a feature that would enable civic engagement and lead to social justice: people would report racism or discrimination as "sketchy behavior." Upon launch, however, SketchFactor was derided as "icky" and downright "racist" by Sam Biddle at Valleywag. In Biddle's view, SketchFactor was "yet another app for avoiding non-white areas of your town — and it's really taking off!" (According to McGuire, SketchFactor was downloaded 60,000 times in the first four days it was live on iTunes and received over 6,000 reports.)

"It's unfortunate that those who have come before us have done what many cities do: heatmap crime," said McGuire. "That isn't helpful information and affects the people who live in these communities. We're different because our users pinpoint specific experiences to draw attention to blocks, alleyways, and corners that are funny, interesting, and problematic. Anyone who lives in a city knows that there's no such thing as a 'good' or 'bad' neighborhood, but, there are streets that need work. There are alleyways worth exploring. There's nothing good about a sketchy corner, regardless of where that corner is."

I wasn't quite as quick as Biddle to jump to a conclusion about the app being racist, given the use case McGuire described and the ones I've encountered on my own travels. People unfamiliar with areas really do need better maps. The biggest issue with SketchFactor in DC is that I find it less useful than existing maps of crime, like Trulia's. Sometimes, it's not enough to be aware of how a neighborhood is changing to avoid getting off at the wrong stop or taking the wrong way back to a hotel. Violent crimes are concentrated in poor areas, which visitors and residents may wish to avoid, along with transit stations where thieves prey upon bewildered or tired travelers unfamiliar with an area and its social contours. Knowing what's safe and what's not if you're "not from around here" isn't easy, and if technology can give a traveler a better roadmap, that's valuable.

Given all of that, I expect there to continue to be a demand for services that give travelers or inebriated residents guidance on getting to their destination safely. That does not mean we shouldn't all be watching carefully to see how these services are actually being used and by whom. That's why algorithmic transparency and auditing matters. Crime data collected and published by law enforcement agencies can act as a source for people to make transit, dining, recreation, or hospitality decisions, just as education or housing data informs other decisions.

As always, these sorts of data-driven decisions, whether made by the individual or powered by algorithmic suggestions, are subject to the quality of the crime data. As anyone who has watched the fiction of "The Wire" or followed the facts of the Department of Veterans Affairs scandal, official statistics can and will be "juked" without sufficient oversight and auditing. One of the reasons data journalism has taken on new importance is because of the need for such independent auditing.

When we talk about big data, we also need to talk about redlining, racial profiling, and discrimination, or the reinforcement of existing biases in systems. Data also may be used by insurers to deny low-income drivers, by data brokers to target vulnerable people in communities, to filter out job applicants, to enable "dragnet surveillance," or by police to target people who an algorithm deems likely to commit a crime but have not done so. That last example is not theoretical: as the American Civil Liberties Union documented, the Chicago Police Department moved beyond "heatmapping" potential crime to sending officers to flag potential criminals, not suspects.

These kinds of risks are why civil rights groups are warning about big data, and two reports from the White House warned of big data and privacy risks and opportunities and harms. As a third important report on civil rights and big data by researchers David Robinson, Aaron Reike, and Harlan Yu explored in September 2014, rebutting analysts who hold that various harms are potential instead of concrete.

"Big data can and should bring greater safety, economic opportunity, and convenience to all people. At their best, new data-driven tools can strengthen the values of equal opportunity and equal justice. They can shed light on inequality and discrimination, and bring more clarity and objectivity to the important decisions that shape people's lives," wrote Yu, Reike, and Robinson.

"But we also see some risks. For example, inaccuracies in databases can cause serious civil rights harms. The E-Verify program, the voluntary, government-run system that employers can use to check whether new employees are work-eligible, has been plagued by an error rate that is 20 times higher for foreign-born workers than for those born in the United States. E-Verify has been under development since it was first authorized in 1996, uses data only from one fairly homogenous source — the government — and is frequently audited. Yet after nearly 20 years, persistent errors remain. This experience provides an important lesson for existing commercial systems, which are fairly new and untested, use data from widely different sources, and operate with no transparency."

The power and the responsibility that comes with new technologies

Unfortunately, there's no question that in the US, poverty remains indelibly associated with race, given the country's history of slavery and long, continuing struggle to ensure civil rights to all of its people. It's a bitter legacy, and an inescapable context for apps like SketchFactor. 151 years after President Lincoln's Emancipation Proclamation and 50 years after the Civil Rights Act, we can still see African-American communities struggling with that reality in big cities around the country, Hispanic communities with undocumented residents and vulnerable immigrants, and reservations with the descendants of the peoples who lived in North America long before Amerigo Vespucci first set sail. That shared history shows how minorities or indigenous people have been subject to discrimination and denied civil rights. New technologies and tools have been used by people in power to coordinate and enable those actions, from punch cards to the telegraph.

How entrepreneurs, government officials, and media use data is ultimately both an ethical and political question, not just a technological one. Data is about power, politics, and people. It's up to all of us to watch, learn, and speak up if we see that responsibility abused.

About Alex Howard

Alex Howard writes about how shifts in technology are changing government and society. A former fellow at Harvard and Columbia, he is the founder of "E Pluribus Unum," a blog focused on open government and technology.

Editor's Picks

Free Newsletters, In your Inbox