Is it wise to trust important decisions to an algorithm that might not been validated by independent parties?
Data-driven algorithms are making decisions that affect many aspects of our lives, and that may be a problem.
"While there may be efficiency gains from these techniques, they can also harbor biases against disadvantaged groups or reinforce structural discrimination," writes Nicholas Diakopoulos, assistant professor of journalism, University of Maryland, in his The Conversation piece We need to know the algorithms the government uses to make important decisions about us. "The public needs to understand the bias and power of algorithms used in the public sphere."
The bias and potential for error Diakopoulos alludes to tends to slip under the radar until an algorithm-based decision negatively impacts individuals or organizations. That concerns Diakopoulos because of the following:
- Data-driven algorithms are used to massage massive amounts of data into usable information. However, the data is messy, and the processing even more so. That being the case, how does one know if the results are accurate and trustworthy?
- Individuals willing to take the time to validate the output from an algorithm-driven system quite often run into problems due to the lack of transparency. As to why that is: Developers are not willing to provide what might be considered trade secrets and proprietary software to third parties.
A case study in algorithmic transparency
To give credence to his concerns, Diakopoulos looked into how law enforcement uses data-driven algorithms. "Last year (2015) the federal government began studying the pros and cons of using computerized data analysis to help determine prison inmates' likelihood of reoffending upon release," writes Diakopoulos. "Scoring individuals as low-, medium-, or high-risk can help with housing and treatment decisions, identifying people who can safely be sent to a minimum security prison or a halfway house, and those who would benefit from a particular type of psychological care."
The first step to determining an inmate's risk of recidivism, according to Diakopoulos, begins with filling out scoresheets. He says, "The form itself, as well as its scoring system, often discloses key features about the algorithm, like the variables under consideration and how they come together to form an overall risk score."
However, that is not enough according to Diakopoulos. To have algorithmic transparency more information is needed on how the forms were designed, developed, and evaluated. As to why this is important, he mentions, "Only then can the public know whether the factors and calculations involved in arriving at the score are fair and reasonable, or uninformed and biased."
Less than transparent
One of the reasons Diakopoulos decided to research criminal justice was the ability to use the Freedom of Information Act (FOIA) and similar state laws to get information about the forms and any supporting documentation. Diakopoulos, his colleague Sandy Banisky, and her media law class submitted FOIA requests in all 50 states. "We asked for documents, mathematical descriptions, data, validation assessments, contracts, and source code related to algorithms used in criminal justice: such as for parole and probation, bail, or sentencing decisions," writes Diakopoulos.
Getting the information was anything but easy, even figuring out whom to ask was difficult. To make matters worse, several states denied the researchers' requests, explaining the algorithms are embedded in software, therefore not subject to the FOIA statutes.
Interestingly, nine states refused to disclose any information about their criminal justice algorithms, stating the software tools were privately owned. One example, offered by Diakopoulos, was LSI-R, a recidivism risk questionnaire.
The list of refusals continues on and on, making it painfully apparent why Diakopoulos is concerned about transparency. So much so, he asks, "[G]iven the government routinely contracts with private companies, how do we balance these concerns against an explainable and indeed legitimate justice system?"
Even more to the point, Diakopoulos mentions that the research team did not receive any information on how the criminal justice risk-assessment forms were developed or evaluated.
The bottom line
Besides law enforcement, algorithms are making decisions related to search engine personalization, advertising systems, employee evaluations, banking/finance, and political campaigns--to name a few.
"These algorithms can make mistakes," suggests Diakopoulos on his site. "They sit in opaque black boxes, their inner workings hidden behind layers of complexity. We need to get inside that black box, to understand how they may be exerting power on us, and to understand where they might be making unjust mistakes."
- Big data: Can it predict the spread of Zika? Cloudera thinks so (TechRepublic)
- Don't draw strategic conclusions from the wrong marketing data (TechRepublic)
- What government can learn from tech: A conversation with Beth Noveck (TechRepublic)
- More than economics: The social impact of open data (TechRepublic)
- Brazil evolves in public data transparency (ZDNet)
- Scary and fascinating: The future of big data (ZDNet)
- Job description: Chief data scientist (Tech Pro Research)