AI Company Book Club: Using Data for the Common Good

When Cathy O’Neil’s book, “Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy,” lit up BlastPoint’s radar a few months ago, we decided to make it the topic for our next book discussion. Everyone at the company got their hands on a copy.

After all, the book is about algorithms—the foundational bricks that uphold the very structure of our business—and how they’re shaping society. As Ph.D., former professor, Wall Street analyst, author and Bloomberg Opinion Columnist Cathy O’Neil explains in “WMDS,” unfortunately, math is often used in nefarious ways.

“Whether or not a model works is…a matter of opinion,” she writes. “In each case, we must ask not only who designed the model but also what that person or company is trying to accomplish.” (p. 21)

Under the guise of this premise, our team has been reflecting on our collective—and sometimes personal—roles and responsibilities within the so called big data revolution. While we agree with some of O’Neil’s arguments and not with others, we’re all truly riveted by this topic. One common view continues to dominate our discussions:

We pride ourselves on being part of the solution by using data to strengthen communities.

Below are some highlights from our ongoing conversations.

Approximating Information is Sometimes the Best, and Only, Option We Have

In Chapter 4, O’Neil shares the example of how US News & World Report gathered data to create a college ranking report years ago. As a starting place, they used a “series of hunches” that included “people wondering what matters most in education, then figuring out which of those variables they could count, and finally deciding how much weight to give each of them in the formula.” (p. 52)

That may be how some models start out—as hunches—because they’re often the best, and only, options available in order to begin a framework that can later be filled in and tweaked.

In 2018, BlastPoint helped Three Rivers Youth, a Pittsburgh-based nonprofit organization, determine the best location to open a new recovery addiction center. But because of the stigma associated with drug use, and because people don’t necessarily want to share private, sensitive information about it, gathering publicly recorded overdose-related death statistics from one neighborhood versus another was one of the only “hunches” we had to go on to explore viable options. To offer a correlating framework, we looked at the number of households where grandparents were the primary caregivers of their grandchildren in targeted neighborhoods.

Was this information limited? Of course. But was it extremely useful to point us and our customer in the right direction? Absolutely. In the end, Three Rivers Youth was able to take that framework to pinpoint where the community’s needs were most dire.

Data is Discriminatory if Used Without Intentional Safeguards

Hiring practices around the world rely heavily on automated personality tests to weed out candidates, yet they don’t necessarily reveal a potential employee’s ability to perform the actual work specified (p. 108).

While many tests illuminate what it might be like to work with a person, should they be hired, some personality assessments actually discriminate unintentionally. Why? Because they’re designed using proxies (e.g., ranking prestigious colleges that candidates list on their résumés higher than others, p. 119) which, the author says, are inexact and often unfair (p. 108). We don’t necessarily disagree.

For example, the way in which questions are worded can lead to biased results, based on how a candidate answers. O’Neil cites a Wall Street Journal article wherein industrial psychologist Tomas Chamorro-Premuzic was asked to evaluate statements that job candidates had to rate themselves on, such as these, asked in a McDonald’s questionnaire: “It is difficult to be cheerful when there are many problems to take care of,” and “Sometimes, I need a push to get started on my work.”

Chamorro-Premuzic explained that the first question would point to “individual differences in neuroticism and conscientiousness,” while the second would reveal “low ambition and drive” (p. 110).

A similar case arose with a screening tool that CVS Pharmacy was using (p. 109), designed by an outside vendor, several years ago, in which the ACLU claimed it “could have the effect of discriminating against applicants with certain mental impairments or disorders, and go beyond merely measuring general personality traits” (Source: ACLU of Rhode Island).

Made aware of the adverse “filtering” effects these kinds of hiring practices create, the Xerox company took data about employee retention and made the decision to intentionally build anti-discrimination safeguards into its predictive churn model, a tool that can help determine employee longevity.

They noticed that employees who had long commutes were more likely to leave the company sooner than those who had shorter commutes (p. 119). Where those long-range-commuters lived became relevant: Xerox realized these employees were coming from poor neighborhoods.

And because Xerox does want to discriminate “on the basis of race, color, religious belief, sex, age, national origin, citizenship status, marital status, union status, sexual orientation or gender identity” (source: Xerox.com), or, presumably, socioeconomics, it removed “commute time” from its churn model, so as not to exclude people who happen to live in poor neighborhoods from fairly applying for job openings.

We think this kind of intentional decision-making allows data to be used for good, especially when companies are able to illuminate information that was previously hidden. Used conscientiously, data can provide more insight into the bigger picture, allowing us to see an issue from a wider vantage point and therefore make us better able to appreciate nuance.

“Color Blind” Data Can Still Lead to Bias

O’Neil’s book takes a disturbing look at racial bias with respect to technology across sectors, throughout history and in the present day. Discrimination’s legacy lives on where data is used as a WMD: in the justice system (p. 25), in education, (p. 69) and in many other realms.

We know that race discrimination, as well as age, gender, religious, physical ability, socioeconomic, language and other kinds of discrimination permeate everyday life, some of it driven by technology. Even when our datasets or formulas don’t include race as a factor, bias creeps in anyway, based on everything else that surrounds them.

Take the now-infamous automatic soap dispenser that didn’t recognize dark skin tones and therefore didn’t eject soap. It turned out that the light-sensitive device was designed by a homogenous team of engineers that lacked any designers of color, and the product was never tested on a person whose skin tone differed from theirs.

That’s why we here at BlastPoint prioritize diversity. We’ve built hiring practices that work to combat gender, class and cultural bias.When great minds from different backgrounds come together, unique perspectives flourish, new ideas and fresh solutions thrive, and bias can be overcome.

We believe that sharing where-I’m-coming from with where-you’re-coming-from can only strengthen models. It’s what creates better access and brings transparency, whether we’re designing a software tool or planning a marketing campaign or helping a customer choose where to open a new store.

We’re committed to strengthening our community in the work that we do by making our platform more accessible to organizations that will use it for the common good. If you’d like to read more about that, check out this spotlight interview by Jamillia Kamara of the Forbes Funds covering BlastPoint’s 2017 UpPrize nomination.

We make decisions intentionally, working hard each day to safeguard our tools so that bias does not creep in. And we’re steadfast in putting our algorithms to work for humanity; not against it.

If you’d like to join this ongoing conversation, follow us on Twitter and LinkedIn, comment or tweet us using the hashtag #bookclub.