This syllabus is for a class designed to teach students to recognize where and understand why ethical issues can arise when applying data science to real world problems.
INFO 4270: ETHICS AND POLICY IN DATA SCIENCE
Mondays and Wednesdays 2:55-4:10PM Hollister Hall 162
Solon Barocas (Professor)
Gates Hall 211
Office hours: Mondays 4:30-6:30PM and by appointment
Brian McInnis (Teaching Assistant) firstname.lastname@example.org
Gates Hall G19
Office hours: Wednesdays 9:00-11:00AM
COURSE DESCRIPTION AND OBJECTIVES
This class will teach you to recognize where and understand why ethical issues and policy questions can arise when applying data science to real world problems. It will bring analytic and technical precision to normative debates about the role that data science, machine learning, and artificial intelligence play in consequential decision-making in commerce, employment, finance, healthcare, education, policing, and other areas. We will focus on ways to conceptualize, measure, and mitigate bias in data-driven decision-making, to audit and evaluate models, and to render these analytic tools more interpretable and their determinations more explainable. You will learn to think critically about how to plan, execute, and evaluate a project with these concerns in mind, and how to cope with novel challenges for which there are often no easy answers or established solutions.
To do so, you will develop fluency in the key technical, ethical, policy, and legal terms and concepts that are relevant to a normative assessment of data science; learn about some of the common approaches and emerging tools for mitigating or managing these ethical concerns; and gain exposure to legal scholarship and policy documents that will help you understand the current regulatory environment and anticipate future developments. Ultimately, the class will teach you how to reason through these problems in a systematic manner and how to justify and defend your approach to dealing with them.
All course materials will be available on Blackboard.
We will read critical commentary and thoughtful reflections by seasoned practitioners, important and illustrative research from computer scientists, an interesting mix of legal scholarship, moral philosophy, and policy analysis, and a host of government documents. All along the way, we will rely on case studies, recent controversies, and current events to ground our discussion.
The appropriate response to many of the problems that we will address in the course is far from settled. This is, consequently, a reading-heavy course. Even so, the assigned readings frequently do not present all sides of the debate. I have therefore selected materials that tend to offer a more critical—and sometimes less familiar—perspective with the goal of provoking productive debate during our class and strong reactions in your assignments. I expect you to stake out conflicting—informed and carefully reasoned—positions on the issues, and you should not shy away from doing so.
The lecture, discussion, and in-class activities will cover most, but not all of the issues raised by the readings. Given the nature of the issues and material under consideration, I expect lively debate and plan to follow the natural flow of discussion as much as possible. As such, I am certain that class will cover some important ideas that do not appear in the readings. Active listening and participation is therefore crucial.
20% Participation (both in-class and on Blackboard)
15% Critical review of proposed data science project
25% Response to Consumer Financial Protection Bureau’s Request for Information 40% Final paper revisiting a recent controversy
I expect you to abide by Cornell’s Code of Academic Integrity at all times. Please note that the Code specifically states that a “Cornell student's submission of work for academic credit indicates that the work is the student's own. All outside assistance should be acknowledged, and the student's academic position truthfully reported at all times.”
Please contact me or the TA if you have any questions or concerns about appropriately acknowledging others’ work in your submitted assignments. You should expect that I will rigorously enforce the Code and may use software to check for plagiarism.
SCHEDULE AND READINGS
I expect you to complete all assigned readings prior to class. Unless I’ve noted particular parts, sections, or pages for you to read, you should read the assigned text in its entirety. For some classes, I have listed recommended readings that you may choose to complete, if you are so inclined. These are optional, and I will not expect that you have read them.
The schedule and readings are subject to change as we progress through the semester. Please always refer to the syllabus posted to Blackboard before you begin reading for the next class.
Background Reading [Optional]
August 23 — Welcome
August 28 — Data , the givens
August 30 — What problem are we solving?
September 4 — Labor Day — No class
September 6 — Cultivating a critical disposition
September 11 — Bias and exclusion
September 13 — The social science of discrimination
September 20 — Auditing algorithms
September 25 — Algorithms audited
September 27 — Formalizing and enforcing fairness
October 2 — Accounting for disparities in accuracy and error rates [Manish Raghavan, a doctoral student in computer science at Cornell and co-author of one of the assigned readings, will join us for this class]
October 4 — Competing notions of fairness
October 9 — Fall break — No class
October 11 — Feedback loops and fairness
October 16 — The fairness of different factors
October 23 — From allocative to representational harms
October 25 — Transparency and due process
October 30 — Interpretability in machine learning
November 1 — The value of explanation
November 6 — The future of scoring
November 8 — The privacy implications of inference
November 13 — Price discrimination
November 15 — Insurance
November 20 — Algorithmic persuasion and manipulation
November 22 — Thanksgiving — No class
November 27 — Algorithmic publics
November 29 — Rejecting certain applications of machine learning