Class 2: Does Cancer Cause Smoking?

Schedule

Project 1 is due before class (9:29am) on Tuesday, 22 January.

Office hours schedule:

  • Mondays, 10-11am, 254 Monroe Hall (Denis)
  • Wednesdays, 5-6:30pm, Rice 442 (Jonas)
  • Thursdays, 11am-noon (after class), Rice 507 (Dave)
  • Fridays, 10:30am-noon, Monroe Basement (Joe)

Slides


Download (full resolution) PDF

Links

Security vulnerabilities as externalities

Manos Antonakakis, Tim April, Michael Bailey, Matthew Bernhard, Elie Bursztein, Jaime Cochran, Zakir Durumeric, J. Alex Halderman, Luca Invernizzi, Michalis Kallitsis, Deepak Kumar, Chaz Lever, Zane Ma, Joshua Mason, Damian Menscher, Chad Seaman, Nick Sullivan, Kurt Thomas, and Yi Zhou. Understanding the Mirai Botnet. USENIX Security Symposium 2017.

According to one common view, information security comes down to technical measures. Given better access control policy models, formal proofs of cryptographic protocols, approved firewalls, better ways of detecting intrusions and malicious code, and better tools for system evaluation and assurance, the problems can be solved. In this note, I put forward a contrary view: information insecurity is at least as much due to perverse incentives. Many, if not most, of the problems can be explained more clearly and convincingly using the language of microeconomics: network externalities, asymmetric information, moral hazard, adverse selection, liability dumping and the tragedy of the commons.

Ross Anderson, Why Information Security is Hard – An Economic Perspective. Annual Computer Security Applications Conference (ACSAC) 2001.

Smoking and Cancer

Richard Doll and Bradford Hill. Smoking and Carcinoma of the Lung. British Medical Journal, September 1950. (This is the first paper on the hospital patients study that I mostly talked about in class, and the tables and figures in the slides were from this paper.)

Scientists in many fields have felt the need for canons of valid inference, and these have been becoming available in what are, properly, experimental sciences, by the rapid development of interest and teaching in “The Design of Experiments”.

Unfortunately, it has become obvious that many teaching departments, with mathematical but without scientific qualifications, have plunged into the task of teaching this new discipline, in spite of harbouring gravely confused notions of the logic of scientific research.

If, indeed, the statistical; departments engaged in university teaching, were performing their appropriate task, of clarifying and confirming, in the future research workers who come within their influence, an understanding of the art of examining observational data, the fallacious conclusions drawn, from a simple association, about the danger of cigarettes, could scarcely have been made the basis of a terrifying propaganda.

For this reason I have thought that the fallacies must be attacked at both of two distinct levels; as an experimental scientist, and as a mathematical statistician.

Ronald A. Fisher, Alleged Dangers of Cigarette Smoking, British Medical Journal 1957.

Tobacco - A Vital U.S. Industry, pamphlet from Tobacco Institute, featuring drawing by Peter Jefferson, arguing against tobacco regulation.

Model of causes of Breast Cancer

Scott Alexander, Cancer Progress: much more than you wanted to know, Slate Star Codex, August 2018. A long, but very interesting and illuminating, post on whether or not we are “winning the war on cancer”.


Project 1: Housing Price Prediction

Project 1 is due Tuesday, 22 January, 9:29am.

You and your assigned partner (see the message in slack) should work together on this assignment. Both team members should fully understand everything you submit. If there are parts you understand quickly but are new to your partner, it is your responsibility to explain them to your partner until everyone understands. If there are parts that your partner understands quickly but that are new to you, it is your responsibility to insist that your partner explains things to you until you understand them well.

To do Project 1, you will need to install Jupyter. Follow the directions at https://jupyter.org/install.html. (As in the directions there, we recommend using Anaconda, which will also install Python and many useful Python packages. The instructions in the project notebook assume you have installed Anaconda.

Once you’ve installed Jupyter, download the Project 1 Notebook: project1.ipynb

Then run,

jupyter notebook project1.ipynb

to get started. This will start the jupyter local server and open the notebook in your web browser.

The notebook contains the questions and starting code for Project 1. You will do your assignment by editing this file, and will submit your completed jupyter notebook as your assignment. (We’ll provide details on how to submit later.)


Class 1: Markets, Mechanisms and Machines

Schedule

Everyone wanting to be in the class should:

  1. Submit the MMM Pre-Semester Survey
  2. Join the Course Slack

Slides

Download PDF


Updated Syllabus

The updated course syllabus is now posted: Course Syllabus.


Questions and Answers

Here are our (not yet completed) answers to the questions you asked on the Course Survey.

Getting into the course

What can I do to get into the course? Is this course going to be taught again?
Given the space constraints, we will at best be able to add a few more students to the class. If we are able to add students to the class, we will be inclined to favor students who have been keeping up with the class and showing that they are likely to make valuable contributions to it.

You mentioned it is unlikely you can expand the class size. Can I assume any changes will occur by the first day of class? I still need to finalize my schedule. Is this course likely to be offered again?
We hope to be able to offer the course again in Spring 2020, and should be able to have a larger number of students next year. (But, nothing is guaranteed about this yet.)

Do you guys have hopes of getting people off of the waitlist aside from the ones that end up dropping the class? Will you be offering this class in later semesters?
We will hopefully be able to add a few more students to the class, but we are not expecting to be able to get a bigger room - unfortunately classroom space at UVA is very limited - and the current limits are the maximum (or very close to it) we can have with the assigned room.

If I am still interesting in enrolling in the course, should I attend lecture even if I am not in the class. I have a free period during the class, but do not want to be present if space in the classroom is limited
We won’t kick anyone who wants to attend out of the classroom (unless we get complaints from the fire marshall). The room is fairly small, but everyone should feel free to come, and in the unlikely event that there is a space problem, we’ll try to sort out a solution.

Background expectations

The course syllabus mentions machine learning and econometrics, neither of which I am too familiar with. Would it be better to try to take this class after I’ve taken classes that have overlapping concepts?
If you’re in the CS section of the course, there is no required Economics prerequisites, so definitely not expected that you would have taken econometrics or machine learning. If you are in the Econ section, you need to satisfy the prerequisite (which does include an Econometrics course).

The lectures should be accessible to students without any background beyond the prerequisites (so, we won’t assume any CS or economics background that students in the other section wouldn’t be expected to have). For the projects, you’ll be working in teams that include students in both sections, and we hope you’ll be able to work together and learn from each other in ways that benefit from your different backgrounds in CS and Economics.

Why we’re teaching the course

What inspired you guys to teach this course?

What were your motivations for co-teaching a course like this?
See Class 2.

To what extent can computer science be used to solve problems in economics, and vice versa? Are there any limitations to how we can solve these problems?
At some level, all problems in science are computing problems, and all practical problems are economic ones.

More concretely, there is a set of practical areas that have emerged in the past 10-15 years where Economics and Computer Science are are equal partners. For instance, the tasks of optimal routing (one of the “classical” questions in Computer Science) are now proved to be impossible to solve without invoking the idea of a Nash equilibrium. On the other hand, market design (which is a large area within Econimics) often reduces to a computational problem and solving it is often impossible without using the ideas from Computer Science such as compitational complexity.

Grading

How is our grade broken down?
The updated syllabus has a rough grade break down. But, we don’t grade by a simple forumla based on weighting each assignment. We’re looking for evidence to support the highest justifiable grade based on everything you’ve done in the class.

How will this class be structured? I’m intrigued by the idea of exploring a different structure compared to a typical class.
This is the first time teaching a class like this, so some things we’ll figure out as we go, and we definitely appreciate feedback and suggestions from students about how things are working or could work better. For the class meetings, we will mostly follow a fairly traditional (but hopefully engaging and enlightening!) lecture format, with both sections combined. If it seems useful to occasionally split into two groups, we might do that. For the projects, we will have students working in interdisciplinary teams, and hope that students will learn a lot from this experience and from working with teammates with different backgrounds to solve problems that require knowledge and skills in both Economics and Computing.

Honestly, I’m really worried about a final being worth 50% of the grade. Would love to know more about what that’s about and how that is going to work.
Sorry, this was a mistake in a preliminary version of the syllabus. We don’t actually plan to have any final exam in this class. (See the updated syllabus, which will be finalized before the first class, for more details.)

I’d like to gain experience with data analytics/practical business applications of programming – learning the hard skills. Will this class give me an opportunity to do that, or will I mostly focus on the Econ/theory side of things? I’m not sure if the CS content will be too high-level for me to grasp and just go over my head, especially because half of the class is majoring in CS. I’d appreciate your thoughts on this.
The course will have 5 assigned projects and 1 final project. All of them will be based on analyzing real datasets and we require using Python for this analysis. The course does not have CS prerequisites for the Economics students, so we are not expecting you to necessarily have any previous programming experience. The course will contain overview of the concepts from Economics and Computer Science that will be used. We also expect that interaction between Economics and Computer Science students will help mutual learning. The material will combine Economic theory with the concepts from Computer Science and use Econometric and Machine learning methods for bringing this theory to data.

What does it take to do well in this course?
Being able to tackle open-ended problems and work creatively to find good solutions, working well in a team, being able to express yourself well in writing (and in code or mathematics), and being open to learning new things and going beyond what is provided. Active participation in the class as well as contributing well to successful projects will be a sufficient condition for getting an A in the class.