Project 2

Project 2 is now posted, and is due Tuesday, 4 February (9:29am).

For Project 2, you will develop your model from Project 1, with a goal of improving the the “Zestimate” residual error using the data for 2017 in the Kaggle Zillow prize competition. The Kaggle competition has been concluded, so you’ll have to settle for learning, wisdom, and glory in place of Zillow’s $1.2M prize.

We will run a competition in class on February 4 to see which model works best on a (secret until then) testing data set. You should follow the spirit of this project, not try to cheat (for example, see How a Kaggle Grandmaster cheated in $25,000 AI contest with hidden code – and was fired from dream SV job

  • this is an extreme example of the kind of overfitting we talked about last class). If you do discover a way to “cheat”, you can get “bonus points” for this, but please don’t use it in the model you submit for the competition.