Pro, Con, and Affinity Tagging of Product Reviews
Todd Sullivan
Stanford's Natural Language Processing Course
Final Project
Stanford Department of Computer Science
In this project I created several systems for tagging product reviews with the pros, cons, and affinities implied by the text and other information such as the review's rating, author's location, etc. While I technically had four weeks to complete the project, I realistically only had three weeks due to overlap with the class' third project. I explored many techniques including various preprocessing methods, a bag of words Naive Bayes baseline classifier, a maximum entropy classifier, and a combinatorial optimization algorithm for finding optimal tag sets. PowerReviews, which operates the product review portal Buzzillions.com, provided the product review data.

