Production-Capable Extraction of Pros and Cons from Product Reviews

Todd Sullivan
PowerReviews /

In this project I created a pro/con tag extraction system that achieves high enough precision (95%) and recall (44% to 90% depending on the product category) for use in a real world environment. The system is a drastic deviation from my previous work in product reviews and was devised to overcome the fact that the PowerReviews data is inadequate for machine learning-based methods for pro/con tag extraction. (It is important to note that my previous system does indeed work well if one has properly labeled training data.)

Technical Report