Quantifying Yelp Reviews

As a part of this study, we examined the efficacy of machine learning to predict Yelp ratings of Yelp reviews. We noticed that it is incredibly difficult to evaluate a business on Yelp given a long list of reviews. Therefore, we sought out to determine if a machine learning algorithm could illuminate enough of a correlation, between a review and the rating of the review, in order to convert the reviews into a set of quantitative metrics based on the customer evaluations of the business. We conducted two experiments in our study: one to identify the correlation between the number of reviews passed to a model and the ability of the model to accurately predict ratings and the other to identify text feature extraction methods which produce the highest prediction accuracy of a model. In the end, we discovered that there is a large amount of ambiguity amongst reviews and that there is no set standard for what reviews of a particular rating are comprised of. We identified that a sentiment analysis and a word polarity analysis resulted in the lowest mean-squared error.