11.12.2017

Binary Vs Star Rating Systems (Netflix's Paradigm)

Developing further our claim on binary rating systems we used Netflix's recent pivoting to a binary (like/dislike) rating system in order to share our take on why this shift was data driven, and how that would affect Netflix's recommendation engines.

 

The battle of rating systems when it comes to motion pictures took place in a star paved battlefield. Ten-star systems seem to have the upper hand over five-star ones but no one can be certain on whether there will ever be a winner.

During mid April 2017 though Netflix made a major change regarding the way people display their preferences. Netflix followed Youtube and Reddit’s example, as mentioned on a previous blog post "Analysing the Star Rating System. When it comes to knowing users’ behaviour is it really written in the stars?", and shifted from a star rating system to a binary Like/Dislike one.

This decision has, as expected, the vast majority of users and websites torn to shreds. Aside from the expected change aversion that kicked in after pivoting Netflix still gets beaten over that change.

 

 

 

The million dollar question still remains, will they channel this aversion to delight, or even reach a level where users will be at least fine (neutral as seen on the chart above) after this change?

Truth be told pretty much any change, regardless of platform, is at first followed by user aversion. However, introducing it by comparing a movie streaming platform with a swipe left and right cornerstone feature of a dating app is certainly not the way to mitigate their concerns.

 

This blog post is not about Netflix but about the benefits of binary rating systems especially when those reviews feed a recommendation engine. Netflix had all the reasons in the world to proceed to something so “controversial”.

  • Netflix is not here to promote art, it is here to offers its users the most satisfying experience as soon as possible.

You are probably experiencing a duh moment regarding this point’s lucidity, but this is something that users tend to “forget” the most - even if they are reminded of this every time the Skip Intro button pops up. Their main concern is to understand what users like, not just keeping them engaged. This would also fuel their decisions regarding new content planning.

  • Make its personalisation more transparent, aiming towards its improvement.

As Netflix’s Cameron Johnson explained in a previous interview, Netflix’s star ratings weren’t an average score derived from the entire user base. On the contrary it reflected a model’s prediction on how much a user would actually rate a movie, a piece of information not known to the, I dare to say, majority of users. As to change that perception star ratings are replaced with percentages, numbers that make much more sense towards the “that is how much we think you would like this” direction. A fact that takes us to the next point.

  • Recommendation systems need ratings.

Not only in an once off manner but as an iterative procedure. After pivoting towards a like/dislike system Netflix says they recorded a 200% increase in ratings.

  • Lastly changing its rating philosophy Netflix “forced” its customers hand or better thumb.

Watching a movie, re-watching it or stopping/pausing it and returning (or not) to watch the rest are clear and important indicators of user preference and features to be used in a machine learning model - way more important than leaving a rating if you ask me.

Users though sometimes - and stars encourage this kind of behaviour - behave in ways that baffle recommendation models. A great example to explain this is guilty pleasure films. On one hand  as they seek sophisticated suggestions they give relatively bad ratings to these films and on the contrary keep watching them (blame it on nostalgia I guess but still this poses a problem for Netflix).

Free text Vs Binary systems

Our take combines the best of binary and star systems. It considers the nature of all e-commerce websites and relies upon the inarguably high influence of language and the continuously growing accuracy and reliability of Sentiment Analysis tools. The feedback system we will present below is not something that is as simple as star or binary rating systems, but is definitely more informative. This does not consist of a tailored solution for Netflix but it doesn’t mean that it would not work for Netflix too, since Netflix used to have a considerable amount of written reviews as mentioned in the Where Have All the Netflix Reviews Gone? part of the linked article. Our goal is to offer a meaningful, insightful and accurate alternative to traditional rating systems. In order to prove our claims we needed a dataset to test our hypothesis. We found what we seeked in (www.airlinequality.com) created by Skytrax. Skytrax is an independent customer forum that has exactly all data needed to test our assumption. Specifically we needed:

  • Both written reviews and star ratings 
  • Reviews we could trust (most users have uploaded their ticket as proof they actually flew) 
  • A binary flag that will show us, regardless of a user’s written review or star rating, their overall experience with the airline during a trip 

Sounds interesting?

Get in touch now and we can schedule a free, one hour, consultation. No strings attached.

DROP US A LINE