Amazon seems to think I'll like "What To Expect When You're Expecting", Netflix suggests I watch "The X Files", but neither is likely to appeal to me. In spite of letting Amazon know what movies I own, and in spite of having rated over 1300 movies on Netflix, both of their recommendation engines have a miserable success rate.
They're not alone. Every time I hear about a new movie recommendation website I go and try it out, and invariably I'm presented with a list of movies that suits me no better than a random selection.
The problem with Amazon and Netflix is that their recommendations are based on what other people like. If I liked "Gosford Park" then they assume I'll like every other movie favored by other people who liked "Gosford Park". That's just plain silly. My wife likes a lot of the same movies I do, but there are a significant number of them we disagree on - sometimes strongly. Statistical analysis of the preferences of huge numbers of people can help a bit, but then you only get presented with the popular movies - and if your tastes don't tend towards what the majority likes, you're screwed.
There are a couple of recommendation engines out there that try to improve on this by evaluating movies based on objective criteria, and then match those criteria to the individual based on a survey and rated movies - Jinni is a good example of this sort. It was released with much fanfare and sounded like the movie equivalent to Pandora. I tried it out, spending a couple of hours rating movies only to have it routinely recommend movies I've seen and hated.
What went wrong?
While I do tend to prefer certain kinds of films (action, sci-fi, fantasy) over others (horror, romance), the genre (along with the other objective criteria that Jinni uses) aren't good predictors for whether I'll like a movie. There are a lot of movies that would seem to be a match for me that I absolutely despise, and an equal number (or more) which don't match that I love.
The odd thing is that many years back, when I was still renting movies from a video store, I routinely got recommendations with a phenomenal hit rate. These suggestions came from an unlikely source - one of the video store employees, now fondly remembered as "The Guy".
In some ways, The Guy was kind of scary. He'd seen almost every single movie in the store and remembered them all. That's not the scary part. He also remembered the major and minor actors and actresses in each film and the kind of roles they played in the story. That's not the scary part either. The scary part is that he also remembered the director, producer and cinematographer, and could come up with a list of other films that any of them had ever worked on, sorted into phases that each of them seemed to be going through in that stage of their life's work. The Guy seemed to have all of the IMDB inside his head cross indexed with every movie critic ever.
If you were standing and staring at the shelves long enough, The Guy would walk up and ask you what sort of film you were interested in. You'd answer with a genre or a movie you liked and he'd instantaneously dissect it into component parts and then recommend something based upon these parts.
He never recommended a movie to me that I didn't like.
Later I realized that this feat that is so incredible in a human being would actually be quite easy for a computer program. All that's needed is a reasonably complete database of movies - and, what do you know, there is one. IMDB has all sorts of data on movies. I thought about the algorithm a bit and then wrote a simple program to try it out.
The program takes as its input a movie the user likes. It then gets a list of every single person who worked on the film - I ignored jobs like gaffer and dolly grip, and stuck to those like actor, director, and writer. Since the user likes the movie, all of those people get a +1.
The program then got a list of every film each of these people worked on, and gave those films in turn a +1. Note that if the film had two people in common with the liked film, it would receive the bonus from each of them.
When any of these films is rated, all the people involved have their rating changed accordingly, +1 for liked and -1 for disliked. The list of films is recalculated and re-presented.
The result is a list of recommendations that actually seems to fit reality. Actors, directors, and writers don't restrict themselves to a single genre or movie style, but they're likely to work on projects that suit their own particular preferences. This means that if I like a Nicholas Cage film I'm liable to like another one of his films, even if it's not in a genre I usually watch.
Just Do It
The program worked incredibly well. IMDB has all the data right there. If I can cobble together a passable version of this system using php then they could have a full-blown user friendly version in a heartbeat (with an iPhone app and all). They could license the engine to Amazon and Netflix too.
I could do it, but that would require working out a deal for access to the data, and developing APIs and the like, and I'm just not that type of programmer.
Though if I did develop it I'd put a nice, simple pseudo-AI interface on it and call it "The Guy".