Category Archives: content based filtering

Generating Topic and Personal recommendations

Here is an overview of the features we wanted to use to determine the “score” of a paper that we would then rank and output to the user as recommendations.

recommendation system overview


We also decided that since we had created  these “topics”,  and were running the LDA inferencer on all the new papers everyday classifying  them into topics, we would provide topic based recommendations as well- so if  a new user came in, and was browsing the topics- they could see the top papers in that topic. Ofcourse,  in addition to having high topic probability, these papers were ranked by recency of publication, impact factor of their host journal and tweet counts (if any)!

For personalized recommendations, we decided  we would first use topic similarity between the users papers (or library)  and the corpus of all recent papers to filter or shortlist possible candidate papers to recommend, and then use word similarity to further refine the selection. The final ranking would use our special ‘sauce’ based on tweet counts, date of publication, author quality etc to order these papers and present to the user!

This involved connecting various pieces of the pipeline and by September 2014,  we had a working pipeline that generated and  displayed topic recommendations and library recommendations (if a user had uploaded a personal library) on the website!!


Here is a list of books/talks I found useful:                                                                                                           Introduction to Recommender systems  (coursera)                                                                                    Intro to recommender Systems – a four hour lecture by Xavier Amatriain                                         Coursera: Machine Learning class:  Section on Recommender systems