Thanks for signing up to receive this free case study extracted from MIT online course on data science.
Recommendation Engine For Movies: Using Data to Provide The Best User Experience
Instructor: Devavrat Shah
Course: Data Science and Big Data Analytics: Making Data-Driven Decisions
Ever wonder how industries like Netflix, Spotify and Pandora filter products based on their unique user’s preference? Using data of course! But how do they get that data? By building what is known as a Recommendation Engine – a feature that filters items by predicting how users will rate them – the goal is to connect users to the right items so that they will continue to use products/services.
In this case study, we will focus on Netflix and how they utilize Recommendation Engines to provide the best possible shows and movies for unique users.
IMPORTANT: Don't get discouraged if some of the steps described seem too complicated! Remember, this is an extract of the online course that will provide you with all the background necessary to successfully complete this activity.
As a professor in the department of electrical engineering and computer science, Dr. Shah’s current research is on the theory of large complex networks, which includes network algorithms, stochastic networks, network information theory and large-scale statistical inference. He is a member of the Laboratory for Information and Decision Systems (LIDS) and Operations Research Center (ORC), and the Director of the newly formed Statistics and Data Science Center in IDSS.
Dr. Shah received his Bachelor of Technology in Computer Science and Engineering from the Indian Institute of Technology, Bombay, in 1999. He received the Presidents of India Gold Medal, awarded to the best graduating student across all engineering disciplines. He received his Ph.D. in Computer Science from Stanford University. His doctoral thesis won the George B. Dantzig award from INFORMS for best dissertation in 2005. After spending a year between Stanford, Berkeley and MSRI, he started teaching at MIT in 2005. In 2013, he co-founded Celect, Inc. to commercialize his research at MIT.
Dr. Rigollet works at the intersection of statistics, machine learning, and optimization, focusing primarily on the design and analysis of statistical methods for high-dimensional problems. His recent research focuses on the statistical limitations of learning under computational constraints.
At the University of Paris VI, Dr. Rigollet earned a B.S. in statistics in 2001, a B.S. in applied mathematics in 2002, and a Ph.D. in mathematical statistics in 2006. He has held positions as a visiting assistant professor at the Georgia Institute of Technology, and as an assistant professor at Princeton University.
As an Assistant Professor of Electrical Engineering and Computer Science and a member of Laboratory for Information and Decision Systems (LIDS) and Institute for Data Systems and Society (IDSS), Dr. Bresler applies engineering insight to practical problems by formulating and solving mathematical models. His work is focused on understanding the relationship between combinatorial structure and computational tractability of high-dimensional inference in the context of graphical models and other statistical models, recommendation systems, and biology.
He received his Ph.D. from the Department of Electric Engineering and Computer Science at UC Berkeley, and was a postdoc at MIT.
Victor Chernozhukov works in econometrics, mathematical statistics, and machine learning, with much of recent work focusing on the quantification of uncertainty in very high dimensional models. He is a fellow of The Econometric Society and a recipient of The Alfred P. Sloan Research Fellowship and The Arnold Zellner Award. He was elected to the American Academy of Arts and Sciences in April 2016.
As an MIT faculty member and a principal investigator at the Computer Science and Artificial Intelligence Lab (CSAIL), Dr. Moitra focuses on a variety of research areas from statistical inference to optimization and approximation to codes and combinatorics. He is also interested in algorithmic problems with applications in machine learning and big data.
Dr. Moitra received his B.S. in electrical and computer engineering from Cornell in 2007. He completed his M.S. in 2009 and his Ph.D. in 2011 in computer science at MIT. Notably, he received a George M. Sprowls Award and a William A. Martin Award for best thesis for his doctoral and master’s dissertations. He then spent two years as an NSF CI Fellow at the Institute for Advanced Study while he was a senior postdoc in the computer science department at Princeton University.
Dr. Broderick is affiliated with the MIT Institute for Data, Systems, and Society (IDSS), the Computer Science and Artificial Intelligence Laboratory (CSAIL), and MachineLearning@MIT. Her recent research has focused on developing and analyzing models for scalable, unsupervised machine learning using Bayesian nonparametrics.
Prior to joining MIT, she earned her Ph.D. in Statistics at UC Berkeley, an AB in Mathematics from Princeton University, a Master of Advanced Study for completion of Part III of the Mathematical Tripos from the University of Cambridge, an MPhil by research in Physics from the University of Cambridge, and an MS in Computer Science from UC Berkeley. Dr. Broderick was awarded the Evelyn Fix Memorial Medal and Citation, the Berkeley Fellowship, an NSF Graduate Research Fellowship, a Marshall Scholarship, and the Phi Beta Kappa Prize.
As a professor of operations research, Dr. Gamarnik’s interests include probability and stochastic processes with application to queuing theory, theory of random combinatorial structures and algorithms, scheduling, and applications to various business processes and health care. He has served as a research staff member at the Department of Mathematical Sciences, IBM Research, where he worked on a variety of projects with industrial applications, including disaster recovery, performance in business processes, call centers, and operational resilience.
Dr. Gamarnik is a member of the Institute of Mathematical Statistics, Bernoulli Society, INFORMS, and the American Mathematical Society. He serves on the editorial board of both “Operations Research” and the “Annals of Applied Probability,” Notably, he was the recipient of the 2004 Erlang Prize from the INFORMS Applied Probability Society, as well as two National Science Foundation grants in 2007. He holds a B.A. in mathematics from New York University and a Ph.D. in operations research from MIT.
As a theoretical computer scientist, Dr. Kelner’s research focuses on fundamental mathematical problems related to algorithms and complexity theory. In 2007, he was selected by the MIT School of Science to receive support from the NEC Corporation Fund for research in computers and communications. He received an Alfred P. Sloan research fellowship in 2010. In 2011, he was selected by MIT for the Harold E. Edgerton Faculty Achievement Award, given to a junior member of the MIT faculty, for distinction in teaching, research, and scholarship. In 2013, he received the School of Science’s Teaching Prize for Undergraduate Education.
Dr. Kelner received a B.A. in mathematics from Harvard in 2002 and the David Mumford Award as the top Harvard graduate in mathematics. He received his M.S. and Ph.D. degrees from MIT in Electrical Engineering and Computer Science in 2005 and 2006. Dr. Kelner was a Member of IAS 2006-2007 before joining the MIT faculty in applied mathematics as an assistant professor in 2007. He was named associate professor in 2012. He is a member of the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL).
Dr. Uhler’s research interests include mathematical statistics, convex optimization, applied algebraic geometry, and mathematical biology. Currently on the faculty at MIT, Dr. Uhler earned her master’s degree in mathematics and a bachelor’s degree in biology at the University of Zurich and her Ph.D. in statistics from UC Berkeley. After postdoctoral appointments at the Institute for mathematics and its applications in Minneapolis and at ETH Zurich, Dr. Uhler joined IST Austria in 2012. In 2013, she participated in the semester program on Big Data at the Simons Institute at UC Berkeley.
Dr. Jegelka is the X-Consortium Career Development Assistant Professor at MIT EECS, and a member of CSAIL, IDSS, and Machine Learning at MIT. Her research is in algorithmic machine learning, and spans modeling, optimization algorithms, theory and applications. In particular, Dr. Jegelka has been working on exploiting mathematical structure for discrete and combinatorial machine learning problems.
Prior to joining MIT, she was a postdoc in the AMPlab and computer vision group at UC Berkeley, and a Ph.D. student at the Max Planck Institutes in Tuebingen and at ETH Zurich.
Dr. Kalyan Veeramachaneni is a Principal research scientist at the Laboratory for Information and Decision Systems (LIDS) at MIT. He directs a research group called "Data to AI" in the new MIT Institute for Data Systems and Society (IDSS). The group is interested in Big data science and Machine learning, and is focused on solving foundational issues preventing artificial intelligence and machine learning solutions to reach their full potential for societal applications. His recent work focuses on making human interactions with data seamless and efficient.
Dr. Veeramachaneni has co-founded two startups- Feature Labs and PatternEx. Feature Labs helps organizations transform their raw, noisy data into intelligent representations using data science automation tools. PatternEx, a cyber security startup is focused on developing the first active learning based solution for identifying new security threats and constantly evolving models that detect threats. His work on AI driven solutions for data science and cybersecurity has been covered by major media outlets- Washington Post, CBS news, Wired, Forbes, Newsweek, among others. Dr. Veeramachaneni received his Masters’ in Computer Engineering and Ph.D. in Electrical engineering in 2009, both from Syracuse University. After PhD he joined MIT in 2009.