Thursday, Jan. 31, 11:45-12:45

Michael Stonebraker, Adjunct Professor of Computer Science, MIT

Title: Big Data is (at least) Four Different Problems

Abstract: Our ability to mine and analyze "Big Data" has yet to catch up to our ability to generate and collect it. Moreover "Big Data" and "Big Analytics" mean different things to different people. This talk unpacks the Big Data problem into four distinct problems, comments on trends, and compares emerging software tools in each area.

Big volumes of data with SQL analytics. The traditional data warehouse vendors support SQL analytics on very large volumes of data. In this talk, I make a few comments on where I see this market going.

Big volumes of data with complex analytics. By complex analytics, I mean data clustering, multivariate statistics, machine learning, and other complex analytics on very large amounts of data. I will explain how RDBMS vendors can be beaten by 2 orders of magnitude in this market by array DBMSs that provide real analytical scalability.

Big velocity. By this I mean being able to absorb and process a firehose of incoming data. In this market, the traditional SQL vendors are a non-starter. I will discuss alternatives including complex event processing (CEP), NoSQL and NewSQL systems.

Big diversity. Many enterprises are faced with integrating a larger and larger number of data sources with diverse data (spreadsheets, web sources, XML, traditional DBMSs). The traditional ETL products do not appear up to the challenges of this new world, and I talk about an alternate way to go.

Biography: Professor Mike Stonebraker has been a pioneer of data base research and technology for more than a quarter of a century. He is widely recognized as one of the world's foremost experts in database technology and is noted for his insight in operating systems and expert systems.

Over his career, Stonebraker has been both a professor and leading architect for prototype development (starting with Ingres and Postgres) at leading institutions such as University of California at Berkley and M.I.T. Additionally, Stonebraker has authored scores of research papers on database technology, operating systems and the architecture of system software services.

Professor Stonebraker was awarded the prestigious ACM System Software Award in 1992 for his work on INGRES, the first annual Innovation award by the ACM SIGMOD special interest group in 1994, and the IEEE John Von Neumann award in 2005. He was elected to the National Academy of Engineering in 1997 and to the American Academy of Arts and Sciences in 2010.

Stonebraker is presently an Adjunct Professor of Computer Science at M.I.T., where he is working on a variety of future-generation data-oriented projects. He is co-director of a new Intel funded center at MIT CSAIL focusing on Big Data Analytics.

Mike is also a co-founder of eight software startups, including Streambase, Vertica Systems (sold to HP), VoltDB, Goby (sold to TeleNav), and Paradigm4.

Thursday, Jan. 31, 2:00-3:00

Ron Dror, Senior Research Scientist at D. E. Shaw Research, and deputy to Chief Scientist David E. Shaw

Title: How Drugs Bind and Control Their Targets: Characterizing GPCR Signaling Using Anton, a Special-purpose Supercomputer for Molecular Dynamics Simulations

Abstract: Roughly one-third of all drugs act by binding to G-protein-coupled receptors (GPCRs) and either triggering or preventing receptor activation, but the process by which they do so has proven difficult to determine using either experimental or computational approaches. We recently completed a special-purpose machine, named Anton, that uses a combination of novel algorithms and application-specific hardware to accelerate molecular dynamics simulations by orders of magnitude, enabling all-atom protein simulations as long as a millisecond. Anton has made possible simulations in which drugs spontaneously associate with GPCRs to achieve bound conformations that match experimental structures almost perfectly. Simulations on Anton have also captured transitions of a GPCR between its active and inactive structures, allowing us to characterize the mechanism of receptor activation. Our results, together with complementary experimental data, suggest opportunities for the design of drugs that achieve greater specificity and control receptor signaling more precisely.

Biography: Since joining D. E. Shaw Research as its first hire, Ron Dror has served as Senior Research Scientist and deputy to Chief Scientist David E. Shaw, overseeing various projects in computational structural biology and high-performance computing.

He earned a Ph.D. in Electrical Engineering and Computer Science at MIT, advised by Professors Alan Willsky and Edward Adelson. Previously, he completed an M.Phil. in Biological Sciences as a Churchill Scholar at the University of Cambridge as well as a B.A. in Mathematics and a B.S. in Electrical and Computer Engineering at Rice University.

Friday, Feb. 1, 10:00-11:00

Corinna Cortes, Head of Google Research, NY

Title: Accuracy at the Top

Abstract: The accuracy of the items placed near the top is crucial for modern information retrieval systems, such as search engines or recommendation systems, since most users of these systems browse only the first few items. Several algorithms have been introduced in the past to optimize related notions such as precision at k, but what learning guarantees do they have?

We introduce a new notion of classification accuracy based on the top $\tau$-quantile values of a scoring function and give a learning algorithm optimizing this criterion benefitting from margin-based guarantees. We show how training can be done by solving a set of convex optimization problems and describe a very large-scale solution based on the ADMM consensus framework. We also report the results of several experiments in the bipartite setting and compare them to those of several other algorithms seeking high precision at the top.

(Joint work with Stephen Boyd, Mehryar Mohri, and Ana Radovanovic).

Biography: Corinna Cortes is the Head of Google Research, NY, where she is working on a broad range of theoretical and applied large-scale machine learning problems. Prior to Google, Corinna spent more than ten years at AT&T Labs - Research, formerly AT&T Bell Labs, where she held a distinguished research position. Corinna's research work is well-known in particular for her contributions to the theoretical foundations of support vector machines (SVMs) for which she jointly with Vladimir Vapnik received the 2008 Paris Kanellakis Theory and Practice Award, and for her work on data-mining in very large data sets for which she was awarded the AT&T Science and Technology Medal in the year 2000. Corinna received her MS degree in Physics from University of Copenhagen and joined AT&T Bell Labs as a researcher in 1989. She received her Ph.D. in computer science from the University of Rochester in 1993.

Corinna is also a competitive runner, placing third in the More Marathon in New York City in 2005, and a mother of two.

Friday, Feb. 1, 2:00-3:00

Sep Kamvar, LG Career Development Professor of Media Arts and Sciences, MIT & MIT Media Lab

Title: Organizing the World's Social Information

Abstract: In the past few years, we have seen a tremendous growth in public human communication and self-expression, through blogs, microblogs, and social networks. In addition, we are beginning to see the emergence of a social technology stack on the web, where profile and relationship information gathered by some applications can be used by other applications. In this talk I will discuss how the changing web suggests new paradigms for search and discovery, and the challenges these new paradigms pose to research in numerical linear algebra, machine learning, human-computer interaction, and programming languages.

Biography: Sep Kamvar is the LG Associate Professor of Media Arts and Sciences at MIT, and the Director of the Social Computing Group at the MIT Media Lab. His research focuses on social computing and information management.

Prior to MIT, Sep was the head of personalization at Google and a consulting professor of Computational and Mathematical Engineering at Stanford University. Prior to that, he was founder and CEO of Kaltix, a personalized search company that was acquired by Google in 2003.

Sep is the author of two books and over 40 technical publications and patents in the fields of search and social computing. He is on the technical advisory boards of several companies, including Clever Sense and Etsy. His artwork has been exhibited at the Museum of Modern Art in New York, the Victoria and Albert Musem in London, and the National Museum of Contemporary Art in Athens.

Sep received his Ph.D. in Scientific Computing and Computational Mathematics from Stanford University and his A.B. in Chemistry from Princeton University.