How research on foundations, principles, and theory of data management can help with the challenges and opportunities arising from the “ data explosion” that is re-defining personal life, business, and society itself in the 21st century.
Computerized data is ubiquitous in the modern world. This affects every day people, ranging from social media to ecommerce to online education. It affects businesses, ranging from managing finances, to intricate supply chain networks to ever-expanding data sets about the economic landscape and its stakeholders. And it affects nations and their citizens, as public opinion is increasingly shaped by online sources, as science continues to use massive data, and as the security of individuals’ data is at risk.
Data management is central to most computer applications today, and has grown to be one of the most important building blocks in Computer Science, a trend that will continue in the coming decades. How should we manage data in the presence of the emerging application areas, information processing techniques, and ethical challenges? The database theory research community has made central contributions to the very foundations of data management over the past four decades, and has the opportunity to expand those contributions as the nature of data management continues to transform our society and our world.
In April, 2016, a Dagstuhl Perspectives workshop was held, with the goal of identifying and describing some of the most important research themes in data management where research on principles and foundational issues can play a central role. The first seed for this workshop was a letter sent in May, 2012, by Pablo Barceló to the PODS Executive Committee, suggesting that it was time to reflect on the successes of the database theory community (as led by the PODS and ICDT conferences), and to explore opportunities for continued impact and growth going forward. This led to the Reflections on PODS workshop in 2013 co-led by Pablo and Wim Martens, and to a deliberate broadening in scope of the PODS and ICDT conferences. The Dagstuhl workshop itself was suggested by the ICDT Council and its preparation was a joint initiative of the PODS Executive Committee and the ICDT Council. The workshop brought together a broad array of researchers from database theory and neighboring areas for a week of intensive discussions. The findings and recommendations of this group are now available in the report “Research Directions for Principles of Data Management”, published online in arXiv, and soon to be published in the archival Dagstuhl Manifestos journal series. An abridged version of the report is appearing in the December 2016 issue of SIGMOD Record.
The workshop report is focused on seven core themes, namely, Managing Data at Scale, Multi-model Data, Uncertain Information, Knowledge-enriched Data, Data Management and Machine Learning, Process and Data, and Ethics and Data Management. Since new challenges in Principles of Data Management arise all the time, this list of themes is not intended to be exclusive. For each theme, the report highlights central research challenges, and describes emerging results that are providing some of the starting points for addressing them. The report is intended for a broad audience, ranging from researchers and scientists who are exploring the many issues that arise in modern data management, to funding agencies fostering the next generation of Computer Science research, and even to policy makers, sociologists, and philosophers, who are considering the societal and ethical implications of ubiquitous data.
Our heartfelt thanks go to all of the workshop participants, who were deeply engaged in the lively and thoughtful discussions at Dagstuhl. The report would not have been possible without those many contributions. We especially thank the researchers listed as authors on the report, who helped to synthesize the workshop discussions and bring them into a succinct summary form that can serve as both a call to action and research survey for key emerging research challenges. And I especially want to thank my fellow workshop co-organizers – Marcelo Arenas, Pablo Barceló, Wim Martens, Tova Milo and Thomas Schwentick – who have worked so very hard over the past two years to make the workshop and report so inclusive, informative and insightful.