Workshop on Scholarly Big Data: Challenges and Ideas

Full-Day Workshop at 2013 IEEE International Conference on Big Data (IEEE BigData 2013), 6 October 2013, Santa Clara, USA

Scholarly Big Data (SBD2013)

Academics and researchers worldwide continue to produce large numbers of scholarly documents including papers, books, technical reports, etc. and associated data such as tutorials, proposals, lab note books, and course materials. For example PubMed has over 20 million documents, 10 million unique names and 70 million name mentions. Google Scholar has many millions more it is believed. Understanding how at scale research ideas emerge, evolve, or disappear as a topic, what is a good measure of quality of published works, what are the most promising areas of research, how authors connect and influence each other, who are the experts in a field, what works are similar, and who funds a particular research topic are some of the major foci of the rapidly emerging field of Scholarly Big Data.

Digital libraries, repositories, databases, wikipedia, funding agencies and the web have become a medium for answering such questions. For example citation analysis is used to mine large publication graphs in order to extract patterns in the data (e.g., citations per article) that can help measure the quality of a journal. Scientometrics is used to mine graphs that link together multiple types of entities: authors, publications, conference venues, journals, institutions, etc., in order to assess the quality of science and answer complex questions such as those listed above. Tools such as maps of science allow different categories of users to satisfy various needs, e.g., help researchers to easily access research results, identify relevant funding opportunities, and find collaborators; and funding sources identify new directions of research and the impact of existing funding. Moreover, the recent developments in data mining, machine learning, natural language processing, and information retrieval makes it possible to transform the way we analyze research publications, funded proposals, patents, etc., on a web-wide scale.

The workshop aims at bringing together researchers with diverse interdisciplinary backgrounds interested in mining, managing and search scholarly big data. See the call for papers for more information.


08h45-10h00Keynote by Brewster Kahle, Internet Archive
10h00-10h20Coffee break
10h20-10h45 The Microsoft Academic Search Challenges at KDD Cup 2013
Martine De Cock
Senjuti Basu Roy
Swapna Savvana
Vani Mandava
Brian Dalessandro
Claudia Perlich
William Cukierski
Ben Hamner
10h45-11h10 Bibliometric-enhanced Retrieval Models for Big Scholarly Information Systems
Philipp Mayr
Peter Mutschke
11h35-12h00 Academic Publishing as a Social Media Paradigm
Michael E. Payne
Linh B. Ngo
Amy W. Apon
12h00-13h30Lunch provided
13h30-15h40Discussion and breakout sessions
15h40-16h00Coffee break

PSU logo

MSR logo

Microsoft Research

KU logo

UTexas logo