This page attempts to describe my research activities, which currently primarily take place in the CORPNET group in Amsterdam.
Keywords: network science, corporate networks, social network analysis, computational social science, data science, algorithms
The over-arching theme of my research is computational network science. This means that I am interested in network science from the computational point of view. The main idea behind network science is that by investigating interactions between objects in a dataset as a network, we can better understand the considered data. Moreover, from the network perspective we are often able to reveal patterns and phenomena that are not visible when the mere individual objects are studied. One could see network science as a specialization of data science that focuses on network data. At the same time, network science can be seen as a particular method in complexity research. My research has two big pillars: Computational network science fundamentals and Knowledge discovery in networks:
I first and foremost aim to provide fundamental contributions related to the algorithms, methods and techniques used in network science, with a strong focus on contributions to computer science. This line of research continues that of my PhD thesis titled Algorithms and Analyzing and Mining Real-World Graphs (text, presentation). Topics include: computational aspects of algorithms for analyzing large-scale network data, network visualization, high performance network data processing and large-scale network data management.
The second pillar of my research deals with analyzing, visualizing and mining large-scale network data from a range of domains. The aim is to answer important research questions from these domains by using cutting edge network science algorithms. As of September 2015, I work in the CORPNET project at the University of Amsterdam, where I focus on network-related questions on corporate governance as seen from the domain of political economy. In particular, I work on analyzing corporate and economic networks in which firms are connected based on for example interlocking directorates and ownership.
Apart from papers, of which a list can be found at the Publications page, the output of my research is often a particular algorithm or piece of software. Some examples can be found below.
Visualizing network data allows patterns and outliers in the data to be found. However, visualizing networks with hundreds of thousands or even millions of nodes, is computationally expensive. GPU's such as nVidia's Titan X can be used to speed up this progress. Together with colleague Kristian Rietveld and CS master student Govert Brinkmann we made a GPU implementation of Gephi's ForceAtlas2, a popular force-directed layout algorithm.
In 2011, together with colleague Walter Kosters we presented a paper that introduced an algorithm for computing the exact diameter (maximal distance) of small-world networks. Traditionally, the diameter takes O(mn) time to compute for a graph with n nodes and m edges. However, if the graph has certain properties common to small-world networks, computation can be done much more efficiently. Pruning strategies and smart lower and upper bounds together form the BoundingDiameters algorithm, which in practice finds the diameter in linear instead of quadratic time. For a network with 8 million nodes and 1 billion edges, computation time is reduced from 1.5 years to 1 minute.
teexGraph is a lightweight C++ library released in 2016 to efficiently analyze the structure of large real-world networks. It is able to analyze and summarize the network topology, but more importantly delivers maximal performance when computing distance-based metrics such as the distance distribution, center/periphery structure, and various centrality measures such as closeness centrality, betweenness centrality and PageRank. Last but not least, teexGraph includes code for a number of graph algorithms that I have devised as part of my research.
Community detection is a well-known technique to discover groups of tightly connected nodes in networks. However, when analyzing large network datasets, it can be challenging to interpret the meaning the resulting communities. This piece of code allows one to automatically interpret the contents of communities based on some property of the nodes in these communities.
A lot of my work takes place as part of a particular project.
This newly started research project aims to investigate data-driven modelling techniques from both computer science and economics in order to devise new methods, techniques and algorithms for determining official economic statistics. The project is a collaboration between Centraal Bureau voor de Statistiek (CBS), the University of Amsterdam and Leiden University.
The main challenges addressed in this project deal with data management, availability and computation of large-scale network data. In particular, we are analyzing data on over 200 million corporations connected through hundreds of millions of links. This project's multi-core big memory server architecture forms the backbone of the data analysis infrastructure of the CORPNET project (see below).
At the CORPNET group of the University of Amsterdam, we use network science to understand the global economy. The project is a product of the 2015 ERC Starting Grant of Eelke Heemskerk, and focuses on data on corporate ownership and interlocking directorates, forming so-called networks of corporate control. A combination of computer scientists, physicists, political scientists and sociologists work together in this interdisciplinary research group.
This project, in collaboration with the Dutch National Police and Utrecht University, had as a main goal to better assess the risk around soccer matches and other soccer-related activities. Using state-of the art visualization and data science methods, we were able to devise a hands-on framework on an interactive itable. The resulting product could help police officers on a daily basis to make better decisions in terms of planning and allocation of resources around soccer matches, automatically re-using their past data on previous incidents related to soccer matches.
The goal of this project, in collaboration with TU Eindhoven, was the development of stream mining techniques for complex patterns such as graphs. The aim was to extend the existing state-of-the-art techniques into two, orthogonal directions: on the one hand, the mining of more complex patterns in streams, such as sequential patterns and evolving graph patterns (for example social networks), and on the other hand, more natural stream support measures taking into account the temporal nature of most data streams.
If you are interested in finding out more about my research, then feel free to contact me!
Last modified: August 24, 2017 @ 11:03:40.