The release of the Neo4j GDS library version 1.5, and the build-in machine learning models, has now given the Data Scientist that needs to perform a machine learning task on any graph in Neo4j two possible routes to a solution.
If time is of the essence and a supported and tested model that works natively is needed, then a simple function call to the GDS library will get the job done. …
In October 2018, a team of 27 researchers from DeepMind/Google, MIT, and the University of Edinburgh published a paper entitled: “Relational inductive biases, deep learning, and graph networks”.
The crux of the paper is partly a ‘position paper’ as well as practical implementation because it includes a library for building graph neural networks in TensorFlow and Sonnet (Sonnet is DeepMind’s library for Neural Networks on top of TensorFlow). To date this paper has received almost 1,000 citations, hence it warrants some closer investigation.
In this note, Mark Needham and I will first summarize the key theoretical arguments which the paper…
Imagine that one morning you wake up and you say to yourself:
Well, would it not be awesome if I could compute the betweenness centrality scores on that graph of 100 mio nodes we loaded into Neo4j yesterday, and that represents the email traffic in our company? Gosh, that would certainly help us to identify our key employees from an information flow perspective.
Yes, let’s get to work!
In this article, Mark Needham and I hold a drag-race between the Python NetworkX package and the Neo4j Graph Data Science library. In this race we let them compete to discover which one is the fastest in computing the Betweenness and Harmonic centrality measure on undirected graphs. We will also use the Neo4j Python driver so that we don’t even need to leave the cushy environment of our Python IDE.
The NetworkX package is the old kid on the block and was released all the way back in 2008¹ (…in those days Neo4j was still having a hot…
A common challenge graph analysts face is the time complexity constraints many of the most important centrality metrics have.
For instance, Cohen et al. illustrate in “Computing Classic Closeness Centrality, at Scale”, that calculating the closeness centrality on a network of 24 million nodes takes an astonishing 44,222 hours, or around 5 years, to compute (assuming standard computing resources).
Indeed, Betweenness centrality, when applying the Brandes algorithm, has a time complexity of O(|V||E|) and consequently large networks will require an ever-increasing runtime when computing this centrality metric. …
Developer relations at Neo4j, fascinated by anything with Graphs and Deep Learning. PhD student at Birkbeck, University of London