Path Finding Algorithms

Important - page not maintained

This page is no longer being maintained and its content may be out of date. For the latest guidance, please visit the Neo4j Graph Data Science Library .

Goals
In this guide, we will learn about path finding graph algorithms.
Prerequisites
Please have Neo4j (version 4.0 or later) and the Graph Data Science Library downloaded and installed to use path finding algorithms.

Beginner

What are path finding algorithms?

Pathfinding Algo Icon

Path finding algorithms build on top of graph search algorithms and explore routes between nodes, starting at one node and traversing through relationships until the destination has been reached. These algorithms find the cheapest path in terms of the number of hops or weight. Weights can be anything measured, such as time, distance, capacity, or cost.

The Neo4j Graph Data Science Library supports many different path finding algorithms.

Shortest Path

The Shortest Path algorithm calculates the shortest (weighted) path between a pair of nodes. Shortest path is considered to be one of the classical graph problems and has been researched as far back as the 19th century. It has the following use cases:

  • Finding directions between physical locations. This is the most common usage, and web mapping tools such as Google Maps use the shortest path algorithm, or a variant of it, to provide driving directions.

  • Social networks can use the algorithm to find the degrees of separation between people. For example, when you view someone’s profile on LinkedIn, it will indicate how many people separate you in the connections graph, as well as listing your mutual connections.

A*

The A* algorithm improves on the Dijkstra shortest path algorithm, by including extra information by way of a heuristic function that determines which paths to explore next. This optimization results in shortest paths being found more quickly. The A* algorithm can be used to find shortest paths between single pairs of locations, where GPS coordinates are known.

Yen’s k-Shortest Paths

The Yen’s algorithm uses the shortest path algorithm to find the shortest path, 2nd shortest path, 3rd shortest path, and so on up to k-1 deviations of shortest path. It has the following use cases:

All-Pairs Shortest Path

The All Pairs Shortest Path calculates the shortest (weighted) path between all pairs of nodes. This algorithm has optimizations that make it quicker than calling the Single Source Shortest Path algorithm for every pair of nodes in the graph. It has the following use cases:

  • Itis used in urban service system problems, such as the location of urban facilities or the distribution or delivery of goods. One example of this is determining the traffic load expected on different segments of a transportation grid. For more information, see Urban Operations Research.

  • It is used as part of the REWIRE data center design algorithm that finds a network with maximum bandwidth and minimal latency. There are more details about this approach in "REWIRE: An Optimization-based Framework for Data Center Network Design"

Single Source Shortest Path

The Single Source Shortest Path algorithm calculates the shortest (weighted) path from a node to all other nodes in the graph. Open Shortest Path First is a routing protocol for IP networks. It uses Single Source Shortest Path to detect changes in topology, such as link failures, and come up with a new routing structure in seconds.

Minimum Spanning Tree

The Minimum Weight Spanning Tree starts from a given node, and finds all its reachable nodes and the set of relationships that connect the nodes together with the minimum possible weight. It has the following use cases:

Random Walk

Random Walk is an algorithm that provides random paths in a graph. A random walk means that we start at one node, choose a neighbor to navigate to at random or based on a provided probability distribution, and then do the same from that node, keeping the resulting path in a list. It’s similar to how a drunk person traverses a city. It has the following use cases:

  • It has be shown to relate to Brownian motion and also to the movement and dispersal of animals in the study of Random walk models in biology.

  • It has been used to analyse ALSI index of the JSE stock exchange and show that the index followed the random walk hypothesis between years 2000 and 2011. This means the movement of stock prices was random and the ability of investors to perform relied more on luck than anything else. Find this study in The Random Walk Theory And Stock Prices: Evidence From Johannesburg Stock Exchange

  • It is part of the node2vec and graph2vec algorithms, that create node embeddings, as well as the Walktrap and Infomap community detection algorithms.

  • It can be used as part of the training process of machine learning model, as described in David Mack’s article Review prediction with Neo4j and TensorFlow.