author Henk P. Penning date Thu Aug 26 15:18:35 MEST 2004 regards PGP pathfinder
We often want to find shortest paths in a Web of Trust (WOT). Because a (strongly connected component of a) WOT often has the small world property, the number of points at distance k, increases rapidly with k. When searching a shortest path from A to B, it is cheaper to use a modified BreadthFirstSearch algorithm. Instead of only searching from A to B, a combined search from A to B and B to A is given.
Skip this if you know what PGP is.In the Pretty Good Privacy (PGP) public key system, users generate (one or more) keys, and label them with a name and email address. When Alice has a key A, and Bob has a key B, Alice can sign the key of Bob: key B then has signature A.It is the rule that before Alice signs a key of Bob, Alice must establish (by photo id) that
Other users, who observe that Alice signed key B of Bob, may (sceptically) accept that Alice trusts that key B realy belongs to someone named Bob.
- Bob is who he says he is, and
- key B really belongs to Bob.
A trust path from (the key of) Dave to (the key of) Eve is a list of keys
Dave -> X -> Y -> Z -> Evewhere Dave signed X, X signed Y, etc. Dave may trust that the key labeled 'Eve' actually belongs to Eve, if there is a trust path from Dave to Eve. A priori, shorter paths are better than longer paths, and having multiple, disjunct paths is better than having a single path, or no path at all.
A Web Of Trust (WOT) is a labeled, directed graph whereAssume that the datastructure for the WOT allows, for each point v, to find the points connected to v and the points v connects to.
- points represent PGP keys
- lines represent signatures ; A -> B if A has signed B
- the distance from point A to point B is the length of a shortest path from A to B
- the k-ring around a point A is the set of points at distance k from A.
in ( v ) = { x | x -> v } [ all signatures of v ] out ( v ) = { x | v -> x } [ all keys signed by v ]Finding a shortest path from A to B is straightforward. Use BreadthFirstSearch (BFS):
- phase 1: start at A, and find (using out) the k-rings for k=1,2,3,.. until B is found at some distance d (success), or some k-ring is empty (no path found). Mark each visited point with it's distance to A.
- phase 2: walk (using in) from B to A, visiting points with distance d-1, d-2, etc until A is found,
- result: the reverse of the walk in phase 2.
Note that finding a path from A to B, could also start at B, if we switch the roles of (A, B) and (in, out) in the algorithm.
We will use this symmetry to modify BreadthFirstSearch.
- phase 1: start at B, and find (using in) the k-rings for k=1,2,3,.. until A is found at some distance d (success), or some k-ring is empty (no path found). Mark each visited point with it's distance to B.
- phase 2: walk (using out) from A to B, visiting points with distance d-1, d-2, etc until B is found.
- result: the walk in phase 2.
A WOT usually has the small world property: the shortest path length (degrees of separation) is small in view of the size of the graph. This means that the number of points at distance 1, 2, 3, etc grows fast. The cost of computing the points at distance 2k, is much higher than twice the cost of computing the points at distance k. It is much cheaper to find a middle man M: a point on the the shortest path from A to B. The path from A to B isA -> ... -> M -> ... -> BAs it turns out, finding a middle man is easy.Instead of searching from A to B, we can alternate: compute the next ring around A, then the next ring around B, then A etc. Stop when a point in an 'opposite' ring is found (middle man!), or one of the rings becomes empty (no path exists).
Adapting the algorithm is simple:
Because my PGP pathfinder is written in an interpreted language (Perl) and runs on a rather slow server, this improvement was quite noticable.
- marking both A and B initially with '0' causes a problem in a special case : A <-> M <-> B. To avoid this, for marking use depth instead of distance, where depth=distance+1. This can be achieved by initially marking A as '+1' and B as '-1'.
- mark points reached from A as 2, 3, 4 etc
- mark points reached from B as -2, -3, -4 etc
- when computing the next ring around A,
- ignore points with a mark greater than 0,
- stop if a point is found with a mark smaller than 0 ; we found the middle man
- mark other points with the proper (positive) mark
- when computing the next ring around B,
- ignore points with a mark smaller than 0,
- stop if a point is found with a mark greater than 0 ; we found the middle man
- mark other points with the proper (negative) mark
- to keep the algorithm simple, treat these cases as special : A == B and A -> B.
Instead of simply alternating between A and B :the program should proceed with the smallest ring, be it around A or B. The adaptation of the algorithm is trivial.
- next ring around A
- next ring around B
- next ring around A
- ....
If we want to show multiple, disjunct paths, we must succesivelySuppose we start with a graph that is one strongly connected component (for every A and B, there is a path from A to B). When we delete points from the graph, it may become unconnected. Using straightforward BFS can be costly when there is no path. When A is very well connected, and B isn't, we have to search almost the entire graph to find out that B isn't reachable from A.
- find a shortest path, and show it
- delete the (interior of) the path from the graph
- repeat until there are no more paths, or enough paths are shown
BFS-I and BFS-II generally terminate much more quickly when there is no path.
|
|