Interface VertexProgram<M>
-
- All Superinterfaces:
Cloneable
- All Known Implementing Classes:
ConnectedComponentVertexProgram
,PageRankVertexProgram
,PeerPressureVertexProgram
,ShortestPathVertexProgram
,org.apache.tinkerpop.gremlin.process.computer.util.StaticVertexProgram
,TraversalVertexProgram
public interface VertexProgram<M> extends Cloneable
AVertexProgram
represents one component of a distributed graph computation. Each vertex in the graph (logically) executes theVertexProgram
instance in parallel. The collective behavior yields the computational result. In practice, a "worker" (i.e. task, thread, etc.) is responsible for executing the VertexProgram against each vertex that it has in its vertex set (a subset of the full graph vertex set). At minimum there is one "worker" for each vertex, though this is impractical in practice andGraphComputer
implementations that leverage such a design are not expected to perform well due to the excess object creation. Any local state/fields in a VertexProgram is static to the vertices within the same worker set. It is not safe to assume that the VertexProgram's "worker" state will remain stable between iterations. Hence, the existence ofworkerIterationStart(org.apache.tinkerpop.gremlin.process.computer.Memory)
andworkerIterationEnd(org.apache.tinkerpop.gremlin.process.computer.Memory)
.- Author:
- Marko A. Rodriguez (http://markorodriguez.com), Matthias Broecheler (me@matthiasb.com)
-
-
Nested Class Summary
Nested Classes Modifier and Type Interface Description static interface
VertexProgram.Builder
static interface
VertexProgram.Features
-
Field Summary
Fields Modifier and Type Field Description static String
VERTEX_PROGRAM
-
Method Summary
All Methods Static Methods Instance Methods Abstract Methods Default Methods Modifier and Type Method Description VertexProgram<M>
clone()
When multiple workers on a single machine need VertexProgram instances, it is possible to use clone.static <V extends VertexProgram>
VcreateVertexProgram(Graph graph, org.apache.commons.configuration2.Configuration configuration)
A helper method to construct aVertexProgram
given the content of the supplied configuration.void
execute(Vertex vertex, Messenger<M> messenger, Memory memory)
This method denotes the main body of the computation and is executed on each vertex in the graph.default VertexProgram.Features
getFeatures()
default Set<MapReduce>
getMapReducers()
The set ofMapReduce
jobs that are associated with theVertexProgram
.default Set<MemoryComputeKey>
getMemoryComputeKeys()
TheMemory
keys that will be used during the computation.default Optional<MessageCombiner<M>>
getMessageCombiner()
Combine the messages in route to a particular vertex.Set<MessageScope>
getMessageScopes(Memory memory)
This method returns all theMessageScope
possibilities for a particular iteration of the vertex program.GraphComputer.Persist
getPreferredPersist()
GraphComputer.ResultGraph
getPreferredResultGraph()
default Set<org.apache.tinkerpop.gremlin.process.traversal.traverser.TraverserRequirement>
getTraverserRequirements()
The traverser requirements that are needed when this VP is used as part of a traversal.default Set<VertexComputeKey>
getVertexComputeKeys()
TheElement
properties that will be mutated during the computation.default void
loadState(Graph graph, org.apache.commons.configuration2.Configuration configuration)
When it is necessary to load the state of the VertexProgram, this method is called.void
setup(Memory memory)
The method is called at the beginning of the computation.default void
storeState(org.apache.commons.configuration2.Configuration configuration)
When it is necessary to store the state of the VertexProgram, this method is called.boolean
terminate(Memory memory)
The method is called at the end of each iteration to determine if the computation is complete.default void
workerIterationEnd(Memory memory)
This method is called at the end of each iteration of each "computational chunk." The set of vertices in the graph are typically not processed with full parallelism.default void
workerIterationStart(Memory memory)
This method is called at the start of each iteration of each "computational chunk." The set of vertices in the graph are typically not processed with full parallelism.
-
-
-
Field Detail
-
VERTEX_PROGRAM
static final String VERTEX_PROGRAM
- See Also:
- Constant Field Values
-
-
Method Detail
-
storeState
default void storeState(org.apache.commons.configuration2.Configuration configuration)
When it is necessary to store the state of the VertexProgram, this method is called. This is typically required when the VertexProgram needs to be serialized to another machine. Note that what is stored is simply the instance/configuration state, not any processed data. The default implementation provided simply stores the VertexProgram class name for reflective reconstruction. It is typically a good idea to VertexProgram.super.storeState().- Parameters:
configuration
- the configuration to store the state of the VertexProgram in.
-
loadState
default void loadState(Graph graph, org.apache.commons.configuration2.Configuration configuration)
When it is necessary to load the state of the VertexProgram, this method is called. This is typically required when the VertexProgram needs to be serialized to another machine. Note that what is loaded is simply the instance state, not any processed data.- Parameters:
graph
- the graph that the VertexProgram will run againstconfiguration
- the configuration to load the state of the VertexProgram from.
-
setup
void setup(Memory memory)
The method is called at the beginning of the computation. The method is global to theGraphComputer
and as such, is not called for each vertex. During this stage, theMemory
should be initialized to to its "start state."- Parameters:
memory
- The global memory of the GraphComputer
-
execute
void execute(Vertex vertex, Messenger<M> messenger, Memory memory)
This method denotes the main body of the computation and is executed on each vertex in the graph. This method is logically executed in parallel on all vertices in the graph. When theMemory
is read, it is according to the aggregated state yielded in the previous iteration. When theMemory
is written, the data will be aggregated at the end of the iteration for reading in the next iteration.- Parameters:
vertex
- theVertex
to execute theVertexProgram
onmessenger
- the messenger that moves data between verticesmemory
- the shared state between all vertices in the computation
-
terminate
boolean terminate(Memory memory)
The method is called at the end of each iteration to determine if the computation is complete. The method is global to theGraphComputer
and as such, is not called for eachVertex
. TheMemory
maintains the aggregated data from the last execute() iteration.- Parameters:
memory
- The global memory of theGraphComputer
- Returns:
- whether or not to halt the computation
-
workerIterationStart
default void workerIterationStart(Memory memory)
This method is called at the start of each iteration of each "computational chunk." The set of vertices in the graph are typically not processed with full parallelism. The vertex set is split into subsets and a worker is assigned to call theexecute(org.apache.tinkerpop.gremlin.structure.Vertex, org.apache.tinkerpop.gremlin.process.computer.Messenger<M>, org.apache.tinkerpop.gremlin.process.computer.Memory)
method. The default implementation is a no-op.- Parameters:
memory
- The memory at the start of the iteration.
-
workerIterationEnd
default void workerIterationEnd(Memory memory)
This method is called at the end of each iteration of each "computational chunk." The set of vertices in the graph are typically not processed with full parallelism. The vertex set is split into subsets and a worker is assigned to call theexecute(org.apache.tinkerpop.gremlin.structure.Vertex, org.apache.tinkerpop.gremlin.process.computer.Messenger<M>, org.apache.tinkerpop.gremlin.process.computer.Memory)
method. The default implementation is a no-op.- Parameters:
memory
- The memory at the end of the iteration.
-
getVertexComputeKeys
default Set<VertexComputeKey> getVertexComputeKeys()
TheElement
properties that will be mutated during the computation. All properties in the graph are readable, but only the keys specified here are writable. The default is an empty set.- Returns:
- the set of element keys that will be mutated during the vertex program's execution
-
getMemoryComputeKeys
default Set<MemoryComputeKey> getMemoryComputeKeys()
TheMemory
keys that will be used during the computation. These are the only keys that can be read or written throughout the life of theGraphComputer
. The default is an empty set.- Returns:
- the set of memory keys that will be read/written
-
getMessageCombiner
default Optional<MessageCombiner<M>> getMessageCombiner()
Combine the messages in route to a particular vertex. Useful to reduce the amount of data transmitted over the wire. For example, instead of sending two objects that will ultimately be merged at the vertex destination, merge/combine into one and send that object. If no message combiner is provider, then no messages will be combined. Furthermore, it is not guaranteed the all messages in route to the vertex will be combined and thus, combiner-state should not be used. The result of the vertex program algorithm should be the same regardless of whether message combining is executed or not.- Returns:
- A optional denoting whether or not their is a message combine associated with the vertex program.
-
getMessageScopes
Set<MessageScope> getMessageScopes(Memory memory)
This method returns all theMessageScope
possibilities for a particular iteration of the vertex program. The returned messages scopes are the scopes that will be used to send messages during the stated iteration. It is not a requirement that all stated messages scopes be used, just that it is possible that they be used during the iteration.- Parameters:
memory
- an immutable form of theMemory
- Returns:
- all possible message scopes during said vertex program iteration
-
getMapReducers
default Set<MapReduce> getMapReducers()
The set ofMapReduce
jobs that are associated with theVertexProgram
. This is not necessarily the exhaustive list over the life of theGraphComputer
. If MapReduce jobs are declared by GraphComputer.mapReduce(), they are not contained in this set. The default is an empty set.- Returns:
- the set of
MapReduce
jobs associated with thisVertexProgram
-
getTraverserRequirements
default Set<org.apache.tinkerpop.gremlin.process.traversal.traverser.TraverserRequirement> getTraverserRequirements()
The traverser requirements that are needed when this VP is used as part of a traversal. The default is an empty set.- Returns:
- the traverser requirements
-
clone
VertexProgram<M> clone()
When multiple workers on a single machine need VertexProgram instances, it is possible to use clone. This will provide a speedier way of generating instances, over thestoreState(org.apache.commons.configuration2.Configuration)
andloadState(org.apache.tinkerpop.gremlin.structure.Graph, org.apache.commons.configuration2.Configuration)
model. The default implementation simply returns the object as it assumes that the VertexProgram instance is a stateless singleton.- Returns:
- A clone of the VertexProgram object
-
getPreferredResultGraph
GraphComputer.ResultGraph getPreferredResultGraph()
-
getPreferredPersist
GraphComputer.Persist getPreferredPersist()
-
createVertexProgram
static <V extends VertexProgram> V createVertexProgram(Graph graph, org.apache.commons.configuration2.Configuration configuration)
A helper method to construct aVertexProgram
given the content of the supplied configuration. The class of the VertexProgram is read from theVERTEX_PROGRAM
static configuration key. Once the VertexProgram is constructed,loadState(org.apache.tinkerpop.gremlin.structure.Graph, org.apache.commons.configuration2.Configuration)
method is called with the provided graph and configuration.- Type Parameters:
V
- The vertex program type- Parameters:
graph
- The graph that the vertex program will execute againstconfiguration
- A configuration with requisite information to build a vertex program- Returns:
- the newly constructed vertex program
-
getFeatures
default VertexProgram.Features getFeatures()
-
-