SparkGraphComputer (Apache TinkerPop 3.4.8-SNAPSHOT API)

java.lang.Object
- org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer
- - org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer

All Implemented Interfaces:

GraphComputer
```
public final class SparkGraphComputer
extends AbstractHadoopGraphComputer
```
GraphComputer implementation for Apache Spark.

Author:

Marko A. Rodriguez (http://markorodriguez.com)

Nested Class Summary
- Nested classes/interfaces inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer
  AbstractHadoopGraphComputer.Features
- Nested classes/interfaces inherited from interface org.apache.tinkerpop.gremlin.process.computer.GraphComputer
  GraphComputer.Exceptions, GraphComputer.Persist, GraphComputer.ResultGraph

Field Summary
- Fields inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer
  executed, graphFilter, hadoopGraph, logger, mapReducers, persist, resultGraph, vertexProgram, workers

Constructor Summary

Constructors
Constructor and Description

SparkGraphComputer(HadoopGraph hadoopGraph)

Constructors
Constructor and Description
`SparkGraphComputer(HadoopGraph hadoopGraph)`

Method Summary

All Methods Static Methods Instance Methods Concrete Methods
Modifier and Type	Method and Description
`SparkGraphComputer`	`configure(String key, Object value)` Set an arbitrary configuration key/value for the underlying `Configuration` in the `GraphComputer`.
`SparkGraphComputer`	`graphStorageLevel(StorageLevel storageLevel)` Specifies the method by which the `VertexProgram` created graph is persisted.
`SparkGraphComputer`	`kryoRegistrationRequired(boolean required)` Determines if kryo registration is required such that attempts to serialize classes that are not registered will result in an error.
`protected void`	`loadJar(Configuration hadoopConfiguration, File file, Object... params)`
`static void`	`main(String[] args)`
`SparkGraphComputer`	`master(String clusterManager)` Sets the configuration option for `spark.master` which is the cluster manager to connect to which may be one of the allowed master URLs.
`SparkGraphComputer`	`persistContext(boolean persist)` Determines if the Spark context should be left open preventing Spark from garbage collecting unreferenced RDDs.
`SparkGraphComputer`	`persistStorageLevel(StorageLevel storageLevel)`
`SparkGraphComputer`	`serializer(Class<? extends Serializer> serializer)` Specifies the `org.apache.spark.serializer.Serializer` implementation to use.
`SparkGraphComputer`	`skipGraphCache(boolean skip)` Determines if the graph RDD should be cached or not.
`SparkGraphComputer`	`skipPartitioner(boolean skip)` Determines if the graph RDD should be partitioned or not.
`SparkGraphComputer`	`sparkKryoRegistrator(Class<? extends KryoRegistrator> registrator)` Specifies the `org.apache.spark.serializer.KryoRegistrator` to use to install additional types.
`Future<ComputerResult>`	`submit()` Submit the `VertexProgram` and the set of `MapReduce` jobs for execution by the `GraphComputer`.
`SparkGraphComputer`	`workers(int workers)` Sets the number of workers.

Methods inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer
copyDirectoryIfNonExistent, edges, features, loadJars, mapReduce, persist, program, result, toString, validateStatePriorToExecution, vertices

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait

- Constructor Detail
  - SparkGraphComputer
```
public SparkGraphComputer(HadoopGraph hadoopGraph)
```
- Method Detail
  - workers
```
public SparkGraphComputer workers(int workers)
```
    Sets the number of workers. If the spark.master configuration is configured with "local" then it will change that configuration to use the specified number of worker threads.
    
    Specified by:
    
    workers in interface GraphComputer
    
    Overrides:
    
    workers in class AbstractHadoopGraphComputer
    
    Parameters:
    
    workers - the number of workers to execute the submission
    
    Returns:
    
    the updated GraphComputer with newly set worker count
  - configure
```
public SparkGraphComputer configure(String key,
                                    Object value)
```
    Description copied from interface: GraphComputer
    
    Set an arbitrary configuration key/value for the underlying Configuration in the GraphComputer. Typically, the other fluent methods in GraphComputer should be used to configure the computation. However, for some custom configuration in the underlying engine, this method should be used. Different GraphComputer implementations will have different key/values and thus, parameters placed here are generally not universal to all GraphComputer implementations. The default implementation simply does nothing and returns the GraphComputer unchanged.
    
    Parameters:
    
    key - the key of the configuration
    
    value - the value of the configuration
    
    Returns:
    
    the updated GraphComputer with newly set key/value configuration
  - master
```
public SparkGraphComputer master(String clusterManager)
```
    Sets the configuration option for spark.master which is the cluster manager to connect to which may be one of the allowed master URLs.
  - persistContext
```
public SparkGraphComputer persistContext(boolean persist)
```
    Determines if the Spark context should be left open preventing Spark from garbage collecting unreferenced RDDs.
  - graphStorageLevel
```
public SparkGraphComputer graphStorageLevel(StorageLevel storageLevel)
```
    Specifies the method by which the VertexProgram created graph is persisted. By default, it is configured to use StorageLevel#MEMORY_ONLY()
  - persistStorageLevel
```
public SparkGraphComputer persistStorageLevel(StorageLevel storageLevel)
```
  - skipPartitioner
```
public SparkGraphComputer skipPartitioner(boolean skip)
```
    Determines if the graph RDD should be partitioned or not. By default, this value is false.
  - skipGraphCache
```
public SparkGraphComputer skipGraphCache(boolean skip)
```
    Determines if the graph RDD should be cached or not. If true then graphStorageLevel(StorageLevel) is ignored. By default, this value is false.
  - serializer
```
public SparkGraphComputer serializer(Class<? extends Serializer> serializer)
```
    Specifies the org.apache.spark.serializer.Serializer implementation to use. By default, this value is set to org.apache.spark.serializer.KryoSerializer.
  - sparkKryoRegistrator
```
public SparkGraphComputer sparkKryoRegistrator(Class<? extends KryoRegistrator> registrator)
```
    Specifies the org.apache.spark.serializer.KryoRegistrator to use to install additional types. By default this value is set to TinkerPop's GryoRegistrator.
  - kryoRegistrationRequired
```
public SparkGraphComputer kryoRegistrationRequired(boolean required)
```
    Determines if kryo registration is required such that attempts to serialize classes that are not registered will result in an error. By default this value is false.
  - submit
```
public Future<ComputerResult> submit()
```
    Description copied from interface: GraphComputer
    
    Submit the VertexProgram and the set of MapReduce jobs for execution by the GraphComputer.
    
    Returns:
    
    a Future denoting a reference to the asynchronous computation and where to get the DefaultComputerResult when its is complete.
  - loadJar
```
protected void loadJar(Configuration hadoopConfiguration,
                       File file,
                       Object... params)
```
    Specified by:
    
    loadJar in class AbstractHadoopGraphComputer
  - main
```
public static void main(String[] args)
                 throws Exception
```
    Throws:
    
    Exception

Class SparkGraphComputer

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer

Nested classes/interfaces inherited from interface org.apache.tinkerpop.gremlin.process.computer.GraphComputer

Field Summary

Fields inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer

Constructor Summary

Method Summary

Methods inherited from class org.apache.tinkerpop.gremlin.hadoop.process.computer.AbstractHadoopGraphComputer

Methods inherited from class java.lang.Object

Constructor Detail

SparkGraphComputer

Method Detail

workers

configure

master

persistContext

graphStorageLevel

persistStorageLevel

skipPartitioner

skipGraphCache

serializer

sparkKryoRegistrator

kryoRegistrationRequired

submit

loadJar

main