TinkerPop Upgrade Information
This document helps users of TinkerPop to understand the changes that come with each software release. It outlines new features, how to resolve breaking changes and other information specific to a release. This document is useful to end-users who are building applications on TinkerPop, but it is equally useful to TinkerPop providers, who build libraries and other systems on the core APIs and protocols that TinkerPop exposes.
These providers include:
-
Graph System Provider
-
Graph Database Provider
-
Graph Processor Provider
-
-
Graph Driver Provider
-
Graph Language Provider
-
Graph Plugin Provider
TinkerPop 3.2.0
Nine Inch Gremlins
TinkerPop 3.2.9
Release Date: May 8, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Lambda Construction
It was realized quite shortly after release of 3.2.8 that there was a bug in construction of Lambda
instances:
gremlin> org.apache.tinkerpop.gremlin.util.function.Lambda.function("{ it.get() }")
(class: org/apache/tinkerpop/gremlin/util/function/Lambda$function, method: callStatic signature: (Ljava/lang/Class;[Ljava/lang/Object;)Ljava/lang/Object;) Illegal type in constant pool
Type ':help' or ':h' for help.
Display stack trace? [yN]n
The problem was related to a bug in Groovy 2.4.14 and was fixed in 2.4.15.
See: TINKERPOP-1953
TinkerPop 3.2.8
Release Date: April 2, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Improved Connection Monitoring
Gremlin Server now has two new settings: idleConnectionTimeout
and keepAliveInterval
. The keepAliveInterval
tells
Gremlin Server how long it should wait between writes to a client before it issues a "ping" to that client to see if
it is still present. The idleConnectionTimeout
represents how long Gremlin Server should wait between requests from
a client before it closes the connection on the server side. By default, these two configurations are set to zero,
meaning that they are both disabled.
This change should help to alleviate issues where connections are left open on the server longer than they should be by clients that might mysteriously disappear without properly closing their connections.
See: TINKERPOP-1726
Gremlin.Net Lambdas
Gremlin.Net now has a Lambda
class that can be used to construct Groovy or Java lambdas which will be evaluated on the
server.
Gremlin.Net Tokens Improved
The various Gremlin tokens (e.g. T
, Order
, Operator
, etc.) that were implemented as Enums before in Gremlin.Net
are now implemented as classes. This mainly allows them to implement interfaces which their Java counterparts already
did. T
for example now implements the new interface IFunction
which simply mirrors its Java counterpart Function
.
Steps that expect objects for those interfaces as arguments now explicitly use the interface. Before, they used just
object
as the type for these arguments which made it hard for users to know what kind of object
they can use.
However, usage of these tokens themselves shouldn’t change at all (e.g. T.Id
is still T.Id
).
See: TINKERPOP-1901
Gremlin.Net: Traversal Predicate Classes Merged
Gremlin.Net used two classes for traversal predicates: P
and TraversalPredicate
. Steps that worked with traversal
predicates expected objects of type TraversalPredicate
, but they were constructed from the P
class
(e.g. P.Gt(1)
returned a TraversalPredicate
). Merging these two classes into the P
class should avoid unnecessary
confusion. Most users should not notice this change as predicates can still be constructed exactly as before, e.g.,
P.Gt(1).And(P.Lt(3))
still works without any modifications.
Only users that implemented their own predicates and used TraversalPredicate
as the base class need to change their
implementation to now use P
as the new base class.
See: TINKERPOP-1919
Upgrading for Providers
Graph System Providers
Kitchen Sink Test Graph
The "Kitchen Sink" test graph has been added to the gremlin-test
module. It contains (or will contain) various
disconnected subgraphs of that offer unique structures (e.g. a self-loop) for specific test cases. Graph systems that
use the test suite should not have to make any changes to account for this new graph unless that system performs some
form or special pre-initialization of their system in preparation for loading (e.g. requires a schema) or does the
loading of the graph test data outside of the standard method in which TinkerPop provides.
See: TINKERPOP-1877
TinkerPop 3.2.7
Release Date: December 17, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin-Python Core Types
With the addition of UUID
, Date
, and Timestamp
, Gremlin-Python now implements serializers for all core GraphSON types. Users
that were using other types to represent this data can now use the Python classes datetime.datetime
and`uuid.UUID` in GLV traversals.
Since Python does not support a native Timestamp
object, Gremlin-Python now offers a dummy class Timestamp
, which allows
users to wrap a float and submit it to the Gremlin Server as a Timestamp
GraphSON type. Timestamp
can be found in
gremlin_python.statics
.
See: TINKERPOP-1807
EventStrategy Detachment
EventStrategy
forced detachment of mutated elements prior to raising them in events. While this was a desired
outcome, it may not have always fit every use case. For example, a user may have wanted a reference element or the
actual element itself. As a result, EventStrategy
has changed to allow it to be constructed with a detach()
option, where it is possible to specify any of the following: null
for no detachment, DetachedFactory
for the
original behavior, and ReferenceFactory
for detachment that returns reference elements.
See: TINKERPOP-1829
Embedded Remote Connection
As Gremlin Language Variants (GLVs) expand their usage and use of withRemote()
becomes more common, the need to mock
the "remote" in unit tests increases. To simplify mocking in Java, the new EmbeddedRemoteConnection
provides a
simple way to provide a "remote" that is actually local to the same JVM.
See: TINKERPOP-1756
DSL Type Specification
Prior to this version, the Java annotation processor for Gremlin DSLs has tried to infer the appropriate type specifications when generating anonymous methods. It largely performed this inference on simple conventions in the DSL method’s template specification and there were times where it would fail. For example, a method like this:
public default GraphTraversal<S, E> person() {
return hasLabel("person");
}
would generate an anonymous method like:
public static <S> SocialGraphTraversal<S, E> person() {
return hasLabel("person");
}
and, of course, generate a compile error and E
was not recognized as a symbol. The preferred generation would likely
be:
public static <S> SocialGraphTraversal<S, S> person() {
return hasLabel("person");
}
To remedy this situation, a new annotation has been added which allows the user to control the type specifications more directly providing a way to avoid/override the inference system:
@GremlinDsl.AnonymousMethod(returnTypeParameters = {"A", "A"}, methodTypeParameters = {"A"})
public default GraphTraversal<S, E> person() {
return hasLabel("person");
}
which will then generate:
public static <A> SocialGraphTraversal<A, A> person() {
return hasLabel("person");
}
See: TINKERPOP-1791
Specify a Cluster Object
The :remote connect
command can now take a pre-defined Cluster
object as its argument as opposed to a YAML
configuration file.
gremlin> cluster = Cluster.open()
==>localhost/127.0.0.1:8182
gremlin> :remote connect tinkerpop.server cluster
==>Configured localhost/127.0.0.1:8182
See: TINKERPOP-1787
Remote Traversal Timeout
There was limited support for "timeouts" with remote traversals (i.e. those traversals executed using the withRemote()
option) prior to 3.2.7. Remote traversals will now interrupt on the server using the scriptEvaluationTimeout
setting in the same way that normal script evaluations would. As a reminder, interruptions for traversals are always
considered "attempts to interrupt" and may not always succeed (a graph database implementation might not respect the
interruption, for example).
See: TINKERPOP-1770
Modifications to match()
The match()
-step has been generalized to support the local scoping of all barrier steps, not just reducing barrier steps.
Previously, the order().limit()
clause would have worked globally yielding:
gremlin> g.V().match(
......1> __.as('a').outE('created').order().by('weight',decr).limit(1).inV().as('b'),
......2> __.as('b').has('lang','java')
......3> ).select('a','b').by('name')
==>[a:marko,b:lop]
However, now, order()
(and all other barriers) are treated as local computations to the pattern and thus, the result set is:
gremlin> g.V().match(
......1> __.as('a').outE('created').order().by('weight',decr).limit(1).inV().as('b'),
......2> __.as('b').has('lang','java')
......3> ).select('a','b').by('name')
==>[a:marko,b:lop]
==>[a:josh,b:ripple]
==>[a:peter,b:lop]
Note that this is not that intense of a breaking change as all of the reducing barriers behaved in this manner previously.
This includes steps like count()
, min()
, max()
, sum()
, group()
, groupCount()
, etc. This update has now
generalized this behavior to all barriers and thus, adds aggregate()
, dedup()
, range()
, limit()
, tail()
, and order()
to the list of locally computed clauses.
See: TINKERPOP-1764
Clone a Graph
In gremlin-test
there is a new GraphHelper
class that has a cloneElements()
method. It will clone elements from
the first graph to the second - GraphHelper.cloneElements(Graph original, Graph clone)
. This helper method is
primarily intended for use in tests.
MutationListener Changes
The MutationListener
has a method called vertexPropertyChanged
which gathered callbacks when a property on a vertex
was modified. The method had an incorrect signature though using Property
instead of VertexProperty
. The old method
that used Property
has now been deprecated and a new method added that uses VertexProperty
. This new method has a
default implementation that calls the old method, so this change should not cause breaks in compilation on upgrade.
Internally, TinkerPop no longer calls the old method except by way of that proxy. Users who have MutationListener
implementations can simply add the new method and override its behavior. The old method can thus be ignored completely.
See: TINKERPOP-1798
Upgrading for Providers
Direction.BOTH Requires Duplication of Self-Edges
Prior to this release, there was no semantic check to determine whether a self-edge (e.g. e[1][2-self→2]
) would be returned
twice on a BOTH
. The semantics have been specified now in the test suite where the edge should be returned twice as it
is both an incoming edge and an outgoing edge.
See: TINKERPOP-1821
TinkerPop 3.2.6
Release Date: August 21, 2017
Upgrading for Users
Please see the changelog for a complete list of all the modifications that are part of this release.
Deprecated useMapperFromGraph
The userMapperFromGraph
configuration option for the Gremlin Server serializers has been deprecated. Change
configuration files to use the ioRegistries
option instead. The ioRegistries
option is not a new feature, but
it has not been promoted as the primary way to add IoRegistry
instances to serializers.
See: TINKERPOP-1694
WsAndHttpChannelizer
The WsAndHttpChannelizer
has been added to allow for processing both WebSocket and HTTP requests on the same
port and gremlin server. The SaslAndHttpBasicAuthenticationHandler
has also been added to service
authentication for both protocols in conjunction with the SimpleAuthenticator
.
See: TINKERPOP-915
Upgrading for Providers
ReferenceVertex Label
ReferenceVertex.label()
was hard coded to return EMPTY_STRING
. At some point, ReferenceElements
were suppose to
return labels and ReferenceVertex
was never updated as such. Note that ReferenceEdge
and ReferenceVertexProperty
work as expected. However, given a general change at ReferenceElement
, the Gryo serialization of ReferenceXXX
is
different. If the vertex does not have a label Vertex.DEFAULT_LABEL
is assumed.
See: TINKERPOP-1789
TinkerPop 3.2.5
Release Date: June 12, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphSON Path Serialization
Serialization of Path
with GraphSON was inconsistent with Gryo in that all the properties on any elements of
the Path
were being included. With Gryo that, correctly, was not happening as that could be extraordinarily
expensive. GraphSON serialization has now been modified to properly not include properties. That change can cause
breaks in application code if that application code tries to access properties on elements in a Path
as they
will no longer be there. Applications that require the properties will need to alter their Gremlin to better
restrict the data they want to retrieve.
See: TINKERPOP-1676
DSL Support
It has always been possible to construct Domain Specific Languages (DSLs) with Gremlin, but the approach has required a somewhat deep understanding of the TinkerPop code base and it is not something that has had a recommended method for implementation. With this release, TinkerPop simplifies DSL development and provides the best practices for their implementation.
// standard Gremlin
g.V().hasLabel('person').
where(outE("created").count().is(P.gte(2))).count()
// the same traversal as above written as a DSL
social.persons().where(createdAtLeast(2)).count()
Authentication Configuration
The server settings previously used authentication.className
to set an authenticator for the the two provided
authentication handler and channelizer classes to use. This has been deprecated in favor of authentication.authenticator
.
A class that extends AbstractAuthenticationHandler
may also now be provided as authentication.authenticationHandler
to be used in either of the provided channelizer classes to handle the provided authenticator
See: TINKERPOP-1657
Default Maximum Parameters
It was learned that compilation for scripts with large numbers of parameters is more expensive than those with less
parameters. It therefore becomes possible to make some mistakes with how Gremlin Server is used. A new setting on
the StandardOpProcessor
and SessionOpProcessor
called maxParameters
controls the number of parameters that can
be passed in on a request. This setting is defaulted to sixteen.
Users upgrading to this version may notice errors in their applications if they use more than sixteen parameters. To fix this problem simply reconfigure Gremlin Server with a configuration as follows:
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { maxParameters: 64 }}
- { className: org.apache.tinkerpop.gremlin.server.op.standard.StandardOpProcessor, config: { maxParameters: 64 }}
The above configuration allows sixty-four parameters to be passed on each request.
See: TINKERPOP-1663
GremlinScriptEngine Metrics
The GremlinScriptEngine
has a number of new metrics about its cache size and script compilation times which should
be helpful in understanding usage problems. As GremlinScriptEngine
instances are used in Gremlin Server these metrics
are naturally exposed as part of the standard metrics
set. Note that metrics are captured for both sessionless requests as well as for each individual session that is opened.
See: TINKERPOP-1644
Additional Error Information
Additional information on error responses from Gremlin Server should help make debugging errors easier. Error responses now have both the exception hierarchy and the stack trace that was generated on the server. In this way, receiving an error on a client doesn’t mean having to rifle through Gremlin Server logs to try to find the associated error.
This change has been applied to all Gremlin Server protocols. For the binary protocol and the Java driver this change
means that the ResponseException
thrown from calls to submit()
requests to the server now have the following
methods:
public Optional<String> getRemoteStackTrace()
public Optional<List<String>> getRemoteExceptionHierarchy()
The HTTP protocol has also been updated and returns both exceptions
and stackTrace
fields in the response:
{
"message": "Division by zero",
"Exception-Class": "java.lang.ArithmeticException",
"exceptions": ["java.lang.ArithmeticException"],
"stackTrace": "java.lang.ArithmeticException: Division by zero\n\tat java.math.BigDecimal.divide(BigDecimal.java:1742)\n\tat org.codehaus.groovy.runtime.typehandling.BigDecimalMath.divideImpl(BigDecimalMath.java:68)\n\tat org.codehaus.groovy.runtime.typehandling.IntegerMath.divideImpl(IntegerMath.java:49)\n\tat org.codehaus.groovy.runtime.dgmimpl.NumberNumberDiv$NumberNumber.invoke(NumberNumberDiv.java:323)\n\tat org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)\n\tat org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)\n\tat org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)\n\tat org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)\n\tat Script4.run(Script4.groovy:1)\n\tat org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:834)\n\tat org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:547)\n\tat javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)\n\tat org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)\n\tat org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$2(GremlinExecutor.java:314)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n"
}
Note that the Exception-Class
which was added in a previous version has been deprecated and replaced by these new
fields.
See: TINKERPOP-1044
Gremlin Console Scripting
The gremlin.sh
command has two flags, -i
and -e
, which are used to pass a script and arguments into the Gremlin
Console for execution. Those flags now allow for passing multiple scripts and related arguments to be supplied which
can yield greater flexibility in automation tasks.
$ bin/gremlin.sh -i y.groovy 1 2 3 -i x.groovy
$ bin/gremlin.sh -e y.groovy 1 2 3 -e x.groovy
See: TINKERPOP-1653
Path support for by()-, from()-, to()-modulation
It is now possible to extract analyze sub-paths using from()
and to()
modulations with respective, path-based steps.
Likewise, simplePath()
and cyclicPath()
now support, along with from()
and to()
, by()
-modulation so the cyclicity
is determined by projections of the path data. This extension is fully backwards compatible.
See: TINKERPOP-1387
GraphManager versus DefaultGraphManager
Gremlin Server previously implemented its own final GraphManager
class. Now, the GraphManager
has been changed to
an interface, and users can supply their own GraphManager
implementations in their YAML. The previous GraphManager
class was meant be used by classes internal to Gremlin Server, but it was public so if it was used for some reason by
users then then a compile error can be expected. To correct this problem, which will likely manifest as a compile error
when trying to create a new GraphManager()
instance, simply change the code to new DefaultGraphManager(Settings)
.
In addition to the change mentioned above, several methods on GraphManager
were deprecated:
-
getGraphs()
should be replaced by the combination ofgetGraphNames()
and thengetGraph(String)
-
getTraversalSources()
is similarly replaced and should instead use a combination ofgetTraversalSourceNames()
andgetTraversalSource(String)
See: TINKERPOP-1438
Gremlin-Python Driver
Gremlin-Python now offers a more complete driver implementation that uses connection pooling and
the Python concurrent.futures
module to provide asynchronous I/0 using threading. The default underlying
WebSocket client implementation is still provided by Tornado, but it is trivial to plug in another client by
defining the Transport
interface.
Using the DriverRemoteConnection
class is the exact same as in previous versions; however,
DriverRemoteConnection
now uses the new Client
class to submit messages to the server.
The Client
class implementation/interface is based on the Java Driver, with some restrictions.
Most notably, Gremlin-Python does not yet implement the Cluster
class. Instead, Client
is
instantiated directly. Usage is as follows:
from gremlin_python.driver import client
client = client.Client('ws://localhost:8182/gremlin', 'g')
result_set = client.submit('1 + 1')
future_results = result_set.all() # returns a concurrent.futures.Future
results = future_results.result() # returns a list
assert results == [2]
client.close() # don't forget to close underlying connections
See: TINKERPOP-1599
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
SimplePathStep and CyclicPathStep now PathFilterStep
The Gremlin traversal machine use to support two step instructions: SimplePathStep
and CyclicPathStep
. These have
been replaced by a high-level instruction called PathFilterStep
which is boolean configured for simple or cyclic paths.
Furthermore, PathFilterStep
also support from()
-, to()
-, and by()
-modulation.
LazyBarrierStrategy No Longer End Appends Barriers
LazyBarrierStrategy
was trying to do to much by considering Traverser
effects on network I/O by appending an
NoOpBarrierStrategy
to the end of the root traversal. This should not be accomplished by LazyBarrierStrategy
,
but instead by RemoteStrategy
. RemoteStrategy
now tries to barrier-append. This may effect the reasoning logic in
some ProviderStrategies
. Most likely not, but just be aware.
See: TINKERPOP-1627
TinkerPop 3.2.4
Release Date: February 8, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
TinkerGraph Deserialization
A TinkerGraph deserialized from Gryo or GraphSON is now configured with multi-properties enabled. This change allows
TinkerGraphs returned from Gremlin Server to properly return multi-properties, which was a problem seen when
subgraphing a graph that contained properties with a setting other than Cardinality.single
.
This change could be considered breaking in the odd chance that a TinkerGraph returned from Gremlin Server was later
mutated, because calls to property(k,v)
would default to Cardinality.list
instead of Cardinality.single
. In the
event that this is a problem, simple change calls to property(k,v)
to property(Cardinality.single,k,v)
and
explicitly set the Cardinality
.
See: TINKERPOP-1587
Traversal Promises
The Traversal
API now has a new promise()
method. These methods return a promise in the form of a
CompleteableFuture
. Usage is as follows:
gremlin> promise = g.V().out().promise{it.next()}
==>java.util.concurrent.CompletableFuture@4aa3d36[Completed normally]
gremlin> promise.join()
==>v[3]
gremlin> promise.isDone()
==>true
gremlin> g.V().out().promise{it.toList()}.thenApply{it.size()}.get()
==>6
At this time, this method is only used for traversals that are configured using withRemote()
.
See: TINKERPOP-1490
If/Then-Semantics with Choose Step
Gremlin’s choose()
-step supports if/then/else-semantics. Thus, to effect if/then-semantics, identity()
was required.
Thus, the following two traversals below are equivalent with the later being possible in this release.
g.V().choose(hasLabel('person'),out('created'),identity())
g.V().choose(hasLabel('person'),out('created'))
See: TINKERPOP-1508
FastNoSuchElementException converted to regular NoSuchElementException
Previously, a call to Traversal.next()
that did not have a result would throw a FastNoSuchElementException
.
This has been changed to a regular NoSuchElementException
that includes the stack trace. Code that explicitly catches
FastNoSuchElementException
should be converted to check for the more general class of NoSuchElementException
.
See: TINKERPOP-1330
ScriptEngine support in gremlin-core
ScriptEngine
and GremlinPlugin
infrastructure has been moved from gremlin-groovy to gremlin-core to allow for
better re-use across different Gremlin Language Variants. At this point, this change is non-breaking as it was
implemented through deprecation.
The basic concept of a ScriptEngine
has been replaced by the notion of a GremlinScriptEngine
(i.e. a
"ScriptEngine" that is specifically tuned for executing Gremlin-related scripts). "ScriptEngine" infrastructure has
been developed to help support this new interface, specifically GremlinScriptEngineFactory
and
GremlinScriptEngineManager
. Prefer use of this infrastructure when instantiating a GremlinScriptEngine
rather
than trying to instantiate directly.
For example, rather than instantiate a GremlinGroovyScriptEngine
with the constructor:
GremlinScriptEngine engine = new GremlinGroovyScriptEngine();
prefer to instantiate it as follows:
GremlinScriptEngineManager manager = new CachedGremlinScriptEngineManager();
GremlinScriptEngine engine = manager.getEngineByName("gremlin-groovy");
Related to the addition of GremlinScriptEngine
, org.apache.tinkerpop.gremlin.groovy.plugin.GremlinPlugin
in
gremlin-groovy has been deprecated and then replaced by org.apache.tinkerpop.gremlin.jsr223.GremlinPlugin
. The new
version of GremlinPlugin
is similar but does carry some new methods to implement that involves the new Customizer
interface. The Customizer
interface is the way in which GremlinScriptEngine
instance can be configured with
imports, initialization scripts, compiler options, etc.
Note that a GremlinPlugin
can be applied to a GremlinScriptEngine
by adding it to the GremlinScriptEngineManager
that creates it.
GremlinScriptEngineManager manager = new CachedGremlinScriptEngineManager();
manager.addPlugin(ImportGremlinPlugin.build().classImports(java.awt.Color.class).create());
GremlinScriptEngine engine = manager.getEngineByName("gremlin-groovy");
All of this new infrastructure is currently optional on the 3.2.x line of code. More detailed documentation will for these changes will be supplied as part of 3.3.0 when these features become mandatory and the deprecated code is removed.
See: TINKERPOP-1562
SSL Client Authentication
Added new server configuration option ssl.needClientAuth
.
See: TINKERPOP-1602
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
CloseableIterator
Prior to TinkerPop 3.x, Blueprints had the notion of a CloseableIterable
which exposed a way for Graph Providers
to offer a way to release resources that might have been opened when returning vertices and edges. That interface was
never exposed in TinkerPop 3.x, but has now been made available via the new CloseableIterator
. Providers may choose
to use this interface or not when returning values from Graph.vertices()
and Graph.edges()
.
It will be up to users to know whether or not they need to call close()
. Of course, users should typically not be
operating with the Graph Structure API, so it’s unlikely that they would be calling these methods directly in the
first place. It is more likely that users will be calling Traversal.close()
. This method will essentially iterate
the steps of the Traversal
and simply call close()
on any steps that implement AutoCloseable
. By default,
GraphStep
now implements AutoCloseable
which most Graph Providers will extend upon (as was done with TinkerGraph’s
TinkerGraphStep
), so the integration should largely come for free if the provider simply returns a
CloseableIterator
from Graph.vertices()
and Graph.edges()
.
See: TINKERPOP-1589
HasContainer AndP Splitting
Previously, GraphTraversal
made it easy for providers to analyze P
-predicates in HasContainers
, but always
splitting AndP
predicates into their component parts. This helper behavior is no longer provided because,
1.) AndP
can be inserted into a XXXStep
in other ways, 2.) the providers XXXStep
should process AndP
regardless of GraphTraversal
helper, and 3.) the GraphTraversal
helper did not recursively split.
A simple way to split AndP
in any custom XXXStep
that implements HasContainerHolder
is to use the following method:
@Override
public void addHasContainer(final HasContainer hasContainer) {
if (hasContainer.getPredicate() instanceof AndP) {
for (final P<?> predicate : ((AndP<?>) hasContainer.getPredicate()).getPredicates()) {
this.addHasContainer(new HasContainer(hasContainer.getKey(), predicate));
}
} else
this.hasContainers.add(hasContainer);
}
See: TINKERPOP-1482, TINKERPOP-1502
Duplicate Multi-Properties
Added supportsDuplicateMultiProperties
to VertexFeatures
so that graph provider who only support unique values as
multi-properties have more flexibility in describing their graph capabilities.
See: TINKERPOP-919
Deprecated OptIn
In 3.2.1, all junit-benchmark
performance tests were deprecated. At that time, the OptIn
representations of these
tests should have been deprecated as well, but they were not. That omission has been remedied now. Specifically, the
following fields were deprecated:
-
OptIn.SUITE_GROOVY_ENVIRONMENT_PERFORMANCE
-
OptIn.SUITE_PROCESS_PERFORMANCE
-
OptIn.SUITE_STRUCTURE_PERFORMANCE
As of 3.2.4, the following test suites were also deprecated:
-
OptIn.SUITE_GROOVY_PROCESS_STANDARD
-
OptIn.SUITE_GROOVY_PROCESS_COMPUTER
-
OptIn.SUITE_GROOVY_ENVIRONMENT
-
OptIn.SUITE_GROOVY_ENVIRONMENT_INTEGRATE
Future testing of gremlin-groovy
(and language variants in general) will be handled differently and will not require
a Graph Provider to validate its operations with it. Graph Providers may now choose to remove these tests from their
test suites, which should reduce the testing burden.
See: TINKERPOP-1610
Deprecated getInstance()
TinkerPop has generally preferred static instance()
methods over getInstance()
, but getInstance()
was used in
some cases nonetheless. As of this release, getInstance()
methods have been deprecated in favor of instance()
.
Of specific note, custom IoRegistry
(as related to IO in general) and Supplier<ClassResolver>
(as related to
Gryo serialization in general) now both prefer instance()
over getInstance()
given this deprecation.
See: TINKERPOP-1530
Drivers Providers
Force Close
Closing a session will first attempt a proper close of any open transactions. A problem can occur, however, if there is a long run job (e.g. an OLAP-based traversal) executing, as that job will block the calls to close the transactions. By exercising the option to a do a "forced close" the session will skip trying to close the transactions and just attempt to interrupt the long run job. By not closing transactions, the session leaves it up to the underlying graph database to sort out how it will deal with those orphaned transactions. On the positive side though (for those graphs which do that well) , long run jobs have the opportunity to be cancelled without waiting for a timeout of the job itself which will allow resources to be released earlier.
The "force" argument is passed on the "close" message and is a boolean value. This is an optional argument to "close"
and defaults to false
.
SASL Authentication
Gremlin Supports SASL based authentication. The server accepts either a byte array or Base64 encoded String as the in
the sasl
argument on the RequestMessage
, however it sends back a byte array only. Some serializers or serializer
configurations don’t work well with that approach (specifically the "toString" configuration on the Gryo serializer) as
the byte array is returned in the ResponseMessage
result. In the case of the "toString" serializer the byte array
gets "toString’d" and the can’t be read by the client.
In 3.2.4, the byte array is still returned in the ResponseMessage
result, but is also returned in the status
attributes under a sasl
key as a Base64 encoded string. In this way, the client has options on how it chooses to
process the authentication response and the change remains backward compatible. Drivers should upgrade to using the
Base64 encoded string however as the old approach will likely be removed in the future.
See: TINKERPOP-1600
TinkerPop 3.2.3
Release Date: October 17, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Renamed Null Result Preference
In 3.2.2, the Gremlin Console introduced a setting called empty.result.indicator
, which controlled the output that
was presented when no result was returned. For consistency, this setting has been renamed to result.indicator.null
and can be set as follows:
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
==>null
gremlin> :set result.indicator.null nil
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
==>nil
gremlin> :set result.indicator.null ""
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
gremlin>
See: TINKERPOP-1409
Java Driver Keep-Alive
The Java Driver now has a keepAliveInterval
setting, which controls the amount of time in milliseconds it should wait
on an inactive connection before it sends a message to the server to keep the connection maintained. This should help
environments that use a load balancer in front of Gremlin Server by ensuring connections are actively maintained even
during periods of inactivity.
See: TINKERPOP-1249
Where Step Supports By-Modulation
It is now possible to use by()
with where()
predicate-based steps. Previously, without using match()
, if you wanted
to know who was older than their friend, the following traversal would be used.
gremlin> g.V().as('a').out('knows').as('b').
......1> filter(select('a','b').by('age').where('a', lt('b')))
==>v[4]
Now, with where().by()
support, the above traversal can be expressed more succinctly and more naturally as follows.
gremlin> g.V().as('a').out('knows').as('b').
......1> where('a', lt('b')).by('age')
==>v[4]
See: TINKERPOP-1330
Change In has() Method Signatures
The TinkerPop 3.2.2 release unintentionally introduced a breaking change for some has()
method overloads. In particular the
behavior for single item array arguments was changed:
gremlin> g.V().hasLabel(["software"] as String[]).count()
==>0
Prior this change single item arrays were treated like there was only that single item:
gremlin> g.V().hasLabel(["software"] as String[]).count()
==>2
gremlin> g.V().hasLabel("software").count()
==>2
TinkerPop 3.2.3 fixes this misbehavior and all has()
method overloads behave like before, except that they no longer
support no arguments.
Deprecated reconnectInitialDelay
The reconnectInitialDelay
setting on the Cluster
builder has been deprecated. It no longer serves any purpose.
The value for the "initial delay" now comes from reconnectInterval
(there are no longer two separate settings to
control).
See: TINKERPOP-1460
TraversalSource.close()
TraversalSource
now implements AutoCloseable
, which means that the close()
method is now available. This new
method is important in cases where withRemote()
is used, as withRemote()
can open "expensive" resources that need
to be released.
In the case of TinkerPop’s DriverRemoteConnection
, close()
will destroy the Client
instance that is created
internally by withRemote()
as shown below:
gremlin> graph = EmptyGraph.instance()
==>emptygraph[empty]
gremlin> g = graph.traversal().withRemote('conf/remote-graph.properties')
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> g.close()
gremlin>
Note that the withRemote()
method will call close()
on a RemoteConnection
passed directly to it as well, so
there is no need to do that manually.
See: TINKERPOP-790
IO Reference Documentation
There is new reference documentation for the various IO formats. The documentation provides more details and samples that should be helpful to users and providers who intend to work directly with the TinkerPop supported serialization formats: GraphML, GraphSON and Gryo.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Default LazyBarrierStrategy
LazyBarrierStrategy
has been included as a default strategy. LazyBarrierStrategy
walks a traversal and looks for
"flatMaps" (out()
, in()
, both()
, values()
, V()
, etc.) and adds "lazy barriers" to dam up the stream so to
increase the probability of bulking the traversers. One of the side-effects is that:
g.V().out().V().has(a)
is compiled to:
g.V().out().barrier().V().barrier().has(a)
Given that LazyBarrierStrategy
is an OptimizationStrategy
, it comes before ProviderOptimizationStrategies
.
Thus, if the provider’s XXXGraphStepStrategy
simply walks from the second V()
looking for has()
-only, it will not
be able to pull in the has()
cause the barrier()
blocks it. Please see the updates to TinkerGraphStepStrategy
and
how it acknowledges NoOpBarrierSteps
(i.e. barrier()
) skipping over them and “left”-propagating labels to the
previous step.
See: TINKERPOP-1488
Configurable Strategies
If the provider has non-configurable TraversalStrategy
classes, those classes should expose a static instance()
-method.
This is typical and thus, backwards compatible. However, if the provider has a TraversalStrategy
that can be configured
(e.g. via a Builder
), then it should expose a static create(Configuration)
-method, where the keys of the configuration
are the method names of the Builder
and the values are the method arguments. For instance, for Gremlin-Python to create
a SubgraphStrategy
, it does the following:
g = Graph().traversal().withRemote(connection).
withStrategies(SubgraphStrategy(vertices=__.hasLabel('person'),edges=__.has('weight',gt(0.5))))
The SubgraphStrategy.create(Configuration)
-method is defined as:
public static SubgraphStrategy create(final Configuration configuration) {
final Builder builder = SubgraphStrategy.build();
if (configuration.containsKey(VERTICES))
builder.vertices((Traversal) configuration.getProperty(VERTICES));
if (configuration.containsKey(EDGES))
builder.edges((Traversal) configuration.getProperty(EDGES));
if (configuration.containsKey(VERTEX_PROPERTIES))
builder.vertexProperties((Traversal) configuration.getProperty(VERTEX_PROPERTIES));
return builder.create();
}
Finally, in order to make serialization possible from JVM-based Gremlin language variants, all strategies have a
TraverserStrategy.getConfiguration()
method which returns a Configuration
that can be used to create()
the
TraversalStrategy
.
The SubgraphStrategy.getConfiguration()
-method is defined as:
@Override
public Configuration getConfiguration() {
final Map<String, Object> map = new HashMap<>();
map.put(STRATEGY, SubgraphStrategy.class.getCanonicalName());
if (null != this.vertexCriterion)
map.put(VERTICES, this.vertexCriterion);
if (null != this.edgeCriterion)
map.put(EDGES, this.edgeCriterion);
if (null != this.vertexPropertyCriterion)
map.put(VERTEX_PROPERTIES, this.vertexPropertyCriterion);
return new MapConfiguration(map);
}
The default implementation of TraversalStrategy.getConfiguration()
is defined as:
public default Configuration getConfiguration() {
return new BaseConfiguration();
}
Thus, if the provider does not have any "builder"-based strategies, then no updates to their strategies are required.
See: TINKERPOP-1455
Deprecated elementNotFound
Both Graph.Exceptions.elementNotFound()
methods have been deprecated. These exceptions were being asserted in the
test suite but were not being used anywhere in gremlin-core
itself. The assertions have been modified to simply
assert that NoSuchElementException
was thrown, which is precisely the behavior that was being indirectly asserted
when Graph.Exceptions.elementNotFound()
were being used.
Providers should not need to take any action in this case for their tests to pass, however, it would be wise to remove uses of these exception builders as they will be removed in the future.
See: TINKERPOP-944
Hidden Step Labels for Compilation Only
In order for SubgraphStrategy
to work, it was necessary to have multi-level children communicate with one another
via hidden step labels. It was decided that hidden step labels are for compilation purposes only and will be removed
prior to traversal evaluation. This is a valid decision given that hidden labels for graph system providers are
not allowed to be used by users. Likewise, hidden labels for steps should not be allowed be used by
users as well.
PropertyMapStep with Selection Traversal
PropertyMapStep
now supports selection of properties via child property traversal. If a provider was relying solely
on the provided property keys in a ProviderOptimizationStrategy
, they will need to check if there is a child traversal
and if so, use that in their introspection for respective strategies. This model was created to support SubgraphStrategy.vertexProperties()
filtering.
See: TINKERPOP-1456, TINKERPOP-844
ConnectiveP Nesting Inlined
There was a bug in ConnectiveP
(AndP
/OrP
), where eq(1).and(eq(2).and(eq(3)))
was AndP(eq(1),AndP(eq(2),eq(3)))
instead of unnested/inlined as AndP(eq(1),eq(2),eq(3))
. Likewise, for OrP
. If a provider was leveraging ConnectiveP
predicates for their custom steps (e.g. graph- or vertex-centric index lookups), then they should be aware of the inlining
and can simplify any and/or-tree walking code in their respective ProviderOptimizationStrategy
.
See: TINKERPOP-1470
TinkerPop 3.2.2
Release Date: September 6, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphSON 2.0
GraphSON 2.0 has been introduced to improve and normalize the format of types embedded in GraphSON.
Log4j Dependencies
There were a number of changes to the Log4j dependencies in the various modules. Log4j was formerly included as part
of the slf4j-log4j12
in gremlin-core
, however that "forced" use of Log4j as a logger implementation when that
really wasn’t necessary or desired. If a project depended on gremlin-core
or other TinkerPop project to get its
Log4j implementation then those applications will need to now include the dependency themselves directly.
Note that Gremlin Server and Gremlin Console explicitly package Log4j in their respective binary distributions.
See: TINKERPOP-1151
Default for gremlinPool
The gremlinPool
setting in Gremlin Server is now defaulted to zero. When set to zero, Gremlin Server will use the
value provided by Runtime.availableProcessors()
to set the pool size. Note that the packaged YAML files no longer
contain the thread pool settings as all are now driven by sensible defaults. Obviously these values can be added
and overridden as needed.
See: TINKERPOP-1373
New Console Features
The Gremlin Console can now have its text colorized. For example, you can set the color of the Gremlin ascii art to
the more natural color of green by using the :set
command:
gremlin> :set gremlin.color green
It is also possible to colorize results, like vertices, edges, and other common returns. Please see the reference documentation for more details on all the settings.
The console also now includes better multi-line support:
gremlin> g.V().out().
......1> has('name','josh').
......2> out('created')
==>v[5]
==>v[3]
This is a nice feature in that it can help you understand if a line is incomplete and unevaluated.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Deprecated Io.Builder.registry()
The Io.Builder.registry()
has been deprecated in favor of Io.Builder.onMapper(Consumer<Mapper>)
. This change gives
the Graph
implementation greater flexibility over how to modify the Mapper
implementation. In most cases, the
implementation will simply add its IoRegistry
to allow the Mapper
access to custom serialization classes, but this
approach makes it possible to also set other specific settings that aren’t generalized across all IO implementations.
A good example of this type of usage would be to provide a custom ClassRessolver
implementation to a GryoMapper
.
See: TINKERPOP-1402
Log4j Dependencies
There were a number of changes to the Log4j dependencies in the various modules. Log4j was formerly included as part
of the slf4j-log4j12
in gremlin-core
, however that "forced" use of log4j as a logger implementation when that
really wasn’t necessary or desired. The slf4j-log4j12
dependency is now in "test" scope for most of the modules. The
exception to that rule is gremlin-test
which prescribes it as "optional". That change means that developers
dependending on gremlin-test
(or gremlin-groovy-test
) will need to explicitly specify it as a dependency in their
pom.xml
(or a different slf4j implementation if that better suits them).
See: TINKERPOP-1151
Drivers Providers
GraphSON 2.0
Drivers providers can exploit the new format of typed values JSON serialization offered by GraphSON 2.0. This format
has been created to allow easy and agnostic parsing of a GraphSON payload without type loss. Drivers of non-Java
languages can then implement their own mapping of the GraphSON’s language agnostic type IDs (e.g. UUID
, LocalDate
)
to the appropriate representation for the driver’s language.
Traversal Serialization
There was an "internal" serialization format in place for Traversal
which allowed one to be submitted to Gremlin
Server directly over RemoteGraph
. That format has been removed completely and is wholly replaced by the non-JVM
specific approach of serializing Bytecode
.
See: TINKERPOP-1392
TinkerPop 3.2.1
Release Date: July 18, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gephi Plugin
The Gephi Plugin has been updated to support Gephi 0.9.x. Please upgrade to this latest version to use the Gephi Plugin for Gremlin Console.
See: TINKERPOP-1297
GryoMapper Construction
It is now possible to override existing serializers with calls to addCustom
on the GryoMapper
builder. This option
allows complete control over the serializers used by Gryo. Of course, this also makes it possible to produce completely
non-compliant Gryo files. This feature should be used with caution.
TraversalVertexProgram
TraversalVertexProgram
always maintained a HALTED_TRAVERSERS
TraverserSet
for each vertex throughout the life
of the OLAP computation. However, if there are no halted traversers in the set, then there is no point in keeping that
compute property around as without it, time and space can be saved. Users that have VertexPrograms
that are chained off
of TraversalVertexProgram
and have previously assumed that HALTED_TRAVERSERS
always exists at each vertex, should no
longer assume that.
TraverserSet haltedTraversers = vertex.value(TraversalVertexProgram.HALTED_TRAVERSERS);
// good code
TraverserSet haltedTraversers = vertex.property(TraversalVertexProgram.HALTED_TRAVERSERS).orElse(new TraverserSet());
Interrupting Traversals
Traversals now better respect calls to Thread.interrupt()
, which mean that a running Traversal
can now be
cancelled. There are some limitations that remain, but most OLTP-based traversals should cancel without
issue. OLAP-based traversals for Spark will also cancel and clean up running jobs in Spark itself. Mileage may vary
on other process implementations and it is possible that graph providers could potentially write custom step
implementations that prevent interruption. If it is found that there are configurations or specific traversals that
do not respect interruption, please mention them on the mailing list.
See: TINKERPOP-946
Gremlin Console Flags
Gremlin Console had several methods for executing scripts from file at the start-up of bin/gremlin.sh
. There were
two options:
bin/gremlin.sh script.groovy //1
bin/gremlin.sh -e script.groovy //2
-
The
script.groovy
would be executed as a console initialization script setting the console up for use and leaving it open when the script completed successfully or closing it if the script failed. -
The
script.groovy
would be executed by theScriptExecutor
which meant that commands for the Gremlin Console, such as:remote
and:>
would not be respected.
Changes in this version of TinkerPop have added much more flexibility here and only a minor breaking change should be considered when using this version. First of all, recognize that hese two lines are currently equivalent:
bin/gremlin.sh script.groovy
bin/gremlin.sh -i script.groovy
but users should start to explicitly specify the -i
flag as TinkerPop will eventually remove the old syntax. Despite
the one used beware of the fact that neither will close the console on script failure anymore. In that sense, this
behavior represents a breaking change to consider. To ensure the console closes on failure or success, a script will
have to use the -e
option.
The console also has a number of new features in addition to -e
and -i
:
-
View the available flags for the console with
-h
. -
Control console output with
-D
,-Q
and -V
-
Get line numbers on script failures passed to
-i
and-e
.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
VertexComputing API Change
The VertexComputing
API is used by steps that wrap a VertexProgram
. There is a method called
VertexComputing.generateProgram()
that has changed which now takes a second argument of Memory
. To upgrade, simply
fix the method signature of your VertexComputing
implementations. The Memory
argument can be safely ignored to
effect the exact same semantics as prior. However, now previous OLAP job Memory
can be leveraged when constructing
the next VertexProgram
in an OLAP traversal chain.
Interrupting Traversals
Several tests have been added to the TinkerPop test suite to validate that a Traversal
can be cancelled with
Thread.interrupt()
. The test suite does not cover all possible traversal scenarios. When implementing custom steps,
providers should take care to not ignore an InterruptionException
that might be thrown in their code and to be sure
to check Thread.isInterrupted()
as needed to ensure that the step remains cancellation compliant.
See: TINKERPOP-946
Performance Tests
All "performance" tests have been deprecated. In the previous 3.2.0-incubating release, the ProcessPerformanceSuite
and TraversalPerformanceTest
were deprecated, but some other tests remained. It is the remaining tests that have
been deprecated on this release:
-
`StructurePerformanceSuite
-
GraphReadPerformanceTest
-
GraphWriterPerformanceTest
-
-
GroovyEnvironmentPerformanceSuite
-
SugarLoaderPerformanceTest
-
GremlinExecutorPerformanceTest
-
-
Gremlin Server related performance tests
-
TinkerGraph related performance tests
Providers should implement their own performance tests and not rely on these deprecated tests as they will be removed in a future release along with the "JUnit Benchmarks" dependency.
See: TINKERPOP-1294
Graph Database Providers
Transaction Tests
Tests and assertions were added to the structure test suite to validate that transaction status was in the appropriate
state following calls to close the transaction with commit()
or rollback()
. It is unlikely that this change would
cause test breaks for providers, unless the transaction status was inherently disconnected from calls to close the
transaction somehow.
In addition, other tests were added to enforce the expected semantics for threaded transactions. Threaded transactions are expected to behave like manual transactions. They should be open automatically when they are created and once closed should no longer be used. This behavior is not new and is the typical expected method for working with these types of transactions. The test suite just requires that the provider implementation conform to these semantics.
See: TINKERPOP-947, TINKERPOP-1059
GraphFilter and GraphFilterStrategy
GraphFilter
has been significantly advanced where the determination of an edge direction/label legality is more stringent.
Along with this, GraphFilter.getLegallyPositiveEdgeLabels()
has been added as a helper method to make it easier for GraphComputer
providers to know the space of labels being accessed by the traversal and thus, better enable provider-specific push-down predicates.
Note that GraphFilterStrategy
is now a default TraversalStrategy
registered with GraphComputer.
If GraphFilter
is
expensive for the underlying GraphComputer
implementation, it can be deactivated as is done for TinkerGraphComputer
.
static {
TraversalStrategies.GlobalCache.registerStrategies(TinkerGraphComputer.class,
TraversalStrategies.GlobalCache.getStrategies(GraphComputer.class).clone().removeStrategies(GraphFilterStrategy.class));
}
See: TINKERPOP-1293
Graph Language Providers
VertexTest Signatures
The method signatures of get_g_VXlistXv1_v2_v3XX_name
and get_g_VXlistX1_2_3XX_name
of VertexTest
were changed
to take arguments for the Traversal
to be constructed by extending classes.
TinkerPop 3.2.0
Release Date: Release Date: April 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Hadoop FileSystem Variable
The HadoopGremlinPlugin
defines two variables: hdfs
and fs
. The first is a reference to the HDFS FileSystemStorage
and the latter is a reference to the local FileSystemStorage
. Prior to 3.2.x, fs
was called local
. However,
there was a variable name conflict with Scope.local
. As such local
is now fs
. This issue existed prior to 3.2.x,
but was not realized until this release. Finally, this only effects Gremlin Console users.
Hadoop Configurations
Note that gremlin.hadoop.graphInputFormat
, gremlin.hadoop.graphOutputFormat
, gremlin.spark.graphInputRDD
, and
gremlin.spark.graphOuputRDD
have all been deprecated. Using them still works, but moving forward, users only need to
leverage gremlin.hadoop.graphReader
and gremlin.hadoop.graphWriter
. An example properties file snippet is provided
below.
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
See: TINKERPOP-1082, TINKERPOP-1222
TraversalSideEffects Update
There were changes to TraversalSideEffect
both at the semantic level and at the API level. Users that have traversals
of the form sideEffect{…}
that leverage global side-effects should read the following carefully. If the user’s traversals do
not use lambda-based side-effect steps (e.g. groupCount("m")
), then the changes below will not effect them. Moreover, if user’s
traversal only uses sideEffect{…}
with closure (non-TraversalSideEffect
) data references, then the changes below will not effect them.
If the user’s traversal uses sideEffects in OLTP only, the changes below will not effect them. Finally, providers should not be
effected by the changes save any tests cases.
TraversalSideEffects Get API Change
TraversalSideEffects
can now logically operate within a distributed OLAP environment. In order to make this possible,
it is necessary that each side-effect be registered with a reducing BinaryOperator
. This binary operator will combine
distributed updates into a single global side-effect at the master traversal. Many of the methods in TraversalSideEffect
have been Deprecated
, but they are backwards compatible save that TraversalSideEffects.get()
no longer returns an Optional
,
but instead throws an IllegalArgumentException
. While the Optional
semantics could have remained, it was deemed best to
directly return the side-effect value to reduce object creation costs and because all side-effects must be registered apriori,
there is never a reason why an unknown side-effect key would be used. In short:
// change
traversal.getSideEffects().get("m").get()
// to
traversal.getSideEffects().get("m")
TraversalSideEffects Registration Requirement
All TraversalSideEffects
must be registered upfront. This is because, in OLAP, side-effects map to Memory
compute keys
and as such, must be declared prior to the execution of the TraversalVertexProgram
. If a user’s traversal creates a
side-effect mid-traversal, it will fail. The traversal must use GraphTraversalSource.withSideEffect()
to declare
the side-effects it will use during its execution lifetime. If the user’s traversals use standard side-effect Gremlin
steps (e.g. group("m")
), then no changes are required.
See: TINKERPOP-1192
TraversalSideEffects Add Requirement
In a distributed environment, a side-effect can not be mutated and be expected to exist in the mutated form at the final,
aggregated, master traversal. For instance, if the side-effect "myCount" references a Long
, the Long
can not be updated
directly via sideEffects.set("myCount", sideEffects.get("myCount") + 1)
. Instead, it must rely on the registered reducer
to do the merging and thus, the Step
must do sideEffect.add("mySet",1)
, where the registered reducer is Operator.sum
.
Thus, the below will increment "a". If no operator was provided, then the operator is assumed Operator.assign
and the
final result of "a" would be 1. Note that Traverser.sideEffects(key,value)
uses TraversalSideEffect.add()
.
gremlin> traversal = g.withSideEffect('a',0,sum).V().out().sideEffect{it.sideEffects('a',1)}
==>v[3]
==>v[2]
==>v[4]
==>v[5]
==>v[3]
==>v[3]
gremlin> traversal.getSideEffects().get('a')
==>6
gremlin> traversal = g.withSideEffect('a',0).V().out().sideEffect{it.sideEffects('a',1)}
==>v[3]
==>v[2]
==>v[4]
==>v[5]
==>v[3]
==>v[3]
gremlin> traversal.getSideEffects().get('a')
==>1
See: TINKERPOP-1192, TINKERPOP-1166
ProfileStep Update and GraphTraversal API Change
The profile()
-step has been refactored into 2 steps — ProfileStep
and ProfileSideEffectStep
. Users who previously
used the profile()
in conjunction with cap(TraversalMetrics.METRICS_KEY)
can now simply omit the cap step. Users who
retrieved TraversalMetrics
from the side-effects after iteration can still do so, but will need to specify a side-effect
key when using the profile()
. For example, profile("myMetrics")
.
See: TINKERPOP-958
BranchStep Bug Fix
There was a bug in BranchStep
that also rears itself in subclass steps such as UnionStep
and ChooseStep
.
For traversals with branches that have barriers (e.g. count()
, max()
, groupCount()
, etc.), the traversal needs to be updated.
For instance, if a traversal is of the form g.V().union(out().count(),both().count())
, the result is now different
(the bug fix yields a different output). In order to yield the same result, the traversal should be rewritten as
g.V().local(union(out().count(),both().count()))
. Note that if a branch does not have a barrier, then no changes are required.
For instance, g.V().union(out(),both())
does not need to be updated. Moreover, if the user’s traversal already used
the local()
-form, then no change are required either.
See: TINKERPOP-1188
MemoryComputeKey and VertexComputeKey
Users that have custom VertexProgram
implementations will need to change their implementations to support the new
VertexComputeKey
and MemoryComputeKey
classes. In the VertexPrograms
provided by TinkerPop, these changes were trivial,
taking less than 5 minutes to make all the requisite updates.
-
VertexProgram.getVertexComputeKeys()
returns aSet<VertexComputeKey>
. No longer aSet<String>
. UseVertexComputeKey.of(String key,boolean transient)
to generate aVertexComputeKey
. Transient keys were not supported in the past, so to make the implementation semantically equivalent, the boolean transient should be false. -
VertexProgram.getMemoryComputeKeys()
returns aSet<MemoryComputeKey>
. No longer aSet<String>
. UseMemoryComputeKey.of(String key, BinaryOperator reducer, boolean broadcast, boolean transient)
to generate aMemoryComputeKey
. Broadcasting and transients were not supported in the past so to make the implementation semantically equivalent, the boolean broadcast should be true and the boolean transient should be false.
An example migration looks as follows. What might currently look like:
public Set<String> getMemoryComputeKeys() {
return new HashSet<>(Arrays.asList("a","b","c"))
}
Should now look like:
public Set<MemoryComputeKey> getMemoryComputeKeys() {
return new HashSet<>(Arrays.asList(
MemoryComputeKey.of("a", Operator.and, true, false),
MemoryComputeKey.of("b", Operator.sum, true, false),
MemoryComputeKey.of("c", Operator.or, true, false)))
}
A similar patterns should also be used for VertexProgram.getVertexComputeKeys()
.
See: TINKERPOP-1162
SparkGraphComputer and GiraphGraphComputer Persistence
The MapReduce
-based steps in TraversalVertexProgram
have been removed and replaced using a new Memory
-reduction model.
MapReduce
jobs always created a persistence footprint, e.g. in HDFS. Memory
data was never persisted to HDFS.
As such, there will be no data on the disk that is accessible. For instance, there is no more ~reducing
, ~traversers
,
and specially named side-effects such as m
from a groupCount('m')
. The data is still accessible via ComputerResult.memory()
,
it simply does not have a corresponding on-disk representation.
RemoteGraph
RemoteGraph
is a lightweight Graph
implementation that acts as a proxy for sending traversals to Gremlin Server for
remote execution. It is an interesting alternative to the other methods for connecting to Gremlin Server in that all
other methods involved construction of a String
representation of the Traversal
which is then submitted as a script
to Gremlin Server (via driver or REST).
gremlin> graph = RemoteGraph.open('conf/remote-graph.properties')
==>remotegraph[DriverServerConnection-localhost/127.0.0.1:8182 [graph='graph]]
gremlin> g = graph.traversal()
==>graphtraversalsource[remotegraph[DriverServerConnection-localhost/127.0.0.1:8182 [graph='graph]], standard]
gremlin> g.V().valueMap(true)
==>[name:[marko], label:person, id:1, age:[29]]
==>[name:[vadas], label:person, id:2, age:[27]]
==>[name:[lop], label:software, id:3, lang:[java]]
==>[name:[josh], label:person, id:4, age:[32]]
==>[name:[ripple], label:software, id:5, lang:[java]]
==>[name:[peter], label:person, id:6, age:[35]]
Note that g.V().valueMap(true)
is executing in Gremlin Server and not locally in the console.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
GraphStep Compilation Requirement
OLTP graph providers that have a custom GraphStep
implementation should ensure that g.V().hasId(x)
and g.V(x)
compile
to the same representation. This ensures a consistent user experience around random access of elements based on ids
(as opposed to potentially the former doing a linear scan). A static helper method called GraphStep.processHasContainerIds()
has been added. TinkerGraphStepStrategy
was updated as such:
((HasContainerHolder) currentStep).getHasContainers().forEach(tinkerGraphStep::addHasContainer);
is now
((HasContainerHolder) currentStep).getHasContainers().forEach(hasContainer -> {
if (!GraphStep.processHasContainerIds(tinkerGraphStep, hasContainer))
tinkerGraphStep.addHasContainer(hasContainer);
});
See: TINKERPOP-1219
Step API Update
The Step
interface is fundamental to Gremlin. Step.processNextStart()
and Step.next()
both returned Traverser<E>
.
We had so many Traverser.asAdmin()
and direct typecast calls throughout (especially in TraversalVertexProgram
) that
it was deemed prudent to have Step.processNextStart()
and Step.next()
return Traverser.Admin<E>
. Moreover it makes
sense as this is internal logic where Admins
are always needed. Providers with their own step definitions will simply
need to change the method signatures of Step.processNextStart()
and Step.next()
. No logic update is required — save
that asAdmin()
can be safely removed if used. Also, Step.addStart()
and Step.addStarts()
take Traverser.Admin<S>
and Iterator<Traverser.Admin<S>>
, respectively.
Traversal API Update
The way in which TraverserRequirements
are calculated has been changed (for the better). The ramification is that post
compilation requirement additions no longer make sense and should not be allowed. To enforce this,
Traversal.addTraverserRequirement()
method has been removed from the interface. Moreover, providers/users should never be able
to add requirements manually (this should all be inferred from the end compilation). However, if need be, there is always
RequirementStrategy
which will allow the provider to add a requirement at strategy application time
(though again, there should not be a reason to do so).
ComparatorHolder API Change
Providers that either have their own ComparatorHolder
implementation or reason on OrderXXXStep
will need to update their code.
ComparatorHolder
now returns List<Pair<Traversal,Comparator>>
. This has greatly reduced the complexity of comparison-based
steps like OrderXXXStep
. However, its a breaking API change that is trivial to update to, just some awareness is required.
See: TINKERPOP-1209
GraphComputer Semantics and API
Providers that have a custom GraphComputer
implementation will have a lot to handle. Note that if the graph system
simply uses SparkGraphComputer
or GiraphGraphComputer
provided by TinkerPop, then no updates are required. This
only effects providers that have their own custom GraphComputer
implementations.
Memory
updates:
-
Any
BinaryOperator
can be used for reduction and is made explicit in theMemoryComputeKey
. -
MemoryComputeKeys
can be marked transient and must be removed from the resultantComputerResult.memory()
. -
MemoryComputeKeys
can be specified to not broadcast and thus, must not be available to workers to read inVertexProgram.execute()
. -
The
Memory
API has been changed. No moreincr()
,and()
, etc. Now its justset()
(setup/terminate) andadd()
(execute).
VertexProgram
updates:
-
VertexComputeKeys
can be marked transient and must be removed from the resultantComputerResult.graph()
.
Operational semantic test cases have been added to GraphComputerTest
to ensure that all the above are implemented correctly.
Barrier Step Updates
The Barrier
interface use to simply be a marker interface. Now it has methods and it is the primary means by which
distributed steps across an OLAP job are aggregated and distributed. It is unlikely that Barrier
was ever used
directly by a provider’s custom step. Instead, a provider most likely extended SupplyingBarrierStep
, CollectingBarrierStep
,
and/or ReducingBarrierStep
.
Providers that have custom extensions to these steps or that use Barrier
directly will need to adjust their implementation slightly to
accommodate a new API that reflects the Memory
updates above. This should be a simple change. Note that FinalGet
no longer exists and such post-reduction processing is handled by the reducing step (via the new Generating
interface).
See: TINKERPOP-1164
Performance Tests
The ProcessPerformanceSuite
and TraversalPerformanceTest
have been deprecated. They are still available, but going forward,
providers should implement their own performance tests and not rely on the built-in JUnit benchmark-based performance test suite.
Graph Processor Providers
GraphFilter and GraphComputer
The GraphComputer
API has changed with the addition of GraphComputer.vertices(Traversal)
and GraphComputer.edges(Traversal)
.
These methods construct a GraphFilter
object which is also new to TinkerPop 3.2.0. GraphFilter
is a "push-down predicate"
used to selectively retrieve subgraphs of the underlying graph to be OLAP processed.
-
If the graph system provider relies on an existing
GraphComputer
implementations such asSparkGraphComputer
and/orGiraphGraphComputer
, then there is no immediate action required on their part to remain TinkerPop-compliant. However, they may wish to update theirInputFormat
orInputRDD
implementation to beGraphFilterAware
and handle theGraphFilter
filtering at the disk/database level. It is advisable to do so in order to reduce OLAP load times and memory/GC usage. -
If the graph system provider has their own
GraphComputer
implementation, then they should implement the two new methods and ensure thatGraphFilter
is processed correctly. There is a new test case calledGraphComputerTest.shouldSupportGraphFilter()
which ensures the semantics ofGraphFilter
are handled correctly. For a "quick and easy" way to move forward, look toGraphFilterInputFormat
as a way of wrapping an existingInputFormat
to do filtering prior toVertexProgram
orMapReduce
execution.
Note
|
To quickly move forward, the GraphComputer implementation can simply set GraphComputer.Features.supportsGraphFilter()
to false and ensure that GraphComputer.vertices() and GraphComputer.edges() throws GraphComputer.Exceptions.graphFilterNotSupported() .
This is not recommended as its best to support GraphFilter .
|
See: TINKERPOP-962
Job Chaining and GraphComputer
TinkerPop 3.2.0 has integrated VertexPrograms
into GraphTraversal
. This means, that a single traversal can compile to multiple
GraphComputer
OLAP jobs. This requires that ComputeResults
be chainable. There was never any explicit tests to verify if a
provider’s GraphComputer
could be chained, but now there are. Given a reasonable implementation, it is likely that no changes
are required of the provider. However, to ensure the implementation is "reasonable" GraphComputerTests
have been added.
-
For providers that support their own
GraphComputer
implementation, note that there is a newGraphComputerTest.shouldSupportJobChaining()
. This tests verifies that theComputerResult
output of one job can be fed into the input of a subsequent job. Only linear chains are tested/required currently. In the future, branching DAGs may be required. -
For providers that support their own
GraphComputer
implementation, note that there is a newGraphComputerTest.shouldSupportPreExistingComputeKeys()
. When chaining OLAP jobs together, if an OLAP job requires the compute keys of a previous OLAP job, then the existing compute keys must be accessible. A simple 2 line change toSparkGraphComputer
andTinkerGraphComputer
solved this for TinkerPop.GiraphGraphComputer
did not need an update as this feature was already naturally supported.
See: TINKERPOP-570
Graph Language Providers
ScriptTraversal
Providers that have custom Gremlin language implementations (e.g. Gremlin-Scala), there is a new class called ScriptTraversal
which will handle script-based processing of traversals. The entire GroovyXXXTest
-suite was updated to use this new class.
The previous TraversalScriptHelper
class has been deprecated so immediate upgrading is not required, but do look into
ScriptTraversal
as TinkerPop will be using it as a way to serialize "String-based traversals" over the network moving forward.
See: TINKERPOP-1154
ByModulating and Custom Steps
If the provider has custom steps that leverage by()
-modulation, those will now need to implement ByModulating
.
Most of the methods in ByModulating
are default
and, for most situations, only ByModulating.modulateBy(Traversal)
needs to be implemented. Note that this method’s body will most like be identical the custom step’s already existing
TraversalParent.addLocalChild()
. It is recommended that the custom step not use TraversalParent.addLocalChild()
as this method may be deprecated in a future release. Instead, barring any complex usages, simply rename the
CustomStep.addLocalChild(Traversal)
to CustomStep.modulateBy(Traversal)
.
See: TINKERPOP-1153
TraversalEngine Deprecation and GraphProvider
The TraversalSource
infrastructure has been completely rewritten. Fortunately for users, their code is backwards compatible.
Unfortunately for graph system providers, a few tweaks to their implementation are in order.
-
If the graph system supports more than
Graph.compute()
, then implementGraphProvider.getGraphComputer()
. -
For custom
TraversalStrategy
implementations, changetraverser.getEngine().isGraphComputer()
toTraversalHelper.onGraphComputer(Traversal)
. -
For custom
Steps
, changeimplements EngineDependent
toimplements GraphComputing
.
See: TINKERPOP-971
TinkerPop 3.1.0
A 187 On The Undercover Gremlinz
TinkerPop 3.1.8
Release Date: August 21, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.1.7
Release Date: June 12, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphML XSLT
There were some inconsistencies in the GraphML format supported in TinkerPop 2.x. These issues were corrected on the
initial release of TinkerPop 3.0.0, but as a result, attempting to read GraphML from 2.x will end with an error. A
newly added XSLT file in gremlin-core
, called tp2-to-tp3-graphml.xslt
, transforms 2.x GraphML into 3.x GraphML,
making it possible easily read in legacy GraphML through a 3.x GraphMLReader
.
See: TINKERPOP-1608
TinkerPop 3.1.6
Release Date: February 3, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Providers
Driver Providers
Session Close Confirmation
When a session is closed, it now returns a confirmation in the form of a single NO CONTENT
message. When the message
arrives, it means that the server has already destroyed the session. Prior to this change, the request was somewhat
one-way, in that the client could send the request and the server would silently honor it. The confirmation makes it
a bit easier to ensure from the client perspective that the close did what it was supposed to do, allowing the client
to proceed only when the server was fully complete with its work.
See: TINKERPOP-1544
TinkerPop 3.1.5
Release Date: October 17, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Java Driver and close()
There were a few problems noted around the close()
of Cluster
and Client
instances, including issues that
presented as system hangs. These issues have been resolved, however, it is worth noting that an unchecked exception
that was thrown under a certain situation has changed as part of the bug fixes. When submitting an in-session request
on a Client
that was closed (or closing) an IllegalStateException
is thrown. This replaces older functionality
that threw a ConnectionException
and relied logic far deeper in the driver to produce that error and had the
potential to open additional resources despite the intention of the user to "close".
See: TINKERPOP-1467
TinkerPop 3.1.4
Release Date: September 6, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Server Workers
In release 3.1.3, a recommendation was made to
ensure that the threadPoolWorker
setting for Gremlin Server was no less than 2
in cases where Gremlin Server was
being used with sessions that accept parallel requests. In 3.1.4, that is no longer the case and a size of 1
remains
acceptable even in that specific case.
See: TINKERPOP-1350
TinkerPop 3.1.3
Release Date: July 18, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Reserved Gremlin Server Keys
Gremlin Server has always considered certain binding keys (request parameters) as reserved, but that list has now expanded to be more inclusive all the static enums that are imported to the script engine. It is possible that those using Gremlin Server may have to rename their keys if they somehow successfully were using some of the now reserved terms in previous versions.
See: TINKERPOP-1354
Remote Timeout
Disabling the timeout for a :remote
to Gremlin Server was previously accomplished by setting the timeout to max
as
in:
:remote config timeout max
where max
would set the timeout to be Integer.MAX_VALUE
. While this feature is still supported, it has been
deprecated in favor of the new configuration option of none
, as in:
:remote config timeout none
The use of none
completely disables the timeout rather than just setting an arbitrarily high one. Note that it is
still possible to get a timeout on a request if the server timeout limits are reached. The console timeout value only
refers to how long the console will wait for a response from the server before giving up. By default, the timeout is
set to none
.
See: TINKERPOP-1267
Gremlin Server Workers
Past configuration recommendations for the threadPoolWorker
setting on Gremlin Server stated this value could be
safely set to 1
at the low end. A size of 1
is still valid for most cases, however, if Gremlin Server is being used
with sessions that accept parallel requests, then this value should be no less than 2
or else certain scripts (i.e.
those that block for an extended period of time) may cause Gremlin Server to lock up the session.
See: TINKERPOP-1350
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
Property Keys and Hyphens
Graph providers should no longer rely on the test suite to validate that hyphens work for labels and property keys.
Vertex and Edge Counts
A large number of asserts for vertex and edge counts in the test suite were not being applied. This problem has been rectified, but could manifest as test errors for different implementations. The chances of the new assertions identifying previously unrecognized bugs seems slim however, as there are many other tests that validate these counts in other ways. If those were passing previously, then these new asserts should likely not pose a problem.
See: TINKERPOP-1300
Test Feature Annotations
A large number of gremlin-test
feature annotations were incorrect which caused test cases to run against graphs that
did not support those features. The annotations have been fixed, but this opened the possibility that more test cases
will run against the graph implementation. Providers should ensure that their graph features()
are consistent with
the capabilities of the graph implementation.
See: TINKERPOP-1319
Graph Language Providers
AndTest Renaming
The get_g_V_andXhasXage_gt_27XoutE_count_gt_2X_name
test in AndTest
was improperly named and did not match the
nature of the traversal it was providing. It has been renamed to: get_g_V_andXhasXage_gt_27XoutE_count_gte_2X_name
.
Driver Providers
SASL Mechanism
Note that the Gremlin Driver for Java now passes a new parameter for SASL authentication called saslMechanism
. This
is an optional argument and does not represent a breaking change, but it does make the overall implementation more
complete. While the default authentication implementations packaged with Gremlin Server don’t utilize this argument
other implementations might, so the drivers should be able to pass it as per the SASL specification.
See: TINKERPOP-1263
TinkerPop 3.1.2
Release Date: April 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Aliasing Sessions
Calls to SessionedClient.alias()
used to throw UnsupportedOperationException
and it was therefore not possible to
use that capability with a session. That method is now properly implemented and aliasing is allowed.
See: TINKERPOP-1096
Remote Console
The :remote console
command provides a way to avoid having to prefix the :>
command to scripts when remoting. This
mode of console usage can be convenient when working exclusively with a remote like Gremlin Server and there is only a
desire to view the returned data and not to actually work with it locally in any way.
Console Remote Sessions
The :remote tinkerpop.server
command now allows for a "session" argument to be passed to connect
. This argument,
tells the remote to configure it with a Gremlin Server session. In this way, the console can act as a window to script
exception on the server and behave more like a standard "local" console when it comes to script execution.
See: TINKERPOP-1097
TinkerPop Archetypes
TinkerPop now offers Maven archetypes, which provide example project templates to quickly get started with TinkerPop. The available archetypes are as follows:
-
gremlin-archetype-server
- An example project that demonstrates the basic structure of a Gremlin Server project, how to connect with the Gremlin Driver, and how to embed Gremlin Server in a testing framework. -
gremlin-archetype-tinkergraph
- A basic example of how to structure a TinkerPop project with Maven.
Session Transaction Management
When connecting to a session with gremlin-driver
, it is now possible to configure the Client
instance so as to
request that the server manage the transaction for each requests.
Cluster cluster = Cluster.open();
Client client = cluster.connect("sessionName", true);
Specifying true
to the connect()
method signifies that the client
should make each request as one encapsulated
in a transaction. With this configuration of client
there is no need to close a transaction manually.
Session Timeout Setting
The gremlin-driver
now has a setting called maxWaitForSessionClose
that allows control of how long it will wait for
an in-session connection to respond to a close request before it simply times-out and moves on. When that happens,
the server will either eventually close the connection via at session expiration or at the time of shutdown.
See: TINKERPOP-1160
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
All Providers
Provider Documentation
Documentation related to the lower-level APIs used by a provider, that was formerly in the reference documentation, has been moved to its own documentation set that is now referred to as the Provider Documentation.
See: TINKERPOP-937
Graph System Providers
GraphProvider.clear() Semantics
The semantics of the various clear()
methods on GraphProvider
didn’t really change, but it would be worth reviewing
their implementations to ensure that implementations can be called successfully in an idempotent fashion. Multiple
calls to clear()
may occur for a single test on the same Graph
instance, as 3.1.1-incubating
introduced an
automated method for clearing graphs at the end of a test and some tests call clear()
manually.
See: TINKERPOP-1146
Driver Providers
Session Transaction Management
Up until now transaction management has been a feature of sessionless requests only, but the new manageTransaction
request argument for the Session OpProcessor
changes that. Session-based requests can now pass this boolean value on each request to signal to
Gremlin Server that it should attempt to commit (or rollback) the transaction at the end of the request. By default,
this value as false
, so there is no change to the protocol for this feature.
scriptEvalTimeout Override
The Gremlin Server protocol now allows the passing of scriptEvaluationTimeout
as an argument to the SessionOpProcessor
and the StandardOpProcessor
. This value will override the setting of the same name provided in the Gremlin Server
configuration file on a per request basis.
Plugin Providers
RemoteAcceptor allowRemoteConsole
The RemoteAcceptor
now has a new method called allowRemoteConsole()
. It has a default implementation that
returns false
and should thus be a non-breaking change for current implementations. This value should only be set
to true
if the implementation expects the user to always use :>
to interact with it. For example, the
tinkerpop.server
plugin expects all user interaction through :>
, where the line is sent to Gremlin Server. In
that case, that RemoteAcceptor
implementation can return true
. On the other hand, the tinkerpop.gephi
plugin,
expects that the user sometimes call :>
and sometimes work with local evaluation as well. It interacts with the
local variable bindings in the console itself. For tinkerpop.gephi
, this method returns false
.
TinkerPop 3.1.1
Release Date: February 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Storage I/O
The gremlin-core
io-package now has a Storage
interface. The methods that were available via hdfs
(e.g. rm()
, ls()
, head()
, etc.) are now part of Storage
. Both HDFS and Spark implement Storage
via
FileSystemStorage
and SparkContextStorage
, respectively. SparkContextStorage
adds support for interacting with
persisted RDDs in the Spark cache.
This update changed a few of the file handling methods. As it stands, these changes only effect manual Gremlin Console usage as HDFS support was previously provided via Groovy meta-programing. Thus, these are not "code-based" breaking changes.
-
hdfs.rmr()
no longer exists.hdfs.rm()
is now recursive. Simply change all references tormr()
torm()
for identical behavior. -
hdfs.head(location,lines,writableClass)
no longer exists.-
For graph locations, use
hdfs.head(location,writableClass,lines)
. -
For memory locations, use
hdfs.head(location,memoryKey,writableClass,lines)
.
-
-
hdfs.head(…,ObjectWritable)
no longer exists. UseSequenceFileInputFormat
as an input format is the parsing class.
Given that HDFS (and now Spark) interactions are possible via Storage
and no longer via Groovy meta-programming,
developers can use these Storage
implementations in their Java code. In fact, Storage
has greatly simplified
complex file/RDD operations in both GiraphGraphComputer
and SparkGraphComputer
.
Finally, note that the following low-level/internal classes have been removed: HadoopLoader
and HDFSTools
.
See: TINKERPOP-1033, TINKERPOP-1023
Gremlin Server Transaction Management
Gremlin Server now has a setting called strictTransactionManagement
, which forces the user to pass
aliases
for all requests. The aliases are then used to determine which graphs will have their transactions closed
for that request. The alternative is to continue with default operations where the transactions of all configured
graphs will be closed. It is likely that strictTransactionManagement
(which is false
by default so as to be
backward compatible with previous versions) will become the future standard mode of operation for Gremlin Server as
it provides a more efficient method for transaction management.
Deprecated credentialsDbLocation
The credentialsDbLocation
setting was a TinkerGraph only configuration option to the SimpleAuthenticator
for
Gremlin Server. It provided the file system location to a "credentials graph" that TinkerGraph would read from a
Gryo file at that spot. This setting was only required because TinkerGraph did not support file persistence at the
time that SimpleAuthenticator
was created.
As of 3.1.0-incubating, TinkerGraph received a limited persistence feature that allowed the "credentials graph"
location to be specified in the TinkerGraph properties file via gremlin.tinkergraph.graphLocation
and as such the
need for credentialsDbLocation
was eliminated.
This deprecation is not a breaking change, however users should be encouraged to convert their configurations to use
the gremlin.tinkergraph.graphLocation
as soon as possible, as the deprecated setting will be removed in a future
release.
TinkerGraph Supports Any I/O
TinkerGraph’s 'gremlin.tinkergraph.graphLocation' configuration setting can now take a fully qualified class name
of a Io.Builder
implementation, which means that custom IO implementations can be used to read and write
TinkerGraph instances.
See: TINKERPOP-886
Authenticator Method Deprecation
For users who have a custom Authenticator
implementation for Gremlin Server, there will be a new method present:
public default SaslNegotiator newSaslNegotiator(final InetAddress remoteAddress)
Implementation of this method is now preferred over the old method with the same name that has no arguments. The old
method has been deprecated. This is a non-breaking change as the new method has a default implementation that simply
calls the old deprecated method. In this way, existing Authenticator
implementations will still work.
See: TINKERPOP-995
Spark Persistence Updates
Spark RDD persistence is now much safer with a "job server" system that ensures that persisted RDDs are not garbage
collected by Spark. With this, the user is provider a spark
object that enables them to manage persisted RDDs
much like the hdfs
object is used for managing files in HDFS.
Finally, InputRDD
instance no longer need a reduceByKey()
postfix as view merges happen prior to writing the
graphRDD
. Note that a reduceByKey()
postfix will not cause problems if continued, it is simply inefficient
and no longer required.
See: TINKERPOP-1023, TINKERPOP-1027
Logging
Logging to Gremlin Server and Gremlin Console can now be consistently controlled by the log4j-server.properties
and log4j-console.properties
which are in the respective conf/
directories of the packaged distributions.
See: TINKERPOP-859
Gremlin Server Sandboxing
A number of improvements were made to the sandboxing feature of Gremlin Server (more specifically the
GremlinGroovyScriptEngine
). A new base class for sandboxing was introduce with the AbstractSandboxExtension
,
which makes it a bit easier to build white list style sandboxes. A usable implementation of this was also supplied
with the FileSandboxExtension
, which takes a configuration file containing a white list of accessible methods and
variables that can be used in scripts. Note that the original SandboxExtension
has been deprecated in favor of
the AbsstractSandboxExtension
or extending directly from Groovy’s TypeCheckingDSL
.
Deprecated supportsAddProperty()
It was realized that VertexPropertyFeatures.supportsAddProperty()
was effectively a duplicate of
VertexFeatures.supportsMetaProperties()
. As a result, supportsAddProperty()
was deprecated in favor of the other.
If using supportsAddProperty()
, simply modify that code to instead utilize supportsMetaProperties()
.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Data Types in Tests
There were a number of fixes related to usage of appropriate types in the test suite. There were cases where tests were mixing types, such that a single property key might have two different values. This mixed typing caused problems for some graphs and wasn’t really something TinkerPop was looking to explicitly enforce as a rule of implementing the interfaces.
While the changes should not have been breaking, providers should be aware that improved consistencies in the tests may present opportunities for test failures.
Graph Database Providers
Custom ClassResolver
For providers who have built custom serializers in Gryo, there is a new feature open that can be considered. A
GryoMapper
can now take a custom Kryo ClassResolver
, which means that custom types can be coerced to other types
during serialization (e.g. a custom identifier could be serialized as a HashMap
). The advantage to taking this
approach is that users will not need to have the provider’s serializers on the client side. They will only need to
exist on the server (presuming that the a type is coerced to a type available on the client, of course). The downside
is that serialization is then no longer a two way street. For example, a custom ClassResolver
that coerced a
custom identifier to HashMap
would let the client work with the identifier as a HashMap
, but the client would then
have to send that identifier back to the server as a HashMap
where it would be recognized as a HashMap
(not an
identifier).
See: TINKERPOP-1064
Feature Consistency
There were a number of corrections made around the consistency of Features
and how they were applied in tests.
Corrections fell into two groups of changes:
-
Bugs in the how
Features
were applied to certain tests. -
Refactoring around the realization that
VertexFeatures.supportsMetaProperties()
is really just a duplicate of features already exposed asVertexPropertyFeatures.supportsAddProperty()
.VertexPropertyFeatures.supportsAddProperty()
has been deprecated.
These changes related to "Feature Consistency" open up a number of previously non-executing tests for graphs that did not support meta-properties, so providers should be wary of potential test failure on previously non-executing tests.
Graph Processor Providers
InputRDD and OutputRDD Updates
There are two new methods on the Spark-Gremlin RDD interfaces.
-
InputRDD.readMemoryRDD()
: get aComputerResult.memory()
from an RDD. -
OutputRDD.writeMemoryRDD()
: write aComputerResult.memory()
to an RDD.
Note that both these methods have default implementations which simply work with empty RDDs. Most providers will never
need to implement these methods as they are specific to file/RDD management for GraphComputer
. The four classes that
implement these methods are PersistedOutputRDD
, PersistedInputRDD
, InputFormatRDD
, and OutputFormatRDD
. For the
interested provider, study the implementations therein to see the purpose of these two new methods.
TinkerPop 3.1.0
Release Date: November 16, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Additional upgrade information can be found here:
Upgrading for Users
Shading Jackson
The Jackson library is now shaded to gremlin-shaded
, which will allow Jackson to version independently without
breaking compatibility with dependent libraries or with those who depend on TinkerPop. The downside is that if a
library depends on TinkerPop and uses the Jackson classes, those classes will no longer exist with the standard
Jackson package naming. They will have to shifted as follows:
-
org.objenesis
becomesorg.apache.tinkerpop.shaded.objenesis
-
com.esotericsoftware.minlog
becomesorg.apache.tinkerpop.shaded.minlog
-
com.fasterxml.jackson
becomesorg.apache.tinkerpop.shaded.jackson
See: TINKERPOP-835
PartitionStrategy and VertexProperty
PartitionStrategy
now supports partitioning within VertexProperty
. The Graph
needs to be able to support
meta-properties for this feature to work.
See: TINKERPOP-333
Gremlin Server and Epoll
Gremlin Server provides a configuration option to turn on support for Netty native transport on Linux, which has been shown to help improve performance.
See: TINKERPOP-901
Rebindings Deprecated
The notion of "rebindings" has been deprecated in favor of the term "aliases". Alias is a better and more intuitive term than rebindings which should make it easier for newcomers to understand what they are for.
Configurable Driver Channelizer
The Gremlin Driver now allows the Channerlizer
to be supplied as a configuration, which means that custom
implementations may be supplied.
See: TINKERPOP-680
GraphSON and Strict Option
The GraphMLReader
now has a strict
option on the Builder
so that if a data type for a value is invalid in some
way, GraphMLReader will simply skip that problem value. In that way, it is a bit more forgiving than before especially
with empty data.
See: TINKERPOP-756
Transaction.close() Default Behavior
The default behavior of Transaction.close()
is to rollback the transaction. This is in contrast to previous versions
where the default behavior was commit. Using rollback as the default should be thought of as a like a safer approach
to closing where a user must now explicitly call commit()
to persist their mutations.
See TINKERPOP-805 for more information.
ThreadLocal Transaction Settings
The Transaction.onReadWrite()
and Transaction.onClose()
settings now need to be set for each thread (if another
behavior than the default is desired). For gremlin-server users that may be changing these settings via scripts.
If the settings are changed for a sessionless request they will now only apply to that one request. If the settings are
changed for an in-session request they will now only apply to all future requests made in the scope of that session.
See TINKERPOP-885
Hadoop-Gremlin
-
Hadoop1 is no longer supported. Hadoop2 is now the only supported Hadoop version in TinkerPop.
-
Spark and Giraph have been split out of Hadoop-Gremlin into their own respective packages (Spark-Gremlin and Giraph-Gremlin).
-
The directory where application jars are stored in HDFS is now
hadoop-gremlin-3.2.9-libs
.-
This versioning is important so that cross-version TinkerPop use does not cause jar conflicts.
-
See link:https://issues.apache.org/jira/browse/TINKERPOP-616
Spark-Gremlin
-
Providers that wish to reuse a graphRDD can leverage the new
PersistedInputRDD
andPersistedOutputRDD
.-
This allows the graphRDD to avoid serialization into HDFS for reuse. Be sure to enabled persisted
SparkContext
(see documentation).
-
See link:https://issues.apache.org/jira/browse/TINKERPOP-868, link:https://issues.apache.org/jira/browse/TINKERPOP-925
TinkerGraph Serialization
TinkerGraph is serializable over Gryo, which means that it can shipped over the wire from Gremlin Server. This feature can be useful when working with remote subgraphs.
See: TINKERPOP-728
Deprecation in TinkerGraph
The public static String
configurations have been renamed. The old public static
variables have been deprecated.
If the deprecated variables were being used, then convert to the replacements as soon as possible.
See: TINKERPOP-926
Deprecation in Gremlin-Groovy
The closure wrappers classes GFunction
, GSupplier
, GConsumer
have been deprecated. In Groovy, a closure can be
specified using as Function
and thus, these wrappers are not needed. Also, the GremlinExecutor.promoteBindings()
method which was previously deprecated has been removed.
See: TINKERPOP-879, TINKERPOP-897
Gephi Traversal Visualization
The process for visualizing a traversal has been simplified. There is no longer a need to "name" steps that will
represent visualization points for Gephi. It is possible to just "configure" a visualTraversal
in the console:
gremlin> :remote config visualTraversal graph vg
which creates a special TraversalSource
from graph
called vg
. The traversals created from vg
can be used
to :submit
to Gephi.
Alterations to GraphTraversal
There were a number of changes to GraphTraversal
. Many of the changes came by way of deprecation, but some semantics
have changed as well:
-
ConjunctionStrategy
has been renamed toConnectiveStrategy
(no other behaviors changed). -
ConjunctionP
has been renamed toConnectiveP
(no other behaviors changed). -
DedupBijectionStrategy
has been renamed (and made more effective) asFilterRankingStrategy
. -
The
GraphTraversal
mutation API has change significantly with all previous methods being supported but deprecated.-
The general pattern used now is
addE('knows').from(select('a')).to(select('b')).property('weight',1.0)
.
-
-
The
GraphTraversal
sack API has changed with all previous methods being supported but deprecated.-
The old
sack(mult,'weight')
is nowsack(mult).by('weight')
.
-
-
GroupStep
has been redesigned such that there is now only a key- and value-traversal. No more reduce-traversal.-
The previous
group()
-methods have been renamed togroupV3d0()
. To immediately upgrade, rename all yourgroup()
-calls togroupV3d0()
. -
To migrate to the new
group()
-methods, what wasgroup().by('age').by(outE()).by(sum(local))
is nowgroup().by('age').by(outE().sum())
.
-
-
There was a bug in
fold()
, where if a bulked traverser was provided, the traverser was only represented once.-
This bug fix might cause a breaking change to a user query if the non-bulk behavior was being counted on. If so, used
dedup()
prior tofold()
.
-
-
Both
GraphTraversal().mapKeys()
andGraphTraversal.mapValues()
has been deprecated.-
Use
select(keys)
andselect(columns)
. However, note thatselect()
will not unroll the keys/values. Thus,mapKeys()
⇒select(keys).unfold()
.
-
-
The data type of
Operator
enums will now always be the highest common data type of the two given numbers, rather than the data type of the first number, as it’s been before.
Aliasing Remotes in the Console
The :remote
command in Gremlin Console has a new alias
configuration option. This alias
option allows
specification of a set of key/value alias/binding pairs to apply to the remote. In this way, it becomes possible
to refer to a variable on the server as something other than what it is referred to for purpose of the submitted
script. For example once a :remote
is created, this command:
:remote alias x g
would allow "g" on the server to be referred to as "x".
:> x.E().label().groupCount()
See: TINKERPOP-914
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
All providers should be aware that Jackson is now shaded to gremlin-shaded
and could represent breaking change if
there was usage of the dependency by way of TinkerPop, a direct dependency to Jackson may be required on the
provider’s side.
Graph System Providers
GraphStep Alterations
-
GraphStep
is no longer insideEffect
-package, but now inmap
-package as traversals support mid-traversalV()
. -
Traversals now support mid-traversal
V()
-steps. Graph system providers should ensure that a mid-traversalV()
can leverage any suitable index.
See link:https://issues.apache.org/jira/browse/TINKERPOP-762
Decomposition of AbstractTransaction
The AbstractTransaction
class has been abstracted into two different classes supporting two different modes of
operation: AbstractThreadLocalTransaction
and AbstractThreadedTransaction
, where the former should be used when
supporting ThreadLocal
transactions and the latter for threaded transactions. Of course, providers may still
choose to build their own implementation on AbstractTransaction
itself or simply implement the Transaction
interface.
The AbstractTransaction
gains the following methods to potentially implement (though default implementations
are supplied in AbstractThreadLocalTransaction
and AbstractThreadedTransaction
):
-
doReadWrite
that should execute the read-write consumer. -
doClose
that should execute the close consumer.
See: TINKERPOP-765, TINKERPOP-885
Transaction.close() Default Behavior
The default behavior for Transaction.close()
is to rollback the transaction and is enforced by tests, which
previously asserted the opposite (i.e. commit on close). These tests have been renamed to suite the new semantics:
-
shouldCommitOnCloseByDefault
becameshouldCommitOnCloseWhenConfigured
-
shouldRollbackOnCloseWhenConfigured
becameshouldRollbackOnCloseByDefault
If these tests were referenced in an OptOut
, then their names should be updated.
See: TINKERPOP-805
Graph Traversal Updates
There were numerous changes to the GraphTraversal
API. Nearly all changes are backwards compatible with respective
"deprecated" annotations. Please review the respective updates specified in the "Graph System Users" section.
-
GraphStep
is no longer insideEffect
package. Now inmap
package. -
Make sure mid-traversal
GraphStep
calls are foldingHasContainers
in for index-lookups. -
Think about copying
TinkerGraphStepStrategyTest
for your implementation so you know folding is happening correctly.
Element Removal
Element.Exceptions.elementAlreadyRemoved
has been deprecated and test enforcement for consistency have been removed.
Providers are free to deal with deleted elements as they see fit.
See: TINKERPOP-297
VendorOptimizationStrategy Rename
The VendorOptimizationStrategy
has been renamed to ProviderOptimizationStrategy
. This renaming is consistent
with revised terminology for what were formerly referred to as "vendors".
See: TINKERPOP-876
GraphComputer Updates
GraphComputer.configure(String key, Object value)
is now a method (with default implementation).
This allows the user to specify engine-specific parameters to the underlying OLAP system. These parameters are not intended
to be cross engine supported. Moreover, if there are not parameters that can be altered (beyond the standard GraphComputer
methods), then the provider’s GraphComputer
implementation should simply return and do nothing.
Driver Providers
Aliases Parameter
The "rebindings" argument to the "standard" OpProcessor
has been renamed to "aliases". While "rebindings" is still
supported it is recommended that the upgrade to "aliases" be made as soon as possible as support will be removed in
the future. Gremlin Server will not accept both parameters at the same time - a request must contain either one
parameter or the other if either is supplied.
See: TINKERPOP-913
ThreadLocal Transaction Settings
If a driver configures the Transaction.onReadWrite()
or Transaction.onClose()
settings, note that these settings no
longer apply to all future requests. If the settings are changed for a sessionless request they will only apply to
that one request. If the settings are changed from an in-session request they will only apply to all future requests
made in the scope of that session.
See: TINKERPOP-885
TinkerPop 3.0.0
A Gremlin Rāga in 7/16 Time
TinkerPop 3.0.2
Release Date: October 19, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
BulkLoaderVertexProgram (BLVP)
BulkLoaderVertexProgram
now supports arbitrary inputs (i addition to HadoopGraph
, which was already supported in
version 3.0.1-incubating). It can now also read from any TP3 enabled graph, like TinkerGraph
or Neo4jGraph
.
TinkerGraph
TinkerGraph can now be configured to support persistence, where TinkerGraph will try to load a graph from a specified
location and calls to close()
will save the graph data to that location.
Gremlin Driver and Server
There were a number of fixes to gremlin-driver
that prevent protocol desynchronization when talking to Gremlin
Server.
On the Gremlin Server side, Websocket sub-protocol introduces a new "close" operation to explicitly close sessions. Prior to this change, sessions were closed in a more passive fashion (i.e. session timeout). There were also so bug fixes around the protocol as it pertained to third-party drivers (e.g. python) using JSON for authentication.
Upgrading for Providers
Graph Driver Providers
Gremlin Server close Operation
It is important to note that this feature of the sub-protocol applies to the SessionOpProcessor
(i.e. for
session-based requests). Prior to this change, there was no way to explicitly close a session. Sessions would get
closed by the server after timeout of activity. This new "op" gives drivers the ability to close the session
explicitly and as needed.
TinkerPop 3.0.1
Release Date: September 2, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Server
Gremlin Server now supports a SASL-based
(Simple Authentication and Security Layer) authentication model and a default SimpleAuthenticator
which implements
the PLAIN
SASL mechanism (i.e. plain text) to authenticate requests. This gives Gremlin Server some basic security
capabilities, especially when combined with its built-in SSL feature.
There have also been changes in how global variable bindings in Gremlin Server are established via initialization
scripts. The initialization scripts now allow for a Map
of values that can be returned from those scripts.
That Map
will be used to set global bindings for the server. See this
sample script
for an example.
See: TINKERPOP-576
Neo4j
Problems related to using :install
to get the Neo4j plugin operating in Gremlin Console on Windows have been
resolved.
See: TINKERPOP-804
Upgrading for Providers
Graph System Providers
GraphFactoryClass Annotation
Providers can consider the use of the new GraphFactoryClass
annotation to specify the factory class that GraphFactory
will use to open a new Graph
instance. This is an optional feature and will generally help implementations that have an interface extending Graph
. If that is the case, then this annotation can be used in the following fashion:
@GraphFactory(MyGraphFactory.class)
public interface MyGraph extends Graph{
}
MyGraphFactory
must contain the static open
method that is normally expected by GraphFactory
.
See: TINKERPOP-778
GraphProvider.Descriptor Annotation
There was a change that affected providers who implemented GraphComputer
related tests such as the ProcessComputerSuite
. If the provider runs those tests, then edit the GraphProvider
implementation for those suites to include the GraphProvider.Descriptor
annotation as follows:
@GraphProvider.Descriptor(computer = GiraphGraphComputer.class)
public final class HadoopGiraphGraphProvider extends HadoopGraphProvider {
public GraphTraversalSource traversal(final Graph graph) {
return GraphTraversalSource.build().engine(ComputerTraversalEngine.build().computer(GiraphGraphComputer.class)).create(graph);
}
}
See: TINKERPOP-690 for more information.
Semantics of Transaction.close()
There were some adjustments to the test suite with respect to how Transaction.close()
was being validated. For most providers, this will generally mean checking OptOut
annotations for test renaming problems. The error that occurs when running the test suite should make it apparent that a test name is incorrect in an OptOut
if there are issues there.
See: TINKERPOP-764 for more information.
Graph Driver Providers
Authentication
Gremlin Server now supports SASL-based authentication. By default, Gremlin Server is not configured with authentication turned on and authentication is not required, so existing drivers should still work without any additional change. Drivers should however consider implementing this feature as it is likely that many users will want the security capabilities that it provides.