TinkerPop Upgrade Information
TinkerPop 3.7.0
Gremfir Master of the Pan Flute
TinkerPop 3.7.4
Release Date: NOT OFFICIALLY RELEASED YET
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Upgrading for Providers
Graph System Providers
Graph Driver Providers
TinkerPop 3.7.3
Release Date: October 23, 2024
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphBinary Compatibility
When properties on element were introduced and returned as default in 3.7.0, setting ReferenceElementStrategy
on the
server provided a way to continue to send references for lightweight wire transfer and compatibility reasons. However,
an issue was discovered where when using GraphBinary, the 3.7.x server was not serializing properties as null
as per
the IO specification, but as empty lists instead. This caused deserialization failures in Python, JavaScript and Go
driver versions 3.6.x or below.
A fix was introduced to correct such error, where Gremlin Server versions 3.7.3 and above will return element properties
as null
when ReferenceElementStrategy
is applied, or when token
is used with materializedProperties
option in
3.7.x drivers. However, this also led to a change in 3.7.x driver behavior, where all non-Java drivers returns null
instead of empty list. As such, an additional change was introduced in these GLVs, where null
properties from
reference elements will now deserialized into an empty list, to maintain such behavior with older 3.7.x drivers.
One caveat is that when using 3.7.0 to 3.7.2 drivers to connect to 3.7.3 and above server, these drivers will not
contain the deserialization change and return null
as properties. In these cases, it is recommended to upgrade to
3.7.3 drivers.
TinkerPop 3.7.2
Release Date: April 8, 2024
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.7.1
Release Date: November 20, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
String Manipulation Steps
This version introduces the following new string manipulation steps asString()
, length()
, toLower()
, toUpper()
,
trim()
, lTrim()
, rTrim()
, reverse()
, replace()
, split()
, substring()
, and format()
, as well as
modifications to the concat()
step introduced in 3.7.0.
Updates to String Step concat():
Concat has been modified to take traversal varargs instead of a single traversal. Users no longer have to chain concat() steps together to concatenate multiple traversals:
gremlin> g.V(1).outE().as("a").V(1).values("name").concat(select("a").label(), select("a").inV().values("name"))
==>markocreatedlop
==>markoknowsvadas
==>markoknowsjosh
A notable breaking change from 3.7.0 is that we have output order of inject()
as a child of concat()
to be
consistent with other parent steps. Any 3.7.0 uses of concat(inject(X))
should change to concat(constant(X))
to
retain the old semantics.
// 3.7.0
gremlin> g.inject("a").concat(inject("b"))
==>ab
// 3.7.1
gremlin> g.inject("a").concat(inject())
==>aa
gremlin> g.inject("a").concat(inject("b"))
==>aa
gremlin> g.inject("a").concat(constant("b"))
==>ab
New String Steps asString(), length(), toLower(), toUpper():
The following example demonstrates the use of a closure to perform the above functions:
gremlin> g.V().hasLabel("person").values("age").map{it.get().toString()}
==>29
==>27
==>32
==>35
gremlin> g.V().values("name").map{it.get().length()}
==>5
==>5
==>3
==>4
==>6
==>5
gremlin> g.inject("TO", "LoWeR", "cAsE").map{it.get().toLowerCase()}
==>to
==>lower
==>case
gremlin> g.V().values("name").map{it.get().toUpperCase()}
==>MARKO
==>VADAS
==>LOP
==>JOSH
==>RIPPLE
==>PETER
With these additional steps this operation can be performed with standard Gremlin syntax:
gremlin> g.V().hasLabel("person").values("age").asString()
==>29
==>27
==>32
==>35
gremlin> g.V().values("name").length()
==>5
==>5
==>3
==>4
==>6
==>5
gremlin> g.inject("TO", "LoWeR", "cAsE").toLower()
==>to
==>lower
==>case
gremlin> g.V().values("name").toUpper()
==>MARKO
==>VADAS
==>LOP
==>JOSH
==>RIPPLE
==>PETER
Scopes are also enabled on these string functions. The global scope functions synonymous to parameterless function call, and will only accept string traversers. The local scope will also operate inside of lists of strings.
gremlin> g.V().values("name").fold().toUpper(local)
==>[MARKO,VADAS,LOP,JOSH,RIPPLE,PETER]
New String Steps trim(), lTrim(), rTrim(), reverse():
The following example demonstrates the use of a closure to reverse and trim strings (concatenated with a string for demonstration):
gremlin> g.V().values("name").map{it.get().reverse()}
==>okram
==>sadav
==>pol
==>hsoj
==>elppir
==>retep
gremlin> g.inject(" hi ").map{it.get().trim() + "trim"}
==>hitrim
gremlin> g.inject(" hi ").map{it.get().replaceAll(/^\s+/, '') + "left_trim"}
==>hi left_trim
gremlin> g.inject(" hi ").map{it.get().replaceAll(/\s+$/, '') + "right_trim"}
==> hiright_trim
With these additional steps this operation can be performed with standard Gremlin syntax:
gremlin> g.V().values("name").reverse()
==>okram
==>sadav
==>pol
==>hsoj
==>elppir
==>retep
gremlin> g.inject(" hi ").trim().concat("trim")
==>hitrim
gremlin> g.inject(" hi ").lTrim().concat("left_trim")
==>hi left_trim
gremlin> g.inject(" hi ").rTrim().concat("right_trim")
==> hiright_trim
Scopes are enabled on trim(), lTrim(), and rTrim(). The global scope functions synonymous to parameterless function call, and will only accept string traversers. The local scope will also operate inside of lists of strings. Due to reverse() overloading as a list function, scope is not applied, as reversing lists inside of lists is not a practical use case.
gremlin> g.inject([" hello ", " world "]).trim(Scope.local)
==>[hello,world]
New String Steps replace(), split(), substring()
The following example demonstrates the use of a closure to perform replace()
and split()
functions:
gremlin> g.V().hasLabel("software").values("name").map{it.get().replace("p", "g")}
==>log
==>riggle
gremlin> g.V().hasLabel("person").values("name").map{it.get().split("a")}
==>[m, rko]
==>[v, d, s]
==>[josh]
==>[peter]
With these additional steps this operation can be performed with standard Gremlin syntax:
gremlin> g.V().hasLabel("software").values("name").replace("p", "g")
==>log
==>riggle
gremlin> g.V().hasLabel("person").values("name").split("a")
==>[m,rko]
==>[v,d,s]
==>[josh]
==>[peter]
For substring()
, the new Gremlin step follows the Python standard, taking parameters start index and optionally an
end index. This will enable certain operations that would be complex to achieve with closure:
gremlin> g.V().hasLabel("person").values("name").map{it.get().substring(1,4)}
==>ark
==>ada
==>osh
==>ete
gremlin> g.V().hasLabel("person").values("name").map{it.get().substring(1)}
==>arko
==>adas
==>osh
==>eter
gremlin> g.V().hasLabel("person").values("name").map{it.get().substring(-2)}
String index out of range: -2
Type ':help' or ':h' for help.
The substring()
-step will return a substring with indices specified by the start and end indices, or from
the start index to the remainder of the string if an end index is not specified. Negative indices are allowed and will
count from the end of the string:
gremlin> g.V().hasLabel("person").values("name").substring(1,4)
==>ark
==>ada
==>osh
==>ete
gremlin> g.V().hasLabel("person").values("name").substring(1)
==>arko
==>adas
==>osh
==>eter
gremlin> g.V().hasLabel("person").values("name").substring(-2)
==>ko
==>as
==>sh
==>er
New String Step format()
This step is designed to simplify some string operations. In general, it is similar to the string formatting function available in many programming languages. Variable values can be picked up from Element properties, maps and scope variables.
gremlin> g.V().format("%{name} is %{age} years old")
==>marko is 29 years old
==>vadas is 27 years old
==>josh is 32 years old
==>peter is 35 years old
gremlin> g.V().hasLabel("person").as("a").values("name").as("p1").select("a").in("knows").format("%{p1} knows %{name}")
==>vadas knows marko
==>josh knows marko
gremlin> g.V(1).format("%{name} has %{_} connections").by(bothE().count())
==>marko has 3 connections
See: TINKERPOP-2334, format()-step
List Manipulation Steps
Additional List manipulation/filter steps have been added to replace the use of closures: any()
, all()
, product()
,
merge()
, intersect()
, combine()
, conjoin()
, difference()
,disjunct()
and reverse()
.
The following example demonstrates usage of the newly introduced steps:
gremlin> g.V().values("age").fold().all(P.gt(10))
==>[29,27,32,35]
gremlin> g.V().values("age").fold().any(P.eq(32))
==>[29,27,32,35]
gremlin> g.V().values("age").fold().product(__.V().values("age").limit(2).fold())
==>[[29,29],[29,27],[27,29],[27,27],[32,29],[32,27],[35,29],[35,27]]
gremlin> g.V().values("age").fold().merge([32,30,50])
==>[32,50,35,27,29,30]
gremlin> g.V().values("age").fold().combine([32,30,50])
==>[29,27,32,35,32,30,50]
gremlin> g.V().values("age").fold().intersect([32,30,50])
==>[32]
gremlin> g.V().values("age").fold().disjunct([32,30,50])
==>[50,35,27,29,30]
gremlin> g.V().values("age").fold().difference([32,30,50])
==>[35,27,29]
gremlin> g.V().values("age").order().by(desc).fold().reverse()
==>[27,29,32,35]
gremlin> g.V().values("age").fold().conjoin("-")
==>29-27-32-35
Date Manipulation Steps
Date manipulations in Gremlin queries were only possible using closures, which may or may not be supported by
different providers. In 3.7.1, we introduce the asDate()
, dateAdd
and dateDiff
steps aimed to replace the usage of closure.
The following example demonstrates usage of newly introduced steps:
gremlin> g.inject("2023-08-02T00:00:00Z").asDate().dateAdd(DT.day, 7).dateDiff(datetime("2023-08-02T00:00:00Z"))
==>604800
See: asDate()-step See: dateAdd()-step See: dateDiff()-step See: TINKERPOP-2979
datetime()
for Current Server Time
Function datetime()
extended to return current server time when used without argument.
gremlin> datetime().toGMTString()
==>13 Oct 2023 20:44:20 GMT
Upgrading for Providers
Graph System Providers
MultiProperty and MetaProperty Test Tags
The @MultiMetaProperties
tag signified Gherkin feature tests that were using multi-properties and/or meta-properties.
The features were originally combined as a single tag because tests that had the tag used the crew graph for testing.
As time has gone on, some tests have used the empty graph and inserted their own test data that uses one or the other
feature. In an effort to better allow graphs to support one feature or the other and to test them the single tag has
been split into two tags: @MultiProperties
and @MetaProperties
. The original @MultiMetaProperties
tag has been
removed.
InsertionOrderingRequired Test Tag
Added a new @InsertionOrderingRequired
tag which signifies Gherkin feature tests which are reliant on the graph system predictably returning results (vertices, edges, properties) in the same order in which they were inserted into the graph. These tests should be skipped by any graph which does not guarantee such ordering.
TinkerPop 3.7.0
Release Date: July 31, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
String concat() Step
String manipulations in Gremlin queries were only possible using closures, which may or may not be supported by
different providers. In 3.7.0, we introduce the concat()
-step as the beginning of a series of string manipulation steps
aimed to replace the usage of closure.
The following example demonstrates the use of a closure to add a new vertex with a label like an existing vertex but with some prefix attached:
gremlin> g.V(1).map{"prefix_" + it.get().label}.as('a').addV(select('a'))
==>v[13]
gremlin> g.V(13).label()
==>prefix_person
With concat()
step this operation can be performed with standard Gremlin syntax:
gremlin> g.addV(constant("prefix_").concat(__.V(1).label()))
==>v[14]
gremlin> g.V(14).label()
==>prefix_person
See: TINKERPOP-2672
union() Start Step
The union()
-step could only be used mid-traversal after a start step. The typical workaround for this issue was to
use inject()
with a dummy value to start the traversal and then utilize union()
:
gremlin> g.inject(0).union(V().has('name','vadas'),
......1> V().has('software','name','lop').in('created')).
......2> values('name')
==>vadas
==>marko
==>josh
==>peter
As of this version, union()
can be used more directly to avoid the workaround:
gremlin> g.union(V().has('name','vadas'),
......1> V().has('software','name','lop').in('created')).
......2> values('name')
==>vadas
==>marko
==>josh
==>peter
See: TINKERPOP-2873
Map and Cardinality
Relatively recent changes to the Gremlin language have allowed properties to be set by way of a Map
. As it pertains
to vertices, a Map
can be given to mergeV()
and property()
steps. The limitation was that setting Cardinality
with this syntax was not possible without reverting back to property()
steps that took a Cardinality
as an argument
in some way. The following paragraphs show how changes for in 3.6.5 make this syntax much better for multi-properties.
The mergeV()
step makes it much easier to write upsert-like traversals. Of course, if you had a graph that required
the use of multi-properties, some of the ease of mergeV()
was lost. It typically meant falling back to traversals
using sideEffect()
or similar direct uses of property()
to allow it to work properly:
g.mergeV([(T.id): '1234']).
option(onMatch, sideEffect(property(single,'age', 20).
property(set,'city','miami')).constant([:]))
For this version, mergeV()
gets two new bits of syntax. First, it is possible to individually define the cardinality
for each property value in the Map
for onCreate
or onMerge
events. Therefore, the above example could be written
as:
gremlin> g.addV().property(id,1234).property('age',19).property(set, 'city', 'detroit')
==>v[1234]
gremlin> g.mergeV([(T.id): 1234]).
......1> option(onMatch, ['age': single(20), 'city': set('miami')])
==>v[1234]
gremlin> g.V(1234).valueMap()
==>[city:[detroit,miami],age:[20]]
The other option available is to provide a default Cardinality
to the option()
as follows, continuing from the
previous example:
gremlin> g.mergeV([(T.id): 1234]).
......1> option(onMatch, ['age': 21, 'city': set('orlando')], single)
==>v[1234]
gremlin> g.mergeV([(T.id): 1234]).
......1> option(onMatch, ['age': 22, 'city': set('boston')], single)
==>v[1234]
gremlin> g.V(1234).valueMap()
==>[city:[detroit,miami,orlando,boston],age:[22]]
In the above example, any property value that does not have its cardinality explicitly defined, will be assumed to be the cardinality of the argument specified.
For property(Map)
the Cardinality
could be set universally for the Map
with property(Cardinality, Map)
but
there was no mechanism to set that value individually. Using the same pattern above and constructing a
CardinalityValue
now allows this possibility.
gremlin> g.addV().property(id,1234).property('age',19).property(set, 'city', 'detroit')
==>v[1234]
gremlin> g.V(1234).property(['age': 20, 'city': set('miami')])
==>v[1234]
gremlin> g.V(1234).property(['age': single(21), 'city': set('orlando')])
==>v[1234]
gremlin> g.V(1234).property(single, ['age': 21, 'city': set('boston')])
==>v[1234]
gremlin> g.V(1234).valueMap()
==>[city:[detroit,miami,orlando,boston],age:[21]]
See: TINKERPOP-2957
TinkerGraph Transactions
Previously, there was no reference implementation provided for the Transaction
API as this feature wasn’t supported by
TinkerGraph. Users were instead directed towards the Neo4jGraph provided in neo4j-gremlin
if they wanted to get access
to a Graph
implementation that supported transactions. Unfortunately, the maintenance around this plugin has largely
been abandoned and is only compatible with Neo4j version 3.4, which reached end of life in March 2020.
As of this version, we are introducing the transactional TinkerGraph, TinkerTransactionGraph
, which is TinkerGraph with
transaction capabilities. The TinkerTransactionGraph
has read committed
isolation level, which is the same as the
Neo4jGraph provided in neo4j-gremlin
. Only ThreadLocal
transactions are implemented, therefore embedded graph
transactions may not be fully supported. These transaction semantics may not fit the use case for some production
scenarios that require strict ACID-like transactions. Therefore, it is recommended that TinkerTransactionGraph be used
as a Graph for test environments where you still require support for transactions.
Usage examples
To use TinkerTransactionGraph
remotely, start a Gremlin Server with the included gremlin-server-transaction.yaml
config file.
bin/gremlin-server.sh conf/gremlin-server-transaction.yaml
Then to connect with Java:
GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using("localhost",8182,"g")); //1
GraphTraversalSource gtx = g.tx().begin(); //2
try {
gtx.addV('test1').iterate(); //3
gtx.addV('test2').iterate(); //3
gtx.tx().commit(); //4
} catch (Exception ex) {
gtx.tx().rollback(); //5
}
-
Create connection to Gremlin Server with transaction enabled graph.
-
Spawn a GraphTraversalSource with opened transaction.
-
Make some updates to graph.
-
Commit all changes.
-
Rollback all changes on error.
One can also use the remote TinkerTransactionGraph in Gremlin Console:
gremlin> :remote connect tinkerpop.server conf/remote.yaml session //1
==>Configured localhost/127.0.0.1:8182-[2e70bf11-12f7-4dfe-8a5e-a3d57f0df304]
gremlin> g = traversal().withRemote(DriverRemoteConnection.using("localhost",8182,"g"))
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> gtx = g.tx().begin() //2
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> gtx.addV('test').property('name', 'one')
==>v[0]
gremlin> gtx.V().valueMap()
==>[name:[one]]
gremlin> g.V().valueMap()
gremlin> gtx.tx().commit()
==>null
gremlin> g.V().valueMap() //3
==>[name:[one]]
gremlin> g.V()
==>v[0]
gremlin> gtx = g.tx().begin() //4
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> gtx.addV('test').property('name', 'two')
==>v[2]
gremlin> gtx.V().valueMap()
==>[name:[one]]
==>[name:[two]]
gremlin> g.V().valueMap()
==>[name:[one]]
gremlin> gtx.tx().rollback()
==>null
gremlin> g.V().valueMap() //5
==>[name:[one]]
-
Open remote Console session and spawn remote graph traversal source for the empty TinkerTransactionGraph.
-
Spawn a GraphTraversalSource by opening a transaction.
-
The vertex is added in the remote graph until we commit the transaction (which automatically closes the transaction).
-
Spawn another GraphTraversalSource by opening a new transaction.
-
The second vertex will not bed added to the remote graph since we rolled back the change
To use the embedded TinkerTransactionGraph in Gremlin Console:
gremlin> graph = TinkerTransactionGraph.open() //1
==>tinkertransactiongraph[vertices:0 edges:0]
gremlin> g = traversal().withEmbedded(graph) //2
==>graphtraversalsource[tinkertransactiongraph[vertices:0 edges:0], standard]
gremlin> g.addV('test').property('name','one')
==>v[0]
gremlin> g.tx().commit() //3
==>null
gremlin> g.V().valueMap()
==>[name:[one]]
gremlin> g.addV('test').property('name','two') //4
==>v[2]
gremlin> g.V().valueMap()
==>[name:[one]]
==>[name:[two]]
gremlin> g.tx().rollback() //5
==>null
gremlin> g.V().valueMap()
==>[name:[one]]
-
Open transactional graph.
-
Spawn a GraphTraversalSource with transactional graph.
-
Commit the add vertex operation
-
Add a second vertex without committing
-
Rollback the change
Note that all embedded TinkerTransactionGraph
remains ThreadLocal
transactions, meaning that all traversal sources
spawned from the graph will operate within the same transaction scope.
Important
|
TinkerTransactionGraph comes with performance and semantic limitations, where the former is expect to
be resolved in future versions. Since its primary recommended use case is for testing these limitations should not be
an impediment. Production use cases for TinkerGraph should generally prefer the non-transactional implementation.
|
Properties on Elements
One of the peculiar aspects of using Gremlin remotely is that if you do something like v = g.V().next()
you will
find that the v
, the Vertex
object, does not have any properties associated with it, even if the database
associates some with it. It will be a "reference" only, in that it will only have an id
and label
. The reason and
history for this approach can be found on the dev list.
While this has been a long-standing way TinkerPop operates, it is a confusing point for new users and often forces
some inconvenience on folks by requiring them to alter queries to transform graph elements to other forms that can
carry the property data (e.g. elementMap()
).
With this new release, properties are finally available on graph elements for all programming languages and are now returned by default for OLTP requests. Gremlin Server 3.5 and 3.6 can return properties only in some special cases.
Queries still won’t return properties on Elements for OLAP. It deals with references only as it always have irrespective of remote or local execution.
Consider the following example of this functionality with Javascript:
const client = new Client('ws://localhost:8182/gremlin',{traversalSource: 'gmodern'});
await client.open();
const result = await client.submit('g.V(1)');
console.log(JSON.stringify(result.first()));
await client.close();
The result will be different depending on the version of Gremlin Server. For 3.5/3.6:
{"id":1,"label":"person"}
For 3.7:
{"id":1,"label":"person","properties":{"name":[{"id":0,"label":"name","value":"marko","key":"name"}],"age":[{"id":1,"label":"age","value":29,"key":"age"}]}}
Enabling the previous behavior
Note that drivers from earlier versions like 3.5 and 3.6 will not be able to retrieve properties on elements. Older drivers connecting to 3.7.x servers should disable this functionality server-side:
Configure Gremlin Server to not return properties - update Gremlin Server initialization script with
ReferenceElementStrategy
. This configuration is essentially the one used in older versions of the server by default.
globals << [g : traversal().withEmbedded(graph).withStrategies(ReferenceElementStrategy)]
For 3.7 drivers, properties on elements can also be disabled per request using the tokens
option with materializeProperties
.
g.With("materializeProperties", "tokens").V(1).Next()
Possible issues
ReferenceElement
-type objects are no longer returned by the server by default. When upgrading existing code to 3.7.0,
it is possible that this change could have some impact if you directly declared use of those classes. For example:
ReferenceVertex v = g.V().next();
would need to be changed to:
Vertex v = g.V().next();
In other words, it would be best to code to the various structural interfaces like Vertex
and Edge
rather than
specific implementations.
See: TINKERPOP-2824
Gremlin.NET: Nullable Annotations
Gremlin.NET now uses nullable annotations
to state wether an argument or a return value can be null or not. This should make it much less likely to get a
NullReferenceException
from Gremlin.NET.
This change required to make some breaking changes but most users should not be affected by this as the breaking changes are limited to APIs that are mostly intended for graph driver providers.
See: TINKERPOP-2348
Removed connectOnStartup javascript
Removed the connectOnStartup
option for Gremlin Javascript API to resolve potential unhandledRejection
and race
conditions. New DriverRemoteConnection
objects no longer initiate connection by default at startup. Call open()
explicitly if one wishes to manually connect on startup.
For example:
const drc = new DriverRemoteConnection(url);
drc.open().catch(err => {
// Handle error upon open.
})
Creation of New gremlin-util
Module
gremlin-driver
has been refactored and several classes have been extracted to a new gremlin-util
module. Any classes
which are utilized by both gremlin-driver
and gremlin-server
have been extracted to gremlin-util
. This includes
the entire tinkerpop.gremlin.driver.ser
and tinkerpop.gremlin.driver.message
packages as well as
tinkerpop.gremlin.driver.MessageSerializer
and tinkerpop.gremlin.driver.Tokens
. For a full list of the migrated
classes, see: TINKERPOP-2819.
All migrated classes have had their packages updated to reflect this change. For these classes, packages have changed
from tinkerpop.gremlin.driver.
to tinkerpop.gremlin.util.
. For example
org.apache.tinkerpop.gremlin.driver.ser.GraphBinaryMessageSerializerV1
has been updated to
org.apache.tinkerpop.gremlin.util.ser.GraphBinaryMessageSerializerV1
. All imports of these classes should be updated
to reflect this change. All server config files which declare a list of serializers should also be updated to
reflect the new location of serializer classes.
See: TINKERPOP-2819
Removal of gremlin-driver
from gremlin-server
gremlin-driver
is no longer a dependency of gremlin-server
and thus will no longer be packaged in server
distributions. Any app which makes use of both gremlin-driver
and gremlin-server
will now need to directly
include both modules.
Serializer Renaming
Serializers tended to have a standard suffix that denotes the version. It usually appears as something like "V1d0". The "d0" portion of this has always been a bit superfluous and was actually not used when GraphBinary was introduced, preferring a simple "V1". To bring greater consistency to the naming the "d0" has been dropped from all places where it was referenced that way.
There was a bit of a misnaming in the early days of TinkerPop 3.x where typed versus untyped json was mixed up among
the GraphSON MessageSerializer
implementations. For GraphSON 1.0, untyped GraphSON was referred to as
GraphSONMessageSerializerV1d0
and typed as GraphSONMessageSerializerGremlinV1d0
, but for version 2.0 of GraphSON,
the idea of untyped GraphSON was left behind and so typed GraphSON became GraphSONMessageSerializerV2d0
which
followed to version 3.0. With the return of typed and untyped GraphSON for 3.6.5, it seemed important to unify all
of this naming and given the previously mentioned removal of the "d0" we now have:
-
GraphSONMessageSerializerV1
is now typed GraphSON 1.0 -
GraphSONMessageSerializerGremlinV1d0
is removed. -
GraphSONUntypedMessageSerializerV1
is now untyped GraphSON 1.0 -
GraphSONMessageSerializerV2
is now typed GraphSON 2.0 -
GraphSONMessageSerializerGremlinV2d0
is removed - it was deprecated in 3.4.0 actually and served little purpose -
GraphSONUntypedMessageSerializerV2
is now untyped GraphSON 2.0 -
GraphSONMessageSerializerV3
is typed GraphSON 3.0 as it always has been -
GraphSONUntypedMessageSerializerV3
is untyped GraphSON 3.0 which is newly added
Building and Running with JDK 17
You can now run TinkerPop with Java 17. Be advised that there are some issues with reflection and so you may need to either --add-opens or --add-exports certain modules to enable it to work with Java 17. This mostly affects the Kryo serialization library which is used with OLAP. If you use OLTP, then you may not need to add any of these options to the JVM. The following are only examples used by TinkerPop’s automated tests and are placed here for convenience.
--add-opens=java.base/java.io=ALL-UNNAMED
--add-opens=java.base/java.nio=ALL-UNNAMED
--add-opens=java.base/sun.nio.cs=ALL-UNNAMED
--add-opens=java.base/java.lang=ALL-UNNAMED
--add-opens=java.base/java.lang.invoke=ALL-UNNAMED
--add-opens=java.base/java.lang.reflect=ALL-UNNAMED
--add-opens=java.base/java.util=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent=ALL-UNNAMED
--add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED
--add-opens=java.base/java.net=ALL-UNNAMED
See: TINKERPOP-2703
Upgrading for Providers
Graph Driver Providers
Gremlin.NET: Nullable Reference Types
Enabling nullable reference types comes with some breaking changes in Gremlin.NET which can affect driver providers.
GraphBinary APIs changed to make better use of nullable reference types. Instead of one method WriteValueAsync
and
one method ReadValueAsync
, there are now methods WriteNullableValueAsync
and ReadNullableValueAsync
that allow
null
values and methods WriteNonNullableValueAsync
and ReadNonNullableValueAsync
that do not allow null
values.
Some set
property accessors were removed from some pure data classes in the Structure
and the Driver.Messages
namespaces to initialize these properties directly from the constructor which ensures that they are really not null
.
We also used this opportunity to convert some of these pure data classes into a record
.
See: TINKERPOP-2348
Graph System Providers
Reworked Gremlin Socket Server
The SimpleSocketServer
from gremlin-driver
has been brought into a new module gremlin-tools/gremlin-socket-server
and it has been adapted to be usable by all drivers for testing. See more about creating gremlin socket server tests
here.
Mid-traversal E()
Traversals now support mid-traversal E()-steps.
Prior to this change you were limited to using E()-step only at the start of traversal, but now you can this step in the middle. This improvement makes it easier for users to build certain types of queries. For example, get edges with label knows, if there is none then add new one between josh and vadas.
g.inject(1).coalesce(E().hasLabel("knows"), addE("knows").from(V().has("name","josh")).to(V().has("name","vadas")))
Another reason is to make E() and V() steps equivalent in terms of use in the middle of traversal.
See TINKERPOP-2798
PBiPredicate interface
Custom predicates used in P
now should implement PBiPredicate
interface.
It allows to set the name of the predicate that will be used for serialization by overriding getPredicateName
.
In previous version toString
used for this.
In most cases it should be enough just to replace BiPredicate
with PBiPredicate
in predicate declaration.
See TINKERPOP-2949
TinkerPop 3.6.0
Tinkerheart
TinkerPop 3.6.8
Release Date: October 23, 2024
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Runtime Updates
gremlin-python
has upgraded to Python 3.9 as Python 3.8 has passed end of life.
gremlin-go
has upgraded to Go 1.22 as Go 1.21 has passed end of life.
TinkerPop 3.6.7
Release Date: April 8, 2024
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Runtime Version Upgrades
gremlin-javascript
and gremlint
have upgraded from Node 16 to Node 18 as Node 16 has passed end of life.
gremlin-go
has upgraded from Go 1.20 to Go 1.21 as Go 1.20 has passed end of life.
Gremlin.Net: Fixed a bug in traversal enumeration on .NET 8
The behavior of IEnumerable
was changed in .NET 8 when Current
was accessed on it before starting the enumeration
via MoveNext()
.
The Gremlin.Net driver unfortunately did exactly that in some cases which led to exceptions
(InvalidOperationException: Enumeration has not started. Call MoveNext.
) on .NET 8 for some traversals.
Traversal enumeration has been changed for this version of Gremlin.Net to avoid this problem.
Older platforms than .NET 8 are not affected as IEnumerable.Current
returned null
there which is what the
Gremlin.Net driver expected.
See: TINKERPOP-3029
TinkerPop 3.6.6
Release Date: November 20, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Providers
The HttpGremlinRequestEncoder
constructor has been deprecated in favor of one with an additional parameter boolean userAgentEnabled
.
User agent HTTP headers can now be encoded if this flag is enabled.
TinkerPop 3.6.5
Release Date: July 31, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
HTTP Plain Text
A text/plain
MIME type has been added to the HTTP endpoint to return Gremlin Console formatted results in plain text.
This format can be helpful for a variety of reasons. Reading JSON formatted results can be difficult sometimes and
text/plain
is a more simple, readable representation for when that is helpful.
$ curl -H "Accept:text/plain" -X POST -d "{\"gremlin\":\"g.V()\"}" "http://localhost:8182"
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
See: TINKERPOP-2947
Untyped GraphSON
Prior to GraphSON 3.0, there were options to have GraphSON returned with and without types. The latter was helpful for getting a more simple to process JSON format that relied strictly on the JSON type system. This format was of course lossy in nature and not conducive for use with the language variants that were developing as the primary mechanism for working with Gremlin. Of course, simple HTTP never really went away and forcing the type system on consumers makes that route of connecting harder than it should be.
With HTTP users in mind, this version brings back both versions of untyped GraphSON by adding the following mime types:
-
GraphSON 1.0 -
application/vnd.gremlin-v1.0+json;types=false
-
GraphSON 2.0 -
application/vnd.gremlin-v2.0+json;types=false
Neither is configured by default in Gremlin Server, but can be added if desired by editing the Gremlin Server configuration file.
See: TINKERPOP-2965
TinkerPop 3.6.4
Release Date: May 12, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.6.3
Release Date: May 1, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Deprecation Warning for Go 1.17
Release 3.6.3 will be one of the last versions of 3.6.x to use Go 1.17 for gremlin-go
as this runtime is no longer supported.
Upcoming releases of gremlin-go
will attempt to use the latest available version of Go.
Upgrading for Providers
Graph System Providers
Writing and Deleting Extension to Mutating Interface
Writing
and Deleting
are two new interfaces that extend Mutating
. These two interfaces can be used to distinguish
whether a Step
that implements Mutating
will cause a write or a delete. A mapping has also been added that maps a
Bytecode
instruction to possible steps. By using these two changes in conjunction, you can tell if a traversal will
change the underlying graph by calling BytecodeHelper.findPossibleTraversalSteps()
on each Bytecode
instruction of
the traversal. Be aware that there are a small number of special steps (e.g. io()
or call()
) that aren’t marked
with these interfaces even though they could potentially modify the graph as they can’t implement the current
Mutating
interface which brings in the Event
subsystem.
See: TINKERPOP-2929
TinkerPop 3.6.2
Release Date: January 16, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Changes to mergeV/E semantics
The mergeV()
and mergeE()
step were added at version 3.6.0. Given some feedback on implementation and usage, some
additional changes were needed in order to improve the usability of these steps. These changes could not be made
without a breaking change to existing behavior introduced at 3.6.0. The main changes to consider are:
-
onCreate
Map will now inherit from main merge argument, and overrides of existence criteria (T.id/T.label
andDirection.OUT/IN
) will be prohibited. -
Direction.IN/OUT
can be specified by additional options (Merge.inV/outV
), which can take Map arguments, or a traversal which results in a Map or Vertex. -
mergeE()
will no longer accept upstream Vertices as arguments forDirection.IN/OUT
where not specified in the map arguments. Late binding of those arguments will come fromMerge.inV/outV
instead.
See: TINKERPOP-2850
Upgrading for Providers
Graph System Providers
Callbacks for GraphManager
The GraphManager
class now has several new methods that act as callbacks for various Gremlin Server operations
related to query processing. Overriding these methods in a GraphManager
implementation can help make it easier for
providers to get notification of a query starting and whether it ends in success or failure. The feature may even
be useful to Gremlin Server users who simply wish to develop more advanced logging capabilities and other custom
features without having to extend more complicated classes within the Gremlin Server structure.
See: TINKERPOP-2806
Gherkin Tests Moved to Resources
The Gherkin feature tests have been moved from gremlin-test/features
to actual resources on gremlin-test
. In this
way, these files can be more easily referenced from the classpath. Providers can now configure their CucumberOptions
in this fashion (taken from TinkerGraph):
@CucumberOptions(
tags = "not @RemoteOnly and not @GraphComputerOnly and not @AllowNullPropertyValues",
glue = { "org.apache.tinkerpop.gremlin.features" },
objectFactory = TinkerGraphFeatureTest.TinkerGraphGuiceFactory.class,
features = { "classpath:/org/apache/tinkerpop/gremlin/test/features" },
plugin = {"progress", "junit:target/cucumber.xml"})
See: TINKERPOP-2804
TinkerPop 3.6.1
Release Date: July 18, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphBinary Default Serialization
Python and .NET have support for GraphBinary at least since 3.5.0, but kept GraphSON 3 by default. It now seems safe to make GraphBinary the default in 3.6.x. With this change, all language variants now have GraphBinary as their default serialization format.
To continue using the GraphSON, explicitly specify it as the serializer to use in the configuration.
See: TINKERPOP-2723
TinkerPop 3.6.0
Release Date: April 4, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
element() Step
The new element()
step provides a way to traverse from a Property
to the Element
that owns it:
gremlin> g = traversal().withEmbedded(TinkerFactory.createTheCrew())
==>graphtraversalsource[tinkergraph[vertices:6 edges:14], standard]
// VertexProperty -> Vertex
gremlin> g.V(1).properties().element().limit(1)
==>v[1]
// (Edge)Property -> Edge
gremlin> g.E(13).properties().element().limit(1)
==>e[13][1-develops->10]
// (Meta)Property -> VertexProperty
gremlin> g.V(1).properties().properties().element().limit(1)
==>vp[location->san diego]
mergeV() and mergeE()
One of the most commonly used patterns in Gremlin is the use of fold().coalesce(unfold(), …)
to perform upsert-like
functionality. While this pattern is quite flexible, it can also be confusing to new users and for certain use cases
challenging to get the pattern correctly implemented. For providers, the pattern is difficult to properly optimize
because it can branch into complexity quite quickly making it hard to identify a section of Gremlin for an upsert and
therefore is not executed as efficiently as it might have been otherwise.
The new mergeV()
and mergeE()
steps greatly simplify this pattern and as the pattern is condensed into a single
step it should be straightforward for providers to optimize as part of their implementations. The following example
demonstrates just how much easier implementing a basic upsert of a vertex has gotten:
// prior to 3.6.0, use fold().coalesce(unfold(), ...)
gremlin> g.V().
......1> has('person', 'name', 'vadas').has('age',27).
......2> fold().
......3> coalesce(unfold(),
......4> addV('person').property('name', 'vadas').property('age', 27)).
......5> elementMap()
==>[id:2,label:person,name:vadas,age:27]
// 3.6.0
gremlin> g.mergeV([(T.label): 'person', name:'vadas', age: 27]).
......1> elementMap()
==>[id:2,label:person,name:vadas,age:27]
In a more complex example below, if the vertex is found, then it is updated with an "age" of "30" otherwise it is created with an "age" of 27:
// prior to 3.6.0, use fold().coalesce(unfold(), ...)
gremlin> g.V().has('person','name','vadas').has('age', 27).
......1> fold().
......2> coalesce(unfold().property('age',30),
......3> addV('person').property('name','vadas').property('age',27)).
......4> elementMap()
==>[id:2,label:person,name:vadas,age:30]
// 3.6.0
gremlin> g.mergeV([(T.label): 'person', name:'vadas', age: 27]).
......1> option(onMatch, [age: 30]).
......2> elementMap()
==>[id:2,label:person,name:vadas,age:30]
The pattern was even more complicated for upserting edges, but the following example demonstrates how much easier
mergeE()
is to follow:
// prior to 3.6.0, use a form of coalesce()
gremlin> g.V().has('person','name','vadas').as('v').
......1> V().has('software','name','ripple').
......2> coalesce(__.inE('created').where(outV().as('v')),
......3> addE('created').from('v').property('weight',0.5)).
......4> elementMap()
==>[id:0,label:created,IN:[id:5,label:software],OUT:[id:2,label:person],weight:0.5]
// 3.6.0
gremlin> ripple = g.V().has('software','name','ripple').next()
==>v[5]
gremlin> g.V().has('person','name','vadas').
......1> mergeE([(T.label):'created',(to):ripple, weight: 0.5]).
......2> elementMap()
==>[id:0,label:created,IN:[id:5,label:software],OUT:[id:2,label:person],weight:0.5]
For those currently using the fold().coalesce(unfold(), …)
pattern, there is no need to be concerned with
incompatibility as a result of these new steps. That pattern is still perfectly usable and valid Gremlin, but whenever
possible it would be best to migrate away from it as graph providers ramp up on 3.6.0 support and introduce important
write optimizations that will make a big difference in performance.
Direction Aliases
Aliases have been added to Direction
to allow for OUT
to be referred to as from
and IN
can be referred to as
to
, which is a bit more friendly and matches more closely with existing Gremlin syntax.
Moved Pick
Pick
was formerly a nested class of TraversalOptionParent
, but has now been promoted to being a class on its own
in org.apache.tinkerpop.gremlin.process.traversal.Pick
.
Consistent by() Behavior
The by()
modulator is critical to the usage of Gremlin. When used in conjunction with a step that supports it, the
arguments to the by()
modulator shift the behavior of the internals of the step. The behavior that by()
introduces
has not always been consistent with some overloads establishing null
traversers, others throwing exceptions that are
hard to digest, some filtering, etc.
In 3.6.0, the rules for the by()
modulator are made straightforward. If the by()
produces a result then it is
said to be "productive" and its value is propagated to the step for use. If the by()
does not produce a result then
the traverser to which it was to be applied is filtered.
The following sections demonstrate the behavior in 3.5.x alongside the new 3.6.0 behavior:
aggregate()
gremlin> g.V().aggregate('a').by('age').cap('a') // 3.5.x
==>[29,27,null,null,32,35]
gremlin> g.V().aggregate('a').by('age').cap('a') // 3.6.0
==>[29,27,32,35]
gremlin> g.V().aggregate('a').by(__.values('age')).cap('a') // 3.6.0
==>[29,27,32,35]
gremlin> g.V().aggregate('a').by(out()).cap('a') // 3.5.x
The provided traverser does not map to a value: v[2]->[VertexStep(OUT,vertex)]
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().aggregate('a').by(out()).cap('a') // 3.6.0
==>[v[3],v[3],v[5]]
gremlin> g.V().aggregate('a').by('age') // same for 3.5.x and future
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
cyclicPath()
gremlin> g.V().has('person','name','marko').both().both().cyclicPath().by('age') // 3.5.x
==>v[1]
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().has('person','name','marko').both().both().cyclicPath().by('age') // 3.6.0
==>v[1]
==>v[1]
dedup()
gremlin> g.V().both().dedup().by('age').elementMap() // 3.5.x
==>[id:3,label:software,name:lop,lang:java]
==>[id:2,label:person,name:vadas,age:27]
==>[id:4,label:person,name:josh,age:32]
==>[id:1,label:person,name:marko,age:29]
==>[id:6,label:person,name:peter,age:35]
gremlin> g.V().both().dedup().by('age').elementMap() // 3.6.0
==>[id:2,label:person,name:vadas,age:27]
==>[id:4,label:person,name:josh,age:32]
==>[id:1,label:person,name:marko,age:29]
==>[id:6,label:person,name:peter,age:35]
When using dedup()
over labels all labels must produce or the path will be filtered:
gremlin> g.V().as('a').both().as('b').both().as('c').dedup('a','b').by('age').select('a','b','c').by('name') // 3.5.x
The provided start does not map to a value: v[3]->value(age)
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().as('a').both().as('b').both().as('c').dedup('a','b').by('age').select('a','b','c').by('name') // 3.6.0
==>[a:marko,b:vadas,c:marko]
==>[a:marko,b:josh,c:ripple]
==>[a:vadas,b:marko,c:lop]
==>[a:josh,b:marko,c:lop]
group()
There are two by()
modulators that can be assigned to group()`
. The first modulator is meant to identify the key to
group on and will filter values without that key out of the resulting Map
.
gremlin> g.V().group().by('age').by('name') // 3.5.x
==>[null:[lop,ripple],32:[josh],35:[peter],27:[vadas],29:[marko]]
gremlin> g.V().group().by('age').by('name') // 3.6.0
==>[32:[josh],35:[peter],27:[vadas],29:[marko]]
The second by()`
is applied to the result as a reducing operation and will filter away entries in the List
value of
each key.
gremlin> g.V().group().by('name').by('age') // 3.5.x
==>[ripple:[null],peter:[35],vadas:[27],josh:[32],lop:[null],marko:[29]]
gremlin> g.V().group().by('name').by('age') // 3.6.0
==>[ripple:[],peter:[35],vadas:[27],josh:[32],lop:[],marko:[29]]
groupCount()
gremlin> g.V().groupCount().by('age') // 3.5.x
==>[null:2,32:1,35:1,27:1,29:1]
gremlin> g.V().groupCount().by('age') // 3.6.0
==>[32:1,35:1,27:1,29:1]
math()
The math()
step requires that the result of the by()
be a Number
, so a result of null
will still result in a
runtime exception. Filtering will eliminate such errors, though a runtime error may still be present should the
modulator produce a non-numeric value.
gremlin> g.V().math('_+1').by('age') // 3.5.x
==>30.0
==>28.0
The variable _ for math() step must resolve to a Number - it is instead of type null with value null
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().math('_+1').by('age') // 3.6.0
==>30.0
==>28.0
==>33.0
==>36.0
order()
gremlin> g.V().both().order().by('age').elementMap() // 3.5.x
==>[id:3,label:software,name:lop,lang:java]
==>[id:3,label:software,name:lop,lang:java]
==>[id:3,label:software,name:lop,lang:java]
==>[id:5,label:software,name:ripple,lang:java]
==>[id:2,label:person,name:vadas,age:27]
==>[id:1,label:person,name:marko,age:29]
==>[id:1,label:person,name:marko,age:29]
==>[id:1,label:person,name:marko,age:29]
==>[id:4,label:person,name:josh,age:32]
==>[id:4,label:person,name:josh,age:32]
==>[id:4,label:person,name:josh,age:32]
==>[id:6,label:person,name:peter,age:35]
gremlin> g.V().both().order().by('age').elementMap() // 3.6.0
==>[id:2,label:person,name:vadas,age:27]
==>[id:1,label:person,name:marko,age:29]
==>[id:1,label:person,name:marko,age:29]
==>[id:1,label:person,name:marko,age:29]
==>[id:4,label:person,name:josh,age:32]
==>[id:4,label:person,name:josh,age:32]
==>[id:4,label:person,name:josh,age:32]
==>[id:6,label:person,name:peter,age:35]
path()
All by()
modulators must be productive for the filter to be satisfied.
gremlin> g.V().both().path().by('age') // 3.5.x
==>[29,null]
==>[29,27]
==>[29,32]
==>[27,29]
==>[null,29]
==>[null,32]
==>[null,35]
==>[32,null]
==>[32,null]
==>[32,29]
==>[null,32]
==>[35,null]
gremlin> g.V().both().path().by('age') // 3.6.0
==>[29,27]
==>[29,32]
==>[27,29]
==>[32,29]
project()
The project()
step will produce an incomplete Map
by filtering away keys of unproductive by()
modulators.
gremlin> g.V().project('n','a').by('name').by('age') // 3.5.x
==>[n:marko,a:29]
==>[n:vadas,a:27]
==>[n:lop,a:null]
==>[n:josh,a:32]
==>[n:ripple,a:null]
==>[n:peter,a:35]
gremlin> g.V().project('n','a').by('name').by('age') // 3.6.0
==>[n:marko,a:29]
==>[n:vadas,a:27]
==>[n:lop]
==>[n:josh,a:32]
==>[n:ripple]
==>[n:peter,a:35]
propertyMap()
gremlin> g.V().propertyMap().by(is('x')) // 3.5.x
The provided start does not map to a value: [vp[name→marko]]→[IsStep(eq(x))]
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().propertyMap().by(is('x')) // 3.6.0
==>[name:[],age:[]]
==>[name:[],age:[]]
==>[name:[],lang:[]]
==>[name:[],age:[]]
==>[name:[],lang:[]]
==>[name:[],age:[]]
sack()
gremlin> g.V().sack(assign).by('age').sack() // 3.5.x
==>29
==>27
==>null
==>32
==>null
==>35
gremlin> g.V().sack(assign).by('age').sack() // 3.6.0
==>29
==>27
==>32
==>35
sample()
gremlin> g.V().both().sample(2).by('age') // 3.5.x
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().both().sample(2).by('age') // 3.6.0
==>v[1]
==>v[4]
select()
All by()
modulators must be productive for the filter to be satisfied.
gremlin> g.V().has('person','name','marko').as('a').both().as('b').select('a','b').by('age') // 3.5.x
==>[a:29,b:null]
==>[a:29,b:27]
==>[a:29,b:32]
gremlin> g.V().has('person','name','marko').as('a').both().as('b').select('a','b').by('age') // 3.6.0
==>[a:29,b:27]
==>[a:29,b:32]
simplePath()
gremlin> g.V().has('person','name','marko').both().both().simplePath().by('age') // 3.5.x
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().has('person','name','marko').both().both().simplePath().by('age') // 3.6.0
gremlin>
tree()
All by()
modulators must be productive for the filter to be satisfied.
gremlin> g.V().out().tree().by('age') // 3.5.x
==>[32:[null:[]],35:[null:[]],29:[null:[],32:[],27:[]]]
gremlin> g.V().out().tree().by('age') // 3.6.0
==>[32:[],35:[],29:[32:[],27:[]]]
valueMap()
gremlin> g.V().valueMap().by(is('x')) // 3.5.x
The provided start does not map to a value: [marko]→[IsStep(eq(x))]
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().valueMap().by(is('x')) // 3.6.0
==>[name:[],age:[]]
==>[name:[],age:[]]
==>[name:[],lang:[]]
==>[name:[],age:[]]
==>[name:[],lang:[]]
==>[name:[],age:[]]
where()
gremlin> g.V().as('a').both().both().as('b').where('a',eq('b')).by('age') // 3.5.x
==>v[1]
==>v[1]
==>v[1]
==>v[2]
==>v[3]
==>v[5]
==>v[3]
==>v[3]
==>v[4]
==>v[4]
==>v[4]
==>v[5]
==>v[3]
==>v[6]
gremlin> g.V().as('a').both().both().as('b').where('a',eq('b')).by('age') // 3.6.0
==>v[1]
==>v[1]
==>v[1]
==>v[2]
==>v[4]
==>v[4]
==>v[4]
==>v[6]
For the most part, this change largely removes runtime exceptions and since most uses cases are not likely to rely
on those for query execution, existing code should not be broken by this upgrade. However, users who relied on 3.5.x
behavior where by()
might propagate a null
would however see a behavioral change. To temporarily restore the old
behavior, simply include g.withStrategies(ProductiveByStrategy)
in the traversal configuration, which will force the
null
to be produced. Ultimately, it would be best not to rely on this strategy in the long term however and convert
Gremlin that requires it to behave properly without it.
For example, if in 3.5.x there was a traversal like g.V().group().by('age')
and "age" is known to not always be a
valid key, the appropriate change would be to propagate null
explicitly as with:
g.V().group().by(coalesce(values('age'), constant(null)))
.
See: TINKERPOP-2635
TextP Regex
A number of graph databases have included support for regular expressions text predicates and now TinkerPop includes
a regex()
option to TextP
:
gremlin> g.V().has('person', 'name', regex('peter')).values('name')
==>peter
gremlin> g.V().has('person', 'name', regex('r')).values('name')
==>marko
==>peter
gremlin> g.V().has('person', 'name', regex('r$')).values('name')
==>peter
gremlin> g.V().has('person', 'name', regex('a[rd]')).values('name')
==>marko
==>vadas
See: TINKERPOP-2652
gremlin-annotations
There is a new module called gremlin-annotations
and it holds the annotations used to construct
Java-based Gremlin DSLs. These annotations
were formerly in gremlin-core
and therefore it will be necessary to modify dependencies accordingly when upgrading
to 3.6.0. Package and class names have remained the same and general usage is unchanged.
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-annotations</artifactId>
<version>3.6.0</version>
</dependency>
It is worth noting that gremlin-groovy
utilized the DSL annotations to construct the
Credentials DSL so the gremlin-annotations
package is now explicitly associated to gremlin-groovy
but as an <optional>
dependency.
See: TINKERPOP-2411
fail() Step
The new fail()
step provides a way to immediately terminate a traversal with a runtime exception. In the Gremlin
Console, the exception will be rendered as follows which helps provide some context to the failure:
gremlin> g.V().fail("nope!")
fail() Step Triggered
=====================
Message > nope!
Traverser> v[1]
Bulk > 1
Traversal> V().fail()
Metadata > {}
=====================
Null for T
In 3.5.x, calling property()
with a key that is of type T
and a value that is null
or calling addV()
with a
null
label is processed as a valid traversal and default values are used. That approach allows opportunities for
unexpected problems should a variable being passed as a parameter end up accidentally shifting to null
without the
caller’s knowledge. Starting in 3.6.0, such traversals will generate an exception during construction of the traversal.
It is still possible to call addV()
with no arguments to assume a default label
and id
generation remains
implementation specific with some graphs accepting id
and others ignoring it to generate their own. Both value of
T
remain immutable.
See: TINKERPOP-2611
Logging Changes
In Gremlin Server and Gremlin Console distributions, the default logging implementation of log4j 1.2.x has been replaced by logback 1.2.x given CVE-2019-17571. While it was easy to replace log4j for users of the zip distributions, it was a little harder for users to change our packaged Docker containers which should work more cleanly out of the box.
See: TINKERPOP-2534
Short and Byte
Numeric operations around short
and byte
have not behaved quite like int
and long
. Here is an example of a
sum
operation with sack()
:
gremlin> g.withSack((short) 2).inject((short) 1, (int) 2).sack(sum).sack()
==>3
==>4
gremlin> g.withSack((short) 2).inject((short) 1, (int) 2).sack(sum).sack().collect{it.class}
==>class java.lang.Integer
==>class java.lang.Integer
gremlin> g.withSack((short) 2).inject((short) 1, (long) 2).sack(sum).sack().collect{it.class}
==>class java.lang.Integer
==>class java.lang.Long
gremlin> g.withSack((short) 2).inject((short) 1,(byte) 2).sack(sum).sack().collect{it.class}
==>class java.lang.Integer
==>class java.lang.Integer
Note that the type returned for the the sum
should be the largest type encountered in the operation, thus if a
long + int
would return long
or a byte + int
would return int
. The last example above shows inconsistency in
this rule when dealing with types short
and byte
which simply promote them to int
.
For 3.6.0, that inconsistency is resolved and may be a breaking change should code be relying on the integer promotion.
gremlin> g.withSack((short) 2).inject((short) 1,(byte) 2).sack(sum).sack().collect{it.class}
==>class java.lang.Short
==>class java.lang.Short
gremlin> g.withSack((byte) 2).inject((byte) 1,(byte) 2).sack(sum).sack().collect{it.class}
==>class java.lang.Byte
==>class java.lang.Byte
See: TINKERPOP-2610
Groovy in gremlin-driver
The gremlin-driver
module no longer depends on groovy
or groovy-json
. It became an <optional>
dependency in
3.5.0 and general deprecation of the serializers for the JsonBuilder
class from Groovy (which was the only reason the
dependency existed in the first place) occurred in 3.5.2. As they were made <optional>
it is likely that users who
required those packages have already adjusted their dependencies to explicitly include them. As for those who still
make use of JsonBuilder
serialization for some reason, the only recourse is to find the code in TinkerPop and
maintain it independently. The following classes were removed as of this change (links go to their 3.5.1 versions):
See: TINKERPOP-2593
Removed Gryo MessageSerializers
Gryo MessageSerializer
implementations were deprecated at 3.4.3 (GryoLite at 3.2.6) in favor of GraphBinary. As
GraphBinary has been the default implementation for some time now and given that Gryo is a JVM-only format which
reduces its usability within Gremlin Language Variants, it seemed like the right time to remove the Gryo
MessageSerializer
implementations from the code base. Gryo may still be used for file based applications.
See: TINKERPOP-2639
GroovyTranslator of gremlin-groovy
GroovyTranslator
has been removed from the gremlin-groovy
module. Please update any code using that class to
instead use org.apache.tinkerpop.gremlin.process.traversal.translator.GroovyTranslator
which is found in
gremlin-core
.
See: TINKERPOP-2657
gremlin-python Step Naming
When gremlin-python was first built, it followed the Gremlin step names perfectly and didn’t account well for Python
keywords that those steps conflicted with. As this conflict led to problems in usage, steps that matched keywords were
renamed to have an underscore suffix (e.g. sum()
to sum_()
) and the old step names were deprecated.
In 3.6.0, those original conflicting steps names have simply been removed. Please change any of the following steps that may still be in use to instead prefer the underscore suffixed versions:
-
filter
-
id
-
max
-
min
-
range
-
sum
The full list of steps with the suffix naming can be found in the Reference Documentation.
In addition to removing the conflicting names, camel cased naming has been deprecated for all Gremlin steps and replaced with more Pythonic snake cased names. As this change was merely deprecation, this change is non-breaking at this time, but the camel cased steps will be removed in some future major version.
See: TINKERPOP-2650
property()
with Map
The property()
step has been extended to take a Map
of property key/value pairs as an argument with two new signatures:
property(Map)
property(Cardinality, Map)
When called, each individual key/value pair in the Map
is saved as a property to the element. When the Cardinality
is specified, that cardinality will be applied to all elements in the map as they are saved to the element.
If users need different cardinalities per property, then please use the existing pattern of stringing multiple
property()`
calls together.
See: TINKERPOP-2665
Upgrading for Providers
Graph System Providers
Gherkin Tests
TinkerPop originally introduced Gherkin-based feature tests when GLVs were first introduced to help provide a language agnostic test capability. The Gherkin tests were a near one-to-one copy of the tests of the Gremlin Process Test Suite which focus on Gremlin semantics. Unfortunately, having both JVM tests and Gherkin tests meant maintaining two sets of tests which were testing identical things.
To simplify the ongoing maintenance of the test suite and to make it even easier to contribute to the enforcement of
Gremlin semantics, TinkerPop now provides infrastructure in the gremlin-test
module to run the Gherkin-based tests.
For 3.6.0, the old test suite remains intact and is not deprecated, but providers are encouraged to implement the
Gherkin tests as they will include newer tests that may not be in the old test suite and it would be good to gather
feedback on the new test suite’s usage so that when deprecation and removal of the old suite comes to pass the
transition will not carry as much friction.
Note that the 3.6.0 release includes a convenience zip distribution for gremlin-test
that packages both the data
files and Gherkin features files for a release. This new file can be found on the
Downloads page on the website.
Filters with Mixed Id Types
The requirement that "ids" passed to Graph.vertices
and Graph.edges
all be of a single type has been removed. This
requirement was a bit to prescriptive when there really wasn’t a need to enforce such a validation. It even conflicted
with TinkerGraph operations where mixed T.id
types is a feature. Graph providers may continue to support this
requirement if they wish, but it is no longer enforced by TinkerPop and the Graph.idArgsMustBeEitherIdOrElement
has
been removed so providers will need to construct their own exception.
See: TINKERPOP-2507
Comparability/Orderability Semantics
Prior to 3.6, comparability was not well defined and produced exceptions in a variety of cases. The 3.6 release
rationalizes the comparability semantics, defined in the Graph Provider Documentation. One feature of these semantics
is the introduction of a Ternary Boolean Logics, where ERROR
cases are well defined, and errors are no longer
propagated back to the client as an exception. The ERROR
value is eventually reduced to false
, which results in
the solution being quietly filtered and allows the traversal to proceed normally. For example:
gremlin> g.inject(1, "foo").is(P.gt(0)).count() // 3.5.x
Cannot compare 'foo' (String) and '0' (Integer) as both need to be an instance of Number or Comparable (and of the same type)
Type ':help' or ':h' for help.
gremlin> g.inject(1, "foo").is(P.gt(0)).count() // 3.6.0
==>1
Prior to 3.6, orderability (OrderGlobalStep) only applied to a single typespace and only to certain types. Attempts to order across types resulted in an exception. The 3.6 release introduces total orderability semantics, defined in the Graph Provider Documentation. Order now works on all types in the Gremlin language, including collections, structure elements (Vertex, Edge, VertexProperty, Property), paths, and all the allowed property value types. Additionally, ordering is possible across types, with the type priority defined in the orderability semantics section of the Provider Documentation.
gremlin> g = traversal().withEmbedded(TinkerFactory.createModern())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
// Order across types
gremlin> g.V().values().order() // 3.5.x
java.lang.String cannot be cast to java.lang.Integer
Type ':help' or ':h' for help.
gremlin> g.V().values().order() // 3.6.0
==>27
==>29
==>32
==>35
==>java
==>java
==>josh
==>lop
==>marko
==>peter
==>ripple
==>vadas
// Order by Vertex
gremlin> g.V().order() // 3.5.x
org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerVertex cannot be cast to java.lang.Comparable
Type ':help' or ':h' for help.
Display stack trace? [yN]
gremlin> g.V().order() // 3.6.0
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
// Order by Map / Map.Entry
gremlin> g.V().valueMap().order() // 3,5,x
java.util.LinkedHashMap cannot be cast to java.lang.Comparable
Type ':help' or ':h' for help.
Display stack trace? [yN]
gremlin> g.V().valueMap().order() // 3.6.0
==>[name:[josh],age:[32]]
==>[name:[lop],lang:[java]]
==>[name:[marko],age:[29]]
==>[name:[peter],age:[35]]
==>[name:[ripple],lang:[java]]
==>[name:[vadas],age:[27]]
Feature tags have been introduced for feature tests that stress these new semantics (see Committer Documentation). A new GraphFeature has been added "OrderabilitySemantics" to signify compliance with the new comparability/orderability semantics.
See: Gremlin Semantics
Service Call API
3.6 introduces a call()
API that allows providers to provide custom service calls with their implementation. Providers
using the reference implementation for Traversal
execution will implement the ServiceFactory
and Service
interfaces for each named service they provide. Providers using their own query engines for traveral execution will need
to provide a call operation that can list the available services (directory service) and execute named services.
TinkerPop 3.5.0
The Sleeping Gremlin: No. 18 Entr’acte Symphonique
TinkerPop 3.5.8
Release Date: November 20, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.5.7
Release Date: July 31, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Runtime Version Upgrades
gremlin-javascript
and gremlint
have upgraded from Node 10 to Node 16 as Node 10 has passed end of life.
gremlin-go
has upgraded from Go 1.17 to Go 1.20 as Go 1.17 has passed end of life.
GraphSON max token lengths
Introduced max number length (10000 chars), string length (20 000 000 chars), and nesting depth (1000) constraints for GraphSON deserialization due to security vulnerabilities with earlier versions of Jackson Databind. New constraints are not expected to impact most users but can be overriden via GraphSONMapper.Builder or through serializer configuration. Example:
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0,
config: {
maxNumberLength: 500,
maxStringLength: 500,
maxNestingDepth: 500
}
}
See: TINKERPOP-2948
Deprecation of Neo4j-Gremlin
Neo4j-Gremlin has been deprecated due to incompatibility with current versions of Neo4j.
See: reference docs and TINKERPOP-2977
TinkerPop 3.5.6
Release Date: May 1, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Deprecation Warning for Node 10 and Go 1.17
Release 3.5.6 will be the last version of 3.5.x to use Node 10 for gremlin-javascript
and Go 1.17 for gremlin-go
as
both of these runtimes are no longer supported. The following release (3.5.7) will likely contain Node 16 to align with
3.6.x and gremlin-go
will use the latest available version of Go.
ARM64 Support for Gremlin Server Docker Image
Gremlin Server Docker image now supports both AMD64 and ARM64. Multi-architecture image can now be found on Docker Hub.
see: TINKERPOP-2852
Performance Improvements
There are various performance improvements included in this release:
-
Gremlin Console now uses
ImportCustomizer
to add imports, reducing the time spent resolving imports. This especially reduces the overhead for executing many simple lines of code. Processing for documentation (which runs thousands of simple lines) was reduced from 90 minutes to 25 minutes. Users should notice performance improvements when using the Gremlin Console. -
In some instances, a graph may contain elements whose Ids are sequentially increasing integers. When using the
path()
step with aCollectingBarrierStep
in these graphs, one may observe a huge increase in processing time because of an inefficient hash algorithm. This hash algorithm has been updated and tests have shown a 40x improvement in theseCollectingBarrierStep
queries involvingpath()
. -
There is an important performance improvement to
TraversalStrategy
application which removes some unnecessary recursion when evaluating the Gremlin syntax tree and should improve traversal compilation times, particularly on traversals with many children. Tests have shown a 10x improvement in compilation time. -
FilterRankingStrategy
saw a fairly specific performance improvement that should be particularly noticeable for traversal that have many children and some depth to their syntax tree. A particularly complex traversal that was used in testing this behavior improved its compilation time from 1 minute 48 seconds to just a few milliseconds.
Upgrading for Providers
Graph System Providers
TraversalStrategy Expectations
Given some important performance improvements in this version, providers may need to make alterations to their
TraversalStrategy
implementations as certain assumptions about the data available to a strategy may have changed.
If a TraversalStrategy
implementation requires access to the Graph
implementation, side-effects, and similar data
stored on a Traversal
, it is best to get that information from the root of the traversal hierarchy rather than from
the current traversal that the strategy is executing on as the strategy application process no longer take the
expensive step to propagate that data throughout the traversal in between each strategy application. Use the
TraversalHelper.getRootTraversal()
helper function to get to the root traversal. Note also that
Traversal.getGraph()
will traverse through the parent traversals now when trying to find a Graph
.
See: TINKERPOP-2855, TINKERPOP-2888
Local steps should handle non-Iterables
index
steps and local steps count
, dedup
, max
, mean
, min
, order
, range
, sample
, sum
, tail
should
work correctly with not only Iterable
input, but also with arrays and single values.
Examples of queries that should be supported:
g.inject(1).max(local)
g.inject(new Integer[] {1,2},3).max(Scope.local).toList()
TinkerPop 3.5.5
Release Date: January 16, 2023
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin.NET: Add logging
The Gremlin.NET driver can now be configured for logging to get more insights into its internal state, like when
a connection was closed and will therefore be replaced. It uses Microsoft.Extensions.Logging
for this so all kinds
of different .NET logging implementations can be used.
The following example shows how to provide a LoggerFactory
that is configured to log to the console to
GremlinClient
:
var loggerFactory = LoggerFactory.Create(builder =>
{
builder.AddConsole();
});
var client = new GremlinClient(new GremlinServer("localhost", 8182), loggerFactory: loggerFactory);
See: TINKERPOP-2471
gremlin-driver Host Availability
The Java drivers approach to determine the health of the host to which it is connected is important to how it behaves. The driver has generally taken a pessimistic approach to make this determination, where a failure to borrow a connection from the pool would constitute enough evidence to mark the host as dead. Unfortunately a failure to borrow from the pool is not the best indicator for a dead host. Intermittent network failure, an overly busy client, or other temporary issues could have been at play while the server was perfectly available.
The consequences for marking a host dead are fairly severe as the connection pool is shutdown. While it is shutdown
in an orderly fashion so as to allow for existing requests to complete if at all possible, it immediately blocks new
requests producing an immediate NoHostAvailableException
(assuming there are actually no other hosts to route
requests to). There is then some delay to reconnect to the server and re-establish each of the connections in the pool,
which might take some time especially if there was a large pool. If an application were sending hundreds of requests
a second, it would quickly equate to hundreds of exceptions a second, filling logs quickly and making it hard to
decipher the original root cause from that initial inability to simply borrow a connection.
For 3.5.5, the driver has dropped some of its pessimistic ways and has taken a more optimistic posture when it comes
to considering host availability. When the driver now has a problem borrowing a connection from the pool, it does not
assume the worst of a failed host. It instead checks if the other connections in the pool are still alive. If at least
one is alive it assumes the host is still available and the user simply gets a TimeoutException
for waiting beyond
the maxWaitForConnection
setting. In the event that there are no connections deemed alive, the driver will try to
issue an immediate reconnect to determine host health. If the reconnect succeeds then the host remains in an available
state for future requests and if it fails it is finally marked as dead and thus unavailable.
If a host is marked unavailable, the driver, like before, will try to route requests to other configured hosts that
are healthy. If there are no such other hosts and all are unhealthy, the driver will now make an immediate attempt to
route the request to a random unavailable host on the chance it will come back online and the connection pool thus
re-established. So, rather than a NoHostAvailableException
flood, the user would instead see a TimeoutException
per request made.
Since the driver was already throwing NoHostAvailableException
and TimeoutException
, there shouldn’t be too
many code changes required on upgrade to this version. The main difference to consider is that
NoHostAvailableException
should appear less often, so any code that mitigated against a flood of these exceptions
might not be necessary anymore. One should also consider that if code was relying on the asynchronous
properties that Client.submitAsync()
implies in its name, that code may now see more blocking behavior than
before. In doing this refactoring, it was noted that the connection borrowing behavior has never behaved in a truly
asynchronous manner and only the fast NoHostAvailableException
aspect of that method behavior really contributed to
an asynchronous looking call for failed connection borrowing. With the fast NoHostAvailableException
gone, it may be
more likely now to note blocking behavior when a connection cannot be obtained.
See: TINKERPOP-2813
Added User Agent to Gremlin drivers
Previously, a server does not distinguish amongst the different types of clients connecting to it. We have now added user agent to web socket handshake in all the drivers, each with their own configuration to enable or disable user agent. User agent is enabled by default for all drivers.
-
Java driver can be controlled by the
enableUserAgentOnConnect
configuration. -
.Net driver can be controlled by the
EnableUserAgentOnConnect
inConnectionPoolSettings
. -
Go driver can be controlled by the
EnableUserAgentOnConnect
setting. -
Python driver can be controlled by the
enable_user_agent_on_connect
setting. -
JavaScript driver can be controlled by the
enableUserAgentOnConnect
option.
See: TINKERPOP-2480
Update to SSL Handshake Timeout Configuration
Previously, the java driver relies on the default 10 second SSL handshake timeout defined by Netty. We have removed
the default SSL handshake timeout. The SSL handshake timeout will instead be capped by setting connectionSetupTimeoutMillis
.
See: TINKERPOP-2814
TinkerPop 3.5.4
Release Date: July 18, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
SparkIOUtil utility
A utility class SparkIOUtil
is introduces, which allows users to load graph into Spark RDD via
loadVertices(org.apache.commons.configuration2.Configuration, JavaSparkContext)
method.
Gremlin-Go
After introducing a number of release candidates over the past several months, the official release of Gremlin-Go is now available starting at 3.5.4. There are a few simple prerequisites required to get started:
-
A Golang version of
1.20
or greater. Please see Go Downloads for more details on installing Golang. -
A basic understanding of Go Modules.
-
A project set up using Go Modules where
gremlin-go
will be used as a dependency.
In the root directory of your project, containing the go.mod
file, run the following command:
go get github.com/apache/tinkerpop/gremlin-go/v3@v3.5.4
Afterwards, we can quickly get started with writing Gremlin in Go:
package main
import (
"fmt"
"github.com/apache/tinkerpop/gremlin-go/v3/driver"
)
func main() {
// Creating the connection to the server
driverRemoteConnection, err := gremlingo.NewDriverRemoteConnection("ws://localhost:8182/gremlin",
func(settings *gremlingo.DriverRemoteConnectionSettings) {
// Configure optional settings
settings.LogVerbosity = gremlingo.Info
settings.InitialConcurrentConnections = 2
}
)
// Handle error
if err != nil {
fmt.Println(err)
return
}
// Cleanup
defer driverRemoteConnection.Close()
// Creating graph traversal
g := gremlingo.Traversal_().WithRemote(driverRemoteConnection)
// Perform traversal
result, err := g.V().Count().ToList()
if err != nil {
fmt.Println(err)
return
}
fmt.Println(result[0].GetString())
}
For full documentation on the available API to give Gremlin Go a try, click here.
See: TINKERPOP-2692
GraphBinary support for gremlin-javascript
Gremlin JavaScript now also supports GraphBinary. GraphSON3 still remains the default serialization format.
To enable GraphBinary support, set the mime type in the connection options.
const options = { mimeType: 'application/vnd.graphbinary-v1.0' }
const g = traversal().withRemote(new DriverRemoteConnection('ws://localhost:8182/gremlin', options));
Deprecated connectOnStartup option in gremlin-javascript
Deprecated and removed functionality of the connectOnStartup
option for Gremlin Javascript to resolve potential
unhandledRejection
and race conditions. Setting connectOnStartup
to true
will only trigger a console warning.
New DriverRemoteConnection objects no longer initiate connection by default at startup. Call open()
explicitly if one
wishes to connect on startup. For example:
const drc = new DriverRemoteConnection(url);
drc.open().catch(err => {
// Handle error upon open.
})
See: TINKERPOP-2708
TinkerPop 3.5.3
Release Date: April 4, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
.NET WebSocket Compression Support
It should be noted however that compression might make an application susceptible to attacks like CRIME/BREACH.
Compression should therefore be turned off if the application sends sensitive data to the server as well as data that
could potentially be controlled by an untrusted user. Compression can be disabled via the disableCompression
parameter on the GremlinClient
constructor.
See: TINKERPOP-2682
.NET: Translator to Groovy
A Translator
can translate a Gremlin traversal from one programming language to another. Gremlin.NET now comes with
a GroovyTranslator
which translates a Gremlin.NET traversal into a string that contains a Gremlin-Groovy traversal:
var g = ...;
var t = g.V().Has("person", "name", "marko").Where(In("knows")).Values<int>("age");
// Groovy
var translator = GroovyTranslator.Of("g");
Console.WriteLine(translator.Translate(t));
// OUTPUT: g.V().has('person', 'name', 'marko').where(__.in('knows')).values('age')
See: TINKERPOP-2367
TinkerPop 3.5.2
Release Date: January 10, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Tx() in .NET and Python
After Javascript, .NET and Python are now the second and third non-JVM variants of Gremlin to get support for remote transactions.
An example of the .NET Tx()
syntax can be found in the
.NET Transaction Section.
An example of the Python tx()
syntax can be found in the
Python Transaction Section
See: TINKERPOP-2556 for .NET and TINKERPOP-2555 for Python
datetime()
Gremlin in native programming languages can all construct a native date and time object. In Java, that would probably
be a java.util.Date
object while in Javascript it would likely be the Node.js Date
. In any of these cases, these
native objects would be serialized to millisecond-precision offset from the unix epoch to be sent over the wire to the
server (in embedded mode for Java, it would be up to the graph database to determine how the date is handled).
The gap is in Gremlin scripts which do not have a way to natively construct dates and times other than by using Groovy
variants. As TinkerPop moves toward a more secure method of processing Gremlin scripts by way of the gremlin-language
model, it was clear that this gap needed to be filled. The new datetime()
function can take a ISO-8601 formatted
datetime and internally produce a Date
with a default time zone offset of UTC (+00:00).
This functionality, while syntax of gremlin-language
, is also exposed as a component of gremlin-groovy
so that it
can be used in the Gremlin Console and through the GremlinScriptEngine
in Gremlin Server.
gremlin> datetime('2022-10-02').toGMTString()
==>2 Oct 2022 00:00:00 GMT
gremlin> datetime('2022-10-02T00:00:00Z').toGMTString()
==>2 Oct 2022 00:00:00 GMT
gremlin> datetime('2022-10-02T00:00:00-0400').toGMTString()
==>2 Oct 2022 04:00:00 GMT
The above examples use the Java Date
method toGMTString()
to properly format the date for demonstration purposes.
From a Gremlin language perspective there are no functions that can be called on the return value of datetime()
.
See: TINKERPOP-2596
Driver Defaults
Some default value settings were changed to be consistent across different languages.
-
The default for
PoolSize
is now8
instead of4
in .NET. -
The default for
MaxInProcessPerConnection
is now32
instead of16
in .NET. -
The default for
maxContentLength
is now 10 mb in the Java builder instead of 65536. -
The default for
pool_size
is now8
instead of4
in Python. -
If a
protocol_factory
is not specified, it now adheres to themax_content_length
specified instead of always using 65536 in Python.
See: TINKERPOP-2379
fold() Map
The fold()
step has long been used for reducing the traversal stream to a List
as in:
gremlin> g.V().values('age').fold()
==>[29,27,32,35]
gremlin> g.inject([a: 1],[b:2]).fold()
==>[[a:1],[b:2]]
This step has always had a lesser known second signature though which has two arguments of a "seed" and "foldFunction":
gremlin> g.inject([a: 1],[b:2]).fold([], addAll)
==>[[a:1],[b:2]]
There was a bit of incomplete (or perhaps incorrect) logic that did not allow the folding of Map
and as of this
version, that issue has been resolved which enables Map
entry merges:
gremlin> g.inject([a: 1],[b:2]).fold([:], addAll)
==>[a:1,b:2]
See: TINKERPOP-2667
Refinements to null
Release 3.5.0 introduce the ability for there to be traversers that contained a null
value. Since that time it has
been noted that supplying a null
value as an argument to certain Gremlin steps might cause exception behaviors that
were either not clear or perhaps even unexpected. The following changes were made to help make these behaviors more
consistent and to further solidify the semantics of null
usage in the Gremlin language:
-
hasKey(null)
,hasLabel(null)
andhas(T.label,null)
- These steps used to throw generalNullPointerException
failures, but now filter results. As these structural elements of the graph can’t benull
, it will effectively filter all elements. -
hasValue(null)
- Formerly this step would throw a generalNullPointerException
, but now it will filter as expected. -
elementMap()
,valueMap()
,properties()
, andvalues()
- These steps used to throw generalNullPointerException
failures if anull
was provided as an argument, but nownull
keys are ignored for purpose of the filter. -
withSideEffect()
- This configuration step can now take anull
for its value. -
sum()
,mean()
,max()
andmin()
- These methods used to throw generalNullPointerException
failures, but now ignorenull
values when other numbers are present. If all of the values in the stream arenull
thenhasNext()
returnfalse
.
See: TINKERPOP-2605, TINKERPOP-2620
ProductiveByStrategy
Gremlin steps that take a by()
modulator have had varying behaviors for some time now and depending on the step and
context of the argument given to the by()
, the traversal might return a result, throw an exception, produce a null
if no value is present, or filter results. While all of these behaviors cannot be reconciled in 3.5.x without breaking
changes it is possible to start to deal with the exception behavior in a more consistent way.
When a traversal given to by()
does not contain a result, the traversal is deemed "unproductive". As mentioned above,
unproductive arguments to by()
lead to different results depending on the case:
gremlin> g.V().aggregate('x').by('age').cap('x')
==>[29,27,null,null,32,35]
gremlin> g.V().aggregate('x').by(values('age').is(gt(29))).cap('x')
The provided traverser does not map to a value: v[1]->[PropertiesStep([age],value), IsStep(gt(29))]
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().aggregate('x').by(out()).cap('x')
The provided traverser does not map to a value: v[2]->[VertexStep(OUT,vertex)]
Type ':help' or ':h' for help.
Display stack trace? [yN]
With ProductiveByStrategy
now installed by default in 3.5.2, the exception behavior is changed to follow the null
behavior forcing the unproductive traversal to return something:
gremlin> g.V().aggregate('x').by('age').cap('x')
==>[29,27,null,null,32,35]
gremlin> g.V().aggregate('x').by(values('age').is(gt(29))).cap('x')
==>[null,null,null,null,32,35]
gremlin> g.V().aggregate('x').by(out()).cap('x')
==>[v[3],v[3],null,null,null,v[5]]
Note that group()
step has some special behaviors that made it virtually impossible for ProductiveByStrategy
to
fully retain them, so ultimately the strategy ignores the modulators on that step except for the simple cases where the
key modulator is a basic ValueTraversal
(e.g. by('name')')
. As a result, for group()
the behavior should be
unchanged with exceptions retained:
gremlin> g.V().group().by(values('age'))
==>[null:[v[3],v[5]],32:[v[4]],35:[v[6]],27:[v[2]],29:[v[1]]]
gremlin> g.V().group().by(values('age').is(gt(29)))
The provided traverser does not map to a value: v[1]->[PropertiesStep([age],value), IsStep(gt(29))]
Type ':help' or ':h' for help.
Display stack trace? [yN]
See: TINKERPOP-2635
TinkerPop 3.5.1
Release Date: July 19, 2021
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
tx() in Javascript
Javascript is now the first non-JVM variant of Gremlin to get support for
remote transactions. An example of the tx()
syntax can be found in the Javascript Transaction Section.
See: TINKERPOP-2557
Gremlint Library
The Gremlint website became an official part of the TinkerPop project for 3.5.0. In 3.5.1, the javascript library that contains the logic for the site has been made available as a code library that can be used independently and has been published to npm.
See: TINKERPOP-2551
SSL Exceptions
When connecting to an SSL enabled server using Java, errors with that connectivity would result in a RuntimeException
that wrapped an underlying NoHostAvailableException
which did not surface information about the SSL problem. For
this sort of failure in 3.5.2, the RuntimeException
has been replaced with the NoHostAvailableException
which in
turn wraps a more specific SSLException
. While this change represents a small break in exception handling, it seemed
a worthwhile change to take, so as to help make this error more clear when it arises.
See: link: TINKERPOP-2616
TinkerPop 3.5.0
Release Date: May 3, 2021
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Host Language Runtimes
TinkerPop implements Gremlin in a variety of different programming languages. For 3.5.0, there are a number of major upgrades to those programming language environments:
-
Java - TinkerPop now builds and is compatible with Java 11.
-
Python - Support for Python 2.x has been dropped completely. Users must use Python 3 going forward. For the most part, from a user’s perspective, there are no specific API changes to consider as a result of this change.
-
Javascript - Upgraded to support Node version 10.
-
Jython - Support for Jython has been removed and gremlin-python no longer produces a JVM-based artifact. This change means that the
GremlinJythonScriptEngine
no longer exists and there is no way to write native Python lambdas that can execute in Gremlin Server. All lambdas should be written using gremlin-groovy if they are needed. -
.NET - Gremlin.NET no longer targets .NET Standard 1.3, but only .NET Standard 2.0. Since .NET Core 2.0 and .NET Framework 4.6.1 already support this .NET Standard version, most users should not be impacted by this.
Gremlint
Gremlint, the JavaScript library powering gremlint.com, is now part of the TinkerPop project. It provides utilities for formatting Gremlin queries, taking parameters such as indentation, maximum line length and dot placement into account.
Until recently Gremlint has been an independent project maintained by Øyvind Sæbø, a developer at Ardoq. Ardoq has since donated Gremlint, as well as gremlint.com, to TinkerPop so that it can serve as TinkerPop’s canonical Gremlin code formatter.
See: TINKERPOP-2530
Translators
Translator
implementations were moved from a mostly quiet and internal feature of TinkerPop to a documented and more
readily accessible form in 3.4.9. For 3.5.0, the functionality has been expanded significantly in a number of ways.
First, for Java, gremlin-core
now has a JavascriptTranslator
and a DotNetTranslator
which completes the set of
Gremlin translation functions for the programming languages that TinkerPop supports. It is therefore now possible to
convert Gremlin bytecode to string representations that can compile in C#, Groovy, Javascript and Python.
The following example demonstrates what this functionality looks like in Javascript but it is quite similar for each
Translator
implementation:
// gremlin-core module
import org.apache.tinkerpop.gremlin.process.traversal.translator.JavascriptTranslator;
GraphTraversalSource g = ...;
Traversal<Vertex,Integer> t = g.V().has("person","name","marko").
where(in("knows")).
values("age").
map(Lambda.function("it.get() + 1"));
Translator.ScriptTranslator javascriptTranslator = JavascriptTranslator.of("g");
System.out.println(javascriptTranslator.translate(t).getScript());
// OUTPUT: g.V().has("person","name","marko").where(__.in_("knows")).values("age").map(() => "it.get() + 1")
In addition, a native Python Translator
implementation has been added that will generate a string of Gremlin that
will compile on the JVM.
from gremlin_python.process.translator import *
g = ...
t = (g.V().has('person','name','marko').
where(__.in_("knows")).
values("age"))
# Groovy
translator = Translator().of('g');
print(translator.translate(t.bytecode));
# OUTPUT: g.V().has('person','name','marko').where(__.in('knows')).values('age')
Versions and Dependencies
Apache Commons
There is a major breaking change in the use of Configuration
objects. Prior to 3.5.0, Configuration
objects were
from the Apache Commons commons-configuration
library, but in this version, they are of commons-configuration2
.
While this is a breaking change, the fix for most implementations will be quite simple and amounts to changing the
import statements from:
org.apache.commons.configuration.*
to
org.apache.commons.configuration2.*
It is also worth noting that default list handling in configurations is treated differently. TinkerPop disabled the
default list handling approach in Configuration
1.x, but if that functionality is still needed, it can be reclaimed
by setting the LegacyListDelimiterHandler
- details for doing taking this step and other relevant upgrade information
can be found in the 2.x Upgrade Documentation.
Neo4j
There were two key changes to the neo4j-gremlin
module:
-
The underlying Neo4j version moved from the 3.2.x line to 3.4.x line. Please see the Neo4j Upgrade FAQ for more information as features and configuration options may have changed.
-
Experimental support for multi/meta-properties in Neo4j which were previously deprecated have now been permanently removed.
Groovy Dependency in Java Driver
The gremlin-driver
module made its dependency on Groovy optional
as its only reason for inclusion was to support
JsonBuilder
serialization and this feature is little known and perhaps even less used. Read more about this change
here in the Upgrade Documentation in the Serialization Section under the subsection title
"GraphSON and JsonBuilder".
See: TINKERPOP-2185
Shaded Java Driver
The gremlin-driver
has an additional packaging which may make it easier to upgrade for some users who may have
extensive dependency chains.
<dependency>
<groupId>org.apache.tinkerpop</groupId>
<artifactId>gremlin-driver</artifactId>
<version>3.7.4-SNAPSHOT</version>
<classifier>shaded</classifier>
</dependency>
The above dependency with the shaded
classifier shades all the non-optional dependencies of gremlin-driver
and
includes gremlin-core
and tinkergraph-gremlin
in an unshaded form. The slf4j dependency was not included because
shading it can cause problems with its operations.
See: TINKERPOP-2476
Serialization
Java and Gryo
Since the first release of TinkerPop 3.x, Gryo has been the default serialization format for Gremlin Server and Java Driver. It was also used as the default serialization format for Gremlin Console remote connectivity to Gremlin Server. As of this release, Gryo has been replaced as the default by GraphBinary. All packaged configuration files and programmatic defaults have been modified as such.
It is still possible to utilize Gryo as a message serialization format by modifying Gremlin Server configuration files to include the appropriate Gryo configurations. If using Gryo, do not use earlier versions of the driver and server with 3.5.0. Use a 3.5.0 client to connect to a 3.5.0 server. Generally speaking, mixed version combinations will appear to work properly, but problems will likely occur during general course of usage and it is therefore not advisable to take this approach.
For best compatibility between 3.4.x and 3.5.x, please use GraphBinary.
GraphSON and JsonBuilder
GraphSON serialization support for Groovy’s JsonBuilder
has been present since the first version of GraphSON. That
approach to returning results has never materialized as a standardized way to use Gremlin as originally envisioned.
While support for this serialization form is still present, the dependency on Groovy in gremlin-driver
has been
changed to "optional", which means that users who wish to continue to return JsonBuilder
results for some
reason must explicitly include groovy
and groovy-json
dependencies in their applications. For Maven this would
mean adding the following dependencies:
<dependency>
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy</artifactId>
<version>${groovy.version}</version>
<classifier>indy</classifier>
</dependency>
<dependency>
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy-json</artifactId>
<version>${groovy.version}</version>
<classifier>indy</classifier>
<exclusions>
<!-- exclude non-indy type -->
<exclusion>
<groupId>org.codehaus.groovy</groupId>
<artifactId>groovy</artifactId>
</exclusion>
</exclusions>
</dependency>
The ${groovy.version}
should match the version specified in TinkerPop’s root
pom.xml.
.NET: GraphBinary
Gremlin.NET now also supports GraphBinary. GraphSON 3 however still remains the default serialization format as GraphBinary should be considered experimental for this version in .NET:
var client = new GremlinClient(new GremlinServer("localhost", 8182), new GraphBinaryMessageSerializer());
var g = Traversal().WithRemote(new DriverRemoteConnection(client));
.NET: New JSON Library
Gremlin.NET now uses System.Text.Json
instead of Newtonsoft.Json as System.Text.Json
is already included in .NET
Core 3.0 and higher which removes a dependency and offers better performance. Most users should not notice this change,
however users who have implemented their own GraphSON serializers or deserializers will need to modify them
accordingly. The same applies to users that let Gremlin.NET return data without deserializing it first as the returned
data types will change in this case, for example from Newtonsoft.Json’s JObject
or JToken
to JsonElement
with
System.Text.Json
.
Python dict Deserialization
In Gremlin it is common to return a dict
or list
as a key value in another dict
. The problem for Python is that
these values are not hashable and will result in an error. By introducing a HashableDict
and Tuple
for those keys
(respectively), it is now possible to return these types of results and not have to work around them:
>>> g.V().has('person', 'name', 'marko').elementMap("name").groupCount().next()
{{<T.id: 1>: 1, <T.label: 4>: 'person', 'name': 'marko'}: 1}
Transaction Improvements
The TinkerPop Transaction API and its related features have not changed much since TinkerPop 3.x was initially released. Transactions that extend beyond the scope of a single traversal (or request) have remained a feature for embedded use cases and script execution (where supported) even in the face of the rise of remote graph use cases. With the varying contexts that exist for how and when transactions can be used, it has led to a fair bit of confusion.
For 3.5.0, TinkerPop introduces a change in approach to transactions that has the goal of unifying the API and features for all use cases, while addressing the more current state of the graph ecosystem which has shifted heavily toward remote communication and more diverse programming language ecosystems beyond the JVM.
Note
|
The old transaction API remains intact in 3.5.0, so this version should be backward compatible with the old model for embedded transactions. |
The new model for using a transaction looks like this:
g = traversal().withEmbedded(graph)
// or
g = traversal().withRemote(conn)
tx = g.tx() // create a Transaction object
gtx = tx.begin() // spawn a GraphTraversalSource from the Transaction
assert tx.isOpen() == true
gtx.addV('person').iterate()
gtx.addV('software').iterate()
tx.commit() // alternatively you could explicitly rollback()
assert tx.isOpen() == false
// it is still possible to use g, but gtx is "done" after it is closed so use
// tx.begin() to produce a new gtx instance for a fresh transaction
assert 2 == g.V().count().next()
The first important point to take away here is that the same transaction code will work for either embedded or remote use cases. The second important point is that for remote cases, transaction support with bytecode is a wholly new feature, which was implemented by providing support for bytecode in sessions. Up until this time, sessions could only support script-based requests, so that makes another added feature related to this one.
Important
|
The g.tx() is only supported in Java at this time. Support for other languages will come available in
future releases on the 3.5.x line.
|
Anonymous Child Traversals
TinkerPop conventions for child traversals is to spawn them anonymously from __
, therefore:
g.addV('person').addE('self').to(__.V(1))
or more succinctly via static import as:
g.addV('person').addE('self').to(V(1))
Some users have chosen to instead write the above as:
g.addV('person').addE('self').to(g.V(1))
which spawns a child traversal from a GraphTraversalSource
. When spawned this way, a traversal is bound to a "source"
and therefore is not anonymous. While the above code worked, it is important that there be less ways to do things
with Gremlin so as to avoid confusion in examples, documentations and mailing list answers.
As of 3.5.0, attempting to use a traversal spawned from a "source" will result in an exception. Users will need to modify their code if they use the unconventional syntax.
See: TINKERPOP-2361
Use of null
Gremlin has traditionally disallowed null
as a value in traversals and not always in consistent ways:
gremlin> g.inject(1, null, null, 2, null)
java.lang.NullPointerException
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V().has('person','name','marko').property('age', null)
The AddPropertyStep does not have a provided value: AddPropertyStep({key=[age]})
Type ':help' or ':h' for help.
Display stack trace? [yN]
gremlin> g.addV("person").property("name", 'stephen').property("age", null)
==>v[13]
gremlin> g.V().has('person','name','stephen').elementMap()
==>[id:13,label:person,name:stephen]
gremlin> g.V().constant(null)
gremlin>
Note how null
can produce exception behavior or act as a filter. For 3.5.0, TinkerPop has not only made null
usage
consistent, but has also made it an allowable value within a Traversal
:
gremlin> g.inject(1, null, null, 2, null)
==>1
==>null
==>null
==>null
==>2
gremlin> g.V().constant(null)
==>null
==>null
==>null
==>null
==>null
==>null
TinkerGraph can be configured to support null
as a property value and all graphs may not support this feature (for
example, Neo4j does not). Please be sure to check the new supportsNullPropertyValues()
feature (or the documentation
of the graph provider) to determine if the Graph
implementation allows null
as a property value.
With respect to null
in relation to properties, there was a bit of inconsistency in the handling of null
in calls
to property()
depending on the type of mutation being executed demonstrated as follows in earlier versions:
gremlin> g.V(1).property("x", 1).property("y", null).property("z", 2)
The AddPropertyStep does not have a provided value: AddPropertyStep({key=[y]})
Type ':help' or ':h' for help.
Display stack trace? [yN]N
gremlin> g.addV("test").property("x", 1).property("y", null).property("z", 2)
==>v[13]
gremlin> g.V(13).properties()
==>vp[x->1]
==>vp[z->2]
This behavior has been altered to become consistent. First, assuming null
is not supported as a property value, the
setting of a property to null
should have the behavior of removing the property in the same way in which you might
do g.V().properties().drop()
:
gremlin> g.V(1).property("x", 1).property("y", null).property("z", 2)
==>v[1]
gremlin> g.V(1).elementMap()
==>[id:1,label:person,name:marko,x:1,z:2,age:29]
gremlin> g.V().hasLabel('person').property('age',null).iterate()
gremlin> g.V().hasLabel('person').elementMap()
==>[id:1,label:person,name:marko]
==>[id:2,label:person,name:vadas]
==>[id:4,label:person,name:josh]
==>[id:6,label:person,name:peter]
Then, assuming null
is supported as a property value, it would simply store the null
for the key:
gremlin> g.addV("person").property("name", 'stephen').property("age", null)
==>v[13]
gremlin> g.V().has('person','name','stephen').elementMap()
==>[id:13,label:person,name:stephen,age:null]
gremlin> g.V().has('person','age',null)
==>v[13]
The above described changes also have an effect on steps like group()
and groupCount()
which formerly produced
exceptions when keys could not be found:
gremlin> g.V().group().by('age')
The property does not exist as the key has no associated value for the provided element: v[3]:age
Type ':help' or ':h' for help.
Display stack trace? [yN]n
For situations where the key did not exist, the approach was to filter away vertices that did not have the available
key so that such steps would work properly or to write a more complex by()
modulator to better handle the possibility
of a missing key. With the latest changes however none of that is necessary unless desired:
gremlin> g.V().groupCount().by('age')
==>[null:2,32:1,35:1,27:1,29:1]
In conclusion, this improved support of null
may affect the behavior of existing traversals written in past
versions of TinkerPop as it is no longer possible to rely on null
to expect a filtering action for traversers.
Please review existing Gremlin carefully to ensure that there are no unintended consequences of this change and that
there are no opportunities to improve existing logic to take greater advantage of this expansion of null
semantics.
See: TINKERPOP-2235, TINKERPOP-2099
ByModulatorOptimizationStrategy
The new ByModulatorOptimizationStrategy
attempts to re-write by()
modulator traversals to use their more optimized
forms which can provide a major performance improvement. As a simple an example, a traversal like by(id())
would
be replaced by by(id)
, thus replacing a step-based traversal with a token-based traversal.
See: TINKERPOP-1682
SeedStrategy
The new SeedStrategy
allows the user to set a seed value for steps that make use of Random
so that the traversal
has the ability to return deterministic results. While this feature is useful for testing and debugging purposes,
there are also some practical applications as well.
gremlin> g.V().values('name').fold().order(local).by(shuffle)
==>[josh,marko,vadas,peter,ripple,lop]
gremlin> g.V().values('name').fold().order(local).by(shuffle)
==>[vadas,lop,marko,peter,josh,ripple]
gremlin> g.V().values('name').fold().order(local).by(shuffle)
==>[peter,ripple,josh,lop,marko,vadas]
gremlin> g.withStrategies(new SeedStrategy(22323)).V().values('name').fold().order(local).by(shuffle)
==>[lop,peter,josh,marko,vadas,ripple]
gremlin> g.withStrategies(new SeedStrategy(22323)).V().values('name').fold().order(local).by(shuffle)
==>[lop,peter,josh,marko,vadas,ripple]
gremlin> g.withStrategies(new SeedStrategy(22323)).V().values('name').fold().order(local).by(shuffle)
==>[lop,peter,josh,marko,vadas,ripple]
See: TINKERPOP-2014
by(T) for Property
The Property
interface is not included in the hierarchy of Element
. This means that an edge property or a
meta-property are not considered elements the way that a VertexProperty
is. As a result, some usages of T
in
relation to properties do not work consistently. One such example is by(T)
, a token-based traversal, where the
following works for a VertexProperty
but will not for edge properties or meta-properties:
gremlin> g.V(1).properties().as('a').select('a').by(key)
==>name
==>age
For a Property
you would need to use key()
-step:
gremlin> g.E(11).properties().as('a').select(last,'a').by(key())
==>weight
Aside from the inconsistency, this issue also presents a situation where performance is impacted as token-based
traversals are inherently faster than step-based ones. In 3.5.0, this issue has been resolved in conjunction with the
introduction of ByModulatorOptimizationStrategy
which will optimize by(key())
and by(value())
to their
appropriate token versions automatically.
See: TINKERPOP-1682
match() Consistency
The match()
step behavior might have seemed inconsistent those first using it. While there are a number of examples
that might demonstrate this issue, the easiest one to consume would be:
gremlin> g.V().match(__.as("a").out("knows").as("b"))
==>[a:v[1],b:v[2]]
==>[a:v[1],b:v[4]]
gremlin> g.V().match(__.as("a").out("knows").as("b")).unfold()
gremlin> g.V().match(__.as("a").out("knows").as("b")).identity()
==>[]
==>[]
The output is unexpected if there isn’t awareness of some underlying optimizations at play, where match()
as the
final step in the traversal implies that the user wants all of the labels as part of the output. With the addition
of the extra steps, unfold()
and identity()
in the above case, the implication is that the traversal must be
explicit in the labels to preserve from match, thus:
gremlin> g.V().match(__.as("a").out("knows").as("b")).select('a','b').unfold()
==>a=v[1]
==>b=v[2]
==>a=v[1]
==>b=v[4]
gremlin> g.V().match(__.as("a").out("knows").as("b")).select('a','b').identity()
==>[a:v[1],b:v[2]]
==>[a:v[1],b:v[4]]
Being explicit, as is the preference in writing Gremlin to good form, helps restrict the path history required to
execute the traversal and therefore preserves memory. Of course, making match()
a special form of end step is a
confusing approach as the behavior of the step changes simply because another step is in play. Furthermore, correct
execution of the traversal, relies on the execution of traversal strategies when the same traversal should produce
the same results irrespective of the strategies applied to it.
In 3.5.0, we look to better adhere to that guiding design principle and ensure a more consistent output for these types
of traversals. While the preferred method is to specify the labels to preserve from match()
with a following
select()
step as shown above, match()
will now consistently return all labels when they are not specified
explicitly.
gremlin> g.V().match(__.as("a").out("knows").as("b"))
==>[a:v[1],b:v[2]]
==>[a:v[1],b:v[4]]
gremlin> g.V().match(__.as("a").out("knows").as("b")).identity()
==>[a:v[1],b:v[2]]
==>[a:v[1],b:v[4]]
gremlin> g.V().match(__.as("a").out("knows").as("b")).unfold()
==>a=v[1]
==>b=v[2]
==>a=v[1]
==>b=v[4]
See: TINKERPOP-2481, TINKERPOP-2499
Gremlin Server
Remote SideEffects
Remote traversals no longer support the retrieval of remote side-effects. Users must therefore directly return
side-effects as part of their query if they need that data. Note that server settings for TraversalOpProcessor
, which
formerly held the cache for these side-effects, no longer have any effect and can be removed.
Audit Logging
The authentication.enableAuditlog
configuration property is deprecated and replaced by the enableAuditLog
property
to also make it available to Authorizer
implementations. With the new setting enabled, there are slight changes in the
formatting of audit log messages. In particular, the name of the authenticated user is included in every message.
Authorization
While Gremlin Server has long had authentication options to determine if a user can connect to the server, it now also
contains the ability to apply a level of authorization to better control what a particular authenticated user will
have access to. Authorization is controlled by the new Authorizer
interface, which can be implemented by users and
graph providers to provide this custom functionality.
UnifiedChannelizer
Gremlin Server uses a Channelizer
abstraction to configure different Netty pipelines which can then offer different
server behaviors. Most commonly, users configure the WebSocketChannelizer
to enable the websocket protocol to which
the various language drivers can connect.
TinkerPop 3.5.0 introduces a new Channelizer
implementation called the UnifiedChannelizer
. This channelizer is
somewhat similar to the WsAndHttpChannelizer
in that combines websocket and standard HTTP protocols in the server,
but it provides a new and improved thread management approach as well as a more streamlined execution model. The
UnifiedChannelizer
technically replaces all existing implementations, but is not yet configured by default in Gremlin
Server. To use it, modify the channelizer
setting in the server yaml file as follows:
channelizer: org.apache.tinkerpop.gremlin.server.channel.UnifiedChannelizer
As the UnifiedChannelizer
is tested further, it will eventually become the default implementation. It may however
be the preferred channelizer when using large numbers of short-lived sessions as the the threading model of the
UnifiedChannelizer
is better suited for such situations. If using this new channelizer, there are a few considerations
to keep in mind:
-
The
UnifiedChannelizer
does not use theOpProcessor
infrastructure, therefore those configurations are no longer relevant and can be ignored. -
It is important to read about the
gremlinPool
setting in the Tuning Section of the reference documentation and to look into the new configurations available related to this channelizer:maxParameters
,sessionLifeTimeout
,useGlobalFunctionCacheForSessions
, anduseCommonEngineForSessions
. -
Generally speaking, if current usage patterns involve mixing heavy loads of sessionless requests with arbitrary numbers of long-run sessions that have unpredictable end times, then the
UnifiedChannelizer
may not be the right choice for that situation. The long-run sessions will consume threads that would normally be available to sessionless requests and eventually slow their processing. On the other hand, if usage patterns involve mixing heavy loads of sessionless requests with short-lived sessions of similar execution time or if usage patterns allow the ability to predict the size of the pool that will support the workload then theUnifiedChannelizer
will greatly improve performance over the old model.
Retry Conditions
Some error conditions are temporary in nature and therefore an operation that ends in such a situation may be tried
again as-is to potential success. In embedded use cases, an exception that implements the TemporaryException
interface implies that the failing operation can be retried. For remote use cases, a ResponseStatusCode
of 596
which equates to SERVER_ERROR_TEMPORARY
is an indicator that a request may be retried.
With this more concrete and generalized approach to determining when retries should happen, the need to trap provider specific exceptions or to examine the text of error messages are removed. Before replacing existing code that might do these things currently, it may be best to include this sort of retry checking in addition to current methods as it may take time for providers to support these new options. Alternatively, if you can confirm that a provider does support this functionality then feel free to proceed wholly with this generalized TinkerPop approach.
Finally, it is important to note that TinkerPop drivers do not automatically retry when these conditions are met. It is up to the application to determine if retry is desired and how best to do so.
See: TINKERPOP-2517
Python Transport Layer
With the removal of Python 2.x support the transport layer of gremlin-python has been rewritten to use a library that
utilizes the asyncio event loop of Python 3. AIOHTTP utilizes Python 3’s
event loop with a minimal HTTP abstraction and is now used for the transport layer. From a user’s perspective there is
not much of a change except there is now new configuration options available through named parameters, see
AIOHTTP ws_connect for
more details. This change fixed a number of issues that were related to the IOLoop of the old
Tornado transport layer, which has been completely removed from the
library. An additional config which enables the driver to be used from within an event loop has been added and can be
used by setting call_from_event_loop=True
when connecting.
Python Kerberos Support
The Python Driver now supports Kerberos based authentication:
g = traversal().withRemote(DriverRemoteConnection(
'ws://localhost:8182/gremlin', 'g', kerberized_service='gremlin@hostname.your.org'))
Deprecation Removal
The following deprecated classes, methods or fields have been removed in this version:
-
gremlin-core
-
org.apache.tinkerpop.gremlin.process.computer.bulkdumping.BulkDumperVertexProgram
-
org.apache.tinkerpop.gremlin.process.computer.bulkloading.BulkLoader
-
org.apache.tinkerpop.gremlin.process.computer.bulkloading.BulkLoaderVertexProgram
-
org.apache.tinkerpop.gremlin.process.computer.bulkloading.IncrementalBulkLoader
-
org.apache.tinkerpop.gremlin.process.computer.bulkloading.OneTimeBulkLoader
-
org.apache.tinkerpop.gremlin.process.computer.clustering.peerpressure.PeerPressureVertexProgram.Builder#traversal(*)
-
org.apache.tinkerpop.gremlin.process.computer.ranking.pagerank.PageRankVertexProgram.Builder#traversal(*)
-
org.apache.tinkerpop.gremlin.process.computer.ranking.pagerank.PageRankVertexProgram.Builder#vertexCount()
-
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.PageRankVertexProgramStep.modulateBy(*)
-
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.PageRankVertexProgramStep.modulateTimes()
-
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.PeerPressureVertexProgramStep.modulateBy(*)
-
org.apache.tinkerpop.gremlin.process.computer.traversal.step.map.PeerPressureVertexProgramStep.modulateTimes()
-
org.apache.tinkerpop.gremlin.process.remote.traversal.AbstractRemoteTraversalSideEffects
-
org.apache.tinkerpop.gremlin.process.remote.traversal.EmbeddedRemoteTraversalSideEffects
-
org.apache.tinkerpop.gremlin.process.remote.traversal.RemoteTraversalSideEffects
-
org.apache.tinkerpop.gremlin.process.remote.traversal.RemoteTraversal#getSideEffects()
-
org.apache.tinkerpop.gremlin.process.traversal.Order.decr
-
org.apache.tinkerpop.gremlin.process.traversal.Order.incr
-
org.apache.tinkerpop.gremlin.process.traversal.TraversalSource#withRemote(*)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource#withRemote(*)
-
org.apache.tinkerpop.gremlin.process.traversal.step.map.PropertyMapStep(Traversal.Admin, boolean, PropertyType, String…)
-
org.apache.tinkerpop.gremlin.process.traversal.step.map.PropertyMapStep#isIncludeTokens()
-
org.apache.tinkerpop.gremlin.process.traversal.util.BytecodeUtil
-
org.apache.tinkerpop.gremlin.structure.util.star.StarGraph#builder()
-
org.apache.tinkerpop.gremlin.structure.util.star.StarGraph.Builder#create()
-
-
gremlin-driver
-
org.apache.tinkerpop.gremlin.driver.Tokens#ARGS_SCRIPT_EVAL_TIMEOUT
-
org.apache.tinkerpop.gremlin.driver.Channelizer#createKeepAliveMessage()
-
org.apache.tinkerpop.gremlin.driver.Channelizer#supportsKeepAlive()
-
org.apache.tinkerpop.gremlin.driver.Cluster.Builder#keyCertChainFile(String)
-
org.apache.tinkerpop.gremlin.driver.Cluster.Builder#keyFile(String)
-
org.apache.tinkerpop.gremlin.driver.Cluster.Builder#keyPassword(String)
-
org.apache.tinkerpop.gremlin.driver.Cluster.Builder#maxWaitForSessionClose(Integer)
-
org.apache.tinkerpop.gremlin.driver.Cluster.Builder#trustCertificateChainFile(String)
-
org.apache.tinkerpop.gremlin.driver.handler.NioGremlinRequestEncoder
-
org.apache.tinkerpop.gremlin.driver.handler.NioGremlinResponseDecoder
-
org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversalSideEffects
-
org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteTraversal#getSideEffects()
-
org.apache.tinkerpop.gremlin.driver.simple.NioClient
-
-
gremlin-python
-
org.apache.tinkerpop.gremlin.python.jsr223.*
-
-
gremlin-server
-
org.apache.tinkerpop.gremlin.server.Settings.scriptEvaluationTimeout
-
org.apache.tinkerpop.gremlin.server.Settings.SslSettings.keyCertChainFile
-
org.apache.tinkerpop.gremlin.server.Settings.SslSettings.keyFile
-
org.apache.tinkerpop.gremlin.server.Settings.SslSettings.keyPassword
-
org.apache.tinkerpop.gremlin.server.Settings.SslSettings.trustCertificateChainFile
-
org.apache.tinkerpop.gremlin.server.ResponseHandlerContext
-
org.apache.tinkerpop.gremlin.server.channel.NioChannelizer
-
org.apache.tinkerpop.gremlin.server.handler.NioGremlinBinaryRequestDecoder
-
org.apache.tinkerpop.gremlin.server.handler.NioGremlinResponseFrameEncoder
-
org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.evalOpInternal(ResponseHandlerContext, Supplier, BindingSupplier)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.generateMetaData(ChannelHandlerContext, RequestMessage, ResponseStatusCode, Iterator)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.handleIterator(ResponseHandlerContext, Iterator)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.makeFrame(ChannelHandlerContext, RequestMessage, MessageSerializer, boolean, List, ResponseStatusCode, Map)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.makeFrame(Context, RequestMessage, MessageSerializer, boolean, List, ResponseStatusCode, Map)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.makeFrame(ResponseHandlerContext, RequestMessage, MessageSerializer, boolean, List, ResponseStatusCode, Map)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor.makeFrame(ResponseHandlerContext, RequestMessage, MessageSerializer, boolean, List, ResponseStatusCode, Map, Map)
-
org.apache.tinkerpop.gremlin.server.op.traversal.TraversalOpProcessor.onSideEffectSuccess(Graph, Context)
-
org.apache.tinkerpop.gremlin.server.util.SideEffectIterator
-
-
neo4j-gremlin
-
org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph#getTrait()
-
org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph#CONFIG_META_PROPERTIES
-
org.apache.tinkerpop.gremlin.neo4j.structure.Neo4jGraph#CONFIG_MULTI_PROPERTIES
-
org.apache.tinkerpop.gremlin.neo4j.structure.trait.MultiMetaNeo4jTrait
-
org.apache.tinkerpop.gremlin.neo4j.structure.trait.NoMultiNoMetaNeo4jTrait
-
org.apache.tinkerpop.gremlin.neo4j.structure.trait.Neo4jTrait
-
Certain elements of the API were not or could not be deprecated in prior versions and were simply renamed for this release:
-
org.apache.tinkerpop.gremlin.driver.message.ResponseStatusCode#SERVER_ERROR_SCRIPT_EVALUATION
becameSERVER_ERROR_EVALUATION
Upgrading for Providers
Graph System Providers
Server Authorization
Gremlin Server now supports an extension model that enables authorization. Graph providers are not required to implement this functionality in any way, but it can be helpful for those graphs that wish to provide this functionality through Gremlin Server. Graphs Systems may still choose to rely on their own native authorization functionality if they so choose.
ScalarMapStep
Previous versions of MapStep
had a single abstract method that needed to be implemented:
protected abstract E map(final Traverser.Admin<S> traverser);
This method made it easy to implement new implementations because it hid certain processing logic and made it so that
the implementer only had to reason about how to take the current object from the Traverser
and transform it to a
new value. As 3.5.0 changed semantics around how null
is processed, this method became a bit of a hindrance to the
more complex logic which those semantics entailed. Specifically, this method could not easily communicate to underlying
processing what a null
might mean - is the null
the end of the traversal stream or should the null
be promoted
down the stream as a value to be processed.
Interestingly, the method that enabled the handling of this more complex decision making already existed in
AbstractStep
:
protected Traverser.Admin<E> processNextStart()
It returns a whole Traverser
object and forces manual retrieval of the "next" Traverser
. At this level it becomes
possible to make choices on null
and return it if it should be propagated or dismiss it and return an
EmptyTraverser
. To better accommodate the MapStep
which provides the nice helper map(Traverser)
method as well
as the more flexible version that doesn’t need that infrastructure, ScalarMapStep
was added to extend MapStep
. The
map(Traverser)
was then moved to ScalarMapStep
and those steps that could rely on that helper method now extend
from it. All other steps of this sort still extend MapStep
and directly implement processNextStart()
.
Providers will get compile errors if they extended MapStep
. The easy solution will be to simply modify that code so
that their step instead extends ScalarMapStep
. As a secondary task, providers should then examine their step
implementation to ensure that null
semantics as presented in 3.5.0 apply properly. If they do not, then it is likely
that the step should simply implement MapStep
directly and former map(Traverser)
logic should be migrated to
processNextStart()
.
See: TINKERPOP-2235, TINKERPOP-2099
TraversalStrategy Application
The methodology for strategy application has been altered and the change is most easily described by example. Given a traversal with the structure:
a(b(),c(d()))
Strategies were formerly applied in the following order:
StrategyA on a
StrategyB on a
StrategyA on b
StrategyB on b
StrategyA on c
StrategyB on c
StrategyA on d
StrategyB on d
This approach has always prevented strategies from performing global operations across the traversal and all decedents effectively as children will not have been processed by preceding strategies yet. As of this release, the approach has been altered to apply strategies as follows:
StrategyA on a
StrategyA on b
StrategyA on c
StrategyA on d
StrategyB on a
StrategyB on b
StrategyB on c
StrategyB on d
In this way, strategy B can check if it is being applied to the root traversal and if it is it knows that A has been applied globally.
This revised methodology could represent a breaking change for TraversalStrategy
implementations if they somehow
relied on the old ordering of application. It may also present an opportunity to revise how a TraversalStrategy
is
written to gain some processing benefit to the new order. Please be sure to review any custom strategies carefully
when upgrading to this version.
As part of this change, there have been some adjustments to the Traversal
and Traversal.Admin
interfaces which have
helped to clarify coding intent. There is now an isRoot()
method which determines whether or not the traversal has a
parent or not. Under revised semantics for 3.5.0, a traversal’s parent must be an EmptyStep
instance and should not
be null
. With this change, provider TraversalStrategy
implementations should be reviewed to evaluate if isRoot()
semantics cause any breaks in logic to existing code.
In addition, TraversalStrategies
now implements Iterable
and exposes an iterator()
method which may be preferred
over the old toList()
style construction for getting the list of configured strategies.
Null Semantics
Graph providers should take note of the changes to null
semantics described in the "users" section of these upgrade
notes. As null
is now acceptable as a Traverser
object, this change may affect custom steps. Further note that
null
now works more consistently with mutation steps and graph providers may need to include additional logic to
deal with those possible conditions. Please see the console sessions below which uses TinkerGraph to demonstrate the
current behavioral expectations.
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>false
gremlin> g.addV(null).property(id, null).property('name',null)
==>v[0]
gremlin> g.V().elementMap()
==>[id:0,label:vertex]
...
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>true
gremlin> g.addV(null).property(id, null).property('name',null)
==>v[0]
gremlin> g.V().elementMap()
==>[id:0,label:vertex,name:null]
In the above example, addV()
defaults to Vertex.DEFAULT_LABEL
, the id
is generated and setting the "name"
property to null
results in the value not being set. If the property value is set to an actual value and then set
to null
TinkerGraph will remove the property key all together:
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>false
gremlin> g.addV().property('name','stephen')
==>v[0]
gremlin> g.V().elementMap()
==>[id:0,label:vertex,name:stephen]
gremlin> g.V().has('vertex','name','stephen').property('name',null)
==>v[0]
gremlin> g.V().elementMap()
==>[id:0,label:vertex]
...
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>true
gremlin> g.addV().property('name','stephen')
==>v[2]
gremlin> g.V().has('vertex','name','stephen').property('name',null)
==>v[2]
gremlin> g.V().elementMap()
==>[id:2,label:vertex,name:null]
The above examples point out the default operations of TinkerGraph, but it can be configured to actually accept the
null
as a property value and it is up to graph providers to decided how they wish to treat a null
property value.
Providers should use the new supportsNullPropertyValues()
feature to indicate to users how null
is handled.
For edges, the label
still cannot be defaulted and must be specified, therefore:
gremlin> g.V(0L).as('a').addE(null).to('a')
Label can not be null
Type ':help' or ':h' for help.
Display stack trace? [yN]n
gremlin> g.V(0L).as('a').addE(constant(null)).to('a')
Label can not be null
Type ':help' or ':h' for help.
Display stack trace? [yN]
Also, edges have similar behavior to vertices when it comes to setting properties (again, the default configuration for TinkerGraph is being used here):
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>false
gremlin> g.addV().property('name','stephen')
==>v[0]
gremlin> g.V().has('vertex','name','stephen').as('a').addE('knows').to('a').property(id,null).property('weight',null)
==>e[2][0-knows->0]
gremlin> g.E().elementMap()
==>[id:2,label:knows,IN:[id:0,label:vertex],OUT:[id:0,label:vertex]]
gremlin> g.E().property('weight',0.5)
==>e[2][0-knows->0]
gremlin> g.E().elementMap()
==>[id:2,label:knows,IN:[id:0,label:vertex],OUT:[id:0,label:vertex],weight:0.5]
gremlin> g.E().property('weight',null)
==>e[2][0-knows->0]
gremlin> g.E().elementMap()
==>[id:2,label:knows,IN:[id:0,label:vertex],OUT:[id:0,label:vertex]]
...
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>true
gremlin> g.addV().property('name','stephen')
==>v[8]
gremlin> g.V().has('vertex','name','stephen').as('a').addE('knows').to('a').property(id,null).property('weight',null)
==>e[10][8-knows->8]
gremlin> g.E().elementMap()
==>[id:10,label:knows,IN:[id:8,label:vertex],OUT:[id:8,label:vertex],weight:null]
gremlin> g.E().property('weight',0.5)
==>e[10][8-knows->8]
gremlin> g.E().elementMap()
==>[id:10,label:knows,IN:[id:8,label:vertex],OUT:[id:8,label:vertex],weight:0.5]
gremlin> g.E().property('weight',null)
==>e[10][8-knows->8]
gremlin> g.E().elementMap()
==>[id:10,label:knows,IN:[id:8,label:vertex],OUT:[id:8,label:vertex],weight:null]
Graphs that support multi/meta-properties have some issues to consider as well as demonstrated with TinkerGraph:
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>false
gremlin> g.addV().property(list,'foo',"x").property(list,"foo", null).property(list,'foo','bar')
==>v[0]
gremlin> g.V().elementMap()
==>[id:0,label:vertex,foo:bar]
gremlin> g.V().valueMap()
==>[foo:[x,bar]]
gremlin> g.V().property('foo',null)
==>v[0]
gremlin> g.V().valueMap(true)
==>[id:0,label:vertex]
...
gremlin> g.addV().property(list,'foo','bar','x',1,'y',null)
==>v[0]
gremlin> g.V().properties('foo').valueMap(true)
==>[id:1,key:foo,value:bar,x:1]
gremlin> g.V().properties('foo').property('x',null)
==>vp[foo->bar]
gremlin> g.V().properties('foo').valueMap(true)
==>[id:1,key:foo,value:bar]
...
gremlin> g.getGraph().features().vertex().supportsNullPropertyValues()
==>false
gremlin> g.addV().property(list,'foo',"x").property(list,"foo", null).property(list,'foo','bar')
==>v[11]
gremlin> g.V().elementMap()
==>[id:11,label:vertex,foo:bar]
gremlin> g.V().valueMap()
==>[foo:[x,null,bar]]
...
gremlin> g.addV().property(list,'foo','bar','x',1,'y',null)
==>v[0]
gremlin> g.V().properties('foo').valueMap(true)
==>[id:1,key:foo,value:bar,x:1,y:null]
gremlin> g.V().properties('foo').property('x',null)
==>vp[foo->bar]
gremlin> g.V().properties('foo').valueMap(true)
==>[id:1,key:foo,value:bar,x:null,y:null]
See: TINKERPOP-2235, TINKERPOP-2099
AbstractOpProcessor API Change
The generateMetaData()
method was removed as it was deprecated in a previous version. There already was a preferred
method called generateResultMetaData()
that took an extra Settings
parameter. To fix compilation issues simply
replace implementations of the generateMetaData()
method with generateResultMetaData()
. Gremlin Server has
only been calling generateResultMetaData()
since the deprecation, so this correction should be straightforward.
StoreStep and AggregateStep
Note that StoreStep
has been renamed to AggregateLocalStep
and AggregateStep
has been renamed to
AggregateGlobalStep
. The renaming is important to consider if any custom TraversalStrategies
have been written
that rely on the old step names.
See: TINKERPOP-2254
Session Close
TinkerPop drivers no longer send the session "close" message to kill a session. The close of the connection itself should be responsible for the close of the session. It is also expected that a session is bound to the client that created it. Closing the session explicitly by closing the connection will act as a force close where transaction are not explicitly rolled-back by Gremlin Server. Such transactions would be handled by the underlying graph system in the manner that they provide.
See: TINKERPOP-2336
TemporaryException and SERVER_ERROR_TEMPORARY
The gremlin-core
module now has a TemporaryException
interface. This interface allows providers to throw an
exception that will be considered by users to be generally retryable. In addition, the Gremlin Server protocol now
also has a ResponseStatusCode.SERVER_ERROR_TEMPORARY
status which indicates the same situation. Throwing an exception
that implements TemporaryException
will be recognized by Gremlin Server to return this error code. This notion of
"temporary failure" is helpful to providers as it allows them to let users know that a failure is transient and related
to the system state at the time of the request. Without this indicator, users are left to parse exception messages to
determine when it is considered acceptable to retry an operation.
See: TINKERPOP-2517
gremlin-language
The new gremlin-language
module contains an ANTLR4 grammar for the Gremlin language along with the generated parser
code. The grammar is still under development but covers most of the Gremlin language, with the idea that it will
eventually drive the ongoing design of the language, as opposed to driving it from Java.
The grammar is currently tested against the Gremlin traversals in the entire Gherkin test suite, as well as a major portion of the Gremlin used for examples in the Reference Documentation. The grammar has the following limitations:
-
It does not support lambdas or Groovy syntax
-
The following steps are not yet fully supported:
-
withComputer()
-
io()
-
withoutStrategies()
-
program()
-
connectedComponent()
-
fill()
terminator step
-
-
Vertex
andEdge
instance definitions
See: TINKERPOP-2533
UnifiedChannelizer
The UnifiedChannelizer
is a new Channelizer
implementation. It exposes new a new Session
interface that allows
this channelizer to be extended with custom functionality specific to a providers environment. As of 3.5.0, this
channelizer is not the default and only fits certain workload patterns. The interfaces should therefore be considered
volatile and may change.
See: TINKERPOP-2245
Graph Driver Providers
TraversalOpProcessor Side-effects
TraversalOpProcessor
no longer holds a cache of side-effects and more generally the entire side-effect protocol has
been removed and is no longer supported in the server or drivers.
See: TINKERPOP-2269
Close Message
The functionality of the "close" message is no longer in place in Gremlin Server. Sending the message (from older
drivers for example) will simply result in a no-op on the server and the expected return of the NO_CONTENT
message.
From 3.5.0 forward, drivers need not send this message to close the session and simply rely on the close of the
connection to kill the session.
See: TINKERPOP-2336
TinkerPop 3.4.0
Avant-Gremlin Construction #3 for Theremin and Flowers
TinkerPop 3.4.13
Release Date: January 10, 2022
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.4.12
Release Date: July 19, 2021
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.4.11
Release Date: May 3, 2021
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
barrier() Control
Various TraversalStrategy
implementations inject barrier()
instances to improve traversal performance. It is often
helpful in debugging situations or in tweaking traversal performance activities that it may be necessary to prevent
automated barrier()
injection. Formerly, that involved removing LazyBarrierStrategy
, RepeatUnrollStrategy
, and
PathRetractionStrategy
as all three of those strategies had the potential to add a barrier()
. While this approach
would work, it may be hard to recall those three strategies easily and you may be removing optimizations that might
have been useful in addition to the barrier()
.
In 3.4.11, it is only necessary to remove LazyBarrierStrategy
. Doing so will disable barrier()
additions in other
strategies and as this is the new model of execution, it can be expected to work as the approach in the future even
if other strategies attempt to add barrier()
steps for some reason. Note of course that this rule only applies to
TinkerPop-enabled strategies. It is possible that provider-specific strategies could add barrier()
-steps and thus
still require direct removal.
gremlin> g.V().repeat(out()).times(2).in().explain()
==>Traversal Explanation
====================================================================================================================================================================================
Original Traversal [GraphStep(vertex,[]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), VertexStep(IN,vertex)]
ConnectiveStrategy [D] [GraphStep(vertex,[]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), VertexStep(IN,vertex)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
CountStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
LazyBarrierStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
Final Traversal [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(OUT,vertex), NoOpBarrierStep(2500), VertexStep(IN,vertex)]
gremlin> g.withoutStrategies(LazyBarrierStrategy).V().repeat(out()).times(2).in().explain()
==>Traversal Explanation
=================================================================================================================================================================
Original Traversal [GraphStep(vertex,[]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), VertexStep(IN,vertex)]
ConnectiveStrategy [D] [GraphStep(vertex,[]), RepeatStep([VertexStep(OUT,vertex), RepeatEndStep],until(loops(2)),emit(false)), VertexStep(IN,vertex)]
RepeatUnrollStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
CountStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
EarlyLimitStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
MatchPredicateStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
FilterRankingStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
IncidentToAdjacentStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
PathRetractionStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
InlineFilterStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
AdjacentToIncidentStrategy [O] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
TinkerGraphCountStrategy [P] [GraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
TinkerGraphStepStrategy [P] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
ProfileStrategy [F] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
StandardVerificationStrategy [V] [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
Final Traversal [TinkerGraphStep(vertex,[]), VertexStep(OUT,vertex), VertexStep(OUT,vertex), VertexStep(IN,vertex)]
See: TINKERPOP-1994
TinkerPop 3.4.10
Release Date: January 18, 2021
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Python Timeout Issue
In version 3.4.9, it became possible to set client-side read/write timeouts. Those timeouts were defaulted to thirty
seconds which could put the client at odds with other timeout settings which could then cause unexpected failures. The
client-side timeouts have now been defaulted to None
leaving it to users to set the timeout themselves if they so
desire. In this way, 3.4.10 has the same timeout behavior as 3.4.8 without any additional workarounds.
The timeouts can be configured as shown in the below example:
from gremlin_python.driver.tornado.transport import TornadoTransport
graph=Graph()
connection = DriverRemoteConnection(endpoint,'g',
transport_factory=lambda: TornadoTransport(read_timeout=60, write_timeout=60))
g = graph.traversal().withRemote(connection)
See: TINKERPOP-2505
TinkerPop 3.4.9
Release Date: December 7, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Warning
|
Python users should prefer use of 3.4.10 to avoid an issue with a default timeout setting that requires direct configuration described in TINKERPOP-2505. |
Upgrading for Users
Translator Implementations
One of the silent features of Gremlin is the ScriptTranslator
. More specifically, the implementation of this
interface which will convert a Traversal
object (or Gremlin Bytecode
) into a proper String
representation that
is syntactically correct for the implementation language.
gremlin> import org.apache.tinkerpop.gremlin.process.traversal.translator.*
==>org.apache.tinkerpop.gremlin.structure.*, org.apache.tinkerpop.gremlin.structure.util.*, ...
gremlin> translator = GroovyTranslator.of('g')
==>translator[g:gremlin-groovy]
gremlin> translator.translate(g.V().has("person","name","marko").has("age",gt(20)).where(outE("knows")))
==>g.V().has("person","name","marko").has("age",P.gt((int) 20)).where(__.outE("knows"))
gremlin> translator = PythonTranslator.of('g')
==>translator[g:gremlin-python]
gremlin> translator.translate(g.V().has("person","name","marko").has("age",gt(20)).where(outE("knows")))
==>g.V().has('person','name','marko').has('age',P.gt(20)).where(__.outE('knows'))
Some Gremlin users may already be aware of the implementations for Groovy and Python from previous versions. These
classes have largely been used for testing purposes, but users have found helpful use cases for them and they have
now been promoted to gremlin-core
from their original locations. The old versions in gremlin-groovy
and
gremlin-python
have been deprecated. There have been some improvements to the GroovyTranslator
such that the
Gremlin generated will not match character for character with the deprecated version. There may also be some potential
for the newer version in 3.4.9 to generate scripts that will not work in earlier versions. It is therefore best to
use 3.4.9 translators within environments where 3.4.9 is uniformly supported. If older versions are in place, it may
be better to continue use of the deprecated versions.
See: TINKERPOP-2461
Bytecode Command Improvements
The :bytecode
command in the Gremlin console includes two new options: reset
and config
. Both options provide
ways to better control the GraphSONMapper
used internally by the command. The reset
option will replace the current
GraphSONMapper
with a new one with some basic defaults: GraphSON 3.0 with extension and TinkerIoRegistry
if
present. The config
option provides a way to specify a custom GraphSONMapper
or additional configurations to the
default one previously described.
See: TINKERPOP-2479
withStrategies() Groovy Syntax
The withStrategies()
configuration step accepts a variable number of TraversalStrategy
instances. In Java, those
instances are typically constructed with instance()
if it is a singleton or by way of a builder pattern which
provides a fluent, type safe method to create the object. For Groovy, which is highly applicable to those who use
Gremlin scripts in their applications or work a lot within tools similar to the Gremlin Console, the builder syntax
can work but doesn’t really match the nature of the Groovy language. Using a strategy in this script context would
look something like:
g.withStrategies(ReadOnlyStrategy.instance(),
SubgraphStrategy.build().vertexProperties(hasNot('endTime')).create())
While this method will still work, it is now possible to use a more succinct syntax for Groovy scripts:
g.withStrategies(ReadOnlyStrategy,
new SubgraphStrategy(vertexProperties: __.hasNot('endTime')))
The rules are straightforward. If a strategy can be instantiated with instance()
as a singleton then use just the
class name as a shortcut. Interestingly, many users try this syntax when they first get started and obviously fail.
With the syntax present, they will have one less error to contend with in their early days of Gremlin. For strategies
that take configurations, these strategies will use named arguments in the constructor where the names match the
expected builder methods. This named argument syntax is common to Groovy and not something special to Gremlin - it is
just now exposed for this purpose.
See: TINKERPOP-2466
withEmbedded()
The AnonymousTraversalSource
was introduced in 3.3.5 and is most typically used for constructing remote
TraversalSource
instances, but it also provides a way to construct a TraversalSource
from an embedded Graph
instance:
gremlin> g = traversal().withGraph(TinkerFactory.createModern())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g = traversal().withRemote(DriverRemoteConnection.using('localhost',8182))
==>graphtraversalsource[emptygraph[empty], standard]
The withGraph(Graph)
method is now deprecated in favor the new withEmbedded(Graph)
method that is more explicit
about its intent:
gremlin> g = traversal().withEmbedded(TinkerFactory.createModern())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
This change is mostly applicable to JVM languages where embedded Graph
instances are available. For Gremlin Language
Variants not on the JVM, the withGraph(Graph)
method has simply been deprecated and not replaced (with the preference
to use withRemote()
variants).
See: TINKERPOP-2413
TraversalStrategy in Javascript
Using SubgraphStrategy
, PartitionStrategy
and other TraversalStrategy
implementations is now possible in
Javascript.
const sg = g.withStrategies(
new SubgraphStrategy({vertices:__.hasLabel("person"), edges:__.hasLabel("created")}));
See: TINKERPOP-2054
WebSocket Compression
Gremlin Server now supports standard WebSocket compression (per RFC 7692). Both the Java and Python drivers support this functionality from the client’s perspective. Compression is enabled by default and should be backward compatible, thus allowing older versions of the driver to connect to newer versions of the server and vice versa. Using the compression-enabled drivers with a server that also supports that functionality will greatly reduce network IO requirements.
See: TINKERPOP-2441, TINKERPOP-2453
Connection Management Improvements
The Java Driver was designed with the idea that a Cluster
instance would be created once and then used for the life
of the application. As a result, the cost of setup and termination of that instance was typically sunk into the general
startup and shutdown of the application itself. In some environments, where applications were short-lived, this cost
was quite apparent and undesirable given that it might take several seconds to initialize and then a similar amount of
time for proper shutdown.
In 3.4.9, the initialization and shutdown of the Cluster
object has been improved dramatically, which should be
especially helpful to those aforementioned ephemeral environments. The following micro-benchmark results demonstrate
the difference in performance between 3.4.8 and 3.4.9:
Benchmark | 3.4.8 | 3.4.9 |
---|---|---|
setup and close 100 connections |
2116 ms |
35 ms |
setup and close 32 connections |
2081 ms |
13 ms |
setup and close 1 connection |
2046 ms |
2 ms |
See: TINKERPOP-2445
Per Request Options
With Java it has been possible to pass per-request settings for both scripts and bytecode. While Javascript, Python, and .NET allowed this in various ways, it wasn’t quite as convenient as Java, nor was it well documented. In this release, the approach for making this sort of per-request configurations is now much more consistent across languages. We see this most evident in bytecode based requests:
Java
g.with(Tokens.ARGS_EVAL_TIMEOUT, 500L).V().out("knows");
C#
g.With(Tokens.ArgsEvalTimeout, 500).V().Out("knows").Count();
Javascript
g.with_('evaluationTimeout', 500).V().out('knows');
Python
g.with_('evaluationTimeout', 500).V().out('knows')
Please see the new "Per Request Settings" sections for each language in the Gremlin Drivers and Variants section for information on how to send scripts with specific request configurations.
GraphManager Extension
The org.apache.tinkerpop.gremlin.server.util.CheckedGraphManager
can be used instead of
org.apache.tinkerpop.gremlin.server.util.DefaultGraphManager
(in gremlin-server.yml to ensures that Gremlin Server
fails to start if all graphs fails. This configuration option can be useful for a number of different situations (e.g.
use CheckedGraphManager
on a Kubernetes cluster to ensure that a pod will be restarted when it cannot properly handle
requests.) As a final note, DefaultGraphManager
is no longer final
and thus can be extended.
See: TINKERPOP-2436
Lambdas in gremlin-javascript
Lambda scripts can now be utilized in gremlin-javascript
and follows roughly the same pattern as Python does:
g.V().has('person','name','marko').
values('name').map(() => "it.get()[1]")
See: TINKERPOP-2001
Upgrading for Providers
Graph System Providers
Custom TraverserSet
It is now possible to provide a custom TraverserSet
to Step
implementations that make use of those objects to
introduce new logic for how they are populated and managed. Providers can take advantage of this capability by
constructing their own Traversal
implementation and overriding the getTraverserSetSupplier()
method. When new
TraverserSet
instances are needed during traversal execution, steps will consult this method to get those instances.
See: TINKERPOP-2396
TinkerPop 3.4.8
Release Date: August 3, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin.NET: Automatic Reconnect
The Gremlin.NET driver now automatically tries to reconnect to a server when no open connection is available to submit a request. This should enable the driver to handle cases where the server is only temporarily unavailable or where the server has closed connections which some graph providers do when no requests were sent for some time.
See: TINKERPOP-2288
TinkerPop 3.4.7
Release Date: June 1, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Clear Screen Command
Gremlin Console now has the :cls
command to clear the screen. This feature acts as an alternative to platform
specific clear operations and provides a common way to perform that function.
TinkerPop 3.4.6
Release Date: February 20, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
drop() Properties
In 3.4.5 the equality of the Property
object changed to allow language features like dedup()
to work more
consistently. An unnoticed side-effect of that change was that calling drop()
on properties that had the same values
would not properly remove all their instances. This problem affected edge and meta property instances but not the
properties of vertices as their equality definitions had not changed.
This issue is now resolved with the side-effect being that the inclusion of drop()
will disable LazyBarrierStrategy
which prevents automatic bulking. In most common cases, the impact of that optimization loss should be minimal and
could be added back manually with barrier()
steps in the appropriate places.
See: TINKERPOP-2338
TinkerPop 3.4.5
Release Date: February 3, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Warning
|
A bug was noted in 3.4.5 soon after release and was quickly patched. Users and providers should avoid version 3.4.5 and should instead prefer usage of 3.4.6. |
Upgrading for Users
by(String) Modulator
It is quite common to use the by(String)
modulator when doing some for of selection operation where the String
is
the key to the value of the current Traverser
, demonstrated as follows:
gremlin> g.V().project('name').by('name')
==>[name:marko]
==>[name:vadas]
==>[name:lop]
==>[name:josh]
==>[name:ripple]
==>[name:peter]
gremlin> g.V().order().by('name').values('name')
==>josh
==>lop
==>marko
==>peter
==>ripple
==>vadas
Of course, this approach usually only works when the current Traverser
is an Element
. If it is not an element, the
error is swift and severe:
gremlin> g.V().valueMap().project('x').by('name')
java.util.LinkedHashMap cannot be cast to org.apache.tinkerpop.gremlin.structure.Element
Type ':help' or ':h' for help.
Display stack trace? [yN]n
and while it is perhaps straightforward to see the problem in the above example, it is not always clear exactly where
the mistake is. The above example is the typical misuse of by(String)
and comes when one tries to treat a Map
the
same way as an Element
(which is quite reasonable).
In 3.4.5, the issue of using by(String)
on a Map
and the error messaging have been resolved as follows:
gremlin> g.V().valueMap().project('x').by('name')
==>[x:[marko]]
==>[x:[vadas]]
==>[x:[lop]]
==>[x:[josh]]
==>[x:[ripple]]
==>[x:[peter]]
gremlin> g.V().elementMap().order().by('name')
==>[id:4,label:person,name:josh,age:32]
==>[id:3,label:software,name:lop,lang:java]
==>[id:1,label:person,name:marko,age:29]
==>[id:6,label:person,name:peter,age:35]
==>[id:5,label:software,name:ripple,lang:java]
==>[id:2,label:person,name:vadas,age:27]
gremlin> g.V().values('name').project('x').by('name')
The by("name") modulator can only be applied to a traverser that is an Element or a Map - it is being applied to [marko] a String class instead
Type ':help' or ':h' for help.
Display stack trace? [yN]n
See: TINKERPOP-2314
hasKey() Step and hasValue() Step
Previously, hasKey()
-step and hasValue()
-step only applied to vertex properties. In order to support more
generalized scenarios, the behavior of these steps were modified to allow them to be applied to both edge properties
and meta-properties.
The original behavior is demonstrated as follows:
gremlin> graph = TinkerFactory.createTheCrew()
==>tinkergraph[vertices:6 edges:14]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:14], standard]
gremlin> g.E().properties().hasKey('since')
==>TinkerProperty cannot be cast to Element
gremlin> g.V().properties("location").properties().hasKey("startTime")
==>TinkerProperty cannot be cast to Element
gremlin> g.E().properties().hasValue(2010)
==>TinkerProperty cannot be cast to Element
gremlin> g.V().properties("location").properties().hasValue(2005)
==>TinkerProperty cannot be cast to Element
The new behavior of hasKey()
with edge property can be seen as:
gremlin> g.E().properties().hasKey('since')
==>p[since->2009]
==>p[since->2010]
==>p[since->2010]
==>p[since->2011]
==>p[since->2012]
Similar behavior of for hasKey()
can be seen with meta-properties:
gremlin> g.V().properties().hasKey('location').properties().hasKey('startTime')
==>p[startTime->1997]
==>p[startTime->2001]
==>p[startTime->2004]
==>p[startTime->2004]
==>p[startTime->2005]
==>p[startTime->2005]
==>p[startTime->1990]
==>p[startTime->2000]
==>p[startTime->2006]
==>p[startTime->2007]
==>p[startTime->2011]
==>p[startTime->2014]
==>p[startTime->1982]
==>p[startTime->2009]
The new behavior for hasValue()
with edge property is as follows:
gremlin> g.E().properties().hasValue(2010)
==>p[since->2010]
==>p[since->2010]
and similarly with hasValue()
for meta-properties:
gremlin> g.V().properties().hasKey('location').properties().hasValue(2005)
==>p[endTime->2005]
==>p[endTime->2005]
==>p[startTime->2005]
==>p[startTime->2005]
Properties Equality
There was some inconsistency in terms of Property
equality in earlier versions of Gremlin. Equality is now
defined as follows: two properties are equal only if their key and value are equal, even if their parent elements are
not equal. It makes sense when comparing properties regardless of parent elements to just focus on the property itself
as it yields more uniform and concise results to reason about. The benefit of this change is that the behavior of
property comparison and dedup()
-step are predictable, and it will not affect the result if the property is detached
from the parent element.
Note
|
The "property" here refer to edge properties and meta-properties, thus excluding vertex property. |
The old behavior can be shown using "The Crew" toy graph as follows:
gremlin> g.E().properties().count()
==>13
gremlin> g.E().properties()
==>p[since->2009]
==>p[since->2010]
==>p[skill->4]
==>p[skill->5]
==>p[since->2010]
==>p[since->2011]
==>p[skill->5]
==>p[skill->4]
==>p[since->2012]
==>p[skill->3]
==>p[skill->3]
==>p[skill->5]
==>p[skill->3]
gremlin> g.E().properties().dedup().count()
==>13
gremlin> g.E().dedup().properties()
==>p[since->2009]
==>p[since->2010]
==>p[skill->4]
==>p[skill->5]
==>p[since->2010]
==>p[since->2011]
==>p[skill->5]
==>p[skill->4]
==>p[since->2012]
==>p[skill->3]
==>p[skill->3]
==>p[skill->5]
==>p[skill->3]
The new more consistent behavior is demonstrated below:
gremlin> g.E().properties().count()
==>13
gremlin> g.E().properties()
==>p[since->2009]
==>p[since->2010]
==>p[skill->4]
==>p[skill->5]
==>p[since->2010]
==>p[since->2011]
==>p[skill->5]
==>p[skill->4]
==>p[since->2012]
==>p[skill->3]
==>p[skill->3]
==>p[skill->5]
==>p[skill->3]
gremlin> g.E().properties().dedup().count()
==>7
gremlin> g.E().properties().dedup()
==>p[since->2009]
==>p[since->2010]
==>p[skill->4]
==>p[skill->5]
==>p[since->2011]
==>p[since->2012]
==>p[skill->3]
See: TINKERPOP-2318
Upgrading for Providers
Graph Driver Providers
GraphBinary API Change
In GraphBinary serialization, Java GraphBinaryReader
and GraphBinaryWriter
, along with TypeSerializer<T>
interface now take a Buffer
instance instead of Netty’s ByteBuf
, that way we avoid exposing Netty’s API in our own
public API.
Using our own Buffer
interface, wrapping Netty’s buffer API, allowed us to move TypeSerializer<T>
implementations,
the reader and the writer to org.apache.tinkerpop.gremlin.structure.io.binary
package in gremlin-core
.
Additionally, GraphBinaryReader
and GraphBinaryWriter
methods now throw an java’s IOException
, instead of our
own SerializationException
.
See: TINKERPOP-2305
TinkerPop 3.4.4
Release Date: October 14, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Python GraphBinary
There is now support for GraphBinary in Python. As with Java, it remains a working but experimental format that is still under evaluation. This new serializer can be used by first ensuring that it is available on the server and then configuring the connection as follows:
from gremlin_python.driver.serializer import GraphBinarySerializersV1
gremlin_server_url = "ws://172.17.0.2:45940/gremlin"
remote_conn = DriverRemoteConnection(gremlin_server_url, 'g',
message_serializer=GraphBinarySerializersV1())
g = Graph().traversal().withRemote(remote_conn)
elementMap() Step
Since graph elements (i.e. Vertex
, Edge
, and VertexProperty
) are returned from remote sources as references
(i.e. without properties), one of the more common needs for Gremlin users is the ability to easily return a Map
representation of the elements that they are querying. Typically, such transformations are handled by valueMap()
:
gremlin> g.V().has('person','name','marko').valueMap(true)
==>[id:1,label:person,name:[marko],age:[29]]
gremlin> g.V().has('person','name','marko').valueMap().by(unfold())
==>[name:marko,age:29]
or by way of project()
:
gremlin> g.V().has('person','name','marko').
......1> project('name','age','vid','vlabel').
......2> by('name').
......3> by('age').
......4> by(id).
......5> by(label)
==>[name:marko,age:29,vid:1,vlabel:person]
While valueMap()
works reasonably well for Vertex
and VertexProperty
transformations it does less well for Edge
as it fails to include incident vertices:
gremlin> g.E(11).valueMap(true)
==>[id:11,label:created,weight:0.4]
This limitation forces a fairly verbose use of project()
for what is a reasonably common requirement:
gremlin> g.E(12).union(valueMap(true),
......1> project('inV','outV','inVLabel','outVLabel').
......2> by(inV().id()).
......3> by(outV().id()).
......4> by(inV().label()).
......5> by(outV().label())).unfold().
......6> group().
......7> by(keys).
......8> by(select(values))
==>[inV:3,id:12,inVLabel:software,weight:0.2,outVLabel:person,label:created,outV:6]
By introducing elementMap()
-step, there is now a single step that covers the most common transformation requirements
for all three graph elements:
gremlin> g.V().has('person','name','marko').elementMap()
==>[id:1,label:person,name:marko,age:29]
gremlin> g.V().has('person','name','marko').elementMap('name')
==>[id:1,label:person,name:marko]
gremlin> g.V().has('person','name','marko').properties('name').elementMap()
==>[id:0,key:name,value:marko]
gremlin> g.E(11).elementMap()
==>[id:11,label:created,IN:[id:3,label:software],OUT:[id:4,label:person],weight:0.4]
TinkerPop 3.4.3
Release Date: August 5, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Deprecated store()
The store()
-step and aggregate()
-step do the same thing in different ways, where the former is lazy and the latter
is eager in the side-effect collection of objects from the traversal. The different behaviors can be thought of as
differing applications of Scope
where global
is eager and local
is lazy. As a result, there is no need for both
steps when one will do.
As of 3.4.3, store(String)
is now deprecated in favor of aggregate(Scope, String)
where the Scope
should be set
to local
to ensure the same functionality as store()
. Note that aggregate('x')
is the same as
aggregate(global,'x')
.
See: TINKERPOP-1553
Deprecate Gryo in Gremlin Server
Gryo is now deprecated as a serialization format for Gremlin Server, however, it is still configured as a default option in the sample configuration files packaged with the server. The preferred serialization option should now be GraphBinary. Note that Gremlin Console is now configured to use GraphBinary by default.
See: TINKERPOP-2250
Upgrading for Providers
Graph Driver Providers
Gremlin Server Test Configuration
Gremlin Server has a test configuration built into its Maven build process which all integration tests and Gremlin Language Variants use to validate their operations. While this approach has worked really well for test automation within Maven, there are often times where it would be helpful to simply have Gremlin Server running with that configuration. This need is especially true when developing Gremlin Language Variants which is something that is done outside of the JVM.
This release introduces a Docker script that will start Gremlin Server with this test configuration. It can be started with:
docker/gremlin-server.sh
Once started, it is then possible to run GLV tests directly from a debugger against this instance which should hopefully reduce development friction.
TinkerPop 3.4.2
Release Date: May 28, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Per Request Options
In 3.4.0, the notion of RequestOptions
were added so that users could have an easier way to configure settings on
individual requests made through the Java driver. While that change provided a way to set those configurations for
script based requests, it didn’t include options to make those settings in a Traversal
submitted via Bytecode
. In
this release those settings become available via with()
step on the TraversalSource
as follows:
GraphTraversalSource g = traversal().withRemote(conf);
List<Vertex> vertices = g.with(Tokens.ARGS_SCRIPT_EVAL_TIMEOUT, 500L).V().out("knows").toList()
See: TINKERPOP-2211
Gremlin Console Timeout
The Gremlin Console timeout that is set by :remote config timeout x
was client-side only in prior versions, which
meant that if the console timeout was less than the server timeout the client would timeout but the server might still
be processing the request. Similarly, a longer timeout on the console would not change the server and the timeout
would occur sooner than expected. These discrepancies often led to confusion.
As of 3.4.0, the Java Driver API allowed for timeout settings to be more easily passed per request, so the console was modified for this current version to pass the console timeout for each remote submission thus yielding more consistent and intuitive behavior.
See: TINKERPOP-2203
Upgrading for Providers
Graph System Providers
Warnings
It is now possible to pass warnings over the Gremlin Server protocol using a warnings
status attribute. The warnings
are expected to be a string value or a List
of string values which can be consumed by the user or tools that check
for that status attribute. Note that Gremlin Console is one such tool that will respond to this status attribute - it
will print the messages to the console as they are detected when doing remote script submissions.
See: TINKERPOP-2216
Graph Driver Providers
GraphBinary API Change
In GraphBinary serialization, Java write()
and writeValue()
from TypeSerializer<T>
interface now take a Netty’s
ByteBuf
instance instead of an ByteBufAllocator
, that way the same buffer instance gets reused during the write
of a message. Additionally, we took the opportunity to remove the unused parameter from ResponseMessageSerializer
.
See: TINKERPOP-2161
TinkerPop 3.4.1
Release Date: March 18, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Mix SPARQL and Gremlin
In the initial release of sparql-gremlin
it was only possible to execute a SPARQL query and have it translate to
Gremlin. Therefore, it was only possible to write a query like this:
gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person v:age ?age }")
==>[name:marko,age:29]
==>[name:vadas,age:27]
==>[name:josh,age:32]
==>[name:peter,age:35]
gremlin> g.sparql("SELECT * WHERE { }")
==>v[1]
==>v[2]
==>v[3]
==>v[4]
==>v[5]
==>v[6]
In this release, however, it is now possible to further process that result with Gremlin steps:
gremlin> g.sparql("SELECT ?name ?age WHERE { ?person v:name ?name . ?person v:age ?age }").select("name")
==>marko
==>vadas
==>josh
==>peter
gremlin> g.sparql("SELECT * WHERE { }").out("knows").values("name")
==>vadas
==>josh
Upgrading for Providers
Graph Database Providers
GraphBinary Serializer Changes
In GraphBinary serialization, Java write()
and writeValue()
from TypeSerializer<T>
interface now take a Netty’s
ByteBuf
instance instead of an ByteBufAllocator
, that way the same buffer instance gets reused during the write
of a message. Additionally, we took the opportunity to remove the unused parameter from ResponseMessageSerializer
.
See: TINKERPOP-2161
TinkerPop 3.4.0
Release Date: January 2, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
sparql-gremlin
The sparql-gremlin
module is a SPARQL to Gremlin compiler, which allows
SPARQL to be executed over any TinkerPop-enabled graph system.
graph = TinkerFactory.createModern()
g = graph.traversal(SparqlTraversalSource)
g.sparql("""SELECT ?name ?age
WHERE { ?person v:name ?name . ?person v:age ?age }
ORDER BY ASC(?age)""")
Gremlin.NET Driver Improvements
The Gremlin.NET driver now uses request pipelining. This allows connections to be reused for different requests in
parallel which should lead to better utilization of connections. The ConnectionPool
now also has a fixed size
whereas it could previously create an unlimited number of connections. Each Connection
can handle up to
MaxInProcessPerConnection
requests in parallel. If this limit is reached for all connections, then a
NoConnectionAvailableException
is thrown which makes this a breaking change.
These settings can be set as properties on the ConnectionPoolSettings
instance that can be passed to the GremlinClient
.
Indexing of Collections
TinkerPop 3.4.0 adds a new index()
-step, which allows users to transform simple collections into index collections or maps.
gremlin> g.V().hasLabel("software").values("name").fold().
......1> order(local).
......2> index().unfold()
==>[lop,0]
==>[ripple,1]
gremlin> g.V().hasLabel("person").values("name").fold().
......1> order(local).by(decr).
......2> index().
......3> with(WithOptions.indexer, WithOptions.map)
==>[0:vadas,1:peter,2:marko,3:josh]
Modulation of valueMap()
The valueMap()
step now supports by
and with
modulation, which also led to the deprecation of valueMap(true)
overloads.
by() Modulation
With the help of the by()
modulator valueMap()
result values can now be adjusted, which is particularly useful to turn multi-/list-values into single values.
gremlin> g.V().hasLabel("person").valueMap()
==>[name:[marko],age:[29]]
==>[name:[vadas],age:[27]]
==>[name:[josh],age:[32]]
==>[name:[peter],age:[35]]
gremlin> g.V().hasLabel("person").valueMap().by(unfold())
==>[name:marko,age:29]
==>[name:vadas,age:27]
==>[name:josh,age:32]
==>[name:peter,age:35]
with() Modulation
The with()
modulator can be used to include certain tokens (id
, label
, key
and/or value
).
The old way (still valid, but deprecated):
gremlin> g.V().hasLabel("software").valueMap(true)
==>[id:10,label:software,name:[gremlin]]
==>[id:11,label:software,name:[tinkergraph]]
gremlin> g.V().has("person","name","marko").properties("location").valueMap(true)
==>[id:6,key:location,value:san diego,startTime:1997,endTime:2001]
==>[id:7,key:location,value:santa cruz,startTime:2001,endTime:2004]
==>[id:8,key:location,value:brussels,startTime:2004,endTime:2005]
==>[id:9,key:location,value:santa fe,startTime:2005]
The new way:
gremlin> g.V().hasLabel("software").valueMap().with(WithOptions.tokens)
==>[id:10,label:software,name:[gremlin]]
==>[id:11,label:software,name:[tinkergraph]]
gremlin> g.V().has("person","name","marko").properties("location").valueMap().with(WithOptions.tokens)
==>[id:6,key:location,value:san diego,startTime:1997,endTime:2001]
==>[id:7,key:location,value:santa cruz,startTime:2001,endTime:2004]
==>[id:8,key:location,value:brussels,startTime:2004,endTime:2005]
==>[id:9,key:location,value:santa fe,startTime:2005]
Furthermore, now there’s a finer control over which of the tokens should be included:
gremlin> g.V().hasLabel("software").valueMap().with(WithOptions.tokens, WithOptions.labels)
==>[label:software,name:[gremlin]]
==>[label:software,name:[tinkergraph]]
gremlin> g.V().has("person","name","marko").properties("location").valueMap().with(WithOptions.tokens, WithOptions.values)
==>[value:san diego,startTime:1997,endTime:2001]
==>[value:santa cruz,startTime:2001,endTime:2004]
==>[value:brussels,startTime:2004,endTime:2005]
==>[value:santa fe,startTime:2005]
As shown above, the support of the with()
modulator for valueMap()
makes the valueMap(boolean)
overload
superfluous, hence this overload is now deprecated. This is a breaking API change, since valueMap()
will now always
yield instances of type Map<Object, Object>
. Prior this change only the valueMap(boolean)
overload yielded
Map<Object, Object>
objects, valueMap()
without the boolean parameter used to yield instances of type
Map<String, Object>
.
See: TINKERPOP-2059
Predicate Number Comparison
In previous versions within()
and without()
performed strict number comparisons; that means these predicates did
not only compare number values, but also the type. This was inconsistent with how other predicates (like eq
, gt
,
etc.) work. All predicates will now ignore the number type and instead compare numbers only based on their value.
Old behavior:
gremlin> g.V().has("age", eq(32L))
==>v[4]
gremlin> g.V().has("age", within(32L, 35L))
gremlin>
New behavior:
gremlin> g.V().has("age", eq(32L))
==>v[4]
gremlin> g.V().has("age", within(32L, 35L))
==>v[4]
==>v[6]
See: TINKERPOP-2058
ReferenceElementStrategy
Gremlin Server has had some inconsistent behavior in the serialization of the results it returns. Remote traversals
based on Gremlin bytecode always detach returned graph elements to "reference" (i.e. removes properties and only
include the id
and label
), but scripts would detach graph elements and include the properties. For 3.4.0,
TinkerPop introduces the ReferenceElementStrategy
which can be configured on a GraphTraversalSource
to always
detach to "reference".
gremlin> graph = TinkerFactory.createModern()
==>tinkergraph[vertices:6 edges:6]
gremlin> g = graph.traversal().withStrategies(ReferenceElementStrategy.instance())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> v = g.V().has('person','name','marko').next()
==>v[1]
gremlin> v.class
==>class org.apache.tinkerpop.gremlin.structure.util.reference.ReferenceVertex
gremlin> v.properties()
gremlin>
The packaged initialization scripts that come with Gremlin Server now pre-configure the sample graphs with this strategy to ensure that both scripts and bytecode based requests over any protocol (HTTP, websocket, etc) and serialization format all return a "reference". To revert to the old form, simply remove the strategy in the initialization script.
It is recommended that users choose to configure their GraphTraversalSource
instances with ReferenceElementStrategy
as working with "references" only is the recommended method for developing applications with TinkerPop. In the future,
it is possible that ReferenceElementStrategy
will be configured by default for all graphs on or off Gremlin Server,
so it would be best to start utilizing it now and grooming existing Gremlin and related application code to account
for it.
See: TINKERPOP-2075
Text Predicates
Gremlin now supports simple text predicates on top of the existing P
predicates. Both, the new TextP
text
predicates and the old P
predicates, can be chained using and()
and or()
.
gremlin> g.V().has("person","name", containing("o")).valueMap()
==>[name:[marko],age:[29]]
==>[name:[josh],age:[32]]
gremlin> g.V().has("person","name", containing("o").and(gte("j").and(endingWith("ko")))).valueMap()
==>[name:[marko],age:[29]]
See: TINKERPOP-2041
Changed Infix Behavior
The infix notation of and()
and or()
now supports an arbitrary number of traversals and ConnectiveStrategy
produces a traversal with proper AND and OR semantics.
Input: a.or.b.and.c.or.d.and.e.or.f.and.g.and.h.or.i
*BEFORE*
Output: or(a, or(and(b, c), or(and(d, e), or(and(and(f, g), h), i))))
*NOW*
Output: or(a, and(b, c), and(d, e), and(f, g, h), i)
Furthermore, previous versions failed to apply 3 or more and()
steps using the infix notation, this is now fixed.
gremlin> g.V().has("name","marko").and().has("age", lt(30)).or().has("name","josh").and().has("age", gt(30)).and().out("created")
==>v[1]
==>v[4]
See: TINKERPOP-2029
GraphBinary
GraphBinary is a new language agnostic, network serialization format designed to replace Gryo and GraphSON. At this time it is only available on the JVM, but support will be added for other languages in upcoming releases. The serializer has been configured in Gremlin Server’s packaged configuration files. The serializer can be configured using the Java driver as follows:
Cluster cluster = Cluster.build("localhost").port(8182).
serializer(Serializers.GRAPHBINARY_V1D0).create();
Client client = cluster.connect();
List<Result> r = client.submit("g.V().has('person','name','marko')").all().join();
Status Attributes
The Gremlin Server protocol allows for status attributes to be returned in responses. These attributes were typically for internal use, but were designed with extensibility in mind so that providers could place return their own attributes to calling clients. Unfortunately, unless the client was being used with protocol level requests (which wasn’t convenient) those attributes were essentially hidden from view. As of this version however, status attributes are fully retrievable for both successful requests and exceptions.
See: TINKERPOP-1913
with() Step
This version of TinkerPop introduces the with()
-step to Gremlin. It isn’t really a step but is instead a step
modulator. This modulator allows the step it is modifying to accept configurations that can be used to alter the
behavior of the step itself. A good example of its usage is shown with the revised syntax of the pageRank()
-step
which now uses with()
to replace the old by()
options:
g.V().hasLabel('person').
pageRank().
with(PageRank.edges, __.outE('knows')).
with(PageRank.propertyName, 'friendRank').
order().
by('friendRank',desc).
valueMap('name','friendRank')
A similar change was made for peerPressure()
-step:
g.V().hasLabel('person').
peerPressure().
with(PeerPressure.propertyName, 'cluster').
group().
by('cluster').
by('name')
Note that the by()
modulators still work, but should be considered deprecated and open for removal in a future
release where breaking changes are allowed.
shortestPath() Step
Calculating the shortest path between vertices is a common graph use case. While the traversal to determine a shortest path can be expressed in Gremlin, this particular problem is common enough that the feature has been encapsulated into its own step, demonstrated as follows:
gremlin> g.withComputer().V().has('name','marko').
......1> shortestPath().with(ShortestPath.target, has('name','peter'))
==>[v[1],v[3],v[6]]
connectedComponent() Step
In prior version of TinkerPop, it was recommended that the identification of Connected Component instances in a graph be computed by way of a reasonably complex bit of Gremlin that looked something like this:
g.V().emit(cyclicPath().or().not(both())).repeat(both()).until(cyclicPath()).
path().aggregate("p").
unfold().dedup().
map(__.as("v").select("p").unfold().
filter(unfold().where(eq("v"))).
unfold().dedup().order().by(id).fold()).
dedup()
The above approach had a number of drawbacks that included a large execution cost as well as incompatibilities in OLAP.
To simplify usage of this commonly use graph algorithm, TinkerPop 3.4.0 introduces the connectedComponent()
step
which reduces the above operation to:
g.withComputer().V().connectedComponent()
It is important to note that this step does require the use of a GraphComputer
to work, as it utilizes a
VertexProgram
behind the scenes.
io() Step
There have been some important changes to IO operations for reading and writing graph data. The use of Graph.io()
has been deprecated to further remove dependence on the Graph (Structure) API for users and to extend these basic
operations to GLV users by making these features available as part of the Gremlin language.
It is now possible to simply use Gremlin:
graph = ...
g = graph.traversal()
g.io(someInputFile).read().iterate()
g.io(someOutputFile).write().iterate()
While io()
-step is still single-threaded for OLTP style loading, it can be utilized in conjunction with OLAP which
internally uses CloneVertexProgram
and therefore any graph InputFormat
or OutputFormat
can be configured in
conjunction with this step for parallel loads of large datasets.
It is also worth noting that the io()
-step may be overridden by graph providers to utilize their native bulk-loading
features, so consult the documentation of the implementation being used to determine if there are any improved
efficiencies there.
Per Request Options
The Java driver now allows for various options to be set on a per-request basis via new overloads to submit()
that
accept RequestOption
instances. A good use-case for this feature is to set a per-request override to the
scriptEvaluationTimeout
so that it only applies to the current request.
Cluster cluster = Cluster.open();
Client client = cluster.connect();
RequestOptions options = RequestOptions.build().timeout(500).create();
List<Result> result = client.submit("g.V()", options).all().get();
See: TINKERPOP-1342
min() max() and Comparable
Previously min()
and max()
were only working for numeric values. This has been changed and these steps can now
operate over any Comparable
value. The common workaround was the combination of order().by()
and limit()
as
shown here:
gremlin> g.V().values('name').order().by().limit(1) // workaround for min()
==>josh
gremlin> g.V().values('name').order().by(decr).limit(1) // workaround for max()
==>vadas
Any attempt to use min()
or max()
on non-numeric values lead to an exception:
gremlin> g.V().values('name').min()
java.lang.String cannot be cast to java.lang.Number
Type ':help' or ':h' for help.
Display stack trace? [yN]
With the changes in this release these kind of queries became a lot easier:
gremlin> g.V().values('name').min()
==>josh
gremlin> g.V().values('name').max()
==>vadas
Nested Loop Support
Traversals now support nesting of repeat()
loops.
These can now be used to repeat another traversal while in a looped context, either inside the body of a repeat()
or
in its step modifiers (until()
or emit()
).
gremlin> g.V().repeat(__.in('traverses').repeat(__.in('develops')).emit()).emit().values('name')
==>stephen
==>matthias
==>marko
See: TINKERPOP-967
EventStrategy API
There were some minor modifications to how EventStrategy
is constructed and what can be expected from events raised
from the addition of new properties.
With respect to the change in terms of EventStrategy
construction, the detach()
builder method formerly took a
Class
as an argument and that Class
was meant to be one of the various "detachment factories" or null
. That
approach was a bit confusing, so that signature has changed to detach(EventStrategy.Detachment)
where the argument
is a more handy enum of detachment options.
As for the changes related to events themselves, it is first worth noting that the previously deprecated
vertexPropertyChanged(Vertex, Property, Object, Object…)
on MutationListener
has been removed for what should
have originally been the correct signature of vertexPropertyChanged(Vertex, VertexProperty, Object, Object…)
. In
prior versions when this method and its related edgePropertyChanged()
and vertexPropertyPropertyChanged()
were
triggered by way of the addition of a new property a "fake" property was included with a null
value for the
"oldValue" argument to these methods (as it did not exist prior to this event). That was a bit awkward to reason about
when dealing with that event. To make this easier, the event now raises with a KeyedVertexProperty
or
KeyedProperty
instance, which only contains a property key and no value in them.
Reducing Barrier Steps
The behavior of min()
, max()
, mean()
and sum()
has been modified to return no result if there’s no input.
Previously these steps yielded the internal seed value:
gremlin> g.V().values('foo').min()
==>NaN
gremlin> g.V().values('foo').max()
==>NaN
gremlin> g.V().values('foo').mean()
==>NaN
gremlin> g.V().values('foo').sum()
==>0
These traversals will no longer emit a result. Note, that this also affects more complex scenarios, e.g. if these
steps are used in by()
modulators:
gremlin> g.V().group().
......1> by(label).
......2> by(outE().values("weight").sum())
==>[software:0,person:3.5]
Since software vertices have no outgoing edges and thus no weight values to sum, software
will no longer show up in
the result. In order to get the same result as before, one would have to add a coalesce()
-step:
gremlin> g.V().group().
......1> by(label).
......2> by(outE().values("weight").sum())
==>[person:3.5]
gremlin> g.V().group().
......1> by(label).
......2> by(coalesce(outE().values("weight"), constant(0)).sum())
==>[software:0,person:3.5]
See: TINKERPOP-1777
Order of select() Scopes
The order of select scopes has been changed to: maps, side-effects, paths. Previously the order was: side-effects, maps, paths - which made it almost impossible to select a specific map entry if a side-effect with the same name existed.
The following snippets illustrate the changed behavior:
gremlin> g.V(1).
......1> group("a").
......2> by(__.constant("a")).
......3> by(__.values("name")).
......4> select("a")
==>[a:marko]
gremlin> g.V(1).
......1> group("a").
......2> by(__.constant("a")).
......3> by(__.values("name")).
......4> select("a").select("a")
==>[a:marko]
Above is the old behavior; the second select("a")
has no effect, it selects the side-effect a
again, although one
would expect to get the map entry a
. What follows is the new behavior:
gremlin> g.V(1).
......1> group("a").
......2> by(__.constant("a")).
......3> by(__.values("name")).
......4> select("a")
==>[a:marko]
gremlin> g.V(1).
......1> group("a").
......2> by(__.constant("a")).
......3> by(__.values("name")).
......4> select("a").select("a")
==>marko
See: TINKERPOP-1522
GraphSON BulkSet
In earlier versions of TinkerPop, BulkSet
was coerced to a List
for GraphSON which was convenient in that it
didn’t add a new data type to support, but inconvenient in that it meant that certain process tests were not consistent
in terms of how they ran and the benefits of the BulkSet
were "lost" in that the "bulk" was being resolved server
side. With the addition of BulkSet
as a GraphSON type the "bulk" is now resolved on the client side by the language
variant. How that resolution occurs depends upon the language variant. For Java, there is a BulkSet
object which
maintains that structure sent from the server. For the other variants, the BulkSet
is deserialized to a List
form
which results in a much larger memory footprint than what is contained the BulkSet
.
See: TINKERPOP-2111
Python Bindings
Bindings were formerly created using a Python 2-tuple as a bit of syntactic sugar, but all other language variants
used an explicit Bindings
object which gremlin-python
already had in place. To make all work variants behave
consistently, the 2-tuple syntax has been removed in favor of the explicit Bindings.of()
option.
g.V(Bindings.of('id',1)).out('created').map(lambda: ("it.get().value('name').length()", "gremlin-groovy")).sum()
See: TINKERPOP-2116
Deprecation and Removal
This section describes newly deprecated classes, methods, components and patterns of usage as well as which previously deprecated features have been officially removed or repurposed.
Moving of RemoteGraph
RemoteGraph
was long ago deprecated in favor of withRemote()
. It became even less useful with the introduction of
the AnonymousTraversalSource
concept in 3.3.5. It’s only real use was for testing remote bytecode based traversals
in the test suite as the test suite requires an actual Graph
object to function properly. As such, RemoteGraph
has
been moved to gremlin-test
. It should no longer be used in any capacity besides that.
See: TINKERPOP-2079
Removal of Giraph Support
Support for Giraph has been removed as of this version. There were a number of reasons for this decision which were discussed in the community prior to taking this step. Users should switch to Spark for their OLAP based graph-computing needs.
See: TINKERPOP-1930
Removal of Rebindings Options
The "rebindings" option is no longer supported for clients. It was deprecated long ago at 3.1.0. The server will not respond to them on any channel - websockets, nio or HTTP. Use the "aliases" option instead.
gremlin-server.sh -i Removal
The -i
option for installing dependencies in Gremlin Server was long ago deprecated and has now been removed. Please
use install
as its replacement going forward.
Deprecation Removal
The following deprecated classes, methods or fields have been removed in this version:
-
gremlin-core
-
org.apache.tinkerpop.gremlin.jsr223.ImportCustomizer#GREMLIN_CORE
-
org.apache.tinkerpop.gremlin.process.remote.RemoteGraph
- moved togremlin-test
-
org.apache.tinkerpop.gremlin.process.remote.RemoteConnection.submit(Traversal)
-
org.apache.tinkerpop.gremlin.process.remote.RemoteConnection.submit(Bytecode)
-
org.apache.tinkerpop.gremlin.process.remote.traversal.strategy.decoration.RemoteStrategy#identity()
-
org.apache.tinkerpop.gremlin.process.traversal.TraversalEngine
-
org.apache.tinkerpop.gremlin.process.traversal.engine.*
-
org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.PartitionStrategy.Builder#addReadPartition(String)
-
org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy.Builder#edgeCriterion(Traversal)
-
org.apache.tinkerpop.gremlin.process.traversal.strategy.decoration.SubgraphStrategy.Builder#vertexCriterion(Traversal)
-
org.apache.tinkerpop.gremlin.process.traversal.step.map.LambdaCollectingBarrierStep.Consumers
-
org.apache.tinkerpop.gremlin.process.traversal.step.util.HasContainer#makeHasContainers(String, P)
-
org.apache.tinkerpop.gremlin.process.traversal.step.util.event.MutationListener#vertexPropertyChanged(Vertex, Property, Object, Object…)
-
org.apache.tinkerpop.gremlin.structure.Element.Exceptions#elementAlreadyRemoved(Class, Object)
-
org.apache.tinkerpop.gremlin.structure.Graph.Exceptions#elementNotFound(Class, Object)
-
org.apache.tinkerpop.gremlin.structure.Graph.Exceptions#elementNotFound(Class, Object, Exception)
-
-
gremlin-driver
-
org.apache.tinkerpop.gremlin.driver.Client#rebind(String)
-
org.apache.tinkerpop.gremlin.driver.Client.ReboundClusterdClient
-
org.apache.tinkerpop.gremlin.driver.Tokens#ARGS_REBINDINGS
-
-
gremlin-groovy
-
org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.close()
- no longer implementsAutoCloseable
-
-
gremlin-server
-
org.apache.tinkerpop.gremlin.server.GraphManager#getGraphs()
-
org.apache.tinkerpop.gremlin.server.GraphManager#getTraversalSources()
-
org.apache.tinkerpop.gremlin.server.Settings#serializedResponseTimeout
-
org.apache.tinkerpop.gremlin.server.Settings.AuthenticationSettings#className
-
org.apache.tinkerpop.gremlin.server.handler.OpSelectorHandler(Settings, GraphManager, GremlinExecutor, ScheduledExecutorService)
-
org.apache.tinkerpop.gremlin.server.op.AbstractOpProcessor#makeFrame(ChannelHandlerContext, RequestMessage, MessageSerializer serializer, boolean, List, ResponseStatusCode code)
-
-
hadoop-graph
-
org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration#getGraphInputFormat()
-
org.apache.tinkerpop.gremlin.hadoop.structure.HadoopConfiguration#getGraphOutputFormat()
-
Please see the javadoc deprecation notes or upgrade documentation specific to when the deprecation took place to understand how to resolve this breaking change.
Deprecated GraphSONMessageSerializerGremlinV2d0
The GraphSONMessageSerializerGremlinV2d0
serializer is now analogous to GraphSONMessageSerializerV2d0
and therefore
redundant. It has technically always been equivalent in terms of functionality as both serialized to the same format
(i.e. GraphSON 2.0 with embedded types). It is no longer clear why these two classes were established this way, but
it does carry the negative effect where multiple serializer versions could not be bound to Gremlin Server’s HTTP
endpoint as the MIME types conflicted on application/json
. By simply making both message serializers support
application/json
and application/vnd.gremlin-v2.0+json
, it then became possible to overcome that limitation. In
short, prefer use of GraphSONMessageSerializerV2d0
when possible.
Note that this is a breaking change in the sense that GraphSONMessageSerializerV2d0
will no longer set the header of
requests messages to application/json
. As a result, older versions of Gremlin Server not configured with
GraphSONMessageSerializerGremlinV2d0
will not find a deserializer to match the request.
See: TINKERPOP-1984
Removed groovy-sql Dependency
Gremlin Console and Gremlin Server no longer include groovy-sql. If you depend on groovy-sql, you can install it in Gremlin Console or Gremlin Server using the plugin system.
Console:
:install org.codehaus.groovy groovy-sql 2.5.2
Server:
bin/gremlin-server.sh install org.codehaus.groovy groovy-sql 2.5.2
If your project depended on groovy-sql transitively, simply include it in your project’s build file (e.g. maven: pom.xml).
See: TINKERPOP-2037
Upgrading for Providers
Graph Database Providers
io() Step
The new io()
-step that was introduced provides some new changes to consider. Note that Graph.io()
has been
deprecated and users are no longer instructed to utilize that method. It is not yet decided when that method will be
removed completely, but given the public nature of it and the high chance of common usage, it should be hanging around
for some time.
As with any step in Gremlin, it is possible to replace it with a more provider specific implementation that could be
more efficient. Developing a TraversalStrategy
to do this is encouraged, especially for those graph providers who
might have special bulk loaders that could be abstracted by this step. Examples of this are already shown with
HadoopGraph
which replaces the simple single-threaded loader with CloneVertexProgram
. Graph providers are
encouraged to use the with()
step to capture any necessary configurations required for their underlying loader to
work. Graph providers should not feel restricted to graphson
, gryo
and graphml
formats either. If a graph
supports CSV or some custom graph specific format, it shouldn’t be difficult to gather the configurations necessary to
make that available to users.
See: TINKERPOP-1996
Caching Graph Features
For graph implementations that have expensive creation times, it can be time consuming to run the TinkerPop test suite
as each test run requires a Graph
instance even if the test is ultimately ignored becaue it doesn’t pass the feature
checks. To possibly help alleviate this problem, the GraphProvider
interface now includes this method:
public default Optional<Graph.Features> getStaticFeatures() {
return Optional.empty();
}
This method can be implemented to return a cacheable set of features for a Graph
generated from that GraphProvider
.
Assuming this method is faster than the cost of creating a new Graph
instance, the test suite should execute
significantly faster depending on how many tests end up being ignored.
See: TINKERPOP-1518
Configuring Interface
There were some changes to interfaces that were related to Step
. A new Configuring
interface was added that was
helpful in the implementation of the with()
-step modulator. This new interface extends the Parameterizing
interface
(which moved to the org.apache.tinkerpop.gremlin.process.traversal.step
package with the other step interfaces) and
in turn is extended by the Mutating
interface. Making this change meant that the Mutating.addPropertyMutations()
method could be removed in favor of the new Configuring.configure()
method.
All of the changes above basically mean, that if the Mutating
interface was being used in prior versions, the
addPropertyMutations()
method simply needs to be changed to configure()
.
See: TINKERPOP-1975
OptionsStrategy
OptionsStrategy
is a TraversalStrategy
that makes it possible for users to set arbitrary configurations on a
Traversal
. These configurations can be used by graph providers to allow for traversal-level configurations to be
accessible to their custom steps. A user would write something like:
g.withStrategies(OptionsStrategy.build().with("specialLimit", 10000).create()).V();
The OptionsStrategy
is really only the carrier for the configurations and while users can choose to utilize that
more verbose method for constructing it shown above, it is more elegantly constructed as follows using with()
-step:
g.with("specialLimit", 10000)).V();
The graph provider could then access that value of "specialLimit" in their custom step (or elsewhere) as follows:
OptionsStrategy strategy = this.getTraversal().asAdmin().getStrategies().getStrategy(OptionsStrategy.class).get();
int specialLimit = (int) strategy.getOptions().get("specialLimit");
See: TINKERPOP-2053
Removed hadoop-gremlin Test Artifact
The hadoop-gremlin
module no longer generates a test jar that can be used as a test dependency in other modules.
Generally speaking, that approach tends to be a bad practice and can cause build problems with Maven that aren’t always
obvious to troubleshoot. With the removal of giraph-gremlin
for 3.4.0, it seemed even less useful to have this
test artifact present. All tests are still present. The follow provides a basic summary of how this refactoring
occurred:
-
A new
AbstractFileGraphProvider
was created ingremlin-test
which provided a lot of the features thatHadoopGraphProvider
was exposing. BothHadoopGraphProvider
andSparkHadoopGraphProvider
extend from that class now. -
ToyIoRegistry
and related classes were moved togremlin-test
. -
The various tests that validated capabilities of
Storage
have been moved tospark-gremlin
and are part of those tests now. Obviously, that makes those tests specific to Spark testing now. If that location creates a problem for some reason, that decision can be revisited at some point.
See: TINKERPOP-1410
TraversalEngine Moved
The TraversalEngine
interface was deprecated in 3.2.0 along with all related methods that used it and classes that
implemented it. It was replaced by the Computer
interface and provided a much nicer way to plug different
implementations of Computer
into a traversal. TraversalEngine
was never wholly removed however as it had some deep
dependencies in the inner workings of the test suite. That infrastructure has largely remained as is until now.
As of 3.4.0, TraversalEngine
is no longer in gremlin-core
and can instead be found in gremlin-test
as it is
effectively a "test-only" component and serves no other real function. As explained in the javadocs going back to
3.2.0, providers should implement the Computer
class and use that instead. At this point, graph providers should have
long ago moved to the Computer
infrastructure as methods for constructing a TraversalSource
with a
TraversalEngine
were long ago removed.
See: TINKERPOP-1143
Upsert Graph Feature
Some Graph
implementations may be able to offer upsert functionality for vertices and edges, which can help improve
usability and performance. To help make it clear to users that a graph operates in this fashion, the supportsUpsert()
feature has been added to both Graph.VertexFeatures
and Graph.EdgeFeatures
. By default, both of these methods will
return false
.
Should a provider wish to support this feature, the behavior of addV()
and/or addE()
should change such that when
a vertex or edge with the same identifier is provided, the respective step will insert the new element if that value
is not present or update an existing element if it is found. The method by which the provider "identifies" an element
is completely up to the capabilities of that provider. In the most simple fashion, a graph could simply check the
value of the supplied T.id
, however graphs that support some form of schema will likely have other methods for
determining whether or not an existing element is present.
The extent to which TinkerPop tests "upsert" is fairly narrow. Graph providers that choose to support this feature should consider their own test suites carefully to ensure appropriate coverage.
See: TINKERPOP-1685
TypeTranslator Changes
The TypeTranslator
experienced a change in its API and GroovyTranslator
a change in expectations.
TypeTranslator
now implements BiFunction
and takes the graph traversal source name as an argument along with the
object to translate. This interface is implemented by default for Groovy with GroovyTranslator.DefaultTypeTranslator
which encapsulates all the functionality of what GroovyTranslator
formerly did by default. To provide customize
translation, simply extend the DefaultTypeTranslator
and override the methods.
GroovyTranslator
now expects that the TypeTranslator
provide to it as part of its of()
static method overload
is "complete" - i.e. that it provides all the functionality to translate the types passed to it. Thus, a "complete"
TypeTranslator
is one that does everything that DefaultTypeTranslator
does as a minimum requirement. Therefore,
the extension model described above is the easiest way to get going with a custom TypeTranslator
approach.
See: TINKERPOP-2072
Graph Driver Providers
Deprecation Removal in RemoteConnection
The two deprecated synchronous submit()
methods on the RemoteConnection
interface have been removed, which means
that RemoteConnection
implementations will need to implement submitAsync(Bytecode)
if they have not already done
so.
See: TINKERPOP-2103
TinkerPop 3.3.0
Gremlin Symphony #40 in G Minor
TinkerPop 3.3.11
Release Date: June 1, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GLV Sessions
While TinkerPop doesn’t recommend the use of sessions for most use cases, it does remain a feature that is available and exposed on the server. As such, providing support in all Gremlin Language Variants for this feature is useful in ensuring a consistent implementation for all programming languages.
const client = new Client('ws://localhost:8182/gremlin', { traversalSource: 'g', 'session': 'unique-string-id' });
client = Client('ws://localhost:8182/gremlin', 'g', session=str(uuid.uuid4()))
var gremlinServer = new GremlinServer("localhost", 8182);
var client = new GremlinClient(gremlinServer, sessionId: Guid.NewGuid().ToString()))
Deprecate maxWaitForSessionClose
The maxWaitForSessionClose
setting for the Java driver has been deprecated and in some sense replaced by the
maxWaitForClose
setting. The two settings perform different functions, but expect maxWaitForSessionClose
to be
removed in future versions. The maxWaitForClose
performs a more useful function than maxWaitForSessionClose
in
the sense that it tells the driver how long it should wait for pending messages from the server before closing the
connection. The maxWaitForSessionClose
on the other hand is how long the driver should wait for the server to
respond to a session close message (i.e. an actual response from the server). Waiting for that specific response to the
session close message could result in the driver hanging on calls to Client.close()
if there is a long run query
running on the server and close message is stacked behind that in queue.
Future versions will remove support for that particular message and simply close the session when the connection is closed. As a result that setting will no longer be useful. The old setting is really only useful for connecting to older versions of the server prior to 3.3.11 that do not have the session shutdown hook bound to the close of the connection.
See: TINKERPOP-2336
Upgrading for Providers
Gremlin Driver Providers
Session Close
The "close" message for the SessionOpProcessor
is deprecated, however the functionality to accept the message remains
in Gremlin Server and the functionality to send the message remains in the Java driver. The expectation is that
support for the message will be removed from the driver in a future release, likely at 3.5.0. Server implementations
starting at 3.3.11 should look to use the close of a connection to trigger the close of a session and its release of
resources.
See: TINKERPOP-2336
TinkerPop 3.3.10
Release Date: February 3, 2020
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Traversal Clone
Once a traversal has been executed (i.e. iterated) it’s internal state is such that it cannot be re-used:
gremlin> t = g.V().count()
==>6
gremlin> t
gremlin>
To re-use a traversal it must be copied with clone()
as follows:
gremlin> t = g.V().count()
==>6
gremlin> t.clone()
==>6
gremlin> t.clone()
==>6
The ability to clone()
has been exclusive to Java and was a missing component of other supported languages like
Python, Javascript and .NET. This feature has now been added for all the language variants making the ability to
re-use traversals consistent in all ecosystems.
See: TINKERPOP-2315
Deprecated Jython Support
Jython support in gremlin-python
has been deprecated to focus on native Python 3.x support for 3.5.0 where Jython
support will be wholly removed. This change does mean that TinkerPop will no longer support the ability to submit
lambdas as native Python scripts. They will need to be submitted as Groovy and the library will be defaulted to do as
such.
See: TINKERPOP-2322
TinkerPop 3.3.9
Release Date: October 14, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
ReservedKeysVerificationStrategy
ReservedKeysVerificationStrategy
is a new VerificationStrategy
that can be used to help prevent traversals from
adding property keys that are protected. They may be protected as a result of the key being a reserved keyword of the
underlying graph system or they key may simply violate some standard conventions.
gremlin> g.withStrategies(ReservedKeysVerificationStrategy.build().throwException().create()).addV('person').property("id",123)
The provided traversal contains a AddVertexStartStep that is setting a property key to a reserved word: id
Type ':help' or ':h' for help.
Display stack trace? [yN]
Javascript ResponseError
Gremlin Javascript now enables more robust error handling by way of a ResponseError
which provides access to more
information from the server. Specifically, it includes the statusMessage
and statusCode
which were formerly packed
into the Error.message
property, which meant that the error message string had to be parsed if there was a need to
take a specific action based on that information. The ResponseError
also includes the statusAttributes
which
is a Map
object that will typically incorporate server exceptions
and stackTrace
keys, but could also include
provider specific error information.
The original error messaging has remained unchanged and therefore users who were message parsing should not expect
changes in behavior, however, future versions will eliminate the "overuse" of the Error.message
property, so it is
advised that users update their code to take advantage of the ResponseError
features.
See: TINKERPOP-2285
Java Driver NoHostAvailableException
Expect a NoHostAvailableException
rather than a more generic TimeoutException
if the Java driver is unable to
connect to a Host
. This sort of failure can occur in a number of different scenarios, but can often occur when there
are configuration problems with authentication and SSL, preventing the connection to the Host
to be established.
Deprecated scriptEvaluationTimeout
The scriptEvaluationTimeout
controls the length of time a request is processed by Gremlin Server (or at its core,
the GremlinExecutor
). Of course, with the introduction of bytecode-based requests many versions ago, this naming
has failed to make complete sense in more recent times. Therefore, scriptEvaluationTimeout
has been deprecated and
replaced by evaluationTimeout
. Both configurations are still respected, but it is advised that users switch to
evaluationTimeout
as scriptEvaluationTimeout
will be removed in a later version.
Note that when configuring Gremlin Server’s evaluationTimeout
that the scriptEvaluationTimeout
should be set to
-1
(the default) or else it will use that value in its initialization of the server and ignore the
evaluationTimeout
.
See: TINKERPOP-2213
TinkerPop 3.3.8
Release Date: August 5, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Branch Steps Accept Predicates and Traversals
Prior to this version, branch steps (in particular BranchStep
and ChooseStep
) could only handle constant values and Pick
tokens. Starting in this version, these steps will also accept
predicates and traversals as show in the example below.
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V().hasLabel("person").
......1> group().
......2> by("name").
......3> by(branch(values("age")).
......4> option(29, constant("almost old")).
......5> option(__.is(32), constant("looks like josh")).
......6> option(lt(29), constant("pretty young")).
......7> option(lt(35), constant("younger than peter")).
......8> option(gte(30), constant("pretty old")).
......9> option(none, constant("mysterious")).fold()).
.....10> unfold()
==>peter=[pretty old]
==>vadas=[pretty young, younger than peter]
==>josh=[looks like josh, younger than peter, pretty old]
==>marko=[almost old, younger than peter]
See: TINKERPOP-1084
Python DateTime
With GraphSON, the g:Date
is meant to represent a millisecond-precision offset from the unix epoch, but earlier
version of Gremlin Python were using the timezone of the local system to handle serialization and deserialization,
thus resulting in incorrect conversion. This issue is now resolved. It may be necessary to remove workarounds that have
been introduced to combat this problem.
See: TINKERPOP-2264
JavaScript withComputer()
Gremlin-Javascript now supports withComputer()
syntax, which means that it is now possible in Javascript to utilize
Gremlin steps that require a GraphComputer
to execute (e.g. pageRank()
and peerPressure()
).
See: TINKERPOP-2251
JavaScript and .NET hasNext()
There is now greater consistency across Gremlin Language Variants with hasNext()
support added to both JavaScript
and .NET. Note that in .NET, the proper method name follows C# capitalization semantics and is referred to as
HasNext()
.
See: TINKERPOP-1921
Deprecated getSideEffects()
Traversal.getSideEffects()
has been deprecated for purposes of external calls by end-users. While the method is still
present TinkerPop no longer guarantees its existence in future versions or consistency of its behavior, especially for
Gremlin Language Variants and remote traversal execution. If this method is currently in use to gather side-effect
results after traversal execution, please change such code to use cap()
-step. For example, code like this:
gremlin> t = g.V().hasLabel('person').aggregate('p').out('created')
==>v[3]
==>v[5]
==>v[3]
==>v[3]
gremlin> t.getSideEffects().get('p')
==>v[1]
==>v[2]
==>v[4]
==>v[6]
should be converted to something like:
gremlin> g.V().hasLabel('person').aggregate('p').out('created').union(fold(),cap('p'))
==>[v[3],v[3],v[3],v[5]]
==>[v[1],v[2],v[4],v[6]]
EventStrategy
Prior TinkerPop 3.3.6 EventStrategy
did not work with multi-properties. The EventStrategy
behavior for single-valued properties has not changed; if a property is added to a multi-valued
VertexProperty
, then a VertexPropertyChangedEvent
will be now be fired. The arguments passed to the event depend on the cardinality type.
|
Since properties will always be added and never be overwritten, the old property passed to the change event will always be an empty property. |
|
The old property passed to the change event will be empty if no other property with the same value exists, otherwise, it will be the existing property. |
See: TINKERPOP-2159
TinkerPop 3.3.7
Release Date: May 28, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
JavaScript DSL Pattern
Gremlin DSL developers had to rely on monkey patching to build a Domain Specific Language in JavaScript. With this
latest version, the process is a bit more elegant and largely handled by extending existing core classes of
Gremlin-JavaScript
which is a bit more in-line with other language variant patterns for DSL development.
Gremlin Console Interrupt
There are occasions where a traversal (or other evaluation) takes longer than expected to complete. Common examples of these situations typically include:
-
Accidentally executing
g.V()
(i.e. full graph scan) on a very large graph as an OLTP style traversal. -
Not terminating a
repeat()
properly which results in an infinite loop. -
Bad data might create a cycle in the graph where one wasn’t expected.
-
Failing to properly bound a traversal’s path through one or more supernodes.
In earlier versions, the only option was to kill the Gremlin Console (typically with ctrl+c
), but as of this version
ctrl+c
will be intercepted and attempt to interrupt whatever process is currently executing.
See TINKERPOP-2181
Removed gperfutils Dependency
Gremlin Console included references to:
<dependency>
<groupId>org.gperfutils</groupId>
<artifactId>gbench</artifactId>
</dependency>
<dependency>
<groupId>org.gperfutils</groupId>
<artifactId>gprof</artifactId>
</dependency
to provide some benchmarking and profiling tools for the Utilities Plugin. Neither project is well maintained at this point and given that TinkerPop has moved past the versions of Groovy that are supported by these tools it doesn’t make sense to retain the dependencies. Users who wish to continue to use these tools can obviously do so still by manually adding the dependencies to the Gremlin Console or with the following command:
gremlin> :install org.gperfutils gbench <version>
gremlin> :install org.gperfutils gprof <version>
and then import the appropriate classes and methods to use.
See: TINKERPOP-2182
Upgrading for Providers
Graph Database Providers
Detection of Anti-Patterns
This release adds a strategy named EdgeLabelVerificationStrategy
. The strategy will not be added by default to the traversal source, however, providers can add it explicitly to encourage (or enforce)
users to always specify at least one edge label. EdgeLabelVerificationStrategy
can be configured to either throw an exception if no edge label was specified, log a warning or do both.
gremlin> // throw an exception if edge label was not specified
gremlin> g = TinkerFactory.createModern().traversal().withStrategies(EdgeLabelVerificationStrategy.build().throwException().create())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).out('knows')
==>v[2]
==>v[4]
gremlin> g.V(1).out()
The provided traversal contains a vertex step without any specified edge label: VertexStep(OUT,vertex)
Type ':help' or ':h' for help.
Display stack trace? [yN]
gremlin> // log a warning if edge label was not specified (Log4j has to be configured properly)
gremlin> g = TinkerFactory.createModern().traversal().withStrategies(EdgeLabelVerificationStrategy.build().logWarning().create())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.V(1).out('knows')
==>v[2]
==>v[4]
gremlin> g.V(1).out()
WARN org.apache.tinkerpop.gremlin.process.traversal.strategy.verification.EdgeLabelVerificationStrategy - The provided traversal contains a vertex step without any specified edge label: VertexStep(OUT,vertex)
==>v[3]
==>v[2]
==>v[4]
See: TINKERPOP-2191
TinkerPop 3.3.6
Release Date: March 18, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Anti-Patterns Documentation
The Gremlin Recipes documentation now has a new section which discusses anti-patterns in Gremlin. Its subsections detail commonly seen approaches to the Gremlin language which can lead to poorly written and/or under-performing traversals. The new Anti-Patterns Section can be found here.
See: TINKERPOP-2114
TinkerPop 3.3.5
Release Date: January 2, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
AnonymousTraversalSource
The AnonymousTraversalSource
provides for a more unified syntax for TraversalSource
construction by placing greater
user emphasis on the creation of the source rather than the Graph
it is connected to. It has a number of different
static traversal()
methods available and when imported as:
import static org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource.traversal;
allows TraversalSource
construction syntax such as:
gremlin> g = traversal().withGraph(TinkerFactory.createModern())
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g = traversal().withRemote('conf/remote-graph.properties')
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> g = traversal().withRemote(DriverRemoteConnection.using('localhost',8182))
==>graphtraversalsource[emptygraph[empty], standard]
Typically, this syntax is used for "remote" traversal construction for bytecode based requests, but has the option to
bind a local Graph
instance to it as well. It doesn’t save much typing to do so obviously, so it may not be best
used in that situation. Python, Javascript and .NET have similar syntax.
See: TINKERPOP-2078
Bytecode Command
The Gremlin Console now has a new :bytecode
command to help users work more directly with Gremlin bytecode. The
command is more of a debugging tool than something that would be used for every day purposes. It is sometimes helpful
to look at Gremlin bytecode directly and the process for viewing it in human readable format is not a single step
process. It is also not immediately clear how to convert bytecode to a Gremlin string. The :bytecode
command aims to
help with both of these issues:
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> :bytecode from g.V().out('knows') //1
==>{"@type":"g:Bytecode","@value":{"step":[["V"],["out","knows"]]}}
gremlin> :bytecode translate g {"@type":"g:Bytecode","@value":{"step":[["V"],["out","knows"]]}} //2
==>g.V().out("knows")
Configurable Class Map Cache
The "class map" cache in Gremlin Server (specifically the GremlinGroovyScriptEngine
) that holds compiled scripts is
now fully configurable via the GroovyCompilerGremlinPlugin.classMapCacheSpecification
.
RangeStep Optimizing Strategy
A new strategy named EarlyLimitStrategy
was added. The strategy will try to find a better spot for any RangeStep
,
which is as early as possible in the traversal. If possible it will also merge multiple RangeStep`s into a single one
by recalculating the range for the first step and removing the second. If it turns out that the merge of two steps won’t
produce a valid range (an empty result), then the `EarlyLimitStrategy
will remove the RangeStep
instances and
insert a NoneStep
instead.
This strategy is particularly useful when a provider implementation generates the queries to the underlying database. By making sure that the ranges are applied as early as possible, we can ensure that the underlying database is only asked for the least amount of data necessary to continue the traversal evaluation.
Upgrading for Providers
Graph Database Providers
OptOut on GraphProvider
It is not uncommon for those utilizing the TinkerPop test suite to have multiple configurations of their graph under
test. The multiple configurations typically manifest as multiple GraphProvider
implementations which supply the
different configurations to test. It is sometimes the case, that a particular Graph
configuration cannot support all
of the tests in the suite at which point some less than straightforward approaches to dealing with that present as
solutions.
It has always been possible to apply an OptOut
annotation to a Graph
instance, to avoid a particular test
execution. It is now possible to apply that same OptOut
to a GraphProvider
instance for that same purpose.
Hopefully, this feature will make multiple configuration testing easier.
TinkerPop 3.3.4
Release Date: October 15, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Introducing Order.asc and Order.desc
The Order
enum originally introduced incr
for ascending order and decr
for descending order. It’s not clear why
they were named this way when common querying parlance would call for asc
and desc
for those respective cases. Note
that incr
and decr
have not been removed - just deprecated and thus marked for future removal. Prefer asc
and
desc
going forward when writing Gremlin and look to update existing code using the deprecated values.
See: TINKERPOP-1956
TimedInterrupt
In Gremlin Server, it is best not to use the timedInterrupt
option on GroovyCompilerGremlinPlugin
because it
can compete with the scriptEvaluationTimeout
setting and produce a different error path. Simply rely on
scriptEvaluationTimeout
as it covers both script evaluation time and result iteration time.
See: TINKERPOP-1778
TinkerPop 3.3.3
Release Date: May 8, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Credential DSL Changes
The Credential DSL has been modified to work as a standard Java-based Gremlin DSL. The now deprecated old approach used a "graph wrapping" style that was developed long before the recommended method for building DSLs was published. Under this new model, the DSL is initialized via traversal as follows:
CredentialTraversalSource credentials = graph.traversal(CredentialTraversalSource.class)
credentials.user("stephen","password").iterate()
credentials.users("stephen").valueMap().next()
credentials.users().count().next()
credentials.users("stephen").drop().iterate()
TinkerPop 3.3.2
Release Date: April 2, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Python Sets
Graph traversals that return a Set
from Java are now coerced to a List
in Python. This change ensures that Python
results match Java results for the same traversal. It is possible to see this problem in prior versions of
gremlin-python where a Set
of numbers of different types are returned. In Java, a set of:
[1,1.0d,2,2.0d]
would be deserialized to the following in Python:
[1,2]
Now that the Java Set
is coerced to a List
in Gremin Python, the Java Set
can be fully represented. Users who
require a Set
will need to manually convert their List
to a Set
.
See: TINKERPOP-1844
TinkerPop 3.3.1
Release Date: December 17, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Python path()
There was a bug in GraphSON 3.0 serialization that prevented proper handling of results contain Path
object. As a
result, traversals that used and returned results from the path()
-step in Python would return unusable results,
but did not actually cause an exception condition. This problem is now resolved.
See: TINKERPOP-1799
Added math()
-step for Scientific Traversal Computing
GraphTraversal.math(String)
was added. This step provides scientific calculator capabilities to a Gremlin traversal.
gremlin> g.V().as('a').out('knows').as('b').math('a + b').by('age')
==>56.0
==>61.0
gremlin> g.V().as('a').out('created').as('b').
......1> math('b + a').
......2> by(both().count().math('_ + 100')).
......3> by('age')
==>132.0
==>133.0
==>135.0
==>138.0
gremlin> g.withSack(1).V(1).repeat(sack(sum).by(constant(1))).times(10).emit().sack().math('sin _')
==>0.9092974268256817
==>0.1411200080598672
==>-0.7568024953079282
==>-0.9589242746631385
==>-0.27941549819892586
==>0.6569865987187891
==>0.9893582466233818
==>0.4121184852417566
==>-0.5440211108893698
==>-0.9999902065507035
See: TINKERPOP-1632
Changed Typing on from()
and to()
The from()
and to()
-steps of GraphTraversal
have a Traversal<E,Vertex>
overload. The E
has been changed to ?
in order to reduce < >
-based coersion in strongly type Gremlin language variants.
See: TINKERPOP-1793
addV(traversal) and addE(traversal)
The GraphTraversal
and GraphTraversalSource
methods of addV()
and addE()
have been extended to support dynamic
label determination upon element creation. Both these methods can take a Traversal<?, String>
where the first String
returned by the traversal is used as the label of the respective element.
gremlin> g = TinkerFactory.createModern().traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.addV(V().has('name','marko').label()).
property('name','stephen')
==>v[13]
gremlin> g.V().has('name','stephen').valueMap(true)
==>[name:[stephen],label:person,id:13]
gremlin> g.V().has('name','stephen').
addE(V().hasLabel('software').inE().label()).
to(V().has('name','lop'))
==>e[15][13-created->3]
gremlin> g.V().has('name','stephen').outE().valueMap(true)
==>[label:created,id:15]
gremlin>
See: TINKERPOP-1793
PageRankVertexProgram
There were two major bugs in the way in which PageRank was being calculated in PageRankVertexProgram
. First, teleportation
energy was not being distributed correctly amongst the vertices at each round. Second, terminal vertices (i.e. vertices
with no outgoing edges) did not have their full gathered energy distributed via teleportation.
For users upgrading, note that while the relative rank orders will remain "the same," the actual PageRank values will differ from prior TinkerPop versions.
VERTEX iGRAPH TINKERPOP
marko 0.1119788 0.11375485828040575
vadas 0.1370267 0.14598540145985406
lop 0.2665600 0.30472082661863686
josh 0.1620746 0.14598540145985406
ripple 0.2103812 0.1757986539008437
peter 0.1119788 0.11375485828040575
Normalization preserved through computation:
0.11375485828040575 +
0.14598540145985406 +
0.30472082661863686 +
0.14598540145985406 +
0.1757986539008437 +
0.11375485828040575
==>1.00000000000000018
Two other additions to PageRankVertexProgram
were provided as well.
-
It now calculates the vertex count and thus, no longer requires the user to specify the vertex count.
-
It now allows the user to leverage an epsilon-based convergence instead of having to specify the number of iterations to execute.
See: TINKERPOP-1783
IO Defaults
While 3.3.0 released Gryo 3.0 and GraphSON 3.0 and these versions were defaulted in a number of places, it seems that
some key defaults were missed. Specifically, calls to Graph.io(graphson())
and Graph.io(gryo())
were still using
the old versions. The defaults have now been changed to ensure 3.0 is properly referenced in those cases.
Upgrade Neo4j
See Neo4j’s 3.2 Upgrade FAQ for a complete guide on how to upgrade from the previous 2.3.3 version. Also note that many of the configuration settings have changed from neo4j 2x to 3x
In particular, these properties referenced in TinkerPop documentation and configuration were renamed:
Neo4j 2.3 (TinkerPop <= 3.3.0) | Neo4j 3.2 (TinkerPop 3.3.1) |
---|---|
node_auto_indexing |
dbms.auto_index.nodes.enabled |
relationship_auto_indexing |
dbms.auto_index.relationships.enabled |
ha.cluster_server |
ha.host.coordination |
ha.server |
ha.host.data |
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
IO Version Check
In the Graph.io()
method, providers are to bootstrap the Io
instance returned with their own custom serializers
typically provided through a custom IoRegistry
instance. Prior to this change it was not possible to easily determine
the version of Io
that was expected (nor was it especially necessary as TinkerPop didn’t have breaking format changes
between versions). As of 3.3.0 however, there could be IO test incompatibilities for some providers who need to
register a different IoRegistry
instance depending on the version the user wants.
To allow for that check, the Io
interface now has the following method:
public <V> boolean requiresVersion(final V version);
which allows the graph provider to check if a specific GryoVersion
or GraphSONVersion
is required. Using that
information, the provider could then assign the right IoRegistry
to match that.
See: TINKERPOP-1767
TinkerPop 3.3.0
Release Date: August 21, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Packaged Data Files
TinkerPop has always packaged sample graphs with its zip distributions. As of 3.3.0, the distributions will only include Gryo 3.0, GraphSON 3.0 and GraphML (which is unversioned) files. Other versions are not included, but could obviously be generated using the IO API directly.
GraphTraversal Has-Methods Re-Organized
GraphTraversal.hasXXX()
, where XXX
is Id
, Label
, Key
, Value
, was faulty in that they relied on calling an
intermediate method for flattening Object[]
arguments and thus, yielding a non 1-to-1 correspondence between GraphTraversal
and Bytecode
. This has been remedied. Most users will not notice this change. Perhaps only some users that may use
Java reflection over GraphTraversal
might have a simple problem.
See: TINKERPOP-1520
Changes to IO
Gryo 3.0
With Gryo, TinkerPop skips version 2.0 and goes right to 3.0 (to maintain better parity with GraphSON versioning). Gryo 3.0 fixes a number of inconsistencies with Gryo 1.0 and hopefully marks a point where Gryo is better versioned over time. Gryo 3.0 is not compatible with Gryo 1.0 and is now the default version of Gryo exposed by TinkerPop in Gremlin Server and IO.
It isn’t hard to switch back to use of Gryo 1.0 if necessary. Here is the approach for writing an entire graph:
Graph graph = TinkerFactory.createModern();
GryoMapper mapper = graph.io(IoCore.gryo()).mapper().version(GryoVersion.V1_0).create()
try (OutputStream os = new FileOutputStream("tinkerpop-modern.json")) {
graph.io(IoCore.gryo()).writer().mapper(mapper).create().writeGraph(os, graph)
}
final Graph newGraph = TinkerGraph.open();
try (InputStream stream = new FileInputStream("tinkerpop-modern.json")) {
newGraph.io(IoCore.gryo()).reader().mapper(mapper).create().readGraph(stream, newGraph);
}
Gremlin Server configurations don’t include Gryo 1.0 by default:
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/vnd.gremlin-v3.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} # application/vnd.gremlin-v3.0+gryo-stringd
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/json
but adding an entry as follows will add it back:
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV1d0] }} # application/vnd.gremlin-v1.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/vnd.gremlin-v3.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} # application/vnd.gremlin-v3.0+gryo-stringd
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/json
To use Gryo 1.0 with the Java driver, just specify the 1.0 serializer directly:
GryoMapper.Builder builder = GryoMapper.build().
version(GryoVersion.V1_0).
addRegistry(TinkerIoRegistryV1d0.instance());
Cluster cluster = Cluster.build().serializer(GryoMessageSerializerV1d0(builder));
See: TINKERPOP-1698
GraphSON 3.0
GraphSON 3.0 finishes what GraphSON 2.0 began by taking the extra step to include the following types: g:Map
,
g:List
and g:Set
. With these types it is now possible to get expected Gremlin results in GLVs just as one would
if using Java. This is especially true of the g:Map
type, which allows non-string keys values, something not allowed
in regular JSON maps. This allows for common traversals like g.V().groupCount()
to work, where the traversal groups
on a Vertex
or some other complex object.
Note that GraphSON 3.0 does not have an option to be without types. This was a feature of 1.0 and 2.0, but it is no longer supported. There is little point to such a feature as we see more movement toward GLVs, which require types, and less usage of scripts with custom parsing of results.
Both TinkerGraph and Gremlin Server have been defaulted to work with GraphSON 3.0. For TinkerGraph this means that the following commands:
Graph graph = TinkerFactory.createModern();
graph.io(IoCore.graphson()).writeGraph("tinkerpop-modern.json");
final Graph newGraph = TinkerGraph.open();
newGraph.io(IoCore.graphson()).readGraph("tinkerpop-modern.json");
will write and read GraphSON 3.0 format rather than 1.0. To use 1.0 (or 2.0 for that matter) format simply set the
version()
on the appropriate builder methods:
Graph graph = TinkerFactory.createModern();
GraphSONMapper mapper = graph.io(IoCore.graphson()).mapper().version(GraphSONVersion.V1_0).create()
try (OutputStream os = new FileOutputStream("tinkerpop-modern.json")) {
graph.io(IoCore.graphson()).writer().mapper(mapper).create().writeGraph(os, graph)
}
final Graph newGraph = TinkerGraph.open();
try (InputStream stream = new FileInputStream("tinkerpop-modern.json")) {
newGraph.io(IoCore.graphson()).reader().mapper(mapper).create().readGraph(stream, newGraph);
}
For Gremlin Server, this change means that the application/json
mime type no longer returns GraphSON 1.0 without
type embedding. Instead, Gremlin Server will return GraphSON 3.0 with partial types enabled (i.e. which is equivalent
to application/vnd.gremlin-v3.0+json
). The serializers
section the sample Gremlin Server YAML files now typically
look like this:
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/vnd.gremlin-v3.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} # application/vnd.gremlin-v3.0+gryo-stringd
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV1d0] }} # application/json
It is possible to bring back the original configuration for application/json
by changing the last entry as follows:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV3d0] }} # application/vnd.gremlin-v3.0+gryo
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV3d0, config: { serializeResultToString: true }} # application/vnd.gremlin-v3.0+gryo-stringd
- { className: org.apache.tinkerpop.gremlin.driver.ser.GraphSONMessageSerializerV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV1d0] }} # application/json
Graphite and Ganglia
Graphite and Ganglia are no longer packaged with the Gremlin Server distribution. They are considered optional dependencies and therefore must be installed manually by the user.
SelectStep Defaults to Pop.last
SelectStep
and SelectOneStep
(select()
) are the only Scoping
steps that default to Pop.mixed
as their labeled path
selection criteria. All other steps, like match()
, where()
and dedup()
, use Pop.last
. In order to better enable optimizations
around total Pop.last
traversals, the select()
-steps now default to Pop.last
. Most users will not notice a difference as
it is rare for repeated labels to be used in practice. However, formal backwards compatibility is possible as outlined below.
Assuming that x
is not a Pop
argument:
-
Change all
select(x,y,z)
calls toselectV3d2(x,y,z)
calls. -
Change all
select(x,y,z)
-step calls toselect(Pop.mixed,x,y,z)
.
If an explicit Pop
argument is provided, then no changes are required.
See: TINKERPOP-1541
OptionalStep and Side-Effects
The optional()
-step was previously implemented using ChooseStep
. However, if the optional branch contained side-effects,
then unexpected behaviors can emerge. Thus, a potential backwards compatibility issue arises if side-effects were being
used in optional()
. However, the behavior would be unpredictable so this backwards incompatibility is desirable.
See TINKERPOP-1506
Gremlin Console Initialization
It is no longer possible to intialize the Gremlin Console with a script without use of -e
. In other words, prior
versions allowed:
bin/gremlin.sh gremlin.groovy
Such a command must now be written as:
bin/gremlin.sh -i gremlin.groovy
See: TINKERPOP-1283, TINKERPOP-1651
GraphTraversal valueMap() Signature Updated
GraphTraversal.valueMap(includeTokens,propertyKeys…)
now returns a Map<Object,E>
to account for the presence of T.id
or T.label
if you pass true
to it.
See: TINKERPOP-1483
HADOOP_GREMLIN_LIBS and Spark
The TinkerPop reference documentation has always mentioned that the gremlin-spark
/lib
directory needed to be
added to HADOOP_GREMLIN_LIBS
environment variable. In reality, that was not truly necessary. With Spark 1.x having
gremlin-spark
in HADOOP_GREMLIN_LIBS
hasn’t been a problem, but Spark 2.0 introduces a check for duplicate jars
on the path which will cause job initialization to fail. As a result, going forward with TinkerPop 3.3.0, the
gremlin-spark
lib
directory should not be included in HADOOP_GREMLIN_LIBS
.
Deprecation Removal
The following deprecated classes, methods or fields have been removed in this version:
-
giraph-gremlin
-
org.apache.tinkerpop.gremlin.giraph.groovy.plugin.GiraphGremlinPlugin
-
-
gremlin-console
-
org.apache.tinkerpop.gremlin.console.Console(String)
-
org.apache.tinkerpop.gremlin.console.ConsoleImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.console.plugin.*
-
org.apache.tinkerpop.gremlin.console.groovy.plugin.DriverGremlinPlugin
-
org.apache.tinkerpop.gremlin.console.groovy.plugin.DriverRemoteAcceptor
-
org.apache.tinkerpop.gremlin.console.groovy.plugin.GephiGremlinPlugin
-
org.apache.tinkerpop.gremlin.console.groovy.plugin.UtilitiesGremlinPlugin
-
-
gremlin-core
-
org.apache.tinkerpop.gremlin.jsr223.CoreGremlinModule
-
org.apache.tinkerpop.gremlin.jsr223.CoreGremlinPlugin#INSTANCE
-
org.apache.tinkerpop.gremlin.jsr223.GremlinModule
-
org.apache.tinkerpop.gremlin.jsr223.SingleGremlinScriptEngineManager#getInstance()
-
org.apache.tinkerpop.gremlin.jsr223.GremlinScriptEngineManager#addModule(GremlinModule)
-
org.apache.tinkerpop.gremlin.jsr223.console.PluginAcceptor
-
org.apache.tinkerpop.gremlin.process.traversal.TraversalSource.Builder
-
org.apache.tinkerpop.gremlin.process.traversal.util.ConnectiveP(P…)
-
org.apache.tinkerpop.gremlin.process.traversal.util.AndP(P…)
-
org.apache.tinkerpop.gremlin.process.traversal.util.OrP(P…)
-
org.apache.tinkerpop.gremlin.process.traversal.util.TraversalScriptFunction
-
org.apache.tinkerpop.gremlin.process.traversal.util.TraversalScriptHelper
-
org.apache.tinkerpop.gremlin.process.traversal.Order.keyIncr
-
org.apache.tinkerpop.gremlin.process.traversal.Order.valueIncr
-
org.apache.tinkerpop.gremlin.process.traversal.Order.keyDecr
-
org.apache.tinkerpop.gremlin.process.traversal.Order.valueIncr
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.GraphTraversal.mapKeys()
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.GraphTraversal.mapValues()
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#addV(Object…)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#addE(Direction, String, String, Object…)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#addOutE(String, String, Object…)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#addInV(String, String, Object…)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#selectV3d2()
-
org.apache.tinkerpop.gremlin.process.traversal.Bindings()
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource#withBindings(Bindings)
-
org.apache.tinkerpop.gremlin.structure.Transaction.submit(Function)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#sack(BiFunction,String)
-
org.apache.tinkerpop.gremlin.process.traversal.strategy.finalization.LazyBarrierStrategy
-
org.apache.tinkerpop.gremlin.process.traversal.TraversalSideEffects
(various methods) -
org.apache.tinkerpop.gremlin.process.computer.traversal.step.VertexComputing#generateComputer(Graph)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#groupV3d0(String)
-
org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversal#groupV3d0()
-
org.apache.tinkerpop.gremlin.structure.Graph.Features.VertexPropertyFeatures#supportsAddProperty()
-
org.apache.tinkerpop.gremlin.structure.Graph.Features.VertexPropertyFeatures#FEATURE_ADD_PROPERTY
-
org.apache.tinkerpop.gremlin.structure.Graph.OptIn#SUITE_GROOVY_PROCESS_STANDARD
-
org.apache.tinkerpop.gremlin.structure.Graph.OptIn#SUITE_GROOVY_PROCESS_COMPUTER
-
org.apache.tinkerpop.gremlin.structure.Graph.OptIn#SUITE_GROOVY_ENVIRONMENT
-
org.apache.tinkerpop.gremlin.structure.Graph.OptIn#SUITE_GROOVY_ENVIRONMENT_INTEGRATE
-
org.apache.tinkerpop.gremlin.structure.io.Io.Builder#registry(IoRegistry)
-
org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONMapper.Builder#embedTypes(boolean)
-
org.apache.tinkerpop.gremlin.structure.Transaction.submit(Function)
-
org.apache.tinkerpop.gremlin.structure.util.detached.DetachedEdge(Object,String,Map,Pair,Pair)
-
org.apache.tinkerpop.gremlin.util.CoreImports
-
org.apache.tinkerpop.gremlin.util.ScriptEngineCache
-
org.apache.tinkerpop.gremlin.process.computer.util.ConfigurationTraversal
-
-
gremlin-driver
-
org.apache.tinkerpop.gremlin.driver.Cluster$Builder#reconnectIntialDelay(int)
-
org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0(GryoMapper)
-
org.apache.tinkerpop.gremlin.driver.ser.AbstractGraphSONMessageSerializerV2d0#TOKEN_USE_MAPPER_FROM_GRAPH
-
org.apache.tinkerpop.gremlin.driver.ser.AbstractGryoSONMessageSerializerV2d0#TOKEN_USE_MAPPER_FROM_GRAPH
-
-
gremlin-groovy
-
org.apache.tinkerpop.gremlin.groovy.AbstractImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.CompilerCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.DefaultImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.EmptyImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.ImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.NoImportCustomizerProvider
-
org.apache.tinkerpop.gremlin.groovy.engine.ConcurrentBindings
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor#build(String,List,List,List,Map)
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor#getScriptEngines()
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor#getGlobalBindings()
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.Builder#enabledPlugins(Set)
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.Builder#addEngineSettings(String,List,List,List,Map)
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.Builder#engineSettings(Map)
-
org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.Builder#use(List)
-
org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines
-
org.apache.tinkerpop.gremlin.groovy.function.*
-
org.apache.tinkerpop.gremlin.groovy.plugin.*
-
org.apache.tinkerpop.gremlin.groovy.plugin.credential.*
-
org.apache.tinkerpop.gremlin.groovy.jsr223.DependencyManager
-
org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine(ImportCustomizerProvider)
-
org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine(CompilerCustomizerProvider)
-
org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine#plugins()
-
org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptExecutor
-
org.apache.tinkerpop.gremlin.groovy.jsr223.ScriptEnginePluginAcceptor
-
org.apache.tinkerpop.gremlin.groovy.jsr223.customizer.SandboxExtension
-
org.apache.tinkerpop.gremlin.groovy.jsr223.customizer.*
-
org.apache.tinkerpop.gremlin.groovy.util.DependencyGrabber#deleteDependenciesFromPath(org.apache.tinkerpop.gremlin.groovy.plugin.Artifact)
-
org.apache.tinkerpop.gremlin.groovy.util.DependencyGrabber#copyDependenciesToPath(org.apache.tinkerpop.gremlin.groovy.plugin.Artifact)
-
-
gremlin-python
-
org.apache.tinkerpop.gremlin.python.jsr223.GremlinJythonScriptEngine#()
-
-
gremlin-server
-
org.apache.tinkerpop.gremlin.server.GremlinServer(ServerGremlinExecutor)
-
org.apache.tinkerpop.gremlin.server.Settings#plugins
-
org.apache.tinkerpop.gremlin.server.auth.AllowAllAuthenticator.newSaslNegotiator()
-
org.apache.tinkerpop.gremlin.server.auth.Authenticator.newSaslNegotiator()
-
org.apache.tinkerpop.gremlin.server.auth.Krb5Authenticator.newSaslNegotiator()
-
org.apache.tinkerpop.gremlin.server.auth.SimpleAuthenticator.newSaslNegotiator()
-
org.apache.tinkerpop.gremlin.server.handler.IteratorHandler
-
org.apache.tinkerpop.gremlin.server.handler.NioGremlinResponseEncoder
-
org.apache.tinkerpop.gremlin.server.handler.WsGremlinResponseEncoder
-
org.apache.tinkerpop.gremlin.server.handler.OpSelectorHandler.errorMeter
-
org.apache.tinkerpop.gremlin.server.op.control.*
-
org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.errorMeter
-
org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor.validBindingName
-
org.apache.tinkerpop.gremlin.server.op.session.Session.kill()
-
org.apache.tinkerpop.gremlin.server.op.session.Session.manualkill()
-
-
hadoop-gremlin
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_HADOOP_GRAPH_INPUT_FORMAT
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_HADOOP_GRAPH_OUTPUT_FORMAT
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_HADOOP_GRAPH_INPUT_FORMAT_HAS_EDGES
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_HADOOP_GRAPH_OUTPUT_FORMAT_HAS_EDGES
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_SPARK_GRAPH_INPUT_RDD
-
org.apache.tinkerpop.gremlin.hadoop.Constants#GREMLIN_SPARK_GRAPH_OUTPUT_RDD
-
-
spark-gremlin
-
org.apache.tinkerpop.gremlin.spark.groovy.plugin.SparkGremlinPlugin
-
-
tinkergraph-gremlin
-
org.apache.tinkerpop.gremlin.tinkergraph.groovy.plugin.TinkerGraphGremlinPlugin
-
org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerGraph#CONFIG_*
-
org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistry
-
org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV1d0#getInstance()
-
org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV2d0#getInstance()
-
Please see the javadoc deprecation notes or upgrade documentation specific to when the deprecation took place to understand how to resolve this breaking change.
See: TINKERPOP-832, TINKERPOP-833, TINKERPOP-834, TINKERPOP-999, TINKERPOP-1010, TINKERPOP-1028, TINKERPOP-1040, TINKERPOP-1046, TINKERPOP-1049, TINKERPOP-1142, TINKERPOP-1169, TINKERPOP-1171, TINKERPOP-1275, TINKERPOP-1283, TINKERPOP-1289, TINKERPOP-1291, TINKERPOP-1420, TINKERPOP-1421, TINKERPOP-1465, TINKERPOP-1481, TINKERPOP-1526, TINKERPOP-1603, TINKERPOP-1612, TINKERPOP-1622, TINKERPOP-1651, TINKERPOP-1694, TINKERPOP-1700, TINKERPOP-1706, TINKERPOP-1721, TINKERPOP-1719, TINKERPOP-1720, TINKERPOP-880, TINKERPOP-1170, TINKERPOP-1729
Gremlin-server.sh and Init Scripts
gremlin-server.sh
is now also an init script and can no longer be started without parameters. To start it in the
foreground with defaults like previous usage, please use the console
parameter. Also, gremlin-server.sh
will
continue to start in the foreground when provided a yaml configuration file.
How to install as a service has been added to the Reference Documentation - As A Service.
The switch name has changed for installing dependencies. -i
has been deprecated and replaced by install
.
Removal of useMapperFromGraph
The userMapperFromGraph
serialization configuration option was used to allow the IO configurations of a specific
graph to be assigned to a specific serializer. This feature has been removed completely now. Please use the
ioRegistries
configuration option to add one or more specific Graph
serialization capabilities to a serializer.
serializers:
- { className: org.apache.tinkerpop.gremlin.driver.ser.GryoMessageSerializerV1d0, config: { ioRegistries: [org.apache.tinkerpop.gremlin.tinkergraph.structure.TinkerIoRegistryV1d0] }} # application/vnd.gremlin-v1.0+gryo
see: TINKERPOP-1699
Gremlin-server.bat
The switch name has changed for installing dependencies. -i
has been deprecated and replaced by install
.
SparkGraphComputer GryoRegistrator
Historically, SparkGraphComputer
has used GryoSerializer
to handle the serialization of objects in Spark. The reason
this exists is because TinkerPop uses a shaded version of Kryo and thus, couldn’t use the standard KryoSerializer
-model
provided by Spark. However, a "shim model" was created which allows for the shaded and unshaded versions of Kryo to
interact with one another. To this end, KryoSerializer
can now be used with a GryoRegistrator
. The properties file
for a SparkGraphComputer
now looks as follows:
spark.serializer=org.apache.spark.serializer.KryoSerializer
spark.kryo.registrator=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoRegistrator
If the old GryoSerializer
model is desired, then the properties file should simply look as before:
spark.serializer=org.apache.tinkerpop.gremlin.spark.structure.io.gryo.GryoSerializer
See: TINKERPOP-1389
ScriptInputFormat
The API for the script provided to a ScriptInputFormat
has changed slightly. The signature for parse(line, factory)
is now simply parse(line)
. The inclusion of factory
was deprecated in 3.1.2. Instead of using the {{factory}} to
get the {{StarGraph}} there is a {{graph}} variable in the glocal context of the script. Simply use that directly in
the script.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
GremlinPlugin
The previously deprecated GremlinPlugin
system has been removed. The old GremlinPlugin
interface formerly resided
in the org.apache.tinkerpop.gremlin.groovy.plugin
package of gremlin-groovy
. This interface was replaced by an
interface of the same name in 3.2.4, which now resides in the org.apache.tinkerpop.gremlin.jsr223
package in
gremlin-core
. Obviously, existing plugins will need to be updated to use this new interface.
The plugin model has changed slightly to be more generic and not specifically bound to Groovy based script engines.
Under the new model, the plugin simply returns Customizer
instances that can be applied generically to any
ScriptEngine
or specifically to a particular ScriptEngine
. More details can be found in the
Provider Documentation
Graph Database Providers
Test Suite Removal
A number of test suites that were previously deprecated have been removed which should reduce the burden on graph
providers who are implementing TinkerPop. Test suites related to perfrmance based on junit-benchmarks
have been
removed as have the suites in gremlin-groovy-test
(in fact, this entire module has been removed). Specifically,
providers should be concerned with breaking changes related to the removal of:
-
StructurePerformanceSuite
-
ProcessPerformanceSuite
-
GroovyEnvironmentPerformanceSuite
-
GroovyProcessStandardSuite
-
GroovyProcessComputerSuite
-
GroovyEnvironmentSuite
-
GroovyEnvironmentIntegrateSuite
Those graph providers who relied on these tests should simply remove them from their respective test suites. Beware of
OptOut
annotations that reference tests in these suites as test failure will occur if those references are not
removed.
See: TINKERPOP-1235, TINKERPOP-1612
TransactionException
The AbstractTransaction.TransactionException
class is now just TransactionException
which extends RuntimeExcetpion
rather than Exception
. Providers should consider using this exception to wrap their own on calls to
Transaction.commit()
or Transaction.rollback()
. By throwing this exception, the TinkerPop stack can better respond
to transaction problems and it allows for more common, generalized error handling for users.
See: TINKERPOP-1004
Driver Providers
SASL Byte Array
Gremlin Server no longer supports accepting a byte array for the value passed to the "sasl" parameter in authentication messages. It only accepts a Base64 encoded string.
See: TINKERPOP-1603
TinkerPop 3.2.0
Nine Inch Gremlins
TinkerPop 3.2.11
Release Date: January 2, 2019
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.2.10
Release Date: October 15, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
SASL in Gremlin-Javascript
The Gremlin Javascript Driver now supports SASL Plain Text authentication against a Gremlin Server.
SSL Security
TinkerPop improves its security posture by removing insecure defaults and adding forward-looking standards support.
Gremlin Server no longer supports automatically creating self-signed certificates. Self-signed certificates can still be created manually outside of Gremlin Server. If ssl is enabled, a key store must be configured.
Cluster client no longer trusts all certs by default as this is an insecure configuration. Instead, if no trust store is configured, Cluster will use the default CA certs. To revert to the previous behavior and accept all certs, it must be explicitly configured.
This release introduces JKS and PKCS12 support. JKS is the legacy Java Key Store. PKCS12 has better cross-platform support and is gaining in adoption. Be aware that JKS is the default on Java 8. Java 9 and higher use PKCS12 as the default. Both Java keytool and OpenSSL tools can create, read, update PKCS12 files.
Other new features include specifying SSL protocols and cipher suites.
The packaged *-secure.yaml
files now restrict the protocol to TLSv1.2
by default.
PEM-based configurations are deprecated and may be removed in a future release.
Bulk Import and Export
TinkerPop has provided some general methods for importing and exporting data, but more and more graph providers are
producing their own bulk import/export facilities and they are more efficient and easier to use than TinkerPop’s
methods. As a result, TinkerPop will now refer users to the bulk import/export features of individual graph providers
and as such, has deprecated BulkLoaderVertexProgram
as of this release.
As part of this change, the BulkDumperVertexProgram
has been renamed to CloneVertexProgram
with the former being
deprecated. CloneVertexProgram
is more aptly named, as it essentially copies a graph from a graph InputFormat
to a graph OutputFormat
.
Docker Images
Docker images are now available on Docker Hub for Gremlin Console and Gremlin Server.
$ docker run -it tinkerpop/gremlin-console:3.7.4-SNAPSHOT
$ docker run tinkerpop/gremlin-server:3.7.4-SNAPSHOT
TimedInterruptCustomizerProvider
In Gremlin Server, it is best not to use 'TimedInterruptCustomizerProvider' because it can compete with the 'scriptEvaluationTimeout' setting and produce a different error path. Simply rely on 'scriptEvaluationTimeout' as it covers both script evaluation time and result iteration time.
See: TINKERPOP-1778
TinkerFactory.createGratefulDead()
The Grateful Dead dataset has been with TinkerPop since the early days of 1.x. It has always been available as a
packaged dataset that needed to be loaded through the various IO options available, while other toy graphs had the
benefit of TinkerFactory
to help get them bootstrapped. For 3.2.10, Grateful Dead is now more conveniently loaded
via that same method as the other toy graphs with TinkerFactory.createGratefulDead()
.
Gremlin Javascript Script Submission
Gremlin Javascript can now submit script, with optional bindings, using the Client
class:
const gremlin = require('gremlin');
const client = new gremlin.driver.Client('ws://localhost:8182/gremlin', { traversalSource: 'g' });
client.submit('g.V().tail()')
.then(result => {
console.log(result.length);
console.log(result.toArray()[0]);
});
client.submit('g.V(vid)', { vid: 1 })
.then(result => {
console.log(result.length);
// Get the first item
console.log(result.first());
});
and also allows translation of bytecode steps into script:
const gremlin = require('gremlin');
const graph = new gremlin.process.Graph();
const client = new gremlin.driver.Client('ws://localhost:8182/gremlin', { traversalSource: 'g' });
const translator = new gremlin.process.Translator('g');
const g = graph.traversal();
const script = translator.translate(g.V().tail().getBytecode());
client.submit(script)
.then(result => {
console.log(result.length);
console.log(result.first());
});
Upgrading for Providers
Graph Database Providers
Bulk Import and Export
As noted in the user section, TinkerPop has deprecated its bulk loading feature in BulkLoaderVertexProgram
and will
refer TinkerPop users who need bulk import/export capabilities to the native tools of the graph database they have
chosen. If a graph database provider does not have any bulk loading tools it can choose to build graph InputFormat
and OutputFormat
implementations which can be used by CloneVertexProgram
(formerly BulkDumperVertexProgram
) as
a easy way to get such a feature.
TinkerPop 3.2.9
Release Date: May 8, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Lambda Construction
It was realized quite shortly after release of 3.2.8 that there was a bug in construction of Lambda
instances:
gremlin> org.apache.tinkerpop.gremlin.util.function.Lambda.function("{ it.get() }")
(class: org/apache/tinkerpop/gremlin/util/function/Lambda$function, method: callStatic signature: (Ljava/lang/Class;[Ljava/lang/Object;)Ljava/lang/Object;) Illegal type in constant pool
Type ':help' or ':h' for help.
Display stack trace? [yN]n
The problem was related to a bug in Groovy 2.4.14 and was fixed in 2.4.15.
See: TINKERPOP-1953
TinkerPop 3.2.8
Release Date: April 2, 2018
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Improved Connection Monitoring
Gremlin Server now has two new settings: idleConnectionTimeout
and keepAliveInterval
. The keepAliveInterval
tells
Gremlin Server how long it should wait between writes to a client before it issues a "ping" to that client to see if
it is still present. The idleConnectionTimeout
represents how long Gremlin Server should wait between requests from
a client before it closes the connection on the server side. By default, these two configurations are set to zero,
meaning that they are both disabled.
This change should help to alleviate issues where connections are left open on the server longer than they should be by clients that might mysteriously disappear without properly closing their connections.
See: TINKERPOP-1726
Gremlin.Net Lambdas
Gremlin.Net now has a Lambda
class that can be used to construct Groovy or Java lambdas which will be evaluated on the
server.
Gremlin.Net Tokens Improved
The various Gremlin tokens (e.g. T
, Order
, Operator
, etc.) that were implemented as Enums before in Gremlin.Net
are now implemented as classes. This mainly allows them to implement interfaces which their Java counterparts already
did. T
for example now implements the new interface IFunction
which simply mirrors its Java counterpart Function
.
Steps that expect objects for those interfaces as arguments now explicitly use the interface. Before, they used just
object
as the type for these arguments which made it hard for users to know what kind of object
they can use.
However, usage of these tokens themselves shouldn’t change at all (e.g. T.Id
is still T.Id
).
See: TINKERPOP-1901
Gremlin.Net: Traversal Predicate Classes Merged
Gremlin.Net used two classes for traversal predicates: P
and TraversalPredicate
. Steps that worked with traversal
predicates expected objects of type TraversalPredicate
, but they were constructed from the P
class
(e.g. P.Gt(1)
returned a TraversalPredicate
). Merging these two classes into the P
class should avoid unnecessary
confusion. Most users should not notice this change as predicates can still be constructed exactly as before, e.g.,
P.Gt(1).And(P.Lt(3))
still works without any modifications.
Only users that implemented their own predicates and used TraversalPredicate
as the base class need to change their
implementation to now use P
as the new base class.
See: TINKERPOP-1919
Upgrading for Providers
Graph System Providers
Kitchen Sink Test Graph
The "Kitchen Sink" test graph has been added to the gremlin-test
module. It contains (or will contain) various
disconnected subgraphs of that offer unique structures (e.g. a self-loop) for specific test cases. Graph systems that
use the test suite should not have to make any changes to account for this new graph unless that system performs some
form or special pre-initialization of their system in preparation for loading (e.g. requires a schema) or does the
loading of the graph test data outside of the standard method in which TinkerPop provides.
See: TINKERPOP-1877
TinkerPop 3.2.7
Release Date: December 17, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin-Python Core Types
With the addition of UUID
, Date
, and Timestamp
, Gremlin-Python now implements serializers for all core GraphSON types. Users
that were using other types to represent this data can now use the Python classes datetime.datetime
and`uuid.UUID` in GLV traversals.
Since Python does not support a native Timestamp
object, Gremlin-Python now offers a dummy class Timestamp
, which allows
users to wrap a float and submit it to the Gremlin Server as a Timestamp
GraphSON type. Timestamp
can be found in
gremlin_python.statics
.
See: TINKERPOP-1807
EventStrategy Detachment
EventStrategy
forced detachment of mutated elements prior to raising them in events. While this was a desired
outcome, it may not have always fit every use case. For example, a user may have wanted a reference element or the
actual element itself. As a result, EventStrategy
has changed to allow it to be constructed with a detach()
option, where it is possible to specify any of the following: null
for no detachment, DetachedFactory
for the
original behavior, and ReferenceFactory
for detachment that returns reference elements.
See: TINKERPOP-1829
Embedded Remote Connection
As Gremlin Language Variants (GLVs) expand their usage and use of withRemote()
becomes more common, the need to mock
the "remote" in unit tests increases. To simplify mocking in Java, the new EmbeddedRemoteConnection
provides a
simple way to provide a "remote" that is actually local to the same JVM.
See: TINKERPOP-1756
DSL Type Specification
Prior to this version, the Java annotation processor for Gremlin DSLs has tried to infer the appropriate type specifications when generating anonymous methods. It largely performed this inference on simple conventions in the DSL method’s template specification and there were times where it would fail. For example, a method like this:
public default GraphTraversal<S, E> person() {
return hasLabel("person");
}
would generate an anonymous method like:
public static <S> SocialGraphTraversal<S, E> person() {
return hasLabel("person");
}
and, of course, generate a compile error and E
was not recognized as a symbol. The preferred generation would likely
be:
public static <S> SocialGraphTraversal<S, S> person() {
return hasLabel("person");
}
To remedy this situation, a new annotation has been added which allows the user to control the type specifications more directly providing a way to avoid/override the inference system:
@GremlinDsl.AnonymousMethod(returnTypeParameters = {"A", "A"}, methodTypeParameters = {"A"})
public default GraphTraversal<S, E> person() {
return hasLabel("person");
}
which will then generate:
public static <A> SocialGraphTraversal<A, A> person() {
return hasLabel("person");
}
See: TINKERPOP-1791
Specify a Cluster Object
The :remote connect
command can now take a pre-defined Cluster
object as its argument as opposed to a YAML
configuration file.
gremlin> cluster = Cluster.open()
==>localhost/127.0.0.1:8182
gremlin> :remote connect tinkerpop.server cluster
==>Configured localhost/127.0.0.1:8182
See: TINKERPOP-1787
Remote Traversal Timeout
There was limited support for "timeouts" with remote traversals (i.e. those traversals executed using the withRemote()
option) prior to 3.2.7. Remote traversals will now interrupt on the server using the scriptEvaluationTimeout
setting in the same way that normal script evaluations would. As a reminder, interruptions for traversals are always
considered "attempts to interrupt" and may not always succeed (a graph database implementation might not respect the
interruption, for example).
See: TINKERPOP-1770
Modifications to match()
The match()
-step has been generalized to support the local scoping of all barrier steps, not just reducing barrier steps.
Previously, the order().limit()
clause would have worked globally yielding:
gremlin> g.V().match(
......1> __.as('a').outE('created').order().by('weight',decr).limit(1).inV().as('b'),
......2> __.as('b').has('lang','java')
......3> ).select('a','b').by('name')
==>[a:marko,b:lop]
However, now, order()
(and all other barriers) are treated as local computations to the pattern and thus, the result set is:
gremlin> g.V().match(
......1> __.as('a').outE('created').order().by('weight',decr).limit(1).inV().as('b'),
......2> __.as('b').has('lang','java')
......3> ).select('a','b').by('name')
==>[a:marko,b:lop]
==>[a:josh,b:ripple]
==>[a:peter,b:lop]
Note that this is not that intense of a breaking change as all of the reducing barriers behaved in this manner previously.
This includes steps like count()
, min()
, max()
, sum()
, group()
, groupCount()
, etc. This update has now
generalized this behavior to all barriers and thus, adds aggregate()
, dedup()
, range()
, limit()
, tail()
, and order()
to the list of locally computed clauses.
See: TINKERPOP-1764
Clone a Graph
In gremlin-test
there is a new GraphHelper
class that has a cloneElements()
method. It will clone elements from
the first graph to the second - GraphHelper.cloneElements(Graph original, Graph clone)
. This helper method is
primarily intended for use in tests.
MutationListener Changes
The MutationListener
has a method called vertexPropertyChanged
which gathered callbacks when a property on a vertex
was modified. The method had an incorrect signature though using Property
instead of VertexProperty
. The old method
that used Property
has now been deprecated and a new method added that uses VertexProperty
. This new method has a
default implementation that calls the old method, so this change should not cause breaks in compilation on upgrade.
Internally, TinkerPop no longer calls the old method except by way of that proxy. Users who have MutationListener
implementations can simply add the new method and override its behavior. The old method can thus be ignored completely.
See: TINKERPOP-1798
Upgrading for Providers
Direction.BOTH Requires Duplication of Self-Edges
Prior to this release, there was no semantic check to determine whether a self-edge (e.g. e[1][2-self→2]
) would be returned
twice on a BOTH
. The semantics have been specified now in the test suite where the edge should be returned twice as it
is both an incoming edge and an outgoing edge.
See: TINKERPOP-1821
TinkerPop 3.2.6
Release Date: August 21, 2017
Upgrading for Users
Please see the changelog for a complete list of all the modifications that are part of this release.
Deprecated useMapperFromGraph
The userMapperFromGraph
configuration option for the Gremlin Server serializers has been deprecated. Change
configuration files to use the ioRegistries
option instead. The ioRegistries
option is not a new feature, but
it has not been promoted as the primary way to add IoRegistry
instances to serializers.
See: TINKERPOP-1694
WsAndHttpChannelizer
The WsAndHttpChannelizer
has been added to allow for processing both WebSocket and HTTP requests on the same
port and gremlin server. The SaslAndHttpBasicAuthenticationHandler
has also been added to service
authentication for both protocols in conjunction with the SimpleAuthenticator
.
See: TINKERPOP-915
Upgrading for Providers
ReferenceVertex Label
ReferenceVertex.label()
was hard coded to return EMPTY_STRING
. At some point, ReferenceElements
were suppose to
return labels and ReferenceVertex
was never updated as such. Note that ReferenceEdge
and ReferenceVertexProperty
work as expected. However, given a general change at ReferenceElement
, the Gryo serialization of ReferenceXXX
is
different. If the vertex does not have a label Vertex.DEFAULT_LABEL
is assumed.
See: TINKERPOP-1789
TinkerPop 3.2.5
Release Date: June 12, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
DSL Support
It has always been possible to construct Domain Specific Languages (DSLs) with Gremlin, but the approach has required a somewhat deep understanding of the TinkerPop code base and it is not something that has had a recommended method for implementation. With this release, TinkerPop simplifies DSL development and provides the best practices for their implementation.
// standard Gremlin
g.V().hasLabel('person').
where(outE("created").count().is(P.gte(2))).count()
// the same traversal as above written as a DSL
social.persons().where(createdAtLeast(2)).count()
GraphSON Path Serialization
Serialization of Path
with GraphSON was inconsistent with Gryo in that all the properties on any elements of
the Path
were being included. With Gryo that, correctly, was not happening as that could be extraordinarily
expensive. GraphSON serialization has now been modified to properly not include properties. That change can cause
breaks in application code if that application code tries to access properties on elements in a Path
as they
will no longer be there. Applications that require the properties will need to alter their Gremlin to better
restrict the data they want to retrieve.
See: TINKERPOP-1676
Authentication Configuration
The server settings previously used authentication.className
to set an authenticator for the the two provided
authentication handler and channelizer classes to use. This has been deprecated in favor of authentication.authenticator
.
A class that extends AbstractAuthenticationHandler
may also now be provided as authentication.authenticationHandler
to be used in either of the provided channelizer classes to handle the provided authenticator
See: TINKERPOP-1657
Default Maximum Parameters
It was learned that compilation for scripts with large numbers of parameters is more expensive than those with less
parameters. It therefore becomes possible to make some mistakes with how Gremlin Server is used. A new setting on
the StandardOpProcessor
and SessionOpProcessor
called maxParameters
controls the number of parameters that can
be passed in on a request. This setting is defaulted to sixteen.
Users upgrading to this version may notice errors in their applications if they use more than sixteen parameters. To fix this problem simply reconfigure Gremlin Server with a configuration as follows:
processors:
- { className: org.apache.tinkerpop.gremlin.server.op.session.SessionOpProcessor, config: { maxParameters: 64 }}
- { className: org.apache.tinkerpop.gremlin.server.op.standard.StandardOpProcessor, config: { maxParameters: 64 }}
The above configuration allows sixty-four parameters to be passed on each request.
See: TINKERPOP-1663
GremlinScriptEngine Metrics
The GremlinScriptEngine
has a number of new metrics about its cache size and script compilation times which should
be helpful in understanding usage problems. As GremlinScriptEngine
instances are used in Gremlin Server these metrics
are naturally exposed as part of the standard metrics
set. Note that metrics are captured for both sessionless requests as well as for each individual session that is opened.
See: TINKERPOP-1644
Additional Error Information
Additional information on error responses from Gremlin Server should help make debugging errors easier. Error responses now have both the exception hierarchy and the stack trace that was generated on the server. In this way, receiving an error on a client doesn’t mean having to rifle through Gremlin Server logs to try to find the associated error.
This change has been applied to all Gremlin Server protocols. For the binary protocol and the Java driver this change
means that the ResponseException
thrown from calls to submit()
requests to the server now have the following
methods:
public Optional<String> getRemoteStackTrace()
public Optional<List<String>> getRemoteExceptionHierarchy()
The HTTP protocol has also been updated and returns both exceptions
and stackTrace
fields in the response:
{
"message": "Division by zero",
"Exception-Class": "java.lang.ArithmeticException",
"exceptions": ["java.lang.ArithmeticException"],
"stackTrace": "java.lang.ArithmeticException: Division by zero\n\tat java.math.BigDecimal.divide(BigDecimal.java:1742)\n\tat org.codehaus.groovy.runtime.typehandling.BigDecimalMath.divideImpl(BigDecimalMath.java:68)\n\tat org.codehaus.groovy.runtime.typehandling.IntegerMath.divideImpl(IntegerMath.java:49)\n\tat org.codehaus.groovy.runtime.dgmimpl.NumberNumberDiv$NumberNumber.invoke(NumberNumberDiv.java:323)\n\tat org.codehaus.groovy.runtime.callsite.PojoMetaMethodSite.call(PojoMetaMethodSite.java:56)\n\tat org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)\n\tat org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)\n\tat org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)\n\tat Script4.run(Script4.groovy:1)\n\tat org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:834)\n\tat org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:547)\n\tat javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)\n\tat org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)\n\tat org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$2(GremlinExecutor.java:314)\n\tat java.util.concurrent.FutureTask.run(FutureTask.java:266)\n\tat java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)\n\tat java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)\n\tat java.lang.Thread.run(Thread.java:745)\n"
}
Note that the Exception-Class
which was added in a previous version has been deprecated and replaced by these new
fields.
See: TINKERPOP-1044
Gremlin Console Scripting
The gremlin.sh
command has two flags, -i
and -e
, which are used to pass a script and arguments into the Gremlin
Console for execution. Those flags now allow for passing multiple scripts and related arguments to be supplied which
can yield greater flexibility in automation tasks.
$ bin/gremlin.sh -i y.groovy 1 2 3 -i x.groovy
$ bin/gremlin.sh -e y.groovy 1 2 3 -e x.groovy
See: TINKERPOP-1653
Path support for by()-, from()-, to()-modulation
It is now possible to extract analyze sub-paths using from()
and to()
modulations with respective, path-based steps.
Likewise, simplePath()
and cyclicPath()
now support, along with from()
and to()
, by()
-modulation so the cyclicity
is determined by projections of the path data. This extension is fully backwards compatible.
See: TINKERPOP-1387
GraphManager versus DefaultGraphManager
Gremlin Server previously implemented its own final GraphManager
class. Now, the GraphManager
has been changed to
an interface, and users can supply their own GraphManager
implementations in their YAML. The previous GraphManager
class was meant be used by classes internal to Gremlin Server, but it was public so if it was used for some reason by
users then then a compile error can be expected. To correct this problem, which will likely manifest as a compile error
when trying to create a new GraphManager()
instance, simply change the code to new DefaultGraphManager(Settings)
.
In addition to the change mentioned above, several methods on GraphManager
were deprecated:
-
getGraphs()
should be replaced by the combination ofgetGraphNames()
and thengetGraph(String)
-
getTraversalSources()
is similarly replaced and should instead use a combination ofgetTraversalSourceNames()
andgetTraversalSource(String)
See: TINKERPOP-1438
Gremlin-Python Driver
Gremlin-Python now offers a more complete driver implementation that uses connection pooling and
the Python concurrent.futures
module to provide asynchronous I/0 using threading. The default underlying
WebSocket client implementation is still provided by Tornado, but it is trivial to plug in another client by
defining the Transport
interface.
Using the DriverRemoteConnection
class is the exact same as in previous versions; however,
DriverRemoteConnection
now uses the new Client
class to submit messages to the server.
The Client
class implementation/interface is based on the Java Driver, with some restrictions.
Most notably, Gremlin-Python does not yet implement the Cluster
class. Instead, Client
is
instantiated directly. Usage is as follows:
from gremlin_python.driver import client
client = client.Client('ws://localhost:8182/gremlin', 'g')
result_set = client.submit('1 + 1')
future_results = result_set.all() # returns a concurrent.futures.Future
results = future_results.result() # returns a list
assert results == [2]
client.close() # don't forget to close underlying connections
See: TINKERPOP-1599
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
SimplePathStep and CyclicPathStep now PathFilterStep
The Gremlin traversal machine use to support two step instructions: SimplePathStep
and CyclicPathStep
. These have
been replaced by a high-level instruction called PathFilterStep
which is boolean configured for simple or cyclic paths.
Furthermore, PathFilterStep
also support from()
-, to()
-, and by()
-modulation.
LazyBarrierStrategy No Longer End Appends Barriers
LazyBarrierStrategy
was trying to do to much by considering Traverser
effects on network I/O by appending an
NoOpBarrierStrategy
to the end of the root traversal. This should not be accomplished by LazyBarrierStrategy
,
but instead by RemoteStrategy
. RemoteStrategy
now tries to barrier-append. This may effect the reasoning logic in
some ProviderStrategies
. Most likely not, but just be aware.
See: TINKERPOP-1627
TinkerPop 3.2.4
Release Date: February 8, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
TinkerGraph Deserialization
A TinkerGraph deserialized from Gryo or GraphSON is now configured with multi-properties enabled. This change allows
TinkerGraphs returned from Gremlin Server to properly return multi-properties, which was a problem seen when
subgraphing a graph that contained properties with a setting other than Cardinality.single
.
This change could be considered breaking in the odd chance that a TinkerGraph returned from Gremlin Server was later
mutated, because calls to property(k,v)
would default to Cardinality.list
instead of Cardinality.single
. In the
event that this is a problem, simple change calls to property(k,v)
to property(Cardinality.single,k,v)
and
explicitly set the Cardinality
.
See: TINKERPOP-1587
Traversal Promises
The Traversal
API now has a new promise()
method. These methods return a promise in the form of a
CompleteableFuture
. Usage is as follows:
gremlin> promise = g.V().out().promise{it.next()}
==>java.util.concurrent.CompletableFuture@4aa3d36[Completed normally]
gremlin> promise.join()
==>v[3]
gremlin> promise.isDone()
==>true
gremlin> g.V().out().promise{it.toList()}.thenApply{it.size()}.get()
==>6
At this time, this method is only used for traversals that are configured using withRemote()
.
See: TINKERPOP-1490
If/Then-Semantics with Choose Step
Gremlin’s choose()
-step supports if/then/else-semantics. Thus, to effect if/then-semantics, identity()
was required.
Thus, the following two traversals below are equivalent with the later being possible in this release.
g.V().choose(hasLabel('person'),out('created'),identity())
g.V().choose(hasLabel('person'),out('created'))
See: TINKERPOP-1508
FastNoSuchElementException converted to regular NoSuchElementException
Previously, a call to Traversal.next()
that did not have a result would throw a FastNoSuchElementException
.
This has been changed to a regular NoSuchElementException
that includes the stack trace. Code that explicitly catches
FastNoSuchElementException
should be converted to check for the more general class of NoSuchElementException
.
See: TINKERPOP-1330
ScriptEngine support in gremlin-core
ScriptEngine
and GremlinPlugin
infrastructure has been moved from gremlin-groovy to gremlin-core to allow for
better re-use across different Gremlin Language Variants. At this point, this change is non-breaking as it was
implemented through deprecation.
The basic concept of a ScriptEngine
has been replaced by the notion of a GremlinScriptEngine
(i.e. a
"ScriptEngine" that is specifically tuned for executing Gremlin-related scripts). "ScriptEngine" infrastructure has
been developed to help support this new interface, specifically GremlinScriptEngineFactory
and
GremlinScriptEngineManager
. Prefer use of this infrastructure when instantiating a GremlinScriptEngine
rather
than trying to instantiate directly.
For example, rather than instantiate a GremlinGroovyScriptEngine
with the constructor:
GremlinScriptEngine engine = new GremlinGroovyScriptEngine();
prefer to instantiate it as follows:
GremlinScriptEngineManager manager = new CachedGremlinScriptEngineManager();
GremlinScriptEngine engine = manager.getEngineByName("gremlin-groovy");
Related to the addition of GremlinScriptEngine
, org.apache.tinkerpop.gremlin.groovy.plugin.GremlinPlugin
in
gremlin-groovy has been deprecated and then replaced by org.apache.tinkerpop.gremlin.jsr223.GremlinPlugin
. The new
version of GremlinPlugin
is similar but does carry some new methods to implement that involves the new Customizer
interface. The Customizer
interface is the way in which GremlinScriptEngine
instance can be configured with
imports, initialization scripts, compiler options, etc.
Note that a GremlinPlugin
can be applied to a GremlinScriptEngine
by adding it to the GremlinScriptEngineManager
that creates it.
GremlinScriptEngineManager manager = new CachedGremlinScriptEngineManager();
manager.addPlugin(ImportGremlinPlugin.build().classImports(java.awt.Color.class).create());
GremlinScriptEngine engine = manager.getEngineByName("gremlin-groovy");
All of this new infrastructure is currently optional on the 3.2.x line of code. More detailed documentation will for these changes will be supplied as part of 3.3.0 when these features become mandatory and the deprecated code is removed.
See: TINKERPOP-1562
SSL Client Authentication
Added new server configuration option ssl.needClientAuth
.
See: TINKERPOP-1602
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
CloseableIterator
Prior to TinkerPop 3.x, Blueprints had the notion of a CloseableIterable
which exposed a way for Graph Providers
to offer a way to release resources that might have been opened when returning vertices and edges. That interface was
never exposed in TinkerPop 3.x, but has now been made available via the new CloseableIterator
. Providers may choose
to use this interface or not when returning values from Graph.vertices()
and Graph.edges()
.
It will be up to users to know whether or not they need to call close()
. Of course, users should typically not be
operating with the Graph Structure API, so it’s unlikely that they would be calling these methods directly in the
first place. It is more likely that users will be calling Traversal.close()
. This method will essentially iterate
the steps of the Traversal
and simply call close()
on any steps that implement AutoCloseable
. By default,
GraphStep
now implements AutoCloseable
which most Graph Providers will extend upon (as was done with TinkerGraph’s
TinkerGraphStep
), so the integration should largely come for free if the provider simply returns a
CloseableIterator
from Graph.vertices()
and Graph.edges()
.
See: TINKERPOP-1589
HasContainer AndP Splitting
Previously, GraphTraversal
made it easy for providers to analyze P
-predicates in HasContainers
, but always
splitting AndP
predicates into their component parts. This helper behavior is no longer provided because,
1.) AndP
can be inserted into a XXXStep
in other ways, 2.) the providers XXXStep
should process AndP
regardless of GraphTraversal
helper, and 3.) the GraphTraversal
helper did not recursively split.
A simple way to split AndP
in any custom XXXStep
that implements HasContainerHolder
is to use the following method:
@Override
public void addHasContainer(final HasContainer hasContainer) {
if (hasContainer.getPredicate() instanceof AndP) {
for (final P<?> predicate : ((AndP<?>) hasContainer.getPredicate()).getPredicates()) {
this.addHasContainer(new HasContainer(hasContainer.getKey(), predicate));
}
} else
this.hasContainers.add(hasContainer);
}
See: TINKERPOP-1482, TINKERPOP-1502
Duplicate Multi-Properties
Added supportsDuplicateMultiProperties
to VertexFeatures
so that graph provider who only support unique values as
multi-properties have more flexibility in describing their graph capabilities.
See: TINKERPOP-919
Deprecated OptIn
In 3.2.1, all junit-benchmark
performance tests were deprecated. At that time, the OptIn
representations of these
tests should have been deprecated as well, but they were not. That omission has been remedied now. Specifically, the
following fields were deprecated:
-
OptIn.SUITE_GROOVY_ENVIRONMENT_PERFORMANCE
-
OptIn.SUITE_PROCESS_PERFORMANCE
-
OptIn.SUITE_STRUCTURE_PERFORMANCE
As of 3.2.4, the following test suites were also deprecated:
-
OptIn.SUITE_GROOVY_PROCESS_STANDARD
-
OptIn.SUITE_GROOVY_PROCESS_COMPUTER
-
OptIn.SUITE_GROOVY_ENVIRONMENT
-
OptIn.SUITE_GROOVY_ENVIRONMENT_INTEGRATE
Future testing of gremlin-groovy
(and language variants in general) will be handled differently and will not require
a Graph Provider to validate its operations with it. Graph Providers may now choose to remove these tests from their
test suites, which should reduce the testing burden.
See: TINKERPOP-1610
Deprecated getInstance()
TinkerPop has generally preferred static instance()
methods over getInstance()
, but getInstance()
was used in
some cases nonetheless. As of this release, getInstance()
methods have been deprecated in favor of instance()
.
Of specific note, custom IoRegistry
(as related to IO in general) and Supplier<ClassResolver>
(as related to
Gryo serialization in general) now both prefer instance()
over getInstance()
given this deprecation.
See: TINKERPOP-1530
Drivers Providers
Force Close
Closing a session will first attempt a proper close of any open transactions. A problem can occur, however, if there is a long run job (e.g. an OLAP-based traversal) executing, as that job will block the calls to close the transactions. By exercising the option to a do a "forced close" the session will skip trying to close the transactions and just attempt to interrupt the long run job. By not closing transactions, the session leaves it up to the underlying graph database to sort out how it will deal with those orphaned transactions. On the positive side though (for those graphs which do that well) , long run jobs have the opportunity to be cancelled without waiting for a timeout of the job itself which will allow resources to be released earlier.
The "force" argument is passed on the "close" message and is a boolean value. This is an optional argument to "close"
and defaults to false
.
SASL Authentication
Gremlin Supports SASL based authentication. The server accepts either a byte array or Base64 encoded String as the in
the sasl
argument on the RequestMessage
, however it sends back a byte array only. Some serializers or serializer
configurations don’t work well with that approach (specifically the "toString" configuration on the Gryo serializer) as
the byte array is returned in the ResponseMessage
result. In the case of the "toString" serializer the byte array
gets "toString’d" and the can’t be read by the client.
In 3.2.4, the byte array is still returned in the ResponseMessage
result, but is also returned in the status
attributes under a sasl
key as a Base64 encoded string. In this way, the client has options on how it chooses to
process the authentication response and the change remains backward compatible. Drivers should upgrade to using the
Base64 encoded string however as the old approach will likely be removed in the future.
See: TINKERPOP-1600
TinkerPop 3.2.3
Release Date: October 17, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Renamed Null Result Preference
In 3.2.2, the Gremlin Console introduced a setting called empty.result.indicator
, which controlled the output that
was presented when no result was returned. For consistency, this setting has been renamed to result.indicator.null
and can be set as follows:
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
==>null
gremlin> :set result.indicator.null nil
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
==>nil
gremlin> :set result.indicator.null ""
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> graph.close()
gremlin>
See: TINKERPOP-1409
Java Driver Keep-Alive
The Java Driver now has a keepAliveInterval
setting, which controls the amount of time in milliseconds it should wait
on an inactive connection before it sends a message to the server to keep the connection maintained. This should help
environments that use a load balancer in front of Gremlin Server by ensuring connections are actively maintained even
during periods of inactivity.
See: TINKERPOP-1249
Where Step Supports By-Modulation
It is now possible to use by()
with where()
predicate-based steps. Previously, without using match()
, if you wanted
to know who was older than their friend, the following traversal would be used.
gremlin> g.V().as('a').out('knows').as('b').
......1> filter(select('a','b').by('age').where('a', lt('b')))
==>v[4]
Now, with where().by()
support, the above traversal can be expressed more succinctly and more naturally as follows.
gremlin> g.V().as('a').out('knows').as('b').
......1> where('a', lt('b')).by('age')
==>v[4]
See: TINKERPOP-1330
Change In has() Method Signatures
The TinkerPop 3.2.2 release unintentionally introduced a breaking change for some has()
method overloads. In particular the
behavior for single item array arguments was changed:
gremlin> g.V().hasLabel(["software"] as String[]).count()
==>0
Prior this change single item arrays were treated like there was only that single item:
gremlin> g.V().hasLabel(["software"] as String[]).count()
==>2
gremlin> g.V().hasLabel("software").count()
==>2
TinkerPop 3.2.3 fixes this misbehavior and all has()
method overloads behave like before, except that they no longer
support no arguments.
Deprecated reconnectInitialDelay
The reconnectInitialDelay
setting on the Cluster
builder has been deprecated. It no longer serves any purpose.
The value for the "initial delay" now comes from reconnectInterval
(there are no longer two separate settings to
control).
See: TINKERPOP-1460
TraversalSource.close()
TraversalSource
now implements AutoCloseable
, which means that the close()
method is now available. This new
method is important in cases where withRemote()
is used, as withRemote()
can open "expensive" resources that need
to be released.
In the case of TinkerPop’s DriverRemoteConnection
, close()
will destroy the Client
instance that is created
internally by withRemote()
as shown below:
gremlin> graph = EmptyGraph.instance()
==>emptygraph[empty]
gremlin> g = graph.traversal().withRemote('conf/remote-graph.properties')
==>graphtraversalsource[emptygraph[empty], standard]
gremlin> g.close()
gremlin>
Note that the withRemote()
method will call close()
on a RemoteConnection
passed directly to it as well, so
there is no need to do that manually.
See: TINKERPOP-790
IO Reference Documentation
There is new reference documentation for the various IO formats. The documentation provides more details and samples that should be helpful to users and providers who intend to work directly with the TinkerPop supported serialization formats: GraphML, GraphSON and Gryo.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Default LazyBarrierStrategy
LazyBarrierStrategy
has been included as a default strategy. LazyBarrierStrategy
walks a traversal and looks for
"flatMaps" (out()
, in()
, both()
, values()
, V()
, etc.) and adds "lazy barriers" to dam up the stream so to
increase the probability of bulking the traversers. One of the side-effects is that:
g.V().out().V().has(a)
is compiled to:
g.V().out().barrier().V().barrier().has(a)
Given that LazyBarrierStrategy
is an OptimizationStrategy
, it comes before ProviderOptimizationStrategies
.
Thus, if the provider’s XXXGraphStepStrategy
simply walks from the second V()
looking for has()
-only, it will not
be able to pull in the has()
cause the barrier()
blocks it. Please see the updates to TinkerGraphStepStrategy
and
how it acknowledges NoOpBarrierSteps
(i.e. barrier()
) skipping over them and “left”-propagating labels to the
previous step.
See: TINKERPOP-1488
Configurable Strategies
If the provider has non-configurable TraversalStrategy
classes, those classes should expose a static instance()
-method.
This is typical and thus, backwards compatible. However, if the provider has a TraversalStrategy
that can be configured
(e.g. via a Builder
), then it should expose a static create(Configuration)
-method, where the keys of the configuration
are the method names of the Builder
and the values are the method arguments. For instance, for Gremlin-Python to create
a SubgraphStrategy
, it does the following:
g = Graph().traversal().withRemote(connection).
withStrategies(SubgraphStrategy(vertices=__.hasLabel('person'),edges=__.has('weight',gt(0.5))))
The SubgraphStrategy.create(Configuration)
-method is defined as:
public static SubgraphStrategy create(final Configuration configuration) {
final Builder builder = SubgraphStrategy.build();
if (configuration.containsKey(VERTICES))
builder.vertices((Traversal) configuration.getProperty(VERTICES));
if (configuration.containsKey(EDGES))
builder.edges((Traversal) configuration.getProperty(EDGES));
if (configuration.containsKey(VERTEX_PROPERTIES))
builder.vertexProperties((Traversal) configuration.getProperty(VERTEX_PROPERTIES));
return builder.create();
}
Finally, in order to make serialization possible from JVM-based Gremlin language variants, all strategies have a
TraverserStrategy.getConfiguration()
method which returns a Configuration
that can be used to create()
the
TraversalStrategy
.
The SubgraphStrategy.getConfiguration()
-method is defined as:
@Override
public Configuration getConfiguration() {
final Map<String, Object> map = new HashMap<>();
map.put(STRATEGY, SubgraphStrategy.class.getCanonicalName());
if (null != this.vertexCriterion)
map.put(VERTICES, this.vertexCriterion);
if (null != this.edgeCriterion)
map.put(EDGES, this.edgeCriterion);
if (null != this.vertexPropertyCriterion)
map.put(VERTEX_PROPERTIES, this.vertexPropertyCriterion);
return new MapConfiguration(map);
}
The default implementation of TraversalStrategy.getConfiguration()
is defined as:
public default Configuration getConfiguration() {
return new BaseConfiguration();
}
Thus, if the provider does not have any "builder"-based strategies, then no updates to their strategies are required.
See: TINKERPOP-1455
Deprecated elementNotFound
Both Graph.Exceptions.elementNotFound()
methods have been deprecated. These exceptions were being asserted in the
test suite but were not being used anywhere in gremlin-core
itself. The assertions have been modified to simply
assert that NoSuchElementException
was thrown, which is precisely the behavior that was being indirectly asserted
when Graph.Exceptions.elementNotFound()
were being used.
Providers should not need to take any action in this case for their tests to pass, however, it would be wise to remove uses of these exception builders as they will be removed in the future.
See: TINKERPOP-944
Hidden Step Labels for Compilation Only
In order for SubgraphStrategy
to work, it was necessary to have multi-level children communicate with one another
via hidden step labels. It was decided that hidden step labels are for compilation purposes only and will be removed
prior to traversal evaluation. This is a valid decision given that hidden labels for graph system providers are
not allowed to be used by users. Likewise, hidden labels for steps should not be allowed be used by
users as well.
PropertyMapStep with Selection Traversal
PropertyMapStep
now supports selection of properties via child property traversal. If a provider was relying solely
on the provided property keys in a ProviderOptimizationStrategy
, they will need to check if there is a child traversal
and if so, use that in their introspection for respective strategies. This model was created to support SubgraphStrategy.vertexProperties()
filtering.
See: TINKERPOP-1456, TINKERPOP-844
ConnectiveP Nesting Inlined
There was a bug in ConnectiveP
(AndP
/OrP
), where eq(1).and(eq(2).and(eq(3)))
was AndP(eq(1),AndP(eq(2),eq(3)))
instead of unnested/inlined as AndP(eq(1),eq(2),eq(3))
. Likewise, for OrP
. If a provider was leveraging ConnectiveP
predicates for their custom steps (e.g. graph- or vertex-centric index lookups), then they should be aware of the inlining
and can simplify any and/or-tree walking code in their respective ProviderOptimizationStrategy
.
See: TINKERPOP-1470
TinkerPop 3.2.2
Release Date: September 6, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphSON 2.0
GraphSON 2.0 has been introduced to improve and normalize the format of types embedded in GraphSON.
Log4j Dependencies
There were a number of changes to the Log4j dependencies in the various modules. Log4j was formerly included as part
of the slf4j-log4j12
in gremlin-core
, however that "forced" use of Log4j as a logger implementation when that
really wasn’t necessary or desired. If a project depended on gremlin-core
or other TinkerPop project to get its
Log4j implementation then those applications will need to now include the dependency themselves directly.
Note that Gremlin Server and Gremlin Console explicitly package Log4j in their respective binary distributions.
See: TINKERPOP-1151
Default for gremlinPool
The gremlinPool
setting in Gremlin Server is now defaulted to zero. When set to zero, Gremlin Server will use the
value provided by Runtime.availableProcessors()
to set the pool size. Note that the packaged YAML files no longer
contain the thread pool settings as all are now driven by sensible defaults. Obviously these values can be added
and overridden as needed.
See: TINKERPOP-1373
New Console Features
The Gremlin Console can now have its text colorized. For example, you can set the color of the Gremlin ascii art to
the more natural color of green by using the :set
command:
gremlin> :set gremlin.color green
It is also possible to colorize results, like vertices, edges, and other common returns. Please see the reference documentation for more details on all the settings.
The console also now includes better multi-line support:
gremlin> g.V().out().
......1> has('name','josh').
......2> out('created')
==>v[5]
==>v[3]
This is a nice feature in that it can help you understand if a line is incomplete and unevaluated.
REST API Renamed to HTTP API
This is only a rename to clarify the design of the API. There is no change to the API itself.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Deprecated Io.Builder.registry()
The Io.Builder.registry()
has been deprecated in favor of Io.Builder.onMapper(Consumer<Mapper>)
. This change gives
the Graph
implementation greater flexibility over how to modify the Mapper
implementation. In most cases, the
implementation will simply add its IoRegistry
to allow the Mapper
access to custom serialization classes, but this
approach makes it possible to also set other specific settings that aren’t generalized across all IO implementations.
A good example of this type of usage would be to provide a custom ClassRessolver
implementation to a GryoMapper
.
See: TINKERPOP-1402
Log4j Dependencies
There were a number of changes to the Log4j dependencies in the various modules. Log4j was formerly included as part
of the slf4j-log4j12
in gremlin-core
, however that "forced" use of log4j as a logger implementation when that
really wasn’t necessary or desired. The slf4j-log4j12
dependency is now in "test" scope for most of the modules. The
exception to that rule is gremlin-test
which prescribes it as "optional". That change means that developers
dependending on gremlin-test
(or gremlin-groovy-test
) will need to explicitly specify it as a dependency in their
pom.xml
(or a different slf4j implementation if that better suits them).
See: TINKERPOP-1151
Drivers Providers
GraphSON 2.0
Drivers providers can exploit the new format of typed values JSON serialization offered by GraphSON 2.0. This format
has been created to allow easy and agnostic parsing of a GraphSON payload without type loss. Drivers of non-Java
languages can then implement their own mapping of the GraphSON’s language agnostic type IDs (e.g. UUID
, LocalDate
)
to the appropriate representation for the driver’s language.
Traversal Serialization
There was an "internal" serialization format in place for Traversal
which allowed one to be submitted to Gremlin
Server directly over RemoteGraph
. That format has been removed completely and is wholly replaced by the non-JVM
specific approach of serializing Bytecode
.
See: TINKERPOP-1392
TinkerPop 3.2.1
Release Date: July 18, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gephi Plugin
The Gephi Plugin has been updated to support Gephi 0.9.x. Please upgrade to this latest version to use the Gephi Plugin for Gremlin Console.
See: TINKERPOP-1297
GryoMapper Construction
It is now possible to override existing serializers with calls to addCustom
on the GryoMapper
builder. This option
allows complete control over the serializers used by Gryo. Of course, this also makes it possible to produce completely
non-compliant Gryo files. This feature should be used with caution.
TraversalVertexProgram
TraversalVertexProgram
always maintained a HALTED_TRAVERSERS
TraverserSet
for each vertex throughout the life
of the OLAP computation. However, if there are no halted traversers in the set, then there is no point in keeping that
compute property around as without it, time and space can be saved. Users that have VertexPrograms
that are chained off
of TraversalVertexProgram
and have previously assumed that HALTED_TRAVERSERS
always exists at each vertex, should no
longer assume that.
TraverserSet haltedTraversers = vertex.value(TraversalVertexProgram.HALTED_TRAVERSERS);
// good code
TraverserSet haltedTraversers = vertex.property(TraversalVertexProgram.HALTED_TRAVERSERS).orElse(new TraverserSet());
Interrupting Traversals
Traversals now better respect calls to Thread.interrupt()
, which mean that a running Traversal
can now be
cancelled. There are some limitations that remain, but most OLTP-based traversals should cancel without
issue. OLAP-based traversals for Spark will also cancel and clean up running jobs in Spark itself. Mileage may vary
on other process implementations and it is possible that graph providers could potentially write custom step
implementations that prevent interruption. If it is found that there are configurations or specific traversals that
do not respect interruption, please mention them on the mailing list.
See: TINKERPOP-946
Gremlin Console Flags
Gremlin Console had several methods for executing scripts from file at the start-up of bin/gremlin.sh
. There were
two options:
bin/gremlin.sh script.groovy //1
bin/gremlin.sh -e script.groovy //2
-
The
script.groovy
would be executed as a console initialization script setting the console up for use and leaving it open when the script completed successfully or closing it if the script failed. -
The
script.groovy
would be executed by theScriptExecutor
which meant that commands for the Gremlin Console, such as:remote
and:>
would not be respected.
Changes in this version of TinkerPop have added much more flexibility here and only a minor breaking change should be considered when using this version. First of all, recognize that hese two lines are currently equivalent:
bin/gremlin.sh script.groovy
bin/gremlin.sh -i script.groovy
but users should start to explicitly specify the -i
flag as TinkerPop will eventually remove the old syntax. Despite
the one used beware of the fact that neither will close the console on script failure anymore. In that sense, this
behavior represents a breaking change to consider. To ensure the console closes on failure or success, a script will
have to use the -e
option.
The console also has a number of new features in addition to -e
and -i
:
-
View the available flags for the console with
-h
. -
Control console output with
-D
,-Q
and -V
-
Get line numbers on script failures passed to
-i
and-e
.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
VertexComputing API Change
The VertexComputing
API is used by steps that wrap a VertexProgram
. There is a method called
VertexComputing.generateProgram()
that has changed which now takes a second argument of Memory
. To upgrade, simply
fix the method signature of your VertexComputing
implementations. The Memory
argument can be safely ignored to
effect the exact same semantics as prior. However, now previous OLAP job Memory
can be leveraged when constructing
the next VertexProgram
in an OLAP traversal chain.
Interrupting Traversals
Several tests have been added to the TinkerPop test suite to validate that a Traversal
can be cancelled with
Thread.interrupt()
. The test suite does not cover all possible traversal scenarios. When implementing custom steps,
providers should take care to not ignore an InterruptionException
that might be thrown in their code and to be sure
to check Thread.isInterrupted()
as needed to ensure that the step remains cancellation compliant.
See: TINKERPOP-946
Performance Tests
All "performance" tests have been deprecated. In the previous 3.2.0-incubating release, the ProcessPerformanceSuite
and TraversalPerformanceTest
were deprecated, but some other tests remained. It is the remaining tests that have
been deprecated on this release:
-
`StructurePerformanceSuite
-
GraphReadPerformanceTest
-
GraphWriterPerformanceTest
-
-
GroovyEnvironmentPerformanceSuite
-
SugarLoaderPerformanceTest
-
GremlinExecutorPerformanceTest
-
-
Gremlin Server related performance tests
-
TinkerGraph related performance tests
Providers should implement their own performance tests and not rely on these deprecated tests as they will be removed in a future release along with the "JUnit Benchmarks" dependency.
See: TINKERPOP-1294
Graph Database Providers
Transaction Tests
Tests and assertions were added to the structure test suite to validate that transaction status was in the appropriate
state following calls to close the transaction with commit()
or rollback()
. It is unlikely that this change would
cause test breaks for providers, unless the transaction status was inherently disconnected from calls to close the
transaction somehow.
In addition, other tests were added to enforce the expected semantics for threaded transactions. Threaded transactions are expected to behave like manual transactions. They should be open automatically when they are created and once closed should no longer be used. This behavior is not new and is the typical expected method for working with these types of transactions. The test suite just requires that the provider implementation conform to these semantics.
See: TINKERPOP-947, TINKERPOP-1059
GraphFilter and GraphFilterStrategy
GraphFilter
has been significantly advanced where the determination of an edge direction/label legality is more stringent.
Along with this, GraphFilter.getLegallyPositiveEdgeLabels()
has been added as a helper method to make it easier for GraphComputer
providers to know the space of labels being accessed by the traversal and thus, better enable provider-specific push-down predicates.
Note that GraphFilterStrategy
is now a default TraversalStrategy
registered with GraphComputer.
If GraphFilter
is
expensive for the underlying GraphComputer
implementation, it can be deactivated as is done for TinkerGraphComputer
.
static {
TraversalStrategies.GlobalCache.registerStrategies(TinkerGraphComputer.class,
TraversalStrategies.GlobalCache.getStrategies(GraphComputer.class).clone().removeStrategies(GraphFilterStrategy.class));
}
See: TINKERPOP-1293
Graph Language Providers
VertexTest Signatures
The method signatures of get_g_VXlistXv1_v2_v3XX_name
and get_g_VXlistX1_2_3XX_name
of VertexTest
were changed
to take arguments for the Traversal
to be constructed by extending classes.
TinkerPop 3.2.0
Release Date: Release Date: April 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Hadoop FileSystem Variable
The HadoopGremlinPlugin
defines two variables: hdfs
and fs
. The first is a reference to the HDFS FileSystemStorage
and the latter is a reference to the local FileSystemStorage
. Prior to 3.2.x, fs
was called local
. However,
there was a variable name conflict with Scope.local
. As such local
is now fs
. This issue existed prior to 3.2.x,
but was not realized until this release. Finally, this only effects Gremlin Console users.
Hadoop Configurations
Note that gremlin.hadoop.graphInputFormat
, gremlin.hadoop.graphOutputFormat
, gremlin.spark.graphInputRDD
, and
gremlin.spark.graphOuputRDD
have all been deprecated. Using them still works, but moving forward, users only need to
leverage gremlin.hadoop.graphReader
and gremlin.hadoop.graphWriter
. An example properties file snippet is provided
below.
gremlin.graph=org.apache.tinkerpop.gremlin.hadoop.structure.HadoopGraph
gremlin.hadoop.graphReader=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoInputFormat
gremlin.hadoop.graphWriter=org.apache.tinkerpop.gremlin.hadoop.structure.io.gryo.GryoOutputFormat
gremlin.hadoop.jarsInDistributedCache=true
gremlin.hadoop.defaultGraphComputer=org.apache.tinkerpop.gremlin.spark.process.computer.SparkGraphComputer
See: TINKERPOP-1082, TINKERPOP-1222
TraversalSideEffects Update
There were changes to TraversalSideEffect
both at the semantic level and at the API level. Users that have traversals
of the form sideEffect{…}
that leverage global side-effects should read the following carefully. If the user’s traversals do
not use lambda-based side-effect steps (e.g. groupCount("m")
), then the changes below will not effect them. Moreover, if user’s
traversal only uses sideEffect{…}
with closure (non-TraversalSideEffect
) data references, then the changes below will not effect them.
If the user’s traversal uses sideEffects in OLTP only, the changes below will not effect them. Finally, providers should not be
effected by the changes save any tests cases.
TraversalSideEffects Get API Change
TraversalSideEffects
can now logically operate within a distributed OLAP environment. In order to make this possible,
it is necessary that each side-effect be registered with a reducing BinaryOperator
. This binary operator will combine
distributed updates into a single global side-effect at the master traversal. Many of the methods in TraversalSideEffect
have been Deprecated
, but they are backwards compatible save that TraversalSideEffects.get()
no longer returns an Optional
,
but instead throws an IllegalArgumentException
. While the Optional
semantics could have remained, it was deemed best to
directly return the side-effect value to reduce object creation costs and because all side-effects must be registered apriori,
there is never a reason why an unknown side-effect key would be used. In short:
// change
traversal.getSideEffects().get("m").get()
// to
traversal.getSideEffects().get("m")
TraversalSideEffects Registration Requirement
All TraversalSideEffects
must be registered upfront. This is because, in OLAP, side-effects map to Memory
compute keys
and as such, must be declared prior to the execution of the TraversalVertexProgram
. If a user’s traversal creates a
side-effect mid-traversal, it will fail. The traversal must use GraphTraversalSource.withSideEffect()
to declare
the side-effects it will use during its execution lifetime. If the user’s traversals use standard side-effect Gremlin
steps (e.g. group("m")
), then no changes are required.
See: TINKERPOP-1192
TraversalSideEffects Add Requirement
In a distributed environment, a side-effect can not be mutated and be expected to exist in the mutated form at the final,
aggregated, master traversal. For instance, if the side-effect "myCount" references a Long
, the Long
can not be updated
directly via sideEffects.set("myCount", sideEffects.get("myCount") + 1)
. Instead, it must rely on the registered reducer
to do the merging and thus, the Step
must do sideEffect.add("mySet",1)
, where the registered reducer is Operator.sum
.
Thus, the below will increment "a". If no operator was provided, then the operator is assumed Operator.assign
and the
final result of "a" would be 1. Note that Traverser.sideEffects(key,value)
uses TraversalSideEffect.add()
.
gremlin> traversal = g.withSideEffect('a',0,sum).V().out().sideEffect{it.sideEffects('a',1)}
==>v[3]
==>v[2]
==>v[4]
==>v[5]
==>v[3]
==>v[3]
gremlin> traversal.getSideEffects().get('a')
==>6
gremlin> traversal = g.withSideEffect('a',0).V().out().sideEffect{it.sideEffects('a',1)}
==>v[3]
==>v[2]
==>v[4]
==>v[5]
==>v[3]
==>v[3]
gremlin> traversal.getSideEffects().get('a')
==>1
See: TINKERPOP-1192, TINKERPOP-1166
ProfileStep Update and GraphTraversal API Change
The profile()
-step has been refactored into 2 steps — ProfileStep
and ProfileSideEffectStep
. Users who previously
used the profile()
in conjunction with cap(TraversalMetrics.METRICS_KEY)
can now simply omit the cap step. Users who
retrieved TraversalMetrics
from the side-effects after iteration can still do so, but will need to specify a side-effect
key when using the profile()
. For example, profile("myMetrics")
.
See: TINKERPOP-958
BranchStep Bug Fix
There was a bug in BranchStep
that also rears itself in subclass steps such as UnionStep
and ChooseStep
.
For traversals with branches that have barriers (e.g. count()
, max()
, groupCount()
, etc.), the traversal needs to be updated.
For instance, if a traversal is of the form g.V().union(out().count(),both().count())
, the result is now different
(the bug fix yields a different output). In order to yield the same result, the traversal should be rewritten as
g.V().local(union(out().count(),both().count()))
. Note that if a branch does not have a barrier, then no changes are required.
For instance, g.V().union(out(),both())
does not need to be updated. Moreover, if the user’s traversal already used
the local()
-form, then no change are required either.
See: TINKERPOP-1188
MemoryComputeKey and VertexComputeKey
Users that have custom VertexProgram
implementations will need to change their implementations to support the new
VertexComputeKey
and MemoryComputeKey
classes. In the VertexPrograms
provided by TinkerPop, these changes were trivial,
taking less than 5 minutes to make all the requisite updates.
-
VertexProgram.getVertexComputeKeys()
returns aSet<VertexComputeKey>
. No longer aSet<String>
. UseVertexComputeKey.of(String key,boolean transient)
to generate aVertexComputeKey
. Transient keys were not supported in the past, so to make the implementation semantically equivalent, the boolean transient should be false. -
VertexProgram.getMemoryComputeKeys()
returns aSet<MemoryComputeKey>
. No longer aSet<String>
. UseMemoryComputeKey.of(String key, BinaryOperator reducer, boolean broadcast, boolean transient)
to generate aMemoryComputeKey
. Broadcasting and transients were not supported in the past so to make the implementation semantically equivalent, the boolean broadcast should be true and the boolean transient should be false.
An example migration looks as follows. What might currently look like:
public Set<String> getMemoryComputeKeys() {
return new HashSet<>(Arrays.asList("a","b","c"))
}
Should now look like:
public Set<MemoryComputeKey> getMemoryComputeKeys() {
return new HashSet<>(Arrays.asList(
MemoryComputeKey.of("a", Operator.and, true, false),
MemoryComputeKey.of("b", Operator.sum, true, false),
MemoryComputeKey.of("c", Operator.or, true, false)))
}
A similar patterns should also be used for VertexProgram.getVertexComputeKeys()
.
See: TINKERPOP-1162
SparkGraphComputer and GiraphGraphComputer Persistence
The MapReduce
-based steps in TraversalVertexProgram
have been removed and replaced using a new Memory
-reduction model.
MapReduce
jobs always created a persistence footprint, e.g. in HDFS. Memory
data was never persisted to HDFS.
As such, there will be no data on the disk that is accessible. For instance, there is no more ~reducing
, ~traversers
,
and specially named side-effects such as m
from a groupCount('m')
. The data is still accessible via ComputerResult.memory()
,
it simply does not have a corresponding on-disk representation.
RemoteGraph
RemoteGraph
is a lightweight Graph
implementation that acts as a proxy for sending traversals to Gremlin Server for
remote execution. It is an interesting alternative to the other methods for connecting to Gremlin Server in that all
other methods involved construction of a String
representation of the Traversal
which is then submitted as a script
to Gremlin Server (via driver or HTTP).
gremlin> graph = RemoteGraph.open('conf/remote-graph.properties')
==>remotegraph[DriverServerConnection-localhost/127.0.0.1:8182 [graph='graph]]
gremlin> g = graph.traversal()
==>graphtraversalsource[remotegraph[DriverServerConnection-localhost/127.0.0.1:8182 [graph='graph]], standard]
gremlin> g.V().valueMap(true)
==>[name:[marko], label:person, id:1, age:[29]]
==>[name:[vadas], label:person, id:2, age:[27]]
==>[name:[lop], label:software, id:3, lang:[java]]
==>[name:[josh], label:person, id:4, age:[32]]
==>[name:[ripple], label:software, id:5, lang:[java]]
==>[name:[peter], label:person, id:6, age:[35]]
Note that g.V().valueMap(true)
is executing in Gremlin Server and not locally in the console.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
GraphStep Compilation Requirement
OLTP graph providers that have a custom GraphStep
implementation should ensure that g.V().hasId(x)
and g.V(x)
compile
to the same representation. This ensures a consistent user experience around random access of elements based on ids
(as opposed to potentially the former doing a linear scan). A static helper method called GraphStep.processHasContainerIds()
has been added. TinkerGraphStepStrategy
was updated as such:
((HasContainerHolder) currentStep).getHasContainers().forEach(tinkerGraphStep::addHasContainer);
is now
((HasContainerHolder) currentStep).getHasContainers().forEach(hasContainer -> {
if (!GraphStep.processHasContainerIds(tinkerGraphStep, hasContainer))
tinkerGraphStep.addHasContainer(hasContainer);
});
See: TINKERPOP-1219
Step API Update
The Step
interface is fundamental to Gremlin. Step.processNextStart()
and Step.next()
both returned Traverser<E>
.
We had so many Traverser.asAdmin()
and direct typecast calls throughout (especially in TraversalVertexProgram
) that
it was deemed prudent to have Step.processNextStart()
and Step.next()
return Traverser.Admin<E>
. Moreover it makes
sense as this is internal logic where Admins
are always needed. Providers with their own step definitions will simply
need to change the method signatures of Step.processNextStart()
and Step.next()
. No logic update is required — save
that asAdmin()
can be safely removed if used. Also, Step.addStart()
and Step.addStarts()
take Traverser.Admin<S>
and Iterator<Traverser.Admin<S>>
, respectively.
Traversal API Update
The way in which TraverserRequirements
are calculated has been changed (for the better). The ramification is that post
compilation requirement additions no longer make sense and should not be allowed. To enforce this,
Traversal.addTraverserRequirement()
method has been removed from the interface. Moreover, providers/users should never be able
to add requirements manually (this should all be inferred from the end compilation). However, if need be, there is always
RequirementStrategy
which will allow the provider to add a requirement at strategy application time
(though again, there should not be a reason to do so).
ComparatorHolder API Change
Providers that either have their own ComparatorHolder
implementation or reason on OrderXXXStep
will need to update their code.
ComparatorHolder
now returns List<Pair<Traversal,Comparator>>
. This has greatly reduced the complexity of comparison-based
steps like OrderXXXStep
. However, its a breaking API change that is trivial to update to, just some awareness is required.
See: TINKERPOP-1209
GraphComputer Semantics and API
Providers that have a custom GraphComputer
implementation will have a lot to handle. Note that if the graph system
simply uses SparkGraphComputer
or GiraphGraphComputer
provided by TinkerPop, then no updates are required. This
only effects providers that have their own custom GraphComputer
implementations.
Memory
updates:
-
Any
BinaryOperator
can be used for reduction and is made explicit in theMemoryComputeKey
. -
MemoryComputeKeys
can be marked transient and must be removed from the resultantComputerResult.memory()
. -
MemoryComputeKeys
can be specified to not broadcast and thus, must not be available to workers to read inVertexProgram.execute()
. -
The
Memory
API has been changed. No moreincr()
,and()
, etc. Now its justset()
(setup/terminate) andadd()
(execute).
VertexProgram
updates:
-
VertexComputeKeys
can be marked transient and must be removed from the resultantComputerResult.graph()
.
Operational semantic test cases have been added to GraphComputerTest
to ensure that all the above are implemented correctly.
Barrier Step Updates
The Barrier
interface use to simply be a marker interface. Now it has methods and it is the primary means by which
distributed steps across an OLAP job are aggregated and distributed. It is unlikely that Barrier
was ever used
directly by a provider’s custom step. Instead, a provider most likely extended SupplyingBarrierStep
, CollectingBarrierStep
,
and/or ReducingBarrierStep
.
Providers that have custom extensions to these steps or that use Barrier
directly will need to adjust their implementation slightly to
accommodate a new API that reflects the Memory
updates above. This should be a simple change. Note that FinalGet
no longer exists and such post-reduction processing is handled by the reducing step (via the new Generating
interface).
See: TINKERPOP-1164
Performance Tests
The ProcessPerformanceSuite
and TraversalPerformanceTest
have been deprecated. They are still available, but going forward,
providers should implement their own performance tests and not rely on the built-in JUnit benchmark-based performance test suite.
Graph Processor Providers
GraphFilter and GraphComputer
The GraphComputer
API has changed with the addition of GraphComputer.vertices(Traversal)
and GraphComputer.edges(Traversal)
.
These methods construct a GraphFilter
object which is also new to TinkerPop 3.2.0. GraphFilter
is a "push-down predicate"
used to selectively retrieve subgraphs of the underlying graph to be OLAP processed.
-
If the graph system provider relies on an existing
GraphComputer
implementations such asSparkGraphComputer
and/orGiraphGraphComputer
, then there is no immediate action required on their part to remain TinkerPop-compliant. However, they may wish to update theirInputFormat
orInputRDD
implementation to beGraphFilterAware
and handle theGraphFilter
filtering at the disk/database level. It is advisable to do so in order to reduce OLAP load times and memory/GC usage. -
If the graph system provider has their own
GraphComputer
implementation, then they should implement the two new methods and ensure thatGraphFilter
is processed correctly. There is a new test case calledGraphComputerTest.shouldSupportGraphFilter()
which ensures the semantics ofGraphFilter
are handled correctly. For a "quick and easy" way to move forward, look toGraphFilterInputFormat
as a way of wrapping an existingInputFormat
to do filtering prior toVertexProgram
orMapReduce
execution.
Note
|
To quickly move forward, the GraphComputer implementation can simply set GraphComputer.Features.supportsGraphFilter()
to false and ensure that GraphComputer.vertices() and GraphComputer.edges() throws GraphComputer.Exceptions.graphFilterNotSupported() .
This is not recommended as its best to support GraphFilter .
|
See: TINKERPOP-962
Job Chaining and GraphComputer
TinkerPop 3.2.0 has integrated VertexPrograms
into GraphTraversal
. This means, that a single traversal can compile to multiple
GraphComputer
OLAP jobs. This requires that ComputeResults
be chainable. There was never any explicit tests to verify if a
provider’s GraphComputer
could be chained, but now there are. Given a reasonable implementation, it is likely that no changes
are required of the provider. However, to ensure the implementation is "reasonable" GraphComputerTests
have been added.
-
For providers that support their own
GraphComputer
implementation, note that there is a newGraphComputerTest.shouldSupportJobChaining()
. This tests verifies that theComputerResult
output of one job can be fed into the input of a subsequent job. Only linear chains are tested/required currently. In the future, branching DAGs may be required. -
For providers that support their own
GraphComputer
implementation, note that there is a newGraphComputerTest.shouldSupportPreExistingComputeKeys()
. When chaining OLAP jobs together, if an OLAP job requires the compute keys of a previous OLAP job, then the existing compute keys must be accessible. A simple 2 line change toSparkGraphComputer
andTinkerGraphComputer
solved this for TinkerPop.GiraphGraphComputer
did not need an update as this feature was already naturally supported.
See: TINKERPOP-570
Graph Language Providers
ScriptTraversal
Providers that have custom Gremlin language implementations (e.g. Gremlin-Scala), there is a new class called ScriptTraversal
which will handle script-based processing of traversals. The entire GroovyXXXTest
-suite was updated to use this new class.
The previous TraversalScriptHelper
class has been deprecated so immediate upgrading is not required, but do look into
ScriptTraversal
as TinkerPop will be using it as a way to serialize "String-based traversals" over the network moving forward.
See: TINKERPOP-1154
ByModulating and Custom Steps
If the provider has custom steps that leverage by()
-modulation, those will now need to implement ByModulating
.
Most of the methods in ByModulating
are default
and, for most situations, only ByModulating.modulateBy(Traversal)
needs to be implemented. Note that this method’s body will most like be identical the custom step’s already existing
TraversalParent.addLocalChild()
. It is recommended that the custom step not use TraversalParent.addLocalChild()
as this method may be deprecated in a future release. Instead, barring any complex usages, simply rename the
CustomStep.addLocalChild(Traversal)
to CustomStep.modulateBy(Traversal)
.
See: TINKERPOP-1153
TraversalEngine Deprecation and GraphProvider
The TraversalSource
infrastructure has been completely rewritten. Fortunately for users, their code is backwards compatible.
Unfortunately for graph system providers, a few tweaks to their implementation are in order.
-
If the graph system supports more than
Graph.compute()
, then implementGraphProvider.getGraphComputer()
. -
For custom
TraversalStrategy
implementations, changetraverser.getEngine().isGraphComputer()
toTraversalHelper.onGraphComputer(Traversal)
. -
For custom
Steps
, changeimplements EngineDependent
toimplements GraphComputing
.
See: TINKERPOP-971
TinkerPop 3.1.0
A 187 On The Undercover Gremlinz
TinkerPop 3.1.8
Release Date: August 21, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
TinkerPop 3.1.7
Release Date: June 12, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
GraphML XSLT
There were some inconsistencies in the GraphML format supported in TinkerPop 2.x. These issues were corrected on the
initial release of TinkerPop 3.0.0, but as a result, attempting to read GraphML from 2.x will end with an error. A
newly added XSLT file in gremlin-core
, called tp2-to-tp3-graphml.xslt
, transforms 2.x GraphML into 3.x GraphML,
making it possible easily read in legacy GraphML through a 3.x GraphMLReader
.
See: TINKERPOP-1608
TinkerPop 3.1.6
Release Date: February 3, 2017
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Providers
Driver Providers
Session Close Confirmation
When a session is closed, it now returns a confirmation in the form of a single NO CONTENT
message. When the message
arrives, it means that the server has already destroyed the session. Prior to this change, the request was somewhat
one-way, in that the client could send the request and the server would silently honor it. The confirmation makes it
a bit easier to ensure from the client perspective that the close did what it was supposed to do, allowing the client
to proceed only when the server was fully complete with its work.
See: TINKERPOP-1544
TinkerPop 3.1.5
Release Date: October 17, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Java Driver and close()
There were a few problems noted around the close()
of Cluster
and Client
instances, including issues that
presented as system hangs. These issues have been resolved, however, it is worth noting that an unchecked exception
that was thrown under a certain situation has changed as part of the bug fixes. When submitting an in-session request
on a Client
that was closed (or closing) an IllegalStateException
is thrown. This replaces older functionality
that threw a ConnectionException
and relied logic far deeper in the driver to produce that error and had the
potential to open additional resources despite the intention of the user to "close".
See: TINKERPOP-1467
TinkerPop 3.1.4
Release Date: September 6, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Server Workers
In release 3.1.3, a recommendation was made to
ensure that the threadPoolWorker
setting for Gremlin Server was no less than 2
in cases where Gremlin Server was
being used with sessions that accept parallel requests. In 3.1.4, that is no longer the case and a size of 1
remains
acceptable even in that specific case.
See: TINKERPOP-1350
TinkerPop 3.1.3
Release Date: July 18, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Reserved Gremlin Server Keys
Gremlin Server has always considered certain binding keys (request parameters) as reserved, but that list has now expanded to be more inclusive all the static enums that are imported to the script engine. It is possible that those using Gremlin Server may have to rename their keys if they somehow successfully were using some of the now reserved terms in previous versions.
See: TINKERPOP-1354
Remote Timeout
Disabling the timeout for a :remote
to Gremlin Server was previously accomplished by setting the timeout to max
as
in:
:remote config timeout max
where max
would set the timeout to be Integer.MAX_VALUE
. While this feature is still supported, it has been
deprecated in favor of the new configuration option of none
, as in:
:remote config timeout none
The use of none
completely disables the timeout rather than just setting an arbitrarily high one. Note that it is
still possible to get a timeout on a request if the server timeout limits are reached. The console timeout value only
refers to how long the console will wait for a response from the server before giving up. By default, the timeout is
set to none
.
See: TINKERPOP-1267
Gremlin Server Workers
Past configuration recommendations for the threadPoolWorker
setting on Gremlin Server stated this value could be
safely set to 1
at the low end. A size of 1
is still valid for most cases, however, if Gremlin Server is being used
with sessions that accept parallel requests, then this value should be no less than 2
or else certain scripts (i.e.
those that block for an extended period of time) may cause Gremlin Server to lock up the session.
See: TINKERPOP-1350
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph Database Providers
Property Keys and Hyphens
Graph providers should no longer rely on the test suite to validate that hyphens work for labels and property keys.
Vertex and Edge Counts
A large number of asserts for vertex and edge counts in the test suite were not being applied. This problem has been rectified, but could manifest as test errors for different implementations. The chances of the new assertions identifying previously unrecognized bugs seems slim however, as there are many other tests that validate these counts in other ways. If those were passing previously, then these new asserts should likely not pose a problem.
See: TINKERPOP-1300
Test Feature Annotations
A large number of gremlin-test
feature annotations were incorrect which caused test cases to run against graphs that
did not support those features. The annotations have been fixed, but this opened the possibility that more test cases
will run against the graph implementation. Providers should ensure that their graph features()
are consistent with
the capabilities of the graph implementation.
See: TINKERPOP-1319
Graph Language Providers
AndTest Renaming
The get_g_V_andXhasXage_gt_27XoutE_count_gt_2X_name
test in AndTest
was improperly named and did not match the
nature of the traversal it was providing. It has been renamed to: get_g_V_andXhasXage_gt_27XoutE_count_gte_2X_name
.
Driver Providers
SASL Mechanism
Note that the Gremlin Driver for Java now passes a new parameter for SASL authentication called saslMechanism
. This
is an optional argument and does not represent a breaking change, but it does make the overall implementation more
complete. While the default authentication implementations packaged with Gremlin Server don’t utilize this argument
other implementations might, so the drivers should be able to pass it as per the SASL specification.
See: TINKERPOP-1263
TinkerPop 3.1.2
Release Date: April 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Aliasing Sessions
Calls to SessionedClient.alias()
used to throw UnsupportedOperationException
and it was therefore not possible to
use that capability with a session. That method is now properly implemented and aliasing is allowed.
See: TINKERPOP-1096
Remote Console
The :remote console
command provides a way to avoid having to prefix the :>
command to scripts when remoting. This
mode of console usage can be convenient when working exclusively with a remote like Gremlin Server and there is only a
desire to view the returned data and not to actually work with it locally in any way.
Console Remote Sessions
The :remote tinkerpop.server
command now allows for a "session" argument to be passed to connect
. This argument,
tells the remote to configure it with a Gremlin Server session. In this way, the console can act as a window to script
exception on the server and behave more like a standard "local" console when it comes to script execution.
See: TINKERPOP-1097
TinkerPop Archetypes
TinkerPop now offers Maven archetypes, which provide example project templates to quickly get started with TinkerPop. The available archetypes are as follows:
-
gremlin-archetype-server
- An example project that demonstrates the basic structure of a Gremlin Server project, how to connect with the Gremlin Driver, and how to embed Gremlin Server in a testing framework. -
gremlin-archetype-tinkergraph
- A basic example of how to structure a TinkerPop project with Maven.
Session Transaction Management
When connecting to a session with gremlin-driver
, it is now possible to configure the Client
instance so as to
request that the server manage the transaction for each requests.
Cluster cluster = Cluster.open();
Client client = cluster.connect("sessionName", true);
Specifying true
to the connect()
method signifies that the client
should make each request as one encapsulated
in a transaction. With this configuration of client
there is no need to close a transaction manually.
Session Timeout Setting
The gremlin-driver
now has a setting called maxWaitForSessionClose
that allows control of how long it will wait for
an in-session connection to respond to a close request before it simply times-out and moves on. When that happens,
the server will either eventually close the connection via at session expiration or at the time of shutdown.
See: TINKERPOP-1160
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
All Providers
Provider Documentation
Documentation related to the lower-level APIs used by a provider, that was formerly in the reference documentation, has been moved to its own documentation set that is now referred to as the Provider Documentation.
See: TINKERPOP-937
Graph System Providers
GraphProvider.clear() Semantics
The semantics of the various clear()
methods on GraphProvider
didn’t really change, but it would be worth reviewing
their implementations to ensure that implementations can be called successfully in an idempotent fashion. Multiple
calls to clear()
may occur for a single test on the same Graph
instance, as 3.1.1-incubating
introduced an
automated method for clearing graphs at the end of a test and some tests call clear()
manually.
See: TINKERPOP-1146
Driver Providers
Session Transaction Management
Up until now transaction management has been a feature of sessionless requests only, but the new manageTransaction
request argument for the Session OpProcessor
changes that. Session-based requests can now pass this boolean value on each request to signal to
Gremlin Server that it should attempt to commit (or rollback) the transaction at the end of the request. By default,
this value as false
, so there is no change to the protocol for this feature.
scriptEvalTimeout Override
The Gremlin Server protocol now allows the passing of scriptEvaluationTimeout
as an argument to the SessionOpProcessor
and the StandardOpProcessor
. This value will override the setting of the same name provided in the Gremlin Server
configuration file on a per request basis.
Plugin Providers
RemoteAcceptor allowRemoteConsole
The RemoteAcceptor
now has a new method called allowRemoteConsole()
. It has a default implementation that
returns false
and should thus be a non-breaking change for current implementations. This value should only be set
to true
if the implementation expects the user to always use :>
to interact with it. For example, the
tinkerpop.server
plugin expects all user interaction through :>
, where the line is sent to Gremlin Server. In
that case, that RemoteAcceptor
implementation can return true
. On the other hand, the tinkerpop.gephi
plugin,
expects that the user sometimes call :>
and sometimes work with local evaluation as well. It interacts with the
local variable bindings in the console itself. For tinkerpop.gephi
, this method returns false
.
TinkerPop 3.1.1
Release Date: February 8, 2016
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Storage I/O
The gremlin-core
io-package now has a Storage
interface. The methods that were available via hdfs
(e.g. rm()
, ls()
, head()
, etc.) are now part of Storage
. Both HDFS and Spark implement Storage
via
FileSystemStorage
and SparkContextStorage
, respectively. SparkContextStorage
adds support for interacting with
persisted RDDs in the Spark cache.
This update changed a few of the file handling methods. As it stands, these changes only effect manual Gremlin Console usage as HDFS support was previously provided via Groovy meta-programing. Thus, these are not "code-based" breaking changes.
-
hdfs.rmr()
no longer exists.hdfs.rm()
is now recursive. Simply change all references tormr()
torm()
for identical behavior. -
hdfs.head(location,lines,writableClass)
no longer exists.-
For graph locations, use
hdfs.head(location,writableClass,lines)
. -
For memory locations, use
hdfs.head(location,memoryKey,writableClass,lines)
.
-
-
hdfs.head(…,ObjectWritable)
no longer exists. UseSequenceFileInputFormat
as an input format is the parsing class.
Given that HDFS (and now Spark) interactions are possible via Storage
and no longer via Groovy meta-programming,
developers can use these Storage
implementations in their Java code. In fact, Storage
has greatly simplified
complex file/RDD operations in both GiraphGraphComputer
and SparkGraphComputer
.
Finally, note that the following low-level/internal classes have been removed: HadoopLoader
and HDFSTools
.
See: TINKERPOP-1033, TINKERPOP-1023
Gremlin Server Transaction Management
Gremlin Server now has a setting called strictTransactionManagement
, which forces the user to pass
aliases
for all requests. The aliases are then used to determine which graphs will have their transactions closed
for that request. The alternative is to continue with default operations where the transactions of all configured
graphs will be closed. It is likely that strictTransactionManagement
(which is false
by default so as to be
backward compatible with previous versions) will become the future standard mode of operation for Gremlin Server as
it provides a more efficient method for transaction management.
Deprecated credentialsDbLocation
The credentialsDbLocation
setting was a TinkerGraph only configuration option to the SimpleAuthenticator
for
Gremlin Server. It provided the file system location to a "credentials graph" that TinkerGraph would read from a
Gryo file at that spot. This setting was only required because TinkerGraph did not support file persistence at the
time that SimpleAuthenticator
was created.
As of 3.1.0-incubating, TinkerGraph received a limited persistence feature that allowed the "credentials graph"
location to be specified in the TinkerGraph properties file via gremlin.tinkergraph.graphLocation
and as such the
need for credentialsDbLocation
was eliminated.
This deprecation is not a breaking change, however users should be encouraged to convert their configurations to use
the gremlin.tinkergraph.graphLocation
as soon as possible, as the deprecated setting will be removed in a future
release.
TinkerGraph Supports Any I/O
TinkerGraph’s 'gremlin.tinkergraph.graphLocation' configuration setting can now take a fully qualified class name
of a Io.Builder
implementation, which means that custom IO implementations can be used to read and write
TinkerGraph instances.
See: TINKERPOP-886
Authenticator Method Deprecation
For users who have a custom Authenticator
implementation for Gremlin Server, there will be a new method present:
public default SaslNegotiator newSaslNegotiator(final InetAddress remoteAddress)
Implementation of this method is now preferred over the old method with the same name that has no arguments. The old
method has been deprecated. This is a non-breaking change as the new method has a default implementation that simply
calls the old deprecated method. In this way, existing Authenticator
implementations will still work.
See: TINKERPOP-995
Spark Persistence Updates
Spark RDD persistence is now much safer with a "job server" system that ensures that persisted RDDs are not garbage
collected by Spark. With this, the user is provider a spark
object that enables them to manage persisted RDDs
much like the hdfs
object is used for managing files in HDFS.
Finally, InputRDD
instance no longer need a reduceByKey()
postfix as view merges happen prior to writing the
graphRDD
. Note that a reduceByKey()
postfix will not cause problems if continued, it is simply inefficient
and no longer required.
See: TINKERPOP-1023, TINKERPOP-1027
Logging
Logging to Gremlin Server and Gremlin Console can now be consistently controlled by the log4j-server.properties
and log4j-console.properties
which are in the respective conf/
directories of the packaged distributions.
See: TINKERPOP-859
Gremlin Server Sandboxing
A number of improvements were made to the sandboxing feature of Gremlin Server (more specifically the
GremlinGroovyScriptEngine
). A new base class for sandboxing was introduce with the AbstractSandboxExtension
,
which makes it a bit easier to build white list style sandboxes. A usable implementation of this was also supplied
with the FileSandboxExtension
, which takes a configuration file containing a white list of accessible methods and
variables that can be used in scripts. Note that the original SandboxExtension
has been deprecated in favor of
the AbsstractSandboxExtension
or extending directly from Groovy’s TypeCheckingDSL
.
Deprecated supportsAddProperty()
It was realized that VertexPropertyFeatures.supportsAddProperty()
was effectively a duplicate of
VertexFeatures.supportsMetaProperties()
. As a result, supportsAddProperty()
was deprecated in favor of the other.
If using supportsAddProperty()
, simply modify that code to instead utilize supportsMetaProperties()
.
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
Graph System Providers
Data Types in Tests
There were a number of fixes related to usage of appropriate types in the test suite. There were cases where tests were mixing types, such that a single property key might have two different values. This mixed typing caused problems for some graphs and wasn’t really something TinkerPop was looking to explicitly enforce as a rule of implementing the interfaces.
While the changes should not have been breaking, providers should be aware that improved consistencies in the tests may present opportunities for test failures.
Graph Database Providers
Custom ClassResolver
For providers who have built custom serializers in Gryo, there is a new feature open that can be considered. A
GryoMapper
can now take a custom Kryo ClassResolver
, which means that custom types can be coerced to other types
during serialization (e.g. a custom identifier could be serialized as a HashMap
). The advantage to taking this
approach is that users will not need to have the provider’s serializers on the client side. They will only need to
exist on the server (presuming that the a type is coerced to a type available on the client, of course). The downside
is that serialization is then no longer a two way street. For example, a custom ClassResolver
that coerced a
custom identifier to HashMap
would let the client work with the identifier as a HashMap
, but the client would then
have to send that identifier back to the server as a HashMap
where it would be recognized as a HashMap
(not an
identifier).
See: TINKERPOP-1064
Feature Consistency
There were a number of corrections made around the consistency of Features
and how they were applied in tests.
Corrections fell into two groups of changes:
-
Bugs in the how
Features
were applied to certain tests. -
Refactoring around the realization that
VertexFeatures.supportsMetaProperties()
is really just a duplicate of features already exposed asVertexPropertyFeatures.supportsAddProperty()
.VertexPropertyFeatures.supportsAddProperty()
has been deprecated.
These changes related to "Feature Consistency" open up a number of previously non-executing tests for graphs that did not support meta-properties, so providers should be wary of potential test failure on previously non-executing tests.
Graph Processor Providers
InputRDD and OutputRDD Updates
There are two new methods on the Spark-Gremlin RDD interfaces.
-
InputRDD.readMemoryRDD()
: get aComputerResult.memory()
from an RDD. -
OutputRDD.writeMemoryRDD()
: write aComputerResult.memory()
to an RDD.
Note that both these methods have default implementations which simply work with empty RDDs. Most providers will never
need to implement these methods as they are specific to file/RDD management for GraphComputer
. The four classes that
implement these methods are PersistedOutputRDD
, PersistedInputRDD
, InputFormatRDD
, and OutputFormatRDD
. For the
interested provider, study the implementations therein to see the purpose of these two new methods.
TinkerPop 3.1.0
Release Date: November 16, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Additional upgrade information can be found here:
Upgrading for Users
Shading Jackson
The Jackson library is now shaded to gremlin-shaded
, which will allow Jackson to version independently without
breaking compatibility with dependent libraries or with those who depend on TinkerPop. The downside is that if a
library depends on TinkerPop and uses the Jackson classes, those classes will no longer exist with the standard
Jackson package naming. They will have to shifted as follows:
-
org.objenesis
becomesorg.apache.tinkerpop.shaded.objenesis
-
com.esotericsoftware.minlog
becomesorg.apache.tinkerpop.shaded.minlog
-
com.fasterxml.jackson
becomesorg.apache.tinkerpop.shaded.jackson
See: TINKERPOP-835
PartitionStrategy and VertexProperty
PartitionStrategy
now supports partitioning within VertexProperty
. The Graph
needs to be able to support
meta-properties for this feature to work.
See: TINKERPOP-333
Gremlin Server and Epoll
Gremlin Server provides a configuration option to turn on support for Netty native transport on Linux, which has been shown to help improve performance.
See: TINKERPOP-901
Rebindings Deprecated
The notion of "rebindings" has been deprecated in favor of the term "aliases". Alias is a better and more intuitive term than rebindings which should make it easier for newcomers to understand what they are for.
Configurable Driver Channelizer
The Gremlin Driver now allows the Channerlizer
to be supplied as a configuration, which means that custom
implementations may be supplied.
See: TINKERPOP-680
GraphSON and Strict Option
The GraphMLReader
now has a strict
option on the Builder
so that if a data type for a value is invalid in some
way, GraphMLReader will simply skip that problem value. In that way, it is a bit more forgiving than before especially
with empty data.
See: TINKERPOP-756
Transaction.close() Default Behavior
The default behavior of Transaction.close()
is to rollback the transaction. This is in contrast to previous versions
where the default behavior was commit. Using rollback as the default should be thought of as a like a safer approach
to closing where a user must now explicitly call commit()
to persist their mutations.
See TINKERPOP-805 for more information.
ThreadLocal Transaction Settings
The Transaction.onReadWrite()
and Transaction.onClose()
settings now need to be set for each thread (if another
behavior than the default is desired). For gremlin-server users that may be changing these settings via scripts.
If the settings are changed for a sessionless request they will now only apply to that one request. If the settings are
changed for an in-session request they will now only apply to all future requests made in the scope of that session.
See TINKERPOP-885
Hadoop-Gremlin
-
Hadoop1 is no longer supported. Hadoop2 is now the only supported Hadoop version in TinkerPop.
-
Spark and Giraph have been split out of Hadoop-Gremlin into their own respective packages (Spark-Gremlin and Giraph-Gremlin).
-
The directory where application jars are stored in HDFS is now
hadoop-gremlin-3.7.4-SNAPSHOT-libs
.-
This versioning is important so that cross-version TinkerPop use does not cause jar conflicts.
-
See link:https://issues.apache.org/jira/browse/TINKERPOP-616
Spark-Gremlin
-
Providers that wish to reuse a graphRDD can leverage the new
PersistedInputRDD
andPersistedOutputRDD
.-
This allows the graphRDD to avoid serialization into HDFS for reuse. Be sure to enabled persisted
SparkContext
(see documentation).
-
See link:https://issues.apache.org/jira/browse/TINKERPOP-868, link:https://issues.apache.org/jira/browse/TINKERPOP-925
TinkerGraph Serialization
TinkerGraph is serializable over Gryo, which means that it can shipped over the wire from Gremlin Server. This feature can be useful when working with remote subgraphs.
See: TINKERPOP-728
Deprecation in TinkerGraph
The public static String
configurations have been renamed. The old public static
variables have been deprecated.
If the deprecated variables were being used, then convert to the replacements as soon as possible.
See: TINKERPOP-926
Deprecation in Gremlin-Groovy
The closure wrappers classes GFunction
, GSupplier
, GConsumer
have been deprecated. In Groovy, a closure can be
specified using as Function
and thus, these wrappers are not needed. Also, the GremlinExecutor.promoteBindings()
method which was previously deprecated has been removed.
See: TINKERPOP-879, TINKERPOP-897
Gephi Traversal Visualization
The process for visualizing a traversal has been simplified. There is no longer a need to "name" steps that will
represent visualization points for Gephi. It is possible to just "configure" a visualTraversal
in the console:
gremlin> :remote config visualTraversal graph vg
which creates a special TraversalSource
from graph
called vg
. The traversals created from vg
can be used
to :submit
to Gephi.
Alterations to GraphTraversal
There were a number of changes to GraphTraversal
. Many of the changes came by way of deprecation, but some semantics
have changed as well:
-
ConjunctionStrategy
has been renamed toConnectiveStrategy
(no other behaviors changed). -
ConjunctionP
has been renamed toConnectiveP
(no other behaviors changed). -
DedupBijectionStrategy
has been renamed (and made more effective) asFilterRankingStrategy
. -
The
GraphTraversal
mutation API has change significantly with all previous methods being supported but deprecated.-
The general pattern used now is
addE('knows').from(select('a')).to(select('b')).property('weight',1.0)
.
-
-
The
GraphTraversal
sack API has changed with all previous methods being supported but deprecated.-
The old
sack(mult,'weight')
is nowsack(mult).by('weight')
.
-
-
GroupStep
has been redesigned such that there is now only a key- and value-traversal. No more reduce-traversal.-
The previous
group()
-methods have been renamed togroupV3d0()
. To immediately upgrade, rename all yourgroup()
-calls togroupV3d0()
. -
To migrate to the new
group()
-methods, what wasgroup().by('age').by(outE()).by(sum(local))
is nowgroup().by('age').by(outE().sum())
.
-
-
There was a bug in
fold()
, where if a bulked traverser was provided, the traverser was only represented once.-
This bug fix might cause a breaking change to a user query if the non-bulk behavior was being counted on. If so, used
dedup()
prior tofold()
.
-
-
Both
GraphTraversal().mapKeys()
andGraphTraversal.mapValues()
has been deprecated.-
Use
select(keys)
andselect(columns)
. However, note thatselect()
will not unroll the keys/values. Thus,mapKeys()
⇒select(keys).unfold()
.
-
-
The data type of
Operator
enums will now always be the highest common data type of the two given numbers, rather than the data type of the first number, as it’s been before.
Aliasing Remotes in the Console
The :remote
command in Gremlin Console has a new alias
configuration option. This alias
option allows
specification of a set of key/value alias/binding pairs to apply to the remote. In this way, it becomes possible
to refer to a variable on the server as something other than what it is referred to for purpose of the submitted
script. For example once a :remote
is created, this command:
:remote alias x g
would allow "g" on the server to be referred to as "x".
:> x.E().label().groupCount()
See: TINKERPOP-914
Upgrading for Providers
Important
|
It is recommended that providers also review all the upgrade instructions specified for users. Many of the changes there may prove important for the provider’s implementation. |
All providers should be aware that Jackson is now shaded to gremlin-shaded
and could represent breaking change if
there was usage of the dependency by way of TinkerPop, a direct dependency to Jackson may be required on the
provider’s side.
Graph System Providers
GraphStep Alterations
-
GraphStep
is no longer insideEffect
-package, but now inmap
-package as traversals support mid-traversalV()
. -
Traversals now support mid-traversal
V()
-steps. Graph system providers should ensure that a mid-traversalV()
can leverage any suitable index.
See link:https://issues.apache.org/jira/browse/TINKERPOP-762
Decomposition of AbstractTransaction
The AbstractTransaction
class has been abstracted into two different classes supporting two different modes of
operation: AbstractThreadLocalTransaction
and AbstractThreadedTransaction
, where the former should be used when
supporting ThreadLocal
transactions and the latter for threaded transactions. Of course, providers may still
choose to build their own implementation on AbstractTransaction
itself or simply implement the Transaction
interface.
The AbstractTransaction
gains the following methods to potentially implement (though default implementations
are supplied in AbstractThreadLocalTransaction
and AbstractThreadedTransaction
):
-
doReadWrite
that should execute the read-write consumer. -
doClose
that should execute the close consumer.
See: TINKERPOP-765, TINKERPOP-885
Transaction.close() Default Behavior
The default behavior for Transaction.close()
is to rollback the transaction and is enforced by tests, which
previously asserted the opposite (i.e. commit on close). These tests have been renamed to suite the new semantics:
-
shouldCommitOnCloseByDefault
becameshouldCommitOnCloseWhenConfigured
-
shouldRollbackOnCloseWhenConfigured
becameshouldRollbackOnCloseByDefault
If these tests were referenced in an OptOut
, then their names should be updated.
See: TINKERPOP-805
Graph Traversal Updates
There were numerous changes to the GraphTraversal
API. Nearly all changes are backwards compatible with respective
"deprecated" annotations. Please review the respective updates specified in the "Graph System Users" section.
-
GraphStep
is no longer insideEffect
package. Now inmap
package. -
Make sure mid-traversal
GraphStep
calls are foldingHasContainers
in for index-lookups. -
Think about copying
TinkerGraphStepStrategyTest
for your implementation so you know folding is happening correctly.
Element Removal
Element.Exceptions.elementAlreadyRemoved
has been deprecated and test enforcement for consistency have been removed.
Providers are free to deal with deleted elements as they see fit.
See: TINKERPOP-297
VendorOptimizationStrategy Rename
The VendorOptimizationStrategy
has been renamed to ProviderOptimizationStrategy
. This renaming is consistent
with revised terminology for what were formerly referred to as "vendors".
See: TINKERPOP-876
GraphComputer Updates
GraphComputer.configure(String key, Object value)
is now a method (with default implementation).
This allows the user to specify engine-specific parameters to the underlying OLAP system. These parameters are not intended
to be cross engine supported. Moreover, if there are not parameters that can be altered (beyond the standard GraphComputer
methods), then the provider’s GraphComputer
implementation should simply return and do nothing.
Driver Providers
Aliases Parameter
The "rebindings" argument to the "standard" OpProcessor
has been renamed to "aliases". While "rebindings" is still
supported it is recommended that the upgrade to "aliases" be made as soon as possible as support will be removed in
the future. Gremlin Server will not accept both parameters at the same time - a request must contain either one
parameter or the other if either is supplied.
See: TINKERPOP-913
ThreadLocal Transaction Settings
If a driver configures the Transaction.onReadWrite()
or Transaction.onClose()
settings, note that these settings no
longer apply to all future requests. If the settings are changed for a sessionless request they will only apply to
that one request. If the settings are changed from an in-session request they will only apply to all future requests
made in the scope of that session.
See: TINKERPOP-885
TinkerPop 3.0.0
A Gremlin Rāga in 7/16 Time
TinkerPop 3.0.2
Release Date: October 19, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
BulkLoaderVertexProgram (BLVP)
BulkLoaderVertexProgram
now supports arbitrary inputs (i addition to HadoopGraph
, which was already supported in
version 3.0.1-incubating). It can now also read from any TP3 enabled graph, like TinkerGraph
or Neo4jGraph
.
TinkerGraph
TinkerGraph can now be configured to support persistence, where TinkerGraph will try to load a graph from a specified
location and calls to close()
will save the graph data to that location.
Gremlin Driver and Server
There were a number of fixes to gremlin-driver
that prevent protocol desynchronization when talking to Gremlin
Server.
On the Gremlin Server side, Websocket sub-protocol introduces a new "close" operation to explicitly close sessions. Prior to this change, sessions were closed in a more passive fashion (i.e. session timeout). There were also so bug fixes around the protocol as it pertained to third-party drivers (e.g. python) using JSON for authentication.
Upgrading for Providers
Graph Driver Providers
Gremlin Server close Operation
It is important to note that this feature of the sub-protocol applies to the SessionOpProcessor
(i.e. for
session-based requests). Prior to this change, there was no way to explicitly close a session. Sessions would get
closed by the server after timeout of activity. This new "op" gives drivers the ability to close the session
explicitly and as needed.
TinkerPop 3.0.1
Release Date: September 2, 2015
Please see the changelog for a complete list of all the modifications that are part of this release.
Upgrading for Users
Gremlin Server
Gremlin Server now supports a SASL-based
(Simple Authentication and Security Layer) authentication model and a default SimpleAuthenticator
which implements
the PLAIN
SASL mechanism (i.e. plain text) to authenticate requests. This gives Gremlin Server some basic security
capabilities, especially when combined with its built-in SSL feature.
There have also been changes in how global variable bindings in Gremlin Server are established via initialization
scripts. The initialization scripts now allow for a Map
of values that can be returned from those scripts.
That Map
will be used to set global bindings for the server. See this
sample script
for an example.
See: TINKERPOP-576
Neo4j
Problems related to using :install
to get the Neo4j plugin operating in Gremlin Console on Windows have been
resolved.
See: TINKERPOP-804
Upgrading for Providers
Graph System Providers
GraphFactoryClass Annotation
Providers can consider the use of the new GraphFactoryClass
annotation to specify the factory class that GraphFactory
will use to open a new Graph
instance. This is an optional feature and will generally help implementations that have an interface extending Graph
. If that is the case, then this annotation can be used in the following fashion:
@GraphFactory(MyGraphFactory.class)
public interface MyGraph extends Graph{
}
MyGraphFactory
must contain the static open
method that is normally expected by GraphFactory
.
See: TINKERPOP-778
GraphProvider.Descriptor Annotation
There was a change that affected providers who implemented GraphComputer
related tests such as the ProcessComputerSuite
. If the provider runs those tests, then edit the GraphProvider
implementation for those suites to include the GraphProvider.Descriptor
annotation as follows:
@GraphProvider.Descriptor(computer = GiraphGraphComputer.class)
public final class HadoopGiraphGraphProvider extends HadoopGraphProvider {
public GraphTraversalSource traversal(final Graph graph) {
return GraphTraversalSource.build().engine(ComputerTraversalEngine.build().computer(GiraphGraphComputer.class)).create(graph);
}
}
See: TINKERPOP-690 for more information.
Semantics of Transaction.close()
There were some adjustments to the test suite with respect to how Transaction.close()
was being validated. For most providers, this will generally mean checking OptOut
annotations for test renaming problems. The error that occurs when running the test suite should make it apparent that a test name is incorrect in an OptOut
if there are issues there.
See: TINKERPOP-764 for more information.
Graph Driver Providers
Authentication
Gremlin Server now supports SASL-based authentication. By default, Gremlin Server is not configured with authentication turned on and authentication is not required, so existing drivers should still work without any additional change. Drivers should however consider implementing this feature as it is likely that many users will want the security capabilities that it provides.
Appendix
TinkerPop 2.x
This section contains a few notes that reference differences between TinkerPop 2.x and 3.x.
One of the major differences between TinkerPop 2.x and TinkerPop 3.x is that in TinkerPop 3.x, the Java convention of using setters and getters was abandoned in favor of a syntax that is more aligned with the syntax of Gremlin-Groovy in TinkerPop2. Given that Gremlin-Java and Gremlin-Groovy are nearly identical due to the inclusion of lambdas from Java 8, a big effort was made to ensure that both languages were as similar as possible.
In addition, TinkerPop2 and below made a sharp distinction between the various TinkerPop projects: Blueprints, Pipes,
Gremlin, Frames, Furnace, and Rexster. With TinkerPop 3.x, all of these projects have been merged and are generally
known as Gremlin. Blueprints → Gremlin Structure API : Pipes → GraphTraversal
: Frames → Traversal
:
Furnace → GraphComputer
and VertexProgram
: Rexster → GremlinServer.
GraphML Format
GraphML was a supported format in TinkerPop 2.x, but there were several issues that made it inconsistent with the
specification that were corrected for 3.x. As a result, attempting to read a GraphML file generated by 2.x with the
3.x GraphMLReader
will result in error. To help with this problem, an XSLT file is provided as a resource in
gremlin-core
which will transform 2.x GraphML to 3.x GraphML. It can be used as follows:
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamSource;
import javax.xml.transform.stream.StreamResult;
InputStream stylesheet = Thread.currentThread().getContextClassLoader().getResourceAsStream("tp2-to-tp3-graphml.xslt");
File datafile = new File('/tmp/tp2-graphml.xml');
File outfile = new File('/tmp/tp3-graphml.xml');
TransformerFactory tFactory = TransformerFactory.newInstance();
StreamSource stylesource = new StreamSource(stylesheet);
Transformer transformer = tFactory.newTransformer(stylesource);
StreamSource source = new StreamSource(datafile);
StreamResult result = new StreamResult(new FileWriter(outfile));
transformer.transform(source, result);
TinkerPop2 Data Migration
For those using TinkerPop 2.x, migrating to TinkerPop 3.x will mean a
number of programming changes, but may also require a migration of the data depending on the graph implementation. For
example, trying to open TinkerGraph
data from TinkerPop 2.x with TinkerPop 3.x code will not work, however opening a
TinkerPop2 Neo4jGraph
with a TinkerPop 3.x Neo4jGraph
should work provided there aren’t Neo4j version compatibility
mismatches preventing the read.
If such a situation arises that a particular TinkerPop 2.x Graph
can not be read by TinkerPop 3.x, a "legacy" data
migration approach exists. The migration involves writing the TinkerPop2 Graph
to GraphSON, then reading it to
TinkerPop 3.x with the LegacyGraphSONReader
(a limited implementation of the GraphReader
interface).
The following represents an example migration of the "classic" toy graph. In this example, the "classic" graph is saved to GraphSON using TinkerPop 2.x.
gremlin> Gremlin.version()
==>2.5.z
gremlin> graph = TinkerGraphFactory.createTinkerGraph()
==>tinkergraph[vertices:6 edges:6]
gremlin> GraphSONWriter.outputGraph(graph,'/tmp/tp2.json',GraphSONMode.EXTENDED)
==>null
The above console session uses the gremlin-groovy
distribution from TinkerPop2. It is important to generate the
tp2.json
file using the EXTENDED
mode as it will include data types when necessary which will help limit
"lossiness" on the TinkerPop 3.x side when imported. Once tp2.json
is created, it can then be imported to a
TinkerPop 3.x Graph
.
gremlin> Gremlin.version()
==>3.7.4-SNAPSHOT
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> r = LegacyGraphSONReader.build().create()
==>org.apache.tinkerpop.gremlin.structure.io.graphson.LegacyGraphSONReader@64337702
gremlin> r.readGraph(new FileInputStream('/tmp/tp2.json'), graph)
==>null
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:6 edges:6], standard]
gremlin> g.E()
==>e[11][4-created->3]
==>e[12][6-created->3]
==>e[7][1-knows->2]
==>e[8][1-knows->4]
==>e[9][1-created->3]
==>e[10][4-created->5]