TinkerPop Future
This document offers a rough view at what features can be expected from future releases and catalogs proposals for changes that might be better written and understood in a document form as opposed to a dev list post.
This document is meant to change and restructure frequently and should serve more as a guide rather than directions set
in stone. As this document represents a future look, the current version is always on the master
branch and therefore
the only one that need be maintained.
Roadmap
This section provides some expectations as to what features might be provided in future major versions. It is not a guarantee of a feature landing in a particular version, but it yields some sense of what core developers have some focus on. The most up-to-date version of this document will always be in the git repository.
The sub-sections that follow only outline those release lines that expect to include significant new features. Other release lines may continue to be active with maintenance work and bug fixes, but won’t be listed here. The items listed in each release line represent unreleased changes only. Once an official release is made for a line, the roadmap items are removed with new items taking their place as they are planned. The release line is removed from the roadmap completely when it is no longer maintained.
TinkerPop 4.x
TinkerPop 4 marks the beginning of the move into semantic versioning, as discussed in the DISCUSS thread. Development has begun with the switch from WebSocket to HTTP/1.1 for the underlying transport of Gremlin Server, along with many new features that have been proposed. Here is a rough outline of where new features are expected to land, with major breaking features lined for 4.0, and additional features lined up for minor versions. Additional details on each feature can be found in the Appendix.
4.0 Milestone Release (2024Q4)
As discussed in the DISCUSS thread, we are looking to release a milestone build of what is done so far to gather feedback.
This milestone would roughly offer the opportunity to try Gremlin Server HTTP with the revised GraphSON and GraphBinary serialization formats for TinkerPop 4.0, using Java and Python drivers. These drivers will initially continue to support GraphSON as it can help with debugging (given its human-readable format) and offers wider serialization options for users in these early days, we should expect to just have GraphBinary support for drivers in the future for simplicity.
This milestone release would include the following list of features for preview:
-
-
Replacement of WebSocket with HTTP/1.1 (DISCUSS thread)
-
-
HTTP support - GLVs - Switching from WebSocket to HTTP in GLVs relies on community contribution in each language
-
gremlin-java
-
gremlin-python
-
4.0 (2025H1)
Full GA release of TinkerPop 4.0.
-
-
Replacement of WebSocket with HTTP/1.1 (DISCUSS thread)
-
-
HTTP support - GLVs - Switching from WebSocket to HTTP in GLVs relies on community contribution in each language
-
gremlin-java
-
gremlin-python
-
gremlin-javascript
-
gremlin-dotnet
-
gremlin-go
-
-
Switch default from
GremlinGroovyScriptEngine
toGremlinLangScriptEngine
-
Transaction redesign - API
-
The design of the API with server/core implementations would be viable for the initial release
-
Adding support for these APIs in GLVs sets it up for the implementation in the next iteration
-
4.1…4.x
-
Transaction redesign - Implementation
-
Full API implementation in TinkerGraph
-
-
New Gremlin language elements for geospatial and vector computation
TinkerPop 5.x
Features originally planned for 3.7.x.
-
Add support for traversals as parameters for
V()
,is()
, andhas()
(includesTraversal
arguments toP
) -
Add subgraph/tree structure in all GLVs
-
Define semantics for query federation across Gremlin servers (depends on
call()
step) -
Gremlin debug support
-
Case-insensitive search (TINKERPOP-2673)
-
Mutation steps for
clone()
of anElement
and formoveE()
for edges. -
Add a language element to merge
Map
objects more easily.
Proposals
This section tracks and details future ideas. While the dev list is the primary place to talk about new ideas, complex
topics can be initiated from and/or promoted to this space. While it is fine to include smaller bits of content directly
in future/index.asciidoc
, longer, more developed proposals and ideas would be better added as individual asciidoc
files which would then be included as links to the GitHub repository where they will be viewable in a formatted state.
In this way, this section is more just a list of links to proposals rather than an expansion of text. Proposals should
be named according to this pattern "proposal-<name>-<number>" where the "name" is just a logical title to help identify
the proposal and the "number" is the incremented proposal count.
The general structure of a proposal is fairly open but should include an initial "Status" section which would describe the current state of the proposal. A new proposal would likely hae a status like "Open for discussion". From there, the proposal should include something about the "motivation" for the change which describes a bit about what the issue is and why a change is needed. Finally, it should explain the details of the change itself.
At this stage, the proposal can then be submitted as a pull request for comment. As part of that pull request, the
proposal should be added to the table below. Proposals always target the master
branch.
The table below lists various proposals and their disposition. The Targets column identifies the release or releases to which the proposal applies and the Resolved column helps clarify the state of the proposal itself. Generally speaking, the proposal is "resolved" when the core tenants of its contents are established. For some proposals that might mean "fully implemented", but it might also mean "scheduled and scoped with open issues set aside". In that sense, the meaning is somewhat subjective. Consulting the "Status" section of the proposal itself will provide the complete story.
Proposal | Description | Targets | Resolved |
---|---|---|---|
Equality, Equivalence, Comparability and Orderability Semantics - Documents existing Gremlin semantics along with clarifications for ambiguous behaviors and recommendations for consistency. |
3.6.0 |
N |
|
Gremlin Arrow Flight. |
4.0.0 |
N |
|
Removing the Need for Closures/Lambda in Gremlin |
3.7.0 |
N |
Appendix
TinkerPop 4.x Feature Details
HTTP support - Server
Currently under development in the master-http
branch. This body of work aims to replace the WebSocket protocol in Gremlin Server
with HTTP/1.1 (DISCUSS thread).
For API design, see TINKERPOP-3065
Implement a new HTTP API.
HTTP support - GLVs
As server will no longer support WebSocket, each GLVs will also switch to HTTP protocol. Connection options should be simplified with HTTP compared to WebSocket, and should be unified across all GLVs to the best of each language’s library availability. This will also include implementing interface for pluggable request interceptor for authentication, as raised in the DISCUSS thread.
IO serialization updates
TinkerPop’s serialization IO has not been updated for quite a long time, there are serializers that can and should likely be removed, and definition updated to make the IO overall more simple and maintainable. (DISCUSS thread)
Switch default from GremlinGroovyScriptEngine
to GremlinLangScriptEngine
Switching the default script processing from GremlinGroovyScriptEngine
to GremlinLangScriptEngine
is a step towards removing
dependency on Groovy in the Gremlin Server. Currently, the TinkerPop testing system make heavy use of the Groovy script engine, and
a major portion of the work will involve updating the tests.
Gremlin Console rework
As a result of sessions removal and switch to gremlin-lang
, the Gremlin Console remote mode will be affected, and users
may notice a difference in the interactive experience on the Console. Additional discussions may be needed on the impact and acceptable changes.
(DISCUSS thread)
Transaction redesign
As transaction will have to be implemented over HTTP, this is an opportunity to improve the usability of the transaction APIs. This potentially mean redesigning the transaction model so that it is better suited for all graphs, align remote and embedded transaction usages, and ensure transaction support in GLVs. Such API redesign will be a breaking change that needs to be introduced in the initial release of TP4, which can include stub implementations only, with full implementation added iteratively in minor releases.
Bytecode removal
One of the purposes that bytecode served was to provide a universal way to translate a Traversal. However, with the introduction of
the gremlin-lang
parser this need can be fulfilled differently. Any Gremlin script can be converted into a Traversal in a uniform way which reduces the
need for bytecode. Now, we are left with two systems that serve a similar purpose, it is probably time to remove one of them during a major
version upgrade, see (DISCUSS thread).
Before the full removal can be implemented, a few updates will be needed in gremlin-lang
to ensure appropriate types are covered.
Each GLV will also have to be updated to switch from bytecode based to string based traversal construction. A proposed plan includes:
-
Extract interface from Bytecode, and implement string based traversals and request options
-
Add support for missing types, such as UUID, Set, Edge, ByteBuffer, etc. in
gremlin-lang
(TINKERPOP-3023) -
Add missing types to GLVs and rework traversal generation
-
Ensure Feature tests work properly
I/O serialization update needed
One important note for this proposed plan is that currently gremlin-lang
does not cover all types supported via Bytecode,
which means either all missing types need to be fully defined and implemented in the gremlin-lang
parser for parity
(related to Type System), or consensus have to be reached in the community on if reduced type support
is acceptable, and if so, which types can be omitted at this point.
Groovy removal in Gremlin Server
Removing Groovy from Gremlin Server implies:
-
Revising the configuration system to avoid the init script through Groovy. This is also an opportunity to simply server set-up.
-
Deprecate
GremlinGroovyScriptEngine
forGremlinLangScriptEngine
for script processing -
Remove/replace all the Groovy based plugin infrastructure from the server
One main impact of how Groovy allows arbitrary code to be executed on the server is security vulnerabilities. However, the removal of this system itself has overreaching affects in the community that should be discussed.
Schema support
Schema support relies on a well-defined type system.
Multi-label, no label, mutable label support
TinkerPop only support single, immutable labels for its Elements. Various providers have implemented their own mechanisms for multi-label, no label, and/or mutable label support. Many popular non-TinkerPop graph models also allow for multiple labels. It is time to consider bringing these functionalities into parity.
Multi/meta properties on edges
Currently, meta-properties only exists on vertices, this extends to allowing meta-properties on edges.
Pluggable System for explain/profile()
While TinkerPop provides explain() and profile() steps, switching to a pluggable architecture would increase flexibility for providers who wish to customize the amount and format of information they return. (DISCUSS Thread)
An extension of this is for explain() to work in remote fashion, see TINKERPOP-2128
Improve local()
step
The concept and application of the local()
step has been somewhat confusing to users, and the addition of the string and list
manipulation steps in 3.7 further blurred some definitions of local execution in a traversal. It is a good time to start considering
a redesign or improved design of the local()
step.
Type conversion with cast()
step
We have introduced asString()
and asDate()
in 3.7, this would be to introduce additional casting steps like toInt()
, which
should rely on a well-defined type system.
New Gremlin language elements for geospatial and vector computation
Similar to how string and list manipulation steps were introduced, there is room for creating first-class steps for vector computation and geospatial steps (DISCUSS Thread).
Rework match()
step
The match()
step has been an attempt to introduce a way of declarative form of querying in TinkerPop based on pattern matching.
There exists various issues with the step, and rework is due for improvements.
Unresolved issues related to current match()
:
has()
accepting Traversal
This is a body of work that was in the roadmap for 3.7.x, which is to add support of traversals as parameters to has()
,
which should expand the usability of the Gremlin language.
Query status/query cancellation
These are useful features for debugging and improved resource management that have been implemented by providers, but would now be a good time to bring parity into TinkerPop.
Related issue: TINKERPOP-2210 Support cancellation of remote traversals.
Unify algorithm steps
Moving the algorithm steps into call()
step or generify them in some way.
Modernize IO for OLAP
As name suggests, we should remove old file serialization formats, and introduce more modernized format for IO. One possible candidate is GraphAR, which is a standard data file format for graph data storage and retrieval, currently an incubating Apache project.
A potential large extension of this work, which may not be included for this version yet, is revisiting OLAP in general to resolve open JIRA issues.
Remove neo4j-gremlin
As discussed inside (DISCUSS thread), neo4j-gremlin
was deprecated in 3.7
with the introduction of native transaction in TinkerGraph. TP4 would be the place to remove the module.
Documentation reorganization
In addition to the necessary documentation updates needed for new TP4 feature implementations, this entails more major rework to the documentation structure.
The current documentation is very thorough in certain areas, but lacking in many others. The accumulation of the features and functionalities over the past years likely mean that certain information are outdated, and/or should be reworded for clarity. While we have a generous amount of reference material, there tend to lack implementation guidelines for contributors and providers. TP4 is an opportunity to rework the documentations to be more thorough, concise, clear, and easy to update when new features are implemented.
Another implication of this is to revisit the current documentation generation process. We have a very complex scripting structure that we use to orchestrate the generation of documentations, combined with Maven plugins for language specific docs. This process maybe affected by any major alterations to documentation structure, which would need some effort to revise.
Deprecate sparql-gremlin
This module of TinkerPop has been largely unmaintained and likely unused for many years. Unless we receive fresh interest and contribution, it would be the time to deprecate and remove in a future version.
Proxy implementation
Implementing a proxy for Gremlin Server might be a viable alternative to implementing clustering in the client, for orchestrating multiple Gremlin Server instances, and/or rerouting WebSocket/HTTP requests for compatibility.
io()
step improvements
Simply io()
for data ingestion and export in both embedded and remote usage in some way, and add support for CSV format.
Matrix testing
This aims to create an automated testing set up, which helps to ensure compatibility between drivers and server across minor releases, and to make sure API contracts are not broken unintentionally.
Improved telemetry in driver/server
This is a less well-defined area, aimed at improved metrics collection that can better aid debugging for users and providers. Work may include adding the ability to debug queries and traversals, adding OpenTelemetry support, etc.
Type System
TinkerPop has not had one’s own type system defined and has been relying on the JVM types, which becomes a problem especially in GLVs that doesn’t have corresponding types defined in their language. (DISCUSS thread)
4.x Branching Methodology
Development of 4.x occurs on the 4.0-dev
branch. This branch was created as an orphan branch and therefore has no
history tied to any other branch in the repo including master. As such, there is no need to merge/rebase 4.0-dev
. When
it comes time to promote 4.0-dev
to master
the procedure for doing so will be to:
-
Create a
3.x-master
branch frommaster
-
Delete all content from
master
in one commit -
Rebase
4.0-dev
onmaster
-
Merge
4.0-dev
tomaster
and push
From this point 3.x development will occur on 3.x-master
and 4.x development occurs on master
(with the same version
branching as we have now, e.g 3.3-dev
, 4.1-dev
, etc.) The 3.x-master
branch changes will likely still merge to
master
, but will all merge as no-op changes.