I am trying fetch substring
of field that is stored as an attribute
in the edge
of graph. To be more specific on the company-company edges
I have attached the graph creation query.
g.addV('company').property(id,'SHRTST01').property('name','Alphabet').next()
g.addV('company').property(id,'SHRTST02').property('name','Google').next()
g.addV('company').property(id,'SHRTST03').property('name','Youtube').next()
g.addV('company').property(id,'SHRTST05').property('name','YoutubeKids').next()
g.addV('person').property(id,'SHRTST01_1900-01-01_1_1').property('bu_id', 'SHRTST01_1900-01-01_1_1').property('name', 'W Karl David Laxton').next()
g.addV('person').property(id,'SHRTST02_1900-01-01_1_1').property('bu_id', 'SHRTST02_1900-01-01_1_1').property('name', 'Steven H Strong').next()
g.addE('HAS_SHRHLDING_PC_TO').from(__.V('SHRTST01')).to(__.V('SHRTST01_1900-01-01_1_1')).property(id,'SHRTST01_HAS_SHRHLDING_PC_TO_SHRTST01_1900-01-01_1_1').property('perc_value', 30).next()
g.addE('HAS_VOTING_PC_TO').from(__.V('SHRTST01')).to(__.V('SHRTST01_1900-01-01_1_1')).property(id,'SHRTST01_HAS_VOTING_PC_TO_SHRTST01_1900-01-01_1_1').property('perc_value', 50).next()
g.addE('HAS_SHRHLDING_PC_TO').from(__.V('SHRTST01')).to(__.V('SHRTST02')).property(id,'SHRTST01_HAS_SHRHLDING_PC_TO_SHRTST02_2002-01-01_2_2').property('perc_value', 75).next()
g.addE('HAS_VOTING_PC_TO').from(__.V('SHRTST01')).to(__.V('SHRTST02')).property(id,'SHRTST01_HAS_VOTING_PC_TO_SHRTST02_2002-01-01_2_2').property('perc_value', 50).next()
g.addE('HAS_SHRHLDING_PC_TO').from(__.V('SHRTST02')).to(__.V('SHRTST02_1900-01-01_1_1')).property(id,'SHRTST02_HAS_SHRHLDING_PC_TO_SHRTST02_1900-01-01_1_1').property('perc_value', 25).next()
g.addE('HAS_VOTING_PC_TO').from(__.V('SHRTST02')).to(__.V('SHRTST02_1900-01-01_1_1')).property(id,'SHRTST02_HAS_VOTING_PC_TO_SHRTST02_1900-01-01_1_1').property('perc_value', 23).next()
g.addE('HAS_SHRHLDING_PC_TO').from(__.V('SHRTST02')).to(__.V('SHRTST03')).property(id,'SHRTST02_HAS_SHRHLDING_PC_TO_SHRTST03_2002-01-01_2_2').property('perc_value', 80).next()
g.addE('HAS_VOTING_PC_TO').from(__.V('SHRTST02')).to(__.V('SHRTST03')).property(id,'SHRTST03_HAS_VOTING_PC_TO_SHRTST02_2003-01-01_2_2').property('perc_value', 20).next()
g.addE('HAS_SHRHLDING_PC_TO').from(__.V('SHRTST03')).to(__.V('SHRTST05')).property(id,'SHRTST03_HAS_SHRHLDING_PC_TO_SHRTST05_2002-01-01_2_2').property('perc_value', 75).next()
g.addE('HAS_VOTING_PC_TO').from(__.V('SHRTST03')).to(__.V('SHRTST05')).property(id,'SHRTST03_HAS_VOTING_PC_TO_SHRTST05_2002-01-01_2_2').property('perc_value', 30).next()
The below query gives me the following output but I need to refine it bit more to get the expected output.
g.V('SHRTST01').as('crn')
.repeat(
outE('HAS_SHRHLDING_PC_TO').as('edge_field')
.inV()
.simplePath())
.until(not(outE()))
.emit()
.hasLabel('company')
.select('crn','edge_field')
.project('crn', 'edge_field')
.by(select(keys).select('crn'))
.by(select(keys).select('edge_field'))
Actual output:
{'crn': v[SHRTST01], 'edge_field': e[SHRTST01_HAS_SHRHLDING_PC_TO_SHRTST02_2002-01-01_2_2][SHRTST01-HAS_SHRHLDING_PC_TO->SHRTST02]}
{'crn': v[SHRTST01], 'edge_field': e[SHRTST02_HAS_SHRHLDING_PC_TO_SHRTST03_2002-01-01_2_2][SHRTST02-HAS_SHRHLDING_PC_TO->SHRTST03]}
{'crn': v[SHRTST01], 'edge_field': e[SHRTST03_HAS_SHRHLDING_PC_TO_SHRTST05_2002-01-01_2_2][SHRTST03-HAS_SHRHLDING_PC_TO->SHRTST05]}
Expected output:
{'crn': [SHRTST01], 'edge_field': [SHRTST01_HAS_SHRHLDING_PC_TO_SHRTST02_2002-01-01_2_2], 'shr_id':[SHRTST02_2002-01-01_2_2]}
{'crn': [SHRTST01], 'edge_field': [SHRTST02_HAS_SHRHLDING_PC_TO_SHRTST03_2002-01-01_2_2], 'shr_id':[SHRTST03_2002-01-01_2_2]}
{'crn': [SHRTST01], 'edge_field': [SHRTST03_HAS_SHRHLDING_PC_TO_SHRTST05_2002-01-01_2_2], 'shr_id':[SHRTST05_2002-01-01_2_2]}
I am not sure, how can I derieve the shr_id
. As you can see, the shr_id
is substring
of edge_field
. Any leads on this would be very helpful.
I am also interested to know if there is better way to handle the use case.
Thanks.
The very latest release of Apache TinkerPop (3.7.1) added many new string and list operations to the Gremlin language. Once Amazon Neptune has moved up to that level of Gremlin you will easily be able to perform substring
like operations. Until then it's probably easiest to do this in the application.
This has been a gap in Gremlin for a long time (many historical reasons) but it is great to see it now being addressed.
Those new steps are discussed here: https://github.com/apache/tinkerpop/blob/3.7.1/CHANGELOG.asciidoc#release-3-7-1