Am new to Neo4j and trying to do a POC by implementing a graph DB for Enterprise Reference / Integration Architecture (Architecture showing all enterprise applications as Nodes, Underlying Tables / APIs - logically grouped as Nodes, integrations between Apps as Relationships.
Objective is to achieve seamlessly 'Impact Analysis' using the strength of Graph DB (Note: I understand this may be an incorrect approach to achieve whatever am trying to achieve, so suggestions are welcome)
Let me come brief my question now,
There are four Apps - A1, A2, A3, A4; A1 has set of Tables (represented by a node A1TS1) that's updated by Integration 1 (relationship in this case) and the same set of tables are read by Integration 2. So the Data model looks like below
(A1TS1)<-[:INT1]-(A1)<-[:INT1]-(A2)
(A1TS1)-[:INT2]->(A1)-[:INT2]->(A4)
I have the underlying application table names captured as a List property in A1TS1 node.
Let's say one of the app table is altered for a new column or Data type and I wanted to understand all impacted Integrations and Applications. Now am trying to write a query as below to retrieve all nodes & relationships that are associated/impacted because of this table alteration but am not able to achieve this
Expected Result is - all impacted nodes (A1TS1, A1, A2, A4) and relationships (INT1, INT2)
Option 1 (Using APOC)
MATCH (a {TCName:'A1TS1',AppName:'A1'})-[r]-(b)
WITH a as STRTND, Collect(type(r)) as allr
CALL apoc.path.subgraphAll(STRTND, {relationshipFilter:allr}) YIELD nodes, relationships
RETURN nodes, relationships
This faile with error Failed to invoke procedure 'apoc.path.subgraphAll': Caused by: java.lang.ClassCastException: java.util.ArrayList cannot be cast to java.lang.String
Option 2 (Using with, unwind, collect clause)
MATCH (a {TCName:'A1TS1',AppName:'A1'})-[r]-(b)
WITH a as STRTND, Collect(r) as allr
UNWIND allr as rels
MATCH p=()-[rels]-()-[rels]-()
RETURN p
This fails with error "Cannot use the same relationship variable 'rels' for multiple patterns" but if I use the [rels] once like p=()-[rels]=()
it works but not yielding me all nodes
Any help/suggestion/lead is appreciated. Thanks in advance
Update Trying to give more context Showing the Underlying Data
MATCH (TC:TBLCON) RETURN TC
"TC"
{"Tables":["TBL1","TBL2","TBL3"],"TCName":"A1TS1","AppName":"A1"}
{"Tables":["TBL4","TBL1"],"TCName":"A2TS1","AppName":"A2"}
MATCH (A:App) RETURN A
"A"
{"Sponsor":"XY","Platform":"Oracle","TechOwnr":"VV","Version":"12","Tags":["ERP","OracleEBS","FinanceSystem"],"AppName":"A1"}
{"Sponsor":"CC","Platform":"Teradata","TechOwnr":"RZ","Tags":["EDW","DataWarehouse"],"AppName":"A2"}
MATCH ()-[r]-() RETURN distinct r.relname
"r.relname"
"FINREP" │ (runs between A1 to other apps)
"UPFRNT" │ (runs between A2 to different Salesforce App)
"INVOICE" │ (runs between A1 to other apps)
With this, here is what am trying to achieve Assume "TBL3" is getting altered in App A1, I wanted to write a query specifying the table "TBL3" in match pattern, get all associated relationships and connected nodes (upstream)
May be I need to achieve in 3 steps, Step 1 - Write a match pattern to find the start node and associated relationship(s) Step 2 - Store that relationship(s) from step 1 in a Array variable / parameter Step 3 - Pass the start node from step 1 & parameter from step 2 to apoc.path.subgraphAll to see all the impacted nodes
This may conceptually sound valid but how to do that technically in neo4j Cypher query is the question.
Hope this helps
WIth some lead from cybersam's response, the below query gets me what I want. Only constraint is, this result is limited to 3 layers (3rd layer through Optional Match)
MATCH (TC:TBLCON) WHERE 'TBL3' IN TC.Tables
CALL apoc.path.subgraphAll(TC, {maxLevel:1}) YIELD nodes AS invN, relationships AS invR
WITH TC, REDUCE (tmpL=[], tmpr IN invR | tmpL+type(tmpr)) AS impR
MATCH FLP=(TC)-[]-()-[FLR]-(SL) WHERE type(FLR) IN impR
WITH FLP, TC, SL,impR
OPTIONAL MATCH SLP=(SL)-[SLR]-() WHERE type(SLR) IN impR RETURN FLP,SLP
This works for my needs, hope this might also help someone.
Thanks everyone for the responses and suggestions
****Update****
Enhanced the query to get rid of Optional Match criteria and other given limitations
MATCH (initTC:TBLCON) WHERE $TL IN initTC.Tables
WITH Reduce(O="",OO in Reduce (I=[], II in collect(apoc.node.relationship.types(initTC)) | I+II) | O+OO+"|") as RF
MATCH (TC:TBLCON) WHERE $TL IN TC.Tables
CALL apoc.path.subgraphAll(TC,{relationshipFilter:RF}) YIELD nodes, relationships
RETURN nodes, relationships
Thanks all (especially cybersam)