powershellneo4j

Use wildcard characters with Neo4j admin


Does neo4j-admin support use of wildcard characters in filenames?

The following works just fine:

.\bin\neo4j-admin.ps1 database import full --overwrite-destination neo4j 
--nodes="import\node1.txt" 
--nodes="import\node2.txt" 
--relationships="import/edge1.txt"
--relationships="import/edge2.txt"
--delimiter "\t"

But the following does not:

.\bin\neo4j-admin.ps1 database import full --overwrite-destination neo4j 
--nodes="import\node1.txt" 
--nodes="import\node2.txt" 
--relationships="import/edge*.txt"
--delimiter "\t"

java.lang.IllegalArgumentException: File './import/edge*.txt' doesn't exist

I tried all combinations of single-quote instead of double-quote, without quotes, and forward slash instead of backward. All unsuccessful.

.\bin\neo4j-admin.ps1 --version
5.26.0

Update 1

Following the suggestion in this answer, I get weird behavior when referencing multiple files using regex.

Consider these files:

> cat .\import\node1.txt
PersonID:ID(Person)     Name:string     :LABEL
a1      name1   N
a2      name2   N
> cat .\import\node2.txt
PersonID:ID(Person)     Name:string     :LABEL
a3      name3   N
a4      name4   N
> cat .\import\edge1.txt
:START_ID(Person)       :END_ID(Person) :TYPE
a1      a3      t
a1      a2      t
> cat .\import\edge2.txt
:START_ID(Person)       :END_ID(Person) :TYPE
a3      a4      t
a1      a1      t

When referencing each file separately as the following, the import works with no issue:

.\bin\neo4j-admin.ps1 database import full --overwrite-destination neo4j --nodes="import\node1.txt" --nodes="import\node2.txt" --relationships=import\edge1.txt --relationships=import\edge2.txt --delimiter "\t" --array-delimiter ";" --verbose

But it fails when you specify edges via regex as in the following.

.\bin\neo4j-admin.ps1 database import full --overwrite-destination neo4j --nodes="import\node1.txt" --nodes="import\node2.txt" --relationships='./import/edge.*txt' --delimiter "\t" --array-delimiter ";" --verbose

Caused by: org.neo4j.importer.FileImporter$CsvImportException: java.io.IOException: java.util.concurrent.ExecutionException: org.neo4j.internal.batchimport.input.InputException: ERROR in input data source: BufferedCharSeeker[source:...1fda5dc62ead.\import\edge2.txt, position:41, line:1] in field: :TYPE:3 for header: [:START_ID(Person), :END_ID(Person), :TYPE] raw field value: :TYPE original error: :START_ID(Person) (Person)-[:TYPE]->:END_ID(Person) (Person) referring to missing node :TYPE

The above command works fine if I remove the header line from the second edges file (the same does not apply to nodes!!). This behavior is very confusing, since you would also need to account for the order of files regex will take in the input.


Solution

  • For filenames, neo4j-admin accepts regular expressions, not globbing. See the docs here.

    For your case, that would be:

    .\bin\neo4j-admin.ps1 database import full --overwrite-destination neo4j 
    --nodes="import\node1.txt" 
    --nodes="import\node2.txt" 
    --relationships="import/edge.*txt"
    --delimiter "\t"