I'm working on a project where I'm needing to execute some linux commands (sqoop command) in my Scala application. See sample command I tried executing with MySql on my VM.
import sys.process._
"sqoop eval --connect jdbc:mysql://localhost:3306/retail_db --username root --password cloudera --query 'select * from categories'".!
I got the following error:
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/06/24 15:25:27 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
20/06/24 15:25:27 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure.
Consider using -P instead.
20/06/24 15:25:27 ERROR tool.BaseSqoopTool: Error parsing arguments for eval:
20/06/24 15:25:27 ERROR tool.BaseSqoopTool: Unrecognized argument: *
20/06/24 15:25:27 ERROR tool.BaseSqoopTool: Unrecognized argument: from
20/06/24 15:25:27 ERROR tool.BaseSqoopTool: Unrecognized argument: categories
I used this command as well and I got same error message:
"sqoop eval --connect jdbc:mysql://localhost:3306/retail_db --username root --password cloudera --query 'select * from categories'".!<
Can someone help me figure out what's cause of the error. I've tried using single quote and double quotes, all to no avail. I searched all over SO but I could not get any solution. That's why I'm posting here. NOTE: Same command successfully executed in pyspark as seen below:
>>> import os
>>> import sys
>>> query = "sqoop eval --connect jdbc:mysql://localhost:3306/retail_db --username root --password
cloudera --query 'select * from categories'"
>>> os.system(query)
Warning: /usr/lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/06/24 15:28:56 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.13.0
20/06/24 15:28:56 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure.
Consider using -P instead.
20/06/24 15:28:58 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
----------------------------------------------------
| category_id | category_department_id | category_name |
----------------------------------------------------
| 1 | 2 | Football |
| 2 | 2 | Soccer |
| 3 | 2 | Baseball & Softball |
| 4 | 2 | Basketball |
| 5 | 2 | Lacrosse |
| 6 | 2 | Tennis & Racquet |
It looks like sqoop
doesn't recognize *
, from
, and categories
as individual arguments. The reason it works when invoked from the command line is that the shell interprets the quote marks and presents them as a single select * from categories
argument. In other words, the shell does some pre-processing before handing everything off to the sqoop
program.
The .!
method (i.e. the Scala ProcessBuilder
) launches processes directly, which means that the command elements are not passed to a shell for pre-processing. There are two ways to get around this problem.
Here's an example of the 2nd option.
Seq("sqoop"
,"eval"
,"--connect"
,"jdbc:mysql://localhost:3306/retail_db"
,"--username"
,"root"
,"--password"
,"cloudera"
,"--query"
,"select * from categories").!
As you can see, all the individual arguments are presented as individual arguments, including the last one.