I have an Azure Synapse notebook (written in PySpark) that needs to create a table on our database in a specific schema. The Synapse service principal has read/write access on the DB, but full CONTROL permissions on the schema. So in principle, it should be possible for it to create a table. The code executing this is as follows:
# Retrieve connection details from the linked service using Token Library
server = "{sql_server_name}.database.windows.net"
Port = 1433
Database = "sqldb"
jdbcUrl = f"jdbc:sqlserver://{server}:{Port};databaseName={Database};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
token=TokenLibrary.getConnectionString("LS_ASQL_DB")
conn_Prop = {
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"accessToken" : token
}
table_name = "raw.floodrisk" # Specify the schema and table name
selected_df.write.jdbc(url=jdbcUrl, table= table_name , mode="overwrite", properties=conn_Prop)
While running this, I get the following error: CREATE TABLE permission denied in database 'sqldb'
.
It seems to me that the code is trying to execute the CREATE TABLE
statement on the scope of the DB (for which it doesn't have permission) and only then route it to the schema (for which it does have permission), instead of executing it on the scope of the schema (if that makes sense...). When I give Synapse db_owner
access, the statement works, so in principle the connection is okay.
I have tried finding information on how to change the operating scope to the schema. I looked at possible properties in the conn_Prop
object or maybe the arguments that are passed to df.write.jdbc()
. Also tried finding a possible connection string that would scope the connection to just the schema, but I couldn't find anything working in either of those options.
so the goal is to not grant it owner rights to the full database.
If you don't want to give full access to the database to the service principal user. Instead of adding db_owner
role to the user, change the authorization of the required schema to the user from default authorization. Use below command for it:
Alter AUTHORIZATION ON SCHEMA::<schemaName> TO <synapseSPNUser>;
After that grant the required permissions at schema level using below command:
Grant SELECT ON SCHEMA::<schemaName> TO <schemaName>;
Grant CREATE TABLE, CREATE VIEW, CREATE FUNCTION, CREATE PROCEDURE TO <synapseSPNUser>;
Then you will be able to create table in Azure SQL database using below code:
from pyspark.sql import SparkSession
from pyspark.sql.types import StructType, StructField, StringType, IntegerType
schema = StructType([
StructField("ID", IntegerType(), True),
StructField("Name", StringType(), True),
StructField("RiskLevel", StringType(), True)
])
data = [
(1, "Location A", "High"),
(2, "Location B", "Medium"),
(3, "Location C", "Low")
]
selected_df = spark.createDataFrame(data, schema)
server = "<serverName>.database.windows.net"
Port = 1433
Database = "<dbName>"
jdbcUrl = f"jdbc:sqlserver://{server}:{Port};databaseName={Database};encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30"
token=TokenLibrary.getConnectionString("AzureSqlDatabase1")
conn_Prop = {
"driver" : "com.microsoft.sqlserver.jdbc.SQLServerDriver",
"accessToken" : token
}
table_name = "raw.floodrisk" # Specify the schema and table name
selected_df.write.jdbc(url=jdbcUrl, table= table_name , mode="overwrite", properties=conn_Prop)
Created Table:
For more information you can refer to this.