amazon-web-servicespysparkaws-glueamazon-athenaaws-lake-formation

AWS Glue table missing - Pyspark error Py4JJavaError (error while saving table)


I'm having a unusual behavior with a particular glue table (something that I have never seen before) which in this case is a table created by spark job (schedule with airflow).

Basically, the job consists of ingest a single table from a data warehouse and writing into a table in s3/glue, overwriting the existing partition (save mode is overwrite). For some reason, this job failed today and this was the raised exception.

py4j.protocol.Py4JJavaError: An error occurred while calling o108.saveAsTable.
java.lang.AssertionError: assertion failed: Expect the table customer_cdr has been dropped when the save mode is Overwrite
at scala.Predef$.assert(Predef.scala:170)
at org.apache.spark.sql.execution.command.CreateDataSourceTableAsSelectCommand.run(createDataSourceTables.scala:155)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:104)
at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:102)

At first, me and my colleagues thought that it was just an EMR cluster error from spark and then resetting the cluster would solve it. But then we saw something even weirder.

The table had disappeared from the catalog after the incident (it was no visible in the glue console and not visible within athena). But here is the catch! The table was still there but was hidden. We cannot see it from glue IDE within the search tool but we can access from the console by replace the table name within the url, query the data from Athena or we can even list the table from the cli with the get-table command.

We tried to delete the table (console or cli) but we faced the following issue:

An error occurred (EntityNotFoundException) when calling the DeleteTable operation: Table (v_ntfm_merchantlogstatus) not found

Is almost if the table was removed from lake formation. Now, the question: Have you guys faced any issue like that and was the debug process of it? tks!


Solution

  • In case everyone finds a bizarre issue like, AWS reported that there was a unusual behavior from glue which caused the table to "disappear" (disappear because the table was still there , it was just not visible for our group of users with our roles, being only visible for a admin or root user of the given account).

    So, what should be done in that scenario? In this particular scenario was down to contact AWS to solve the incident (the only possible route).

    cheers,