I am new to Scala and Spark. I am exploring the Amazon Deequ library for data profiling.
How do I get count of rows having a particular value while using ColumnProfilerRunner()?
The AnalysisRunner has an option of "compliance" I am looking for a similar option to filter rows that comply with the given column constraint.
I have multiple columns hence I want to check dynamically instead of using column names.
Appreciate any help.
Thanks
Deequ's column profiler computes a fixed set of statistics. If you want to compute custom statistics of your data, you should use the VerificationSuite. Checkout the examples on deequ's github page.