sqlpostgresqldata-profiling

generate PostgreSQL stats / data profiling


I would like to automate data profiling on PostgreSQL with a free tool, a tool that inspects data content through a column profile or percentage distribution of values. like max, min, avg.


Solution

  • https://www.postgresql.org/docs/current/static/view-pg-stats.html will give you the idea of data distribution for column. It is populated by autovacuum based on your settings. Or manual runs.

    Also yo can run queries like select max(c), min(c), avg(c) from tname to get exact data the is of interest for you.

    To do that I would recommend using psql - it is free and extremely handy for querying Postgres. Also you can easily cron psql -c "your select here" to format any report by your needs.

    You can save profiles and data either to files or database. It can be interactive and scripted. It works with local and remote databases. You can easily mix SQL with bash or any other scripting language variables.

    All this (and much much more) cool features you will find with psql. Documentation is here. You don't need to download it if you already have Postgres client - it is part of the package.