database-designpowerbidaxreverse-engineeringdatabase-diagram

Power BI Chord visual: link tables with same column name


Someone migrated a DB2 database to SQL Server and during the migration all PK and FK were lost. there is no way to have them back.

But thanks to this query I'm now able to Reverse Engineer the database diagram based on the datatype and column name. (I know, is a Reverse Engineer based on assumptions).

I now would like to create a Power BI dashboard with the Chord visual and create a link between tables that have the same column name. (the image is just an example to give you an idea)

enter image description here

But I cannot use From and To because I'm not tracking a change.

Maybe I'm using the wrong visual?

How to track things that are in common?

EDIT FOR BOUNTY:

If I run the query against Adventure Works:

WITH ColumnCount AS(
    SELECT s.name AS SchemaName,
           t.name AS TableName,
           c.name AS ColumnName,
           ct.[name] AS DataType,
           c.max_length,
           c.precision,
           c.scale,
           COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
    FROM sys.schemas s
         JOIN sys.tables t ON s.schema_id = t.schema_id
         JOIN sys.columns c ON t.object_id = c.object_id
         JOIN sys.types ct ON c.user_type_id = ct.user_type_id)
SELECT *
FROM ColumnCount CC
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
         CC.SchemaName,
         CC.TableName;

The query can group columns that have the same: ColumnName, DataType, max_length, precision, scale.

enter image description here

But how to render this in a Power BI Chord visual?

The goal is ti find link between tables.

Chord seems the best visual to archive it but if you have better visual to suggest I'm open to your tips.


Solution

  • To relate the Power BI Chord visual, you need a minimum of two data points and a measure;

    Depending on the number of tables, you may find the Chord chart a little overwhelmed by data, however if you bring those three values through form Adventure Works, you can produce

    Power BI Chord Chart

    The dataset you are starting with has a lot more info, and rows than you need, but doesn't naturally contain the relationship. By simplifying the dataset with the following, you can create your Chord data point

    WITH ColumnCount AS(
        SELECT  t.name AS TableName,
                c.name AS ColumnName,
                COUNT(c.column_id) OVER (PARTITION BY c.[name],  c.max_length, c.precision, c.scale) AS Duplicates
        FROM  sys.tables t 
             JOIN  sys.columns c ON t.object_id = c.object_id
        WHERE   c.name  like '%Id'
        AND     c.name != 'rowguid'
        
    )
    
    SELECT      TableName ForeignTableName,
                LEFT(ColumnName, LEN(ColumnName)-2) PrimaryTableName,
                1 Relationship
    FROM        ColumnCount CC
    WHERE       CC.Duplicates > 1
    AND         LEFT(ColumnName, LEN(ColumnName)-2) != TableName 
    ORDER BY    PrimaryTableName, 
                CC.ColumnName,
                CC.TableName
    

    I have made a few assumptions in the above for simplicity.

    The query will give you a result set like follows; Adventure Works table relationships

    Then to create the chart, simply add the From, To and Values count with your PrimaryTableName, ForeignTableName and Relationship (count) values as follows

    Chord chart configuration

    Giving you the required Power BI Chord relationship chart

    Power BI Chord by Field Id

    As per your comments, if you want to do the same, but use the field names to match, (not assuming name[id]) to denote your fk / pk, then the following would work. I have included a section to force the primary table, based on the Primary Key Check Constraint, that you can comment out if you want to show all relationships, but be warned you will have thousands and they will be bi directional if you don't have a way to identify a primary key/table!

        WITH ColumnCount AS(
        SELECT s.name AS SchemaName,
               t.name AS TableName,
               c.name AS ColumnName,
               ct.[name] AS DataType,
               c.max_length,
               c.precision,
               c.scale,
               COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
        FROM sys.schemas s
             JOIN sys.tables t ON s.schema_id = t.schema_id
             JOIN sys.columns c ON t.object_id = c.object_id
             JOIN sys.types ct ON c.user_type_id = ct.user_type_id
             
             )
    
    SELECT cc.TableName + '.' + cc.ColumnName ForeignTableName, cd.TableName + '.' + cd.ColumnName PrimaryTableName, 1 Relationship --,cc.ColumnName, cc.DataType,  cc.Duplicates
    FROM ColumnCount CC 
        /*pk only joins - take ths out if you want all joins between fields, not just pk to fk */
        INNER JOIN  INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE Col on col.COLUMN_NAME = cc.ColumnName
        AND         col.TABLE_NAME = cc.TableName
        INNER JOIN  INFORMATION_SCHEMA.TABLE_CONSTRAINTS Tab on     Col.Constraint_Name = Tab.Constraint_Name
        AND Col.Table_Name = Tab.Table_Name
        AND Constraint_Type = 'PRIMARY KEY'
        /*end of pk only joins */
    
    LEFT JOIN ColumnCount Cd on cc.ColumnName = cd.ColumnName
    and cc.DataType= cd.DataType
    and cc.TableName != cd.TableName
    WHERE CC.Duplicates > 1
    ORDER BY CC.ColumnName,
             CC.SchemaName,
             CC.TableName;
    

    Plugged into the same structure, this provides a Chord diagram of Chord relationship from Primary Key Id