database-design powerbi dax reverse-engineering database-diagram

Power BI Chord visual: link tables with same column name

Someone migrated a DB2 database to SQL Server and during the migration all PK and FK were lost. there is no way to have them back.

But thanks to this query I'm now able to Reverse Engineer the database diagram based on the datatype and column name. (I know, is a Reverse Engineer based on assumptions).

I now would like to create a Power BI dashboard with the Chord visual and create a link between tables that have the same column name. (the image is just an example to give you an idea)

But I cannot use From and To because I'm not tracking a change.

Maybe I'm using the wrong visual?

How to track things that are in common?

EDIT FOR BOUNTY:

If I run the query against Adventure Works:

WITH ColumnCount AS(
    SELECT s.name AS SchemaName,
           t.name AS TableName,
           c.name AS ColumnName,
           ct.[name] AS DataType,
           c.max_length,
           c.precision,
           c.scale,
           COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
    FROM sys.schemas s
         JOIN sys.tables t ON s.schema_id = t.schema_id
         JOIN sys.columns c ON t.object_id = c.object_id
         JOIN sys.types ct ON c.user_type_id = ct.user_type_id)
SELECT *
FROM ColumnCount CC
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
         CC.SchemaName,
         CC.TableName;

The query can group columns that have the same: ColumnName, DataType, max_length, precision, scale.

But how to render this in a Power BI Chord visual?

The goal is ti find link between tables.

Chord seems the best visual to archive it but if you have better visual to suggest I'm open to your tips.

Solution

To relate the Power BI Chord visual, you need a minimum of two data points and a measure;

From (foreign key table)
To (primary key table)
Relationship Count

Depending on the number of tables, you may find the Chord chart a little overwhelmed by data, however if you bring those three values through form Adventure Works, you can produce

The dataset you are starting with has a lot more info, and rows than you need, but doesn't naturally contain the relationship. By simplifying the dataset with the following, you can create your Chord data point

WITH ColumnCount AS(
    SELECT  t.name AS TableName,
            c.name AS ColumnName,
            COUNT(c.column_id) OVER (PARTITION BY c.[name],  c.max_length, c.precision, c.scale) AS Duplicates
    FROM  sys.tables t 
         JOIN  sys.columns c ON t.object_id = c.object_id
    WHERE   c.name  like '%Id'
    AND     c.name != 'rowguid'
    
)

SELECT      TableName ForeignTableName,
            LEFT(ColumnName, LEN(ColumnName)-2) PrimaryTableName,
            1 Relationship
FROM        ColumnCount CC
WHERE       CC.Duplicates > 1
AND         LEFT(ColumnName, LEN(ColumnName)-2) != TableName 
ORDER BY    PrimaryTableName, 
            CC.ColumnName,
            CC.TableName

I have made a few assumptions in the above for simplicity.

All foreign keys end in ID (to stop incorrect key matches like ActualCost) with

WHERE c.name like '%Id'
Rowguid is not a relationship column and is therefore excluded

AND c.name != 'rowguid'
We don't want a primary key table relationship to itself

LEFT(ColumnName, LEN(ColumnName)-2) != TableName
Scheme is not important (and therefore removed)

The query will give you a result set like follows;

Then to create the chart, simply add the From, To and Values count with your PrimaryTableName, ForeignTableName and Relationship (count) values as follows

Giving you the required Power BI Chord relationship chart

As per your comments, if you want to do the same, but use the field names to match, (not assuming name[id]) to denote your fk / pk, then the following would work. I have included a section to force the primary table, based on the Primary Key Check Constraint, that you can comment out if you want to show all relationships, but be warned you will have thousands and they will be bi directional if you don't have a way to identify a primary key/table!

    WITH ColumnCount AS(
    SELECT s.name AS SchemaName,
           t.name AS TableName,
           c.name AS ColumnName,
           ct.[name] AS DataType,
           c.max_length,
           c.precision,
           c.scale,
           COUNT(c.column_id) OVER (PARTITION BY c.[name], ct.[name], c.max_length, c.precision, c.scale) AS Duplicates
    FROM sys.schemas s
         JOIN sys.tables t ON s.schema_id = t.schema_id
         JOIN sys.columns c ON t.object_id = c.object_id
         JOIN sys.types ct ON c.user_type_id = ct.user_type_id
         
         )

SELECT cc.TableName + '.' + cc.ColumnName ForeignTableName, cd.TableName + '.' + cd.ColumnName PrimaryTableName, 1 Relationship --,cc.ColumnName, cc.DataType,  cc.Duplicates
FROM ColumnCount CC 
    /*pk only joins - take ths out if you want all joins between fields, not just pk to fk */
    INNER JOIN  INFORMATION_SCHEMA.CONSTRAINT_COLUMN_USAGE Col on col.COLUMN_NAME = cc.ColumnName
    AND         col.TABLE_NAME = cc.TableName
    INNER JOIN  INFORMATION_SCHEMA.TABLE_CONSTRAINTS Tab on     Col.Constraint_Name = Tab.Constraint_Name
    AND Col.Table_Name = Tab.Table_Name
    AND Constraint_Type = 'PRIMARY KEY'
    /*end of pk only joins */

LEFT JOIN ColumnCount Cd on cc.ColumnName = cd.ColumnName
and cc.DataType= cd.DataType
and cc.TableName != cd.TableName
WHERE CC.Duplicates > 1
ORDER BY CC.ColumnName,
         CC.SchemaName,
         CC.TableName;

Plugged into the same structure, this provides a Chord diagram of