sqlsql-serverpivotalways-encrypted

Query pivot with encryption


I need to make a query with a large dataset. The data is spread over multiple rows in a table, the goal is to have them in a single column.

Some columns are encrypted (Always Encrypted, in this example the 'Topic' and 'Rating' column).

On some other examples I tried (f.e. MAX() Function) I got the error message:

Encryption scheme mismatch for columns/variables

Example data - table #1 (Person):

ID Person
1 Testi McTest
2 Max Mustermann

Table #2 (Data):

ID PersonID Index Topic (Encrypted) Comment (Encrypted) Rating (Encrypted)
1 1 1 Text 1 Testvalue 1 1
2 1 2 Text 2 Testvalue 2 1
3 2 1 Text 3 Testvalue 3 2
4 2 2 Text 4 Testvalue 4 2

I need to query and get the results in a single row:

Person Index 1 Topic Index 1 Rating Index 2 Topic Index 2 Rating
Testi McTest Text 1 1 Text 2 1
Max Mustermann Text 3 2 Text 4 2

I tried it with joining the rows, but its way too slow.. This is the only solution I got that worked:

WITH tbl AS
(
    SELECT  
        PersonID, Index, Topic, Rating
    FROM            
        Data
)
SELECT
    Person
    i1.Topic,
    i1.Rating,
    i2.Topic,
    i2.Rating
FROM 
    Data
LEFT JOIN 
    Person ON Data.PersonId = Person.Id
INNER JOIN 
    tbl AS i1 on Data.PersonId = i1.Id AND i1.Index = 1
INNER JOIN 
    tbl AS i2 on Data.PersonId = i2.Id AND i2.Index = 2

Other example with encyption error:

SELECT
    p.Person,
    MAX(CASE WHEN d.Index = 1 THEN d.Topic END) AS "Index 1 Topic",
    MAX(CASE WHEN d.Index = 1 THEN d.Rating END) AS "Index 1 Rating",
    MAX(CASE WHEN d.Index = 2 THEN d.Topic END) AS "Index 2 Topic",
    MAX(CASE WHEN d.Index = 2 THEN d.Rating END) AS "Index 2 Rating"
FROM
    Person p
LEFT JOIN
    Data d ON p.ID = d.PersonID
GROUP BY
    p.ID, p.Person

Solution

  • Always Encrypted means you can't use any function that orders the data, such as MAX. You are using Deterministic Encryption, so you can still do point lookups using = in a join.

    I tried it with joining the rows, but its way too slow.

    Almost certainly that's an issue with indexing. For the given query, you can use the following index.

    Data (Index, PersonId) INCLUDE (Topic, Comment, Rating)
    

    I'm assuming Person has a primary key on Id.

    And you don't need to query Data three times, only twice.

    SELECT
        p.Person,
        i1.Topic AS Topic1,
        i1.Rating AS Rating1,
        i2.Topic AS Topic2,
        i2.Rating AS Rating2
    FROM 
        Person AS p
    INNER JOIN 
        Data AS i1 on p.Id = i1.PersonID AND i1.Index = 1
    INNER JOIN 
        Data AS i2 on p.Id = i2.PersonID AND i2.Index = 2;
    

    If it's still too slow then your other option is to just pivot it on the client side, which is the only place you can actually read the data.