I have to flatten out this XML source into one SQL server table. There are multiple ResponseID per source file and multiple TextAnalyticsItem per ResponseID. I want to limit the records so that I only get the TextAnalyticsItems related to the ResponseID. However, the cross apply method gives me all ResponseId with all TextAnalyticsItem. How do I prevent the additional records?
DECLARE @XMLToParse XML;
SET @XMLToParse = '
<Responses>
<Response>
<ResponseId>7662934</ResponseId>
<SurveyId>123</SurveyId>
<RespondentId>234909</RespondentId>
<QuestionId>141757</QuestionId>
<ScaleId>3401</ScaleId>
<AnswerId>17130</AnswerId>
<ResponseMemo>Useful</ResponseMemo>
<ResponseRank>0</ResponseRank>
<ResponseState>0</ResponseState>
<CompletedDate>2020-07-06T09:07:40</CompletedDate>
<ModifiedDate>2020-07-06T09:07:41</ModifiedDate>
<LanguageId>220</LanguageId>
<ResponseNum>0</ResponseNum>
<ResponseDate />
<TextAnalyticsData>
<TextAnalyticsItem>
<Level1>Values Standards</Level1>
<Level2>Better/Best/Brilliant</Level2>
<Level3>Positive</Level3>
<Sentiment>1</Sentiment>
</TextAnalyticsItem>
<TextAnalyticsItem>
<Level1>All-Behaviors</Level1>
<Level2>Positive_</Level2>
<Sentiment>1</Sentiment>
</TextAnalyticsItem>
</TextAnalyticsData>
</Response>
<Response>
<ResponseId>7662078</ResponseId>
<SurveyId>123</SurveyId>
<RespondentId>234826</RespondentId>
<QuestionId>141756</QuestionId>
<ScaleId>3400</ScaleId>
<AnswerId>17129</AnswerId>
<ResponseMemo>Ghjlkk</ResponseMemo>
<ResponseRank>0</ResponseRank>
<ResponseState>0</ResponseState>
<CompletedDate>2020-07-03T07:17:31</CompletedDate>
<ModifiedDate>2020-07-03T07:17:31</ModifiedDate>
<LanguageId>220</LanguageId>
<ResponseNum>0</ResponseNum>
<ResponseDate />
<TextAnalyticsData>
<TextAnalyticsItem>
<Level1>Nonactionable</Level1>
<Sentiment>0</Sentiment>
</TextAnalyticsItem>
</TextAnalyticsData>
</Response>
</Responses>'
SELECT xmlData.A.value('ResponseId[1]', 'VARCHAR(100)') AS ResponseId,
xmlData.A.value('SurveyId[1]', 'VARCHAR(100)') AS SurveyId,
xmlData.A.value('RespondentId[1]', 'VARCHAR(100)') AS RespondentId,
xmlData.A.value('TextAnalyticsData[1]', 'VARCHAR(100)') AS TextAnalyticsData,
tbl1.TxtItems.value('Level1[1]', 'VARCHAR(100)') AS Level1,
tbl1.TxtItems.value('Level2[1]', 'VARCHAR(100)') AS Level2,
tbl1.TxtItems.value('Level3[1]', 'VARCHAR(100)') AS Level3,
tbl1.TxtItems.value('Sentiment[1]', 'VARCHAR(100)') AS Sentiment
FROM @XMLToParse.nodes('Responses/Response/TextAnalyticsData/TextAnalyticsItem') tbl1(TxtItems)
cross apply @XMLToParse.nodes('Responses/Response') xmlData(A)
ORDER BY ResponseId,
RespondentId;
So instead of
ResponseId SurveyId RespondentId TextAnalyticsData Level1 Level2 Level3 Sentiment
---------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------
7662078 123 234826 Nonactionable0 Values Standards Better/Best/BrilliPositive 1
7662078 123 234826 Nonactionable0 All-Behaviors Positive_ NULL 1
7662078 123 234826 Nonactionable0 Nonactionable NULL NULL 0
7662934 123 234909 Values StandardsBetter/Best/BrilliantPositive1All-BehaviorsPositive_1 Values Standards Better/Best/BrilliPositive 1
7662934 123 234909 Values StandardsBetter/Best/BrilliantPositive1All-BehaviorsPositive_1 Nonactionable NULL NULL 0
7662934 123 234909 Values StandardsBetter/Best/BrilliantPositive1All-BehaviorsPositive_1 All-Behaviors Positive_ NULL 1
I want to get
ResponseId SurveyId RespondentId TextAnalyticsData Level1 Level2 Level3 Sentiment
---------------------------------------------------------------------------------------------------- -------------------------------------------------------------------------------
7662078 123 234826 Nonactionable0 Nonactionable NULL NULL 0
7662934 123 234909 Values StandardsBetter/Best/BrilliantPositive1All-BehaviorsPositive_1 Values Standards Better/Best/BrilliPositive 1
7662934 123 234909 Values StandardsBetter/Best/BrilliantPositive1All-BehaviorsPositive_1 All-Behaviors Positive_ NULL 1
Any suggestions? Thanks
Check it out how to achieve what you need. It simulates one-to-many relationship.
I am not sure what was the intent for the TextAnalyticsData column. That's why I commented it out.
SQL
DECLARE @XMLToParse XML =
N'<Responses>
<Response>
<ResponseId>7662934</ResponseId>
<SurveyId>123</SurveyId>
<RespondentId>234909</RespondentId>
<QuestionId>141757</QuestionId>
<ScaleId>3401</ScaleId>
<AnswerId>17130</AnswerId>
<ResponseMemo>Useful</ResponseMemo>
<ResponseRank>0</ResponseRank>
<ResponseState>0</ResponseState>
<CompletedDate>2020-07-06T09:07:40</CompletedDate>
<ModifiedDate>2020-07-06T09:07:41</ModifiedDate>
<LanguageId>220</LanguageId>
<ResponseNum>0</ResponseNum>
<ResponseDate />
<TextAnalyticsData>
<TextAnalyticsItem>
<Level1>Values Standards</Level1>
<Level2>Better/Best/Brilliant</Level2>
<Level3>Positive</Level3>
<Sentiment>1</Sentiment>
</TextAnalyticsItem>
<TextAnalyticsItem>
<Level1>All-Behaviors</Level1>
<Level2>Positive_</Level2>
<Sentiment>1</Sentiment>
</TextAnalyticsItem>
</TextAnalyticsData>
</Response>
<Response>
<ResponseId>7662078</ResponseId>
<SurveyId>123</SurveyId>
<RespondentId>234826</RespondentId>
<QuestionId>141756</QuestionId>
<ScaleId>3400</ScaleId>
<AnswerId>17129</AnswerId>
<ResponseMemo>Ghjlkk</ResponseMemo>
<ResponseRank>0</ResponseRank>
<ResponseState>0</ResponseState>
<CompletedDate>2020-07-03T07:17:31</CompletedDate>
<ModifiedDate>2020-07-03T07:17:31</ModifiedDate>
<LanguageId>220</LanguageId>
<ResponseNum>0</ResponseNum>
<ResponseDate />
<TextAnalyticsData>
<TextAnalyticsItem>
<Level1>Nonactionable</Level1>
<Sentiment>0</Sentiment>
</TextAnalyticsItem>
</TextAnalyticsData>
</Response>
</Responses>';
SELECT r.value('(ResponseId/text())[1]', 'VARCHAR(100)') AS ResponseId
, r.value('(SurveyId/text())[1]', 'VARCHAR(100)') AS SurveyId
, r.value('(RespondentId/text())[1]', 'VARCHAR(100)') AS RespondentId
--xmlData.A.value('TextAnalyticsData[1]', 'VARCHAR(100)') AS TextAnalyticsData,
, a.value('(Level1/text())[1]', 'VARCHAR(100)') AS Level1
, a.value('(Level2/text())[1]', 'VARCHAR(100)') AS Level2
, a.value('(Level3/text())[1]', 'VARCHAR(100)') AS Level3
, a.value('(Sentiment/text())[1]', 'VARCHAR(100)') AS Sentiment
FROM @XMLToParse.nodes('/Responses/Response') t1(r)
CROSS APPLY t1.r.nodes('TextAnalyticsData/TextAnalyticsItem[Level1/text()]') t2(a)
--ORDER BY ResponseId,
-- RespondentId;
Output
+------------+----------+--------------+------------------+-----------------------+----------+-----------+
| ResponseId | SurveyId | RespondentId | Level1 | Level2 | Level3 | Sentiment |
+------------+----------+--------------+------------------+-----------------------+----------+-----------+
| 7662934 | 123 | 234909 | Values Standards | Better/Best/Brilliant | Positive | 1 |
| 7662934 | 123 | 234909 | All-Behaviors | Positive_ | NULL | 1 |
| 7662078 | 123 | 234826 | Nonactionable | NULL | NULL | 0 |
+------------+----------+--------------+------------------+-----------------------+----------+-----------+