I'm trying to get the element with the newest date from various JSON arrays. But I can't get it to work. It returns all the elements from the arrays. I have tried several things with MAX, TOP(1), ORDER BY but with no luck. Can someone point me in the right direction.
This is what the query returns
And this is what I would like - the element where 'gyldigTil' is the newest date
This is the query I'm using
DECLARE @json NVARCHAR(MAX) =
N'{
"_index": "cvr-v-20220630",
"_type": "_doc",
"_id": "4006567262",
"_score": 1.0,
"_source": {
"Vrvirksomhed": {
"virksomhedsform": [
{
"virksomhedsformkode": 80,
"langBeskrivelse": "Anpartsselskab",
"kortBeskrivelse": "APS",
"sidstOpdateret": "2020-02-26T12:02:30+01:00",
"ansvarligDataleverandoer": "E&S",
"periode": {
"gyldigFra": "2016-06-10",
"gyldigTil": "2020-01-22"
}
},
{
"virksomhedsformkode": 10,
"langBeskrivelse": "Enkeltmandsvirksomhed",
"kortBeskrivelse": "ENK",
"sidstOpdateret": "2020-03-11T12:08:14+01:00",
"ansvarligDataleverandoer": "T&S",
"periode": {
"gyldigFra": "2020-01-23",
"gyldigTil": "2022-12-31"
}
}
],
"virksomhedsstatus": [
{
"sidstOpdateret": "2018-10-12T23:46:13+02:00",
"status": "NORMAL",
"periode": {
"gyldigFra": "2016-06-10",
"gyldigTil": "2018-10-08"
}
},
{
"sidstOpdateret": "2020-02-26T12:02:30+01:00",
"status": "UNDER KONKURS",
"periode": {
"gyldigFra": "2018-10-09",
"gyldigTil": "2020-01-21"
}
},
{
"sidstOpdateret": "2020-02-26T12:02:30+01:00",
"status": "OPLØST EFTER KONKURS",
"periode": {
"gyldigFra": "2020-01-22",
"gyldigTil": "2020-01-22"
}
}
],
"cvrNummer": 37803472,
"virksomhedMetadata": {
"nyesteHovedbranche": {
"sidstOpdateret": "2020-02-26T12:02:30+01:00",
"branchetekst": "Vejgodstransport",
"branchekode": "494100",
"periode": {
"gyldigFra": "2016-06-10",
"gyldigTil": "2020-01-22"
}
}
}
}
}
}';
SELECT cvrNummer,
branchetekst,
branchekode,
[Status],
VF.kortBeskrivelse,
VF.gyldigTil AS VF_gyldigTil,
VS.gyldigTil AS VS_gyldigTil
FROM OPENJSON(@JSON)
WITH (
cvrNummer NVARCHAR(50) '$."_source"."Vrvirksomhed"."cvrNummer"',
branchetekst NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchetekst"',
branchekode NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchekode"',
virksomhedsform NVARCHAR(MAX) '$."_source"."Vrvirksomhed"."virksomhedsform"' AS JSON,
virksomhedsstatus NVARCHAR(MAX) '$."_source"."Vrvirksomhed"."virksomhedsstatus"' AS JSON
)
CROSS APPLY OPENJSON(virksomhedsform)
WITH (
kortBeskrivelse NVARCHAR(50) '$.kortBeskrivelse',
gyldigTil date '$.periode.gyldigTil'
) VF
CROSS APPLY OPENJSON(virksomhedsstatus)
WITH (
[Status] NVARCHAR(50) '$.status',
gyldigTil date '$.periode.gyldigTil'
) VS
One approach is to wrap the query in a CTE with ROW_NUMBER() OVER(ORDER BY gyldigTil DESC)
and filter the result with number 1.
WITH cte_all AS (
SELECT cvrNummer,
branchetekst,
branchekode,
Status,
gyldigTil,
ROW_NUMBER() OVER(ORDER BY gyldigTil DESC) AS row_num
FROM OPENJSON(@JSON)
WITH (
cvrNummer NVARCHAR(50) '$."_source"."Vrvirksomhed"."cvrNummer"',
branchetekst NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchetekst"',
branchekode NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchekode"',
virksomhedsstatus NVARCHAR(MAX) '$."_source"."Vrvirksomhed"."virksomhedsstatus"' AS JSON
)
CROSS APPLY OPENJSON(virksomhedsstatus)
WITH (
[Status] NVARCHAR(50) '$.status',
gyldigTil date '$.periode.gyldigTil'
)
)
SELECT cvrNummer,
branchetekst,
branchekode,
Status,
gyldigTil
FROM cte_all
WHERE row_num = 1;
EDIT:
The query above handles the single array use case by returning the singleton attributes plus the virksomhedsstatus attributes from the latest gyldigTil array entry.
A similar approach may be used for additional arrays. However, since the multiple CROSS APPLY
clauses will return a cartesian product of entries from both arrays (6 rows) and the desired entry from each array may be from different rows, it is necessary to select attributes from each array independently for a single-row from each array and singe-row result. One method to accomplish this is a CTE for each array, each returning the latest row:
WITH cte_all AS (
SELECT Vrvirksomhed.cvrNummer,
Vrvirksomhed.branchetekst,
Vrvirksomhed.branchekode,
VS.Status,
VF.kortBeskrivelse,
VF.gyldigTil AS VF_gyldigTil,
VS.gyldigTil AS VS_gyldigTil,
ROW_NUMBER() OVER(ORDER BY VF.gyldigTil DESC) AS VF_row_num,
ROW_NUMBER() OVER(ORDER BY VS.gyldigTil DESC) AS VS_row_num
FROM OPENJSON(@JSON)
WITH (
cvrNummer NVARCHAR(50) '$."_source"."Vrvirksomhed"."cvrNummer"',
branchetekst NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchetekst"',
branchekode NVARCHAR(255) '$."_source"."Vrvirksomhed"."virksomhedMetadata"."nyesteHovedbranche"."branchekode"',
virksomhedsform NVARCHAR(MAX) '$."_source"."Vrvirksomhed"."virksomhedsform"' AS JSON,
virksomhedsstatus NVARCHAR(MAX) '$."_source"."Vrvirksomhed"."virksomhedsstatus"' AS JSON
) AS Vrvirksomhed
CROSS APPLY OPENJSON(virksomhedsform)
WITH (
kortBeskrivelse NVARCHAR(50) '$.kortBeskrivelse',
gyldigTil date '$.periode.gyldigTil'
) AS VF
CROSS APPLY OPENJSON(virksomhedsstatus)
WITH (
Status NVARCHAR(50) '$.status',
gyldigTil date '$.periode.gyldigTil'
) AS VS
)
,cte_VF AS (
SELECT cvrNummer,
branchetekst,
branchekode,
kortBeskrivelse,
VF_gyldigTil
FROM cte_all
WHERE VF_row_num = 1
)
,cte_VS AS (
SELECT VS_gyldigTil,
Status
FROM cte_all
WHERE VS_row_num = 1
)
SELECT cte_VF.cvrNummer,
cte_VF.branchetekst,
cte_VF.branchekode,
cte_VS.Status,
cte_VF.kortBeskrivelse,
cte_VF.VF_gyldigTil,
cte_VS.VS_gyldigTil
FROM cte_VS
CROSS JOIN cte_VF;
Be aware the results are not deterministic (arbitrary) if multiple entries exists with the same latest gyldigTil value.