Test data:
create table lines (id number(38,0),
details1 varchar2(10),
details2 varchar2(10),
details3 varchar2(10),
shape sdo_geometry);
begin
insert into lines (id, details1, details2, details3, shape) values (1, 'a', 'b', 'c', sdo_geometry(2002, 26917, null, sdo_elem_info_array(1, 2, 1), sdo_ordinate_array(574360, 4767080, 574200, 4766980)));
insert into lines (id, details1, details2, details3, shape) values (2, 'a', 'b', 'c', sdo_geometry(2002, 26917, null, sdo_elem_info_array(1, 2, 1), sdo_ordinate_array(573650, 4769050, 573580, 4768870)));
insert into lines (id, details1, details2, details3, shape) values (3, 'a', 'b', 'c', sdo_geometry(2002, 26917, null, sdo_elem_info_array(1, 2, 1), sdo_ordinate_array(574290, 4767090, 574200, 4767070)));
insert into lines (id, details1, details2, details3, shape) values (4, 'a', 'b', 'c', sdo_geometry(2002, 26917, null, sdo_elem_info_array(1, 2, 1), sdo_ordinate_array(571430, 4768160, 571260, 4768040)));
...
end;
/
Full test data here: db<>fiddle
I want to pre-compute calculated columns via a function-based index.
Steps:
(1) Create functions to get the startpoint X & startpoint Y coordinates (numbers) from the SDO_GEOMETRY column :
create function startpoint_x(shape in sdo_geometry) return number
deterministic is
begin
return shape.sdo_ordinates(1);
end;
create function startpoint_y(shape in sdo_geometry) return number
deterministic is
begin
return shape.sdo_ordinates(2);
end;
select
id,
details1,
details2,
details3,
startpoint_x(shape) as startpoint_x,
startpoint_y(shape) as startpoint_y
from
lines
ID DETAILS1 DETAILS2 DETAILS3 STARTPOINT_X STARTPOINT_Y
---------- ---------- ---------- ---------- ------------ ------------
177 a b c 574660 4766400
178 a b c 574840 4765370
179 a b c 573410 4768570
180 a b c 573000 4767330
...
[full table scan]
(2) Create a FBI that stores the ID, startpoint X, and startpoint Y in a composite index:
create index lines_fbi_idx on lines (id, startpoint_x(shape), startpoint_y(shape))
(3) When I only select the indexed columns, the FBI gets invoked (no full table scan), which is good:
select
id,
startpoint_x(shape) as startpoint_x,
startpoint_y(shape) as startpoint_y
from
lines
where --https://stackoverflow.com/a/59581129/5576771
id is not null
and startpoint_x(shape) is not null
and startpoint_y(shape) is not null
--------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 117 | 4 (0)| 00:00:01 |
|* 1 | INDEX FAST FULL SCAN| LINES_FBI_IDX | 3 | 117 | 4 (0)| 00:00:01 |
--------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
PLAN_TABLE_OUTPUT --------------
1 - filter("ID" IS NOT NULL AND "INFRASTR"."STARTPOINT_X"("SHAPE") IS NOT
NULL AND "INFRASTR"."STARTPOINT_Y"("SHAPE") IS NOT NULL)
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
That works as expected.
Note: That's just a bare-bones example for the purposes of this post. In reality, the custom functions would have more complicated logic and be a lot slower — hence the need for precomputing in an index.
Question:
In addition to selecting the indexed columns (ID
, startpoint_x
, and startpoint_y
), I also want to select the non-indexed columns: details1
,details2
, and details3
.
How can I utilize the function-based index for the precomputed columns, while also selecting the non-indexed columns?
I think I can do it by simply:
select
a.id,
a.startpoint_x,
a.startpoint_y,
b.details1,
b.details2,
b.details3
from
(
select
id,
startpoint_x(shape) as startpoint_x,
startpoint_y(shape) as startpoint_y
from
lines
where --https://stackoverflow.com/a/59581129/5576771
id is not null
and startpoint_x(shape) is not null
and startpoint_y(shape) is not null
) a
left join
lines b
on a.id = b.id
------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 219 | 10 (0)| 00:00:01 |
| 1 | NESTED LOOPS OUTER | | 3 | 219 | 10 (0)| 00:00:01 |
|* 2 | INDEX FAST FULL SCAN | LINES_FBI_IDX | 3 | 117 | 4 (0)| 00:00:01 |
| 3 | TABLE ACCESS BY INDEX ROWID BATCHED| LINES | 1 | 34 | 2 (0)| 00:00:01 |
|* 4 | INDEX RANGE SCAN | LINES_FBI_IDX | 1 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------------
PLAN_TABLE_OUTPUT --------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("INFRASTR"."STARTPOINT_X"("SHAPE") IS NOT NULL AND
"INFRASTR"."STARTPOINT_Y"("SHAPE") IS NOT NULL AND "ID" IS NOT NULL)
4 - access("ID"="B"."ID"(+))
Note
-----
- dynamic statistics used: dynamic sampling (level=2)
I had incorrectly assumed that it would be more complicated than that. I thought I might need to force Oracle to invoke the FBI when selecting the indexed columns, while also doing a table scan to get the unindexed columns. I wondered if some sort of materialized hint would be needed. But it looks like the join did what I want without needing a hint.
In general, the benefit of this overall non-spatial FBI solution is:
It works around a limitation with function-based spatial indexes:
From @DavidLapp of Oracle
A spatial index is invoked only by the WHERE clause, not the SELECT list.
Whereas my non-spatial FBI doesn’t have that problem. It can be invoked by the SELECT list, as long as there are IS NOT NULLs
in the where clause.
Next step:
In theory, maybe the resulting numeric XY columns could be "spatialized" after-the-fact:
A) Convert the XY to SDO_GEOMETRY on-the-fly / after-the-fact in a query.
or B) Use GIS software to display the XYs as points in the map — using a dynamic "layer" in the application. For example, use ArcGIS Pro's "XY Event Layer" functionality (which is surprisingly fast: https://gis.stackexchange.com/questions/380176/are-xy-event-layers-optimized-indexed-on-the-fly).
Source: “Regarding the idea of indexing/returning a non-spatial datatype…”