sqloracle-databaseplsql

Read data from a table in a PL/SQL procedure and insert into another table while one hot encoding


I am trying to read data from a table in PL/SQL through a cursor where I am trying insert the data back to another table but one hot encoded.

So the data is like this:

ORIG_TABLE

ID Fruit
1 Orange
2 Apple
3 Mango
3 Orange
4 Apple

And what I am trying to achieve is the following table

ONE_HOT

ID Orange Apple Mango
1 1 0 0
2 0 1 0
3 1 0 1
4 0 1 0

I already have the new table created with the necessary columns, because the range of the values (in this case {Orange, Apple, Mango}) is already known.

CREATE TABLE ONE_HOT
(   
ID        VARCHAR(11 BYTE),
ORANGE    NUMBER,
APPLE     NUMBER,
MANGO     NUMBER
)

I tried reading the cursor and tried inserting through case expression. Code is below, but is not working. I was expecting the code to read a single line at a time and based on the fruit that the current line hold corresponding to the id the current line hold, and then insert 1 or 0 in the appropriate column. And expected the code to keep doing so until there are values in the table.

CURSOR FRUIT_CUR IS SELECT * FROM ORIG_TABLE;
FRUIT_REC FRUIT_CUR%ROWTYPE

OPEN FRUIT_CUR

LOOP

FETCH FRUIT_CUR INTO FRUIT_REC;
EXIT WHEN FRUIT_CUR%NOTFOUND;

INSERT INTO ONE_HOT
VALUES(
FRUIT_REC.ID,
CASE WHEN FRUIT_REC.FRUIT = 'Orange' THEN 1 ELSE 0 END AS ORANGE,
CASE WHEN FRUIT_REC.FRUIT = 'Apple' THEN 1 ELSE 0 END AS APPLE,
CASE WHEN FRUIT_REC.FRUIT = 'Mango' THEN 1 ELSE 0 END AS MANGO

);

  
END LOOP;

CLOSE FRUIT_CUR;

Solution

  • It is always helpful to post the error(s) you are getting when you ask a question so that we don't have to guess. If I correct a bunch of syntax errors, this code works.

    declare  -- declare missing
      CURSOR FRUIT_CUR IS SELECT * FROM ORIG_TABLE;
      FRUIT_REC FRUIT_CUR%ROWTYPE; -- closing semicolon missing
    begin -- begin missing
      OPEN FRUIT_CUR; -- closing semicolon missing
      LOOP
        FETCH FRUIT_CUR INTO FRUIT_REC;
        EXIT WHEN FRUIT_CUR%NOTFOUND;
    
        INSERT INTO ONE_HOT
          VALUES(
            FRUIT_REC.ID,
            CASE WHEN FRUIT_REC.FRUIT = 'Orange' THEN 1 ELSE 0 END,  -- no AS
            CASE WHEN FRUIT_REC.FRUIT = 'Apple' THEN 1 ELSE 0 END, -- no AS
            CASE WHEN FRUIT_REC.FRUIT = 'Mango' THEN 1 ELSE 0 END -- no AS
        );
      END LOOP;
    
      CLOSE FRUIT_CUR;  -- missing semicolon
    end;  -- missing end
    /
    

    Of course, it is likely simpler to just use an implicit cursor

    begin
      for fruit in (select * from orig_table)
      loop
        insert into one_hot( id, orange, apple, mango )
          values( fruit.id,
                  case when fruit.fruit = 'Orange' then 1 else 0 end,
                  case when fruit.fruit = 'Apple' then 1 else 0 end,
                  case when fruit.fruit = 'Mango' then 1 else 0 end );
      end loop;
    end;
    /