for-loopvhdlfpgarasterizingspartan

Serializing code in VHDL


I'm attempting to create a (very basic) GPU on a Spartan-6 FPGA using VHDL.

The big problem I have hit upon is that my understanding of HDL is quite limited - I've been writing my code using nested for loops for ray tracing/scanline rasterization algorithms without considering that these enormous loops consume >100% of the DSP slices when the loops are unraveled on synthesis.

My question is, if, I have a clock triggered counter in place of a for loop (using the counter as the index and resetting it to 0 at its max), would this mean all the logic is only generated once? I can see that, taking ray tracing on a 600x800 screen, with a 200 MHz clock for example, that the overall refresh rate of the entire screen would drop to 625 Hz but that should still be quick enough in theory..?

Thanks very much!


Solution

  • If you implement a for loop, then the functionality in the for loop is executed at the same time for all the values that the for loop goes through. To achieve this, the synthesis tool must implement the functionality once for each value in the for loop, so you will still have the massive hardware implementation.

    For example this code will unroll to parallel hardware for the functionality, the and gate in this case, but without any overhead in hardware as result of the for loop:

    process (clk_i) is
    begin
      if rising_edge(clk_i) then
        for idx_par in z_par_o'range loop
          z_par_o(idx_par) <= a_i(idx_par) and b_i(idx_par);  -- Functionality
        end loop;
      end if;
    end process;
    

    Interleaving of processing for different data values must be implemented with explicit handling in then VHDL, thus having a signal with the value, and doing increment and wrap of this value each time the functionality have calculated the result for the given value.

    And this code will make serial hardware for the functionality, but with overhead in hardware as result of the loop:

    process (clk_i) is
    begin
      if rising_edge(clk_i) then
        if rst_i = '1' then  -- Reset
          idx_ser <= 0;
        else  -- Operation
          z_par_o(idx_ser) <= a_i(idx_ser) and b_i(idx_ser);  -- Functionality
          if idx_ser /= LEN - 1 then  -- Not at end of range
            idx_ser <= idx_ser + 1;  -- Increment
          else  -- At end of range
            idx_ser <= 0;  -- Wrap
          end if;
        end if;
      end if;
    end process;
    

    Ordinary VHDL synthesis tools are not able to unroll for loops to execute over time.