verilogxilinxsynthesize

How can I improve my code to reduce the synthesis time?


I have written some code in verilog for a median filter using a cumulative histogram method. When I try to synthesize my code in xilinx it's processing up to 1 hour and finally shows an error, "program ran out of memory".

My code is:

//***** MEDIAN FILTER BY USING CUMULATIVE HISTOGRAM METHOD******//

module medianfilter(median_out,clk,a1,a2,a3,a4,a5,a6,a7,a8,a9);
output median_out;
input [7:0]a1,a2,a3,a4,a5,a6,a7,a8,a9;
integer i,j;
reg[7:0]b[255:0];
reg [7:0]buff[0:8];
input clk;
reg [7:0]median_out;
always@(negedge clk)
begin 
    //**************************************************************************//
    for(i=0;i<256;i=i+1) // initilize the memory bins with zeros
    b[i]=0;
    //*************************************************************************//

    buff[0]=a1;
    buff[1]=a2;
    buff[2]=a3;
    buff[3]=a4;
    buff[4]=a5;
    buff[5]=a6;
    buff[6]=a7;
    buff[7]=a8;
    buff[8]=a9;
    for(i=0;i<9;i=i+1)  // this loop is for cumulative histogram method
    begin
        b[buff[i]]=b[buff[i]]+1;   // incrementing the value in b[i]th  memory address
            for(j=0;j<256;j=j+1)
                if(j>buff[i])
                    b[j]=b[j]+1; // incrementing the bins below b[i]th bin


    end
//**************************************************************************//
    for(i=0;i<256;i=i+1) // loop for finding the median 
    begin
        if(b[i]>4)  ///////// condition for checking median
        begin
            b[i]=1;
            median_out=i;
            i=256; // loop breaks here
        end
    end
//*************************************************************************//
end

endmodule

How can I make the code synthesizable?


Solution

  • How many adders are generated by your code? I see at least 2,100 8-bit adders which are working in the same cycle.

    You should rethink your algorithm: A median filter needs an ordered list of pixel values, so at first you should think about efficient ordering of numbers on FPGAs.

    A good approach are sorting networks like:

    Sorting 9 numbers can't be done in one cycle so you need pipelining. (You can do it, but at very low clock speed.)

    Our PoC-Library contains pipelined sorting networks,but I have never test these networks with a non-power of two input size!