I am currently working on a VHDL module that first reads in the given input data in parallel via a shift register and then outputs the stored data bit by bit at the output in each clock cycle. For this I implemented a state machine that generates an SR_SHIFT_ENABLE strobe on the rising edge in every clock cycle, which then activates the shift register and creates another bit at the output. When rethinking my implementation, however, I had doubts about the later implementation on the FPGA.
In my current implementation, the generated SR_SHIFT_ENABLE strobe in the shift register is evaluated on the rising clock edge.Enable Pulse is evaluated at the rising edge
However, I'm wondering if in above simulation timing problems might occur in the FPGA implementation with this approach. Specifically, my concern is the SR_SHIFT_ENABLE pulse. According to the simulation, SR_SHIFT_ENABLE is evaluated by the shift register directly before its end. However, according to my considerations, hold timing violations of the shift register could occur, so that this pulse on the rising edge is not present at the shift register long enough (hold time violation?).
One possibility would be to evaluate the SR_SHIFT_ENABLE pulse "in the middle" of the clock, i.e. at the falling clock edge. Enable Pulse is evaluated at the falling edge In this way, I could ensure that no setup or hold timing violations occur (see simulation).
Since both alternatives work in the simulation, I would therefore like to know which of the two variants is preferable.
Thank you in advance for your help :)
I essentially tried two different implementations:
Since both implementations work, I would like to know which of these design alternatives is preferable.
Case 1. If these signals are internal and your design is synchronous, don't worry about setup/hold time because FPGA tool will take care of it.
Case 2. If these signals are external (on the I/O pins of FPGA), you may set input/output delay on them in implementation to achieve timing requirement.
Case 3. If you don't know how to constrain input/output delay, then it's a good idea to toggle output on clock falling-edge and sample input on clock rising-edge.
However, case 3 sacrifices setup time margin for better hold time margin, which limits the maximum frequency of the I/O interface. But it's OK for low speed application.