Is thare any quickstart guide for programmers for writing DSP-accelerated appliations for TMS320C64x?
I have a program with custom algorythm (not the fft, or usial filtering) and I want to accelerate it using multi-DSP coprocessor. So, how should I modify source to move computation from main CPU to DSPs? What limitations are there for DSP-running code?
I have some experience with CUDA. In CUDA I should mark every function as being host, device, or entry point for device (kernel). There are also functions to start kernels and to upload/download data to/from GPU. There are also some limitations, for device code, described in CUDA Reference manual. I hope, there is an similar interface and a documentation for DSP.
Please consider doing a search "TMS320C64x programmer's guide" - here is what I think is the most appropriate link
focus.ti.com/lit/ug/spru565b/spru565b.pdf
Also check this book - to help you get started (uses the previous generation)
Embedded image processing on the TMS320C6000 DSP : examples in code composer studio and MATLAB