videosignal-processingesp32wirelesshdmi

Best approach for implementing wireless video transmission using ESP32 microcontrollers and HDMI


I am working on a project to create a wireless video transmission system using two ESP32 microcontrollers. The first ESP32 will receive video from a camera via HDMI, and wirelessly transmit it to the second ESP32. The second ESP32 will then output the video signal to a monitor via HDMI. I consider using C or Python as language. Whatever is best.

I am looking for advice on the best approach to set up the wireless transmission and process the HDMI signals on the ESP32 microcontrollers. Specifically, I am interested in how to connect the hdmi cable to the microcontroller and process it. Additionally, I would appreciate any tips or best practices for optimizing the performance of the system.

What I've tried: I have done some research on wireless transmission and HDMI signal processing on the ESP32, but I have not found a clear path forward. I am not new to microcontroller but never did much with C.


Solution

  • I hope I can give you enough context to understand what kind of challenge you're entering without a full understanding.

    Specifically, I am interested in how to connect the hdmi cable to the microcontroller and process it.

    Which camera? At what resolution? In the absence of information, I'll assume the camera is something similar to a webcam, with 720p or 1080p video livestream.

    If it's the case, it's not possible with just an ESP32. You need a much more powerful SoC (or a single-board computer built with such an SoC) to do this kind of video signal processing. I believe it's possible with a Raspberry Pi, but definitely not a ESP32.

    To transmit or receive an HDMI signal, you must use a specialized HDMI controller or transceiver, either one that is already built into the SoC you're using, or using an external one. But the ESP32 does not contain any HDMI controller or transceiver. Unlike a 115.2 Kbps serial port, a 720p HDMI video signal 742.5 Mbps per lane is not something that you can bit-bang! So it's not possible, at least without using external controllers.

    Even if I assume the ESP32 can handle HDMI, it's still impractical to transmit an uncompressed video stream over a wireless network, the bandwidth requirement is enormous, meanwhile the ESP32 only has 802.11n at best. This means you'll need video compression, such as H.264. Even on a desktop computing this can be a computational-intensive task and it's not something a microcontroller or SoC can handle without hardware acceleration. But the ESP32 does not have any hardware video encoder or decoder.

    Meanwhile, if you do use external controllers, the ESP32 would be somewhat pointless since it's the external controllers that are doing the actual work, the ESP32 can perhaps be used to control its power switch and other housekeeping tasks like watchdog timers, but it does not have any computing power to process the video.

    Comments

    If you're thinking that it's possible to generate or process a HDMI signal (at perhaps 720p) directly with an ESP32, I suggest you to familiarize yourself with some basic electronics first. It's important to realize that modern data links, such as HDMI, USB, SATA, or Gigabit Ethernet were made possible and became common-place only because of the tremendous engineering efforts spent on high-speed electronics.

    First, at the data link layer, these kinds of interfaces run at extremely fast data rates. A HDMI link with a signal at 60 Hz, using 8-bit color, will run at 742.5 Mbps per lane. A 1080p signal would push the data rate to 1.485 Gbps per lane. So these kinds of electronics operates at a frequency around 1 GHz. Meanwhile an ESP32 has an 200 MHz CPU with 320 KiB of SRAM, it's not even able to hold a single frame of image, let alone generating it. A proper DSP would have megabytes to gigabytes of DRAM. If you want to "process" the video, even more computing power is needed and it likely requires hardware acceleration.

    Next, at the physical layer (PHY), high-speed signal transmission requires specialized transmitters and receivers, capable of doing Low-Voltage Differential Signaling (LVDS), Current-Mode Logic (CML), or in the case of HDMI, Transition-Minimized Differential Signaling (TMDS). They often also performs clock and data recovery, and possibly equalization.

    The end result is that a HDMI transceiver is an advanced piece of technology that not everyone can build, and especially it's not something you can "bit-bang" using a microcontroller.

    The standard solutions are basically either using some kind of external HDMI transceivers or controllers that can handle HDMI signals by themselves, or using some kind of SoCs that already comes with a built-in HDMI transceiver, or somewhere in between.

    1. Use an external DSP or FPGA chip. They can be controlled by a much simpler microcontroller, but the actual heavy processing is performed on these DSP/FPGA chips.

    2. Use a DSP, FPGA or SoC, with the necessary hardware already built in. Many DSP chips are already embedded with both video processing blocks and ARM microcontroller cores inside. For example, take a look at the block diagram of this Texas Instruments TMS320DM8127 DaVinciā„¢ Video Processor

    Figure 1-1. TMS320DM8127 DaVinci Digital Media Processors Functional Block Diagram

    1. Use a SoC that contains some kind of video signal transceiver or processor, just not HDMI, then use an external converter chip to convert such a signal to HDMI. For example, if USB is supported, it's possible to use an existing HDMI USB capture card. Another curious example is Raspberry Pi's SoC, which is equipped with a CSI-2 port, basically a fast data interface more than enough to support an HD video signal from a camera. In such a case, only a relatively simple converter chip is needed to allow HDMI capture, and there are many adapter boards available for purchase.