Application design of real-time image processing system based on OMAP5910 dual-core processor

The salient feature of the real-time image processing system is the large amount of data. Effective processing and transmission of image data is the key to realizing the real-time image processing system. TI has launched the high-performance multimedia dual-core processor OMAP5910, which combines the high-performance, low-power TMS320C55x DSP and ARM925 microprocessor with strong control performance are integrated into the same chip device. How to effectively utilize the advantages of dual-core and rationally use the various memory configurations of OMAP5910 to configure the DMA controller to transmit large-scale image data in real time and efficiently is the focus of this paper. .

The salient feature of the real-time image processing system is the large amount of data. Effective processing and transmission of image data is the key to realizing the real-time image processing system. TI has launched the high-performance multimedia dual-core processor OMAP5910, which combines the high-performance, low-power TMS320C55x DSP and ARM925 microprocessor with strong control performance are integrated into the same chip device. How to effectively utilize the advantages of dual-core and rationally use the various memory configurations of OMAP5910 to configure the DMA controller to transmit large-scale image data in real time and efficiently is the focus of this paper. .

1 Memory management of OMAP5910

Because OMAP5910 supports a variety of memories, when designing a DMA transfer scheme, you must have a detailed understanding of OMAP5910’s memory management.

OMAP5910 MPU memory chip integrates 192KB of SRAM, DSP memory chip integrates 64KB of two-way DARAM, 96KB of one-way SARAM, 32KB of program memory PDROM, MPU and DSP subsystem memory mapping situation shown in Figure 1. Through the EMIFF and EMIFS interfaces, OMAP5910 can access off-chip memory, but the speed of accessing off-chip memory is very different from the speed of accessing on-chip memory.

The on-chip memory of OMAP5910 is mainly managed by the memory management unit TC. TC manages MPU, DSP, DMA, and local bus access to OMAP5910 system storage resources (SRAM, SDRAM, Flash, ROM, etc.). Its main function is to ensure that the processor can efficiently access external storage areas, avoid bottlenecks and reduce the on-chip Processing speed, TC supports the processor or DMA unit to access the memory through 3 different interfaces-EMIFS, EMIFF or IMIF. The EMIFS interface provides access to Flash, SRAM or ROM, the EMIFF interface provides access to SDRAM, and the IMIF interface provides access to OMAP5910 on-chip 192KB SRAM. The three interfaces are completely independent from any processor or DMA unit. Can visit at the same time.

Application design of real-time image processing system based on OMAP5910 dual-core processor

2 DMA controller of OMAP5910

The DMA controller of OMAP5910 is very important for real-time image processing systems. It can complete the data movement in the mapped storage space without the participation of the CPU. The flexible use of the DMA controller can greatly improve the efficiency of data transmission.

The characteristics of the OMAP5910 DMA controller for general-purpose function transmission are as follows:

1) Single-channel split operation, with general and dedicated channels, and different hardware resource ports. All data exchanges are handshaking through Request, Ready, and Abort signals. The DMA channel is time-division multiplexed, and the basic flow of its transmission is shown in Figure 2.

Application design of real-time image processing system based on OMAP5910 dual-core processor

2) Multi-frame transmission. There can be multiple data frames for each block of transmission. The size of the transmitted data is 8-bit, 16-bit and 32-bit. It can be packed and unpacked in bytes, and the transmitted bytes can be counted. All memory address spaces (physical address mapping and I/O space) can be accessed.

3) DMA read, write and frame operations have interrupt triggers. Each DMA physical channel can generate an interrupt to make the processor respond to the status of this transfer. All DMA interrupts are level interrupts.

4) Background transmission, high throughput, DMA can work independently of the CPU and data throughput at the PCU clock speed.

The image data of the real-time image processing system is very large, and a large amount of intermediate data will also be generated during the image processing. The on-chip resources of OMAP5910 are limited and cannot accommodate the image data and intermediate data of the frame, so a large amount of image data must be stored in In the off-chip memory, in order to ensure the real-time performance of the system, the DMA is responsible for completing the movement of data in different storage spaces, without occupying the clock cycle of the CPU, and avoiding the CPU from blocking most of the time on the access to the external memory. At the same time, the DMA rearranges the data. The function can optimize the storage of image data in the memory, which not only improves the utilization efficiency of internal storage space, but also increases the data transmission rate.

3 OMAP5910 internal and external memory data exchange analysis

A complete real-time image processing can not only collect images in real time, but also process images in real time. The real-time image processing system is mainly composed of image sensors, A/D converters, complex programmable logic devices, FPGAs, and OMAP5910 dual-core processors. Image Display equipment and other components. The main function of the system is that the FPGA receives the 14-bit video signal output by the infrared focal plane array sensor in real time. After frequency reduction, the DSP processor of OMAP5910 executes the image processing algorithm. At the same time, the ARM processor of OMAP5910 executes complex control instructions, and then the FPGA Cache, through D/A conversion synthesis 10-bit video signal output, in addition, the ARM processor of OMAP5910 receives computer control instructions through the interface.

According to the visual requirements of the human eye, the imaging system must collect and process at least 25 frames of image data every second to avoid the visual flicker when the image is displayed in real time. For a 320×240 dot matrix image, the A/D is 14 bits, and the data collected per frame is 320×240×14 bits=1 MB. According to real-time requirements, the speed of processing and displaying data is 320×240×14

Bit×25 frame/s=3.125MB/s, that is, the operation of reading 1 line (320 pixels) and writing 1 line (346 pixels) needs to be completed within 64μs, only in this way can the image not lose continuity.

In order to ensure the real-time performance of image processing and display, the data transmission channels of OMAP5910 should be fully utilized, and the data transmission of OMAP5910 on-chip and external memory, as shown in Figure 3, shows all the data when OMAP5910 is processing real-time images. Transmission channels, their transmission rate is related to the type of memory. In order to fully understand the performance of the data transmission channel of OMAP5910, the author did a series of experiments, set the system clock to 150MHz, set the working mode to the full synchronization mode, and enabled Ca che, and tested each transmission channel to transmit 1 frame in detail. The time used for the data is listed in Table 1. Use this as a basis to optimize data transmission.

Application design of real-time image processing system based on OMAP5910 dual-core processor

4 DMA mode data transmission optimization scheme

Based on the detailed analysis of the data transmission rate performance of each data transmission, this paper proposes a data transmission optimization scheme in DMA mode, which divides the entire frame into multiple blocks, and the image data to be processed by OMAP5910 is completely stored in the on-chip data of OMAP5910 Processing in the memory not only reduces a large number of interactions with the external memory, but also makes full use of the high-speed storage resources on the chip. The data transfer between the internal and external memory uses DMA to operate in the background, which greatly improves the work efficiency of OMAP5910.

4.1 Data flow

The data sampled by A/D is first stored in the external buffer. When a certain amount of data is collected, the complex programmable logic device triggers the ARM DMA to read the data, input two frames of images in sequence, save them in the SDRAM, and obtain them from the external Flash The parameter A and B values ​​required for image processing are stored in SDRAM, and the output frame memory is in 8-line block units, which triggers the DSP DMA to transfer the block data from the external buffer area SDRAM of OMAP5910 to the two-way internal buffer area DARAM of the DSP core. For the DSP core to perform calculations. Regarding the characteristics of DMA mode transfer, while the DSP core is performing calculations, the DSP DMA transfers the previous image data (8 rows) to the SDRAM. After the ARM core receives the output row data, it triggers the ARM DMA to move the data to the FPGA control external storage area. , The data flow is shown in Figure 4.

Application design of real-time image processing system based on OMAP5910 dual-core processor

Because OMAP5910 adopts double-buffering mechanism internally and externally, the DMA processed by the ARM core and DSP core transfers the previous frame data at the same time, without affecting the DMA’s current frame data transmission. In this way, the collection of A/D data, the transmission of DMA data and the calculation of CPU data in the entire system have reached a high degree of parallelism.

4.2 Operation sequence

From the operation timing of data transmission, it can be seen that another advantage of this optimization scheme is that it combines several original input processes (each input process refers to the input of 1 row of pixels) into one input process (one input of several lines) Pixels), and the output process that was originally completed centrally is dispersed to the middle of the input process to further improve the performance.

The specific configuration operation is: the ARM DMA read process in OMAP5910 inputs 12 rows of data once, and uses 20 times to input 1 frame of image into SDRAM. The writing process of ARM DMA starts in the middle of the reading process. That is, in the interrupt of the first DMA reading process in 1 frame, the line number and frame number are added and the DMA writing process is started. This time only 2 lines are written, and the 2nd to 20th DMA reads in 1 frame The write process is started in the interrupt at the end of the process, 15 lines are written each time, and the operation sequence is shown in Figure 5.

Application design of real-time image processing system based on OMAP5910 dual-core processor

The specific time calculation is: reading 12 rows of data is 17.07μs×12=204.84μs, writing 15 rows of data is 27.68μs×15=415.2μs, the time required for one read and write and interrupt processing is 204.84μS+415.2μs≈700μs , The processing time allowed for one read/write and interrupt is 68μS×12=816μs. After calculating 700μ<816μs, the real-time performance of the real-time image processing system can be guaranteed.

5 Code design and implementation in the optimization scheme

5.1 The main program on the ARM side

The ARM core mainly implements the configuration of the OMAP5910 system, the data input/output of the FPGA interface, the data storage with the SDRAM, the control and conversion of the frame mode and other optimization schemes.

5.2 The main program on the DSP side

DSP core mainly realizes DSP DMA between I SRAM and DARAM

Data input/output, two-point correction algorithm and defect removal algorithm for infrared images, and functions such as calculation of infrared image brightness and contrast parameters.

5.3 Summary of experience in debugging and configuring DMA channels

1) Test whether the data transfer of the DMA channel is correct. You can write the internal DMA data transfer test program from SDRAM to SDRAM to preliminarily test whether the initial setting and data transfer of the DMA channel are correct.

2) Test the data transfer rate of the DMA and FPGA interface. When the oscilloscope reads the efficiency of the read signal or the write signal, pay attention to check whether the number of the read signal and the write signal is the number of data to be transmitted.

3) Test the external interrupt of DMA. The external interrupt pin is a multiplexed pin. This pin should be set in advance.

4) Test the coordination of the read operation and write operation of the DMA operation to check whether the output transmission is correct.

5) Use different data in the designated memory to test whether the output of the video image is correct.

6) The trigger event of the DMA interrupt is the rising edge valid.

7) In order to ensure the integrity of DMA transmission data, the priority of DMA should be set.

Concluding remarks

The real-time image processing system realizes this data transmission optimization scheme. The real-time display image speed is 25 frames/s, and the visual effect of the image is ideal. By flexibly controlling DMA, it can not only improve the transmission efficiency of image data, but also give full play to it. High-speed performance of OMAP5910.

The Links:   DMF50260NFU-FW-27 LDE052T-13 LCD-SOURCE

Read More