# Stereo Vision IP Core Data Sheet (v1.1) July 15, 2015 Dr. Konstantin Schauwecker Nerian Vision Technologies Gotenstr. 9 70771 Leinfelden-Echterdingen Germany Email: service@nerian.com CONTENTS CONTENTS # Contents | 1 | Introduction | 3 | |----|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------| | 2 | Features | 3 | | 3 | Stereo Vision Core Functionality 3.1 Rectification 3.2 Image Pre-Processing 3.3 Stereo Matching 3.4 Subpixel Optimization 3.5 Uniqueness Check 3.6 Consistency Check 3.7 Gap Interpolation 3.8 Noise Reduction | 4<br>4<br>4<br>6<br>6<br>6<br>7 | | 4 | DMA Core Functionality 4.1 Ports Connected to SVC | 7<br>7<br>8<br>9<br>9 | | 5 | Parameterization | 10 | | 6 | Supported Devices | 11 | | 7 | Timing | 11 | | • | Timing | 11 | | 8 | Resource Usage | 11 | | 9 | IO Signals | 11 | | 10 | Reference Design | 16 | | 11 | DMA Core Registers 11.1 Read Registers 11.1.1 0x00: Status 11.1.2 0x04: Output Bytes Available 11.1.3 0x08: Input FIFO Info 11.1.4 0x0C: Output FIFO Info 11.1.5 0x10: Buffer FIFO Info 11.2 Write Registers 11.2.1 0x00: Control 11.2.2 0x04: Output Address 11.2.3 0x08: Left Input Address 11.2.4 0x0C: Left Input Bytes Available 11.2.5 0x10: Right Input Address | 16<br>16<br>18<br>18<br>18<br>18<br>18<br>19<br>19<br>19 | | | 11.2.6 0x14: Right Input Bytes Available | 19 | CONTENTS CONTENTS | 11.2.7 0x18: Rectification Map Address 11.2.8 0x1C: Buffer Address 11.2.9 0x20: Algorithm Parameters 11.2.10 0x24: License Key Part 1 | 20<br>20<br>20 | |------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------| | 11.2.11 0x28: License Key Part 2 | 21 | | 12 Control Flow | 21 | | 12.1 One-Time Initializations | 21 | | 12.2 Per-Frame Control Flow | 21 | | 12.3 Result Retrieval | 22 | ### 1 Introduction The Stereo Vision Core (SVC) performs stereo matching on two grayscale input images. The images are first rectified to compensate for lens distortions and camera alignment errors. Stereo matching is then performed by applying a variation of the Semi Global Matching (SGM) algorithm as introduced by Hirschmüller (2005). Various post-processing methods are applied to improve the processing results. The output of the SVC is a subpixel accurate and dense disparity map, which is streamed over an AXI4-Stream interface. To simplify use of the SVC on devices with a shared system memory, such as the Zynq SoC, an auxiliary core for direct memory access (DMA) is provided. This DMA core reads input data from memory through AXI3, and converts it into a data stream that is suitable for the SVC. Likewise, the DMA core collects the processing results from the SVC and writes them back to memory. The SVC is provided as a non-encrypted (but obfuscated) netlist. For the DMA core source code can be provided if requested. An IP block for both cores is available for Xilinx Vivado IP Integrator. # 2 Features The SVC and DMA core comprise the following features: - Processing of grayscale images with a bit depth of 8 bits per pixel - Stream-based processing of input images using either AXI4-Stream or AXI3 - Output of disparity map starts before receiving the last pixel of the input images - Pre-processing of input images for improved robustness against illumination variations and occlusions - Image rectification - Rectification using a pre-computed compressed rectification map - Bi-linear interpolation of pixels for subpixel accurate rectification - Stereo matching - Stereo matching through a variation of the Semi-Global Matching (SGM) algorithm - Configurable penalties $P_1$ and $P_2$ for small and large disparity variations - Post-processing - Subpixel optimization - Consistency check with configurable threshold - Uniqueness check with configurable threshold - Filling of small gaps through interpolation - Noise reduction # 3 Stereo Vision Core Functionality The overall functionality of the SVC is depicted in Figure 1. The displayed input and output ports implement a simplified subset of the AXI4-Stream protocol (ARM, 2010). Processing inside the SVC is divided into several sub-modules. Not all of these sub-modules are mandatory. Some can be deactivated through setting the appropriate device registers, or they can be removed from the IP core altogether if desired. #### 3.1 Rectification The core simultaneously reads the two input images from *left\_input* and *right\_input*. The first processing step that is applied to the input data is image rectification. During this step image pixels are displaced in order to compensate for lens distortions and camera alignment errors. Bi-linear interpolation is used to allow for pixel displacements at subpixel levels. To perform the image rectification, a pre-computed rectification map is required that is read from the dedicated input port rectification\_map. The rectification map contains a subpixel accurate x- and y-offset for each pixel of the left and right input images. The offsets are interleaved such that reading from a single data stream is sufficient for finding the displacement vector for each pixel in both images. To save bandwidth, the rectification map is stored in a compressed form. On average one byte is required for encoding the displacement vector of a single pixel. Hence, the overall size of the rectification map equals the size of two input images. Source code is provided for generating the rectification map from typical camera calibration parameters. Rectification is a window-based operation. Hence, the pixel offsets are limited by the employed window size. In our reference parameterization a window size of $63 \times 63$ pixels is used. This allows for pixel offsets in the range of -31 to +31. If desired, the window size can be adjusted to allow for larger pixel displacements. The left rectified image is always written to *left\_output*. If desired, the right rectified image can be written to *disparity\_output* by setting the appropriate device register as explained in Section 11. When outputting the right rectified image, stereo matching results are not available. Further, as the *disparity\_output* port is intended for delivering subpixel accurate disparity maps, it has a larger than needed data width. When delivering the right rectified image over this output, the least significant bits, which otherwise correspond to the subpixel component of a disparity value, are set to 0. #### 3.2 Image Pre-Processing An image pre-processing method is applied to both input images. This causes the subsequent processing steps to be more robust towards illumination variations and occlusions. ### 3.3 Stereo Matching Stereo matching is performed by applying a variation of the SGM algorithm by Hirschmüller (2005). The left input image is selected as reference image and matched against the right image data. The penalties $P_1$ and $P_2$ that are employed by SGM for small and large disparity variations can be configured at runtime though the device registers. For storing intermediate processing results, the SGM sub-module requires writing to an external buffer through the port $buffer\_output$ . This buffer can be located in external memory, or if desired in the FPGA's block-ram. The total size $s_b$ of the buffer can be computed as follows: $$s_b = 3 \cdot (d_{max} + 1) \cdot w , \qquad (1)$$ where $d_{max}$ is the maximum disparity and w is the image width. For the reference parameterization in Section 5, this buffer requires a total size of 210 KB. Data is written to the buffer linearly, starting form byte offset 0 all the way through to the last byte in the buffer. Once the last byte has been written, the SVC sends out a rewind signal. Afterwards, writing will start again at byte offset 0. Similarly, the content of the same buffer is read back linearly through the port buffer\_input. It is ensured that reading and writing never happen simultaneously on the same buffer data. # 3.4 Subpixel Optimization To increase the accuracy of depth measurements, a subpixel optimization is applied. In this step, the costs to the left and to the right of the minimum matching cost are examined. A curve is fitted to the matching costs and its minimum is determined with subpixel accuracy. The resulting disparities after subpixel optimization are encoded as fixed point numbers. Currently the SVC supports 4 decimal bits for the subpixel optimized disparity. Hence, it is possible to measure disparities with a resolution of ½16 pixel. It is thus required do divide each disparity value by 16, when interpreting the final disparity map. #### 3.5 Uniqueness Check A range of post-processing methods is applied to the data from stereo matching. First we discard matches with a high matching uncertainty by imposing a uniqueness constraint. For a stereo match to be considered unique, the minimum matching cost $c_{min}$ times a uniqueness factor $q \in [1, \infty)$ must be smaller than the cost for the next best match. This relation can be expressed in the following formula, where C is the set of matching costs for all feature pairs and $c^* = c_{min}$ is the cost for the best match: $$c^* \cdot q < \min \{ C \setminus \{ c_{\min} \} \}. \tag{2}$$ Stereo matches that are discarded through the uniqueness check are assigned a disparity label of 0xFFF, which corresponds to the decimal value 255.9375 and is the maximum value that can be transmitted through disparity\_output. ### 3.6 Consistency Check A consistency check is employed for filtering further matches with a high matching uncertainty. The common approach to this post-processing technique is to repeat stereo matching in the opposite matching direction (in our case from the right image to the left image), and only retaining matches for which $$|d_l - d_r| \le t_c,\tag{3}$$ where $d_l$ is the disparity form left-to-right matching, $d_r$ the disparity from right-to-left matching, and $t_c$ is the consistency check threshold. In order to save FPGA resources, we refrain from re-running stereo matching a second time. Rather, the right camera disparity map is inferred from the matching costs that have been gathered during the initial left-to-right stereo matching. Pixels that do not pass this consistency check are again labeled with 0xFFF. # 3.7 Gap Interpolation The uniqueness and consistency checks both remove pixels form the computed disparity map, which leaves gaps with no valid disparity data. If one such gap is small, it can be filled with valid disparities by interpolating the disparities from its edges. Interpolation is only performed for gaps whose vertical and horizontal extent $l_h$ and $l_v$ fulfill the condition $$\min\left\{l_h, l_v\right\} \le l_{max},\tag{4}$$ where $l_{max}$ is the maximum gap width. #### 3.8 Noise Reduction Finally, a noise reduction filter is applied to the generated disparity map. This filter performs a smoothing of the disparity map while being aware of discontinuities and invalid disparities. # 4 DMA Core Functionality When using the SVC it is in the responsibility of the developer to provide all required data on the input ports and to collect the data from the output ports in time. In a typical setting, the input data is read from off-chip memory and the processing results are written back to memory. For systems with a shared system memory, such as the Zynq SoC, we provide the DMA core for fetching and writing data. The functionality of this core is depicted in the block diagram of Figure 2. ### 4.1 Ports Connected to SVC Except for the clock signal, the DMA core connects to all input and output ports of the SVC. In Figure 2, those ports are depicted on the right. The ports match the ones shown in Figure 1 on page 5, plus two further outputs and two inputs that were previously omitted for simplicity. The first new SVC-specific output port is write\_registers. This output delivers the contents of all writable device registers, which are held by the DMA core. The SVC only requires a subset of the available write registers. The majority of the registers are used by the DMA core for controlling memory access. The registers Figure 2: Block diagram of DMA core functionality. that are needed by the SVC are registers 0x00, 0x24, 0x28 and 0x2C. A detailed description of all write registers can be found in Section 11.2 on page 18. Another new output is system\_reset, which is an active low reset signal. The reset signal is set to 0 if either the DMA core is reset itself, or a soft reset is triggered by writing register 0x00. It is recommended that the SVC's reset input is connected to this output. The SVC will otherwise not respect the soft-reset bit. The new SVC-specific input ports are buffer\_input\_rewind and buffer\_output\_rewind. These binary signals are sent by the SVC when reading or writing of the buffer shall start from the beginning. It is important that reading does not start before this signal is asserted, as the relevant content might not yet have been written. ### 4.2 Interface Ports All further ports that are not connected to the SVC are displayed on the left hand side of Figure 2. The port register\_io provides read and write access to all device registers, which are held by the DMA core. This port complies to the AXI4-Lite standard (ARM, 2013) and acts as a communication slave. The remaining ports follow the AXI3 standard (ARM, 2013) and act as communication masters. The left\_dma port fetches the left input image and is also used for delivering the processing results. Similarly the right\_dma port is used for fetching the right input image. The port buffer\_dma serves for reading and writing to a buffer memory and the rectification\_map\_dma port fetches the rectification map. Figure 3: Example for interpreting the merged output as image when splitting the disparity map. All fetch and write operations of the AXI3 ports are controlled through the device registers. They contain the input and output memory addresses and can trigger reading or writing when set to a new value. More details on the device registers can be found in Section 11 on page 16. ### 4.3 Output Conversion The rectified left image and the disparity map that are output by the SVC are merged into a single data stream by the DMA core. This data stream is then written out over the left\_dma port. Because the disparity map is output with a significantly higher delay than the rectified left image, the SVC core contains a sufficiently sized FIFO buffer for buffering the rectified left image data. For merging the two incoming data streams, the DMA core provides two possible options. #### 4.3.1 Split Disparity Map In the first option the disparities are split into an integer component (extended to 8 bit) and a 4-bit subpixel component, which consists of the disparity decimal bits (see Section 3.4). We hence receive two new maps, which are the *integer disparity map* and the *subpixel component map*. An image row of the left rectified image, a row of the integer disparity map and a row of the subpixel component map are then output consecutively over the left\_dma port. This interleaved output is repeated until there are no more remaining rows to be delivered. It has to be considered that an element of the subpixel component map only has a size of 4-bits. Hence, two consecutive values are combined into a single byte. For this operation we always write the less-significant 4-bits first, and the more-significant 4-bits last. The merged output data can be interpreted as a row-wise sampled image with dimensions $2.5w \times h$ , where w and h are the width and height of the input image. An example for the output of the DMA core when interpreted this way is shown in Figure 3. As can be seen, the output image is a horizontal arrangement of the left rectified image, the integer disparity map and the subpixel component map. | Parameter | Description | Reference | |----------------------|----------------------------------------------------|-----------| | | | Param. | | Width w | Width of an input image. | 640 | | Height h | Height of an input image. | 480 | | Input FIFO size | Number of data items that are FIFO-buffered | 8 | | | for each AXI input. This value does not apply | | | | to the buffer input. | | | Output FIFO size | Number of data items that are FIFO-buffered | 4 | | | for each AXI output. This value does not | | | | apply to the buffer output. | | | Buffer input FIFO | Number of data items that are FIFO-buffered | 8 | | size | for the buffer AXI input. | | | Buffer output FIFO | Number of data items that are FIFO-buffered | 8 | | size | for the buffer AXI output. | | | Rectification window | Window size that is used for image rectifica- | 63 | | size | tion. See Section 3.1. | | | Maximum disparity | The maximum disparity label that is consid- | 111 | | $d_{max}$ | ered for stereo matching. | | | Pixel cycles | Number of clock cycles that SGM uses for | 7 | | | processing a single pixel. | | | Maximum gap width | The maximum extent in horizontal or vertical | 5 | | $l_{max}$ | direction for a gap in the disparity map, such | | | | that it is still considered for gap interpolation. | | | | See Section 3.7. | | | Split output dispar- | The disparities will be split into integer and | true | | ity map | subpixel component during output merging. | | | | See Section 4.3 | | Table 1: Available parameters and reference parameterization. # 4.3.2 Extended Disparities The alternative method for merging the SVC output does not split the disparities into integer and subpixel components. Rather, the disparities are extended to 16-bit. This happens by introducing additional high-significant bits, which are set to 0. The output is then again a row-wise interleaving of the left rectified image and the 16-bit extended disparity map. Compared to the first output option, this method produces more data due to the disparity extension. In total, the data that is delivered over the $left_output$ port is equivalent to a $3w \times h$ sized image. # 5 Parameterization The SVC can be configured through several parameters, which are listed in Table 1. The table further includes our reference parameterization, which we recommend and use in our own products. All performance indicators that are provided in this document have been obtained with this reference parameterization. Table 2: SVC processing delays for reference parameterization. | Delay | Clock Cycles | Time | |--------------------------|--------------|-----------| | Delay until first output | 112,000 | 1.12 ms | | Delay until last output | 2,606,000 | 26.06 ms | Because the IP core is provided as a netlist, the parameterization cannot be changed by the user. Please contact us if you require a different parameter set, and we will provide you with one or more netlists. Be aware that some parameters can have a great impact on the required FPGA resources and can influence the design timings. We will assist you in finding an adequate parameterization for your application. # 6 Supported Devices The SVC has been field-tested on the Zynq 7000 SoC. We thus recommend using this FPGA. However, our IP core is also compatible to other Xilinx 7-Series FPGAs. Please contact us to find out if your device is supported. # 7 Timing When synthesized for the Zynq 7000 SoC, the maximum supported clock speed for the SVC and DMA core is at 100 MHz. This is also the clock speed for the Zynq's Static Memory Controller (SMC). Hence the core can be connected to the Zynq's memory interfaces. The expected SVC processing delays when operated at 100 MHz with the discussed reference parameterization are listed in Table 2. The delays are measured from the moment at which the first data item arrives at the SVC. As first output we consider the first byte of the computed disparity map. Consequently, the last output is the last byte of the disparity map, after which processing is complete. The measurements were determined under the assumption that new data is always available at the SVCs inputs. If the data to be processed is read from system memory, higher delays might occur due to the additional memory delays. # 8 Resource Usage The total resource usage of the SVC and DMA core are listed in Table 3. The table further contains resource usage information for the individual sub-modules that have been identified in Figure 1 on page 5. This provides an overview of the gains that can be achieved when removing one of the sub-modules from the core. # 9 IO Signals Figure 4a and 4b contain an image of the SVC and DMA core as they appear in IP Integrator, which is part of Xilinx Vivado. Most of the shown ports have already been described in Sections 3 and 4. A detailed list of all input and output signals, | Description | Slice LUTs | Registers | Memory | DSPs | |-----------------------|------------|-----------|--------|------| | SVC total usage | 22,686 | 26,164 | 85 | 12 | | DMA core total usage | $3,\!556$ | 3,379 | 8 | 0 | | Image rectification | 2,498 | 1,952 | 34 | 4 | | Image pre-processing | 334 | 772 | 3 | 0 | | SGM stereo | 12,506 | 13,948 | 42 | 2 | | Subpixel optimization | 338 | 622 | 0 | 0 | | Uniqueness check | 1,535 | 1,240 | 0 | 1 | | Consistency check | 3,879 | 5,880 | 1 | 0 | | Gap interpolation | 696 | 700 | 4 | 5 | | Noise suppression | 821 | 1,036 | 1 | 0 | | Others | 79 | 14 | 0 | 0 | Table 3: Resource usage of SVC and indivdual sub-modules. Figure 4: Interfaces of (a) SVC and (b) DMA core as shown by IP Integrator. including a breakdown of the AXI ports, is provided in Table 4 for the SVC. Likewise, Table 4 contains an equivalent list for the DMA core. Table 4: List of SVC input and output signals. | clk reset i 1 Gobal reset signal. Active low. left_input_tready left_input_tvalid left_input_tdata i 1 Left image data is valid left_input_tdata right_input_tready o 1 Ready to receive left image left_input_tdata right_input_tdata right_input_tready o 1 Ready to receive right image right_input_tready right_input_tvalid i 1 Ready to receive right image right_input_tdata rectification_map_tready o 1 Ready to receive rectification map rectification_map_tready o 1 Ready to receive rectification map rectification_map_tdata i 32 Rectification map data is valid rectification_map_tdata left_output_tready i 1 Ready to deliver left output left_output_tdata disparity_output_tready i 1 Ready to deliver disparity output disparity_output_tdata disparity_output_tdata disparity_output_tdata o 11 Ready to deliver disparity output disparity_output_tdata o 11 Disparity output data. buffer_output_rewind o 1 Writing to buffer shall start from 0. See Sections 3.3 and 4.1. buffer_output_tdata buffer_output_tdata o 384* buffer_output_tdata buffer_input_rewind o 1 Ready to receive left image imate left image left image left image left imate left imate left image left imate left imate left imate left imate left imate left | Signal name | $\mathbf{i}/\mathbf{o}$ | Bits | Description | |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------|-------------------------|------|-------------------------------------------| | left_input_tready left_input_tvalid left_input_tvalid left_input_tdata right_input_tready right_input_tvalid right_input_tvalid right_input_tvalid right_input_tdata rectification_map_tready rectification_map_tready rectification_map_tdata left_output_tready left_output_tvalid left_output_tdata respective_output_tdata respective_output_tdata rectification_map_tdata left_output_tready left_output_tdata rectification_map_tdata left_output_tready left_output_tdata rectification_map_tdata left_output_tready left_output_tdata left_output_tdata left_output_tdata rectification_map_todata left_output_tready left_output_tready left_output_tdata left_output_data left_output_tdata left_output_data left | clk | i | 1 | Main clock source | | left_input_tvalid | reset | i | 1 | Gobal reset signal. Active low. | | left_input_tdata right_input_tready right_input_tvalid right_input_tvalid right_input_tvalid right_input_tdata rectification_map_tready rectification_map_tvalid rectification_map_tdata left_output_tready left_output_tvalid left_output_tvalid disparity_output_tvalid disparity_output_rewind buffer_output_tready luffer_output_ttadta buffer_input_tvalid buffer_input_tready luffer_input_tvalid left_input_tvalid left_input_tvalid luffer_input_ttadta luffer_input_tvalid luffer_input_tvalid luffer_input_tdata luffer_input_data luffer_inpu | $left\_input\_tready$ | О | 1 | Ready to receive left image | | right_input_tready right_input_tvalid right_input_tdata rectification_map_tready rectification_map_tvalid rectification_map_tdata left_output_tready left_output_tvalid disparity_output_tdata buffer_output_tready buffer_output_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tdata rectification_map_tready i 1 Ready to receive rectification map rectification_map_tvalid i 1 Rectification map data is valid rectification_map_tdata i 32 Rectification map data left_output_tready i 1 Ready to deliver left output left output_data buffer output_tvalid o 1 Disparity output data is valid left_output_tdata o 11* Disparity output data. buffer output_rewind o 1 Writing to buffer shall start from 0. See Sections 3.3 and 4.1. buffer output_tdata o 384* Buffer output data buffer input_tready o 1 Ready to receive rectification map rectification map data i 32 Rectification map data i 32 Rectification map data i 32 Rectification map data is valid left_output data is valid left_output_data i 32 Rectification map data is valid left_output data is valid left_output_data is valid left_output_data is valid left_output_data is valid left_output_data i 34* Buffer output data is valid left_output_data i 34* Buffer input data i 34* Buffer input data i 34* Buffer input data write_registers i 384 Writable device registers, forwarded by | $left\_input\_tvalid$ | i | 1 | Left image data is valid | | right_input_tvalid right_input_tdata rectification_map_tready rectification_map_tready rectification_map_tvalid rectification_map_tvalid rectification_map_tvalid rectification_map_tdata left_output_tready left_output_tvalid left_output_tvalid left_output_tdata disparity_output_tready disparity_output_tdata buffer_output_tready buffer_output_tvalid buffer_output_tvalid buffer_input_tready buffer_input_tready buffer_input_tready buffer_input_tteady buffer_input_tteady buffer_input_tteady buffer_input_tteady buffer_input_tdata buffer_input_data buffer_input_ | $left\_input\_tdata$ | i | 8 | Left image data | | right_input_tdata rectification_map_tready rectification_map_tvalid rectification_map_tvalid rectification_map_tdata left_output_tready left_output_tready left_output_tvalid left_output_tdata disparity_output_tready disparity_output_tdata buffer_output_tready lufter_output_tready lufter_output_tready lufter_output_tready lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tvalid lufter_output_tvalid lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tready lufter_output_tready lufter_output_tdata lufter_input_tready lufter_input_tready lufter_input_tready lufter_input_tdata lufter_input_tdata lufter_input_tdata lufter_input_tdata lufter_input_tdata lufter_input_data lufter | ${\bf right\_input\_tready}$ | О | 1 | Ready to receive right image | | rectification_map_tready rectification_map_tvalid rectification_map_tvalid rectification_map_tdata left_output_tready left_output_tvalid left_output_tvalid left_output_tdata disparity_output_tready disparity_output_tdata buffer_output_tready lufter_output_tready lufter_output_tready lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tvalid lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tdata lufter_output_tready lufter_output_tready lufter_output_tready lufter_output_tdata lufter_output_data lufter_outp | $right\_input\_tvalid$ | i | 1 | Right image data is valid | | rectification _map_tvalid | $right\_input\_tdata$ | i | 8 | Right image data | | rectification map tdata left_output_tready left_output_tvalid left_output_tvalid left_output_tdata disparity_output_tready disparity_output_tdata buffer_output_tready buffer_output_tready buffer_output_tready buffer_input_tvalid buffer_input_tdata buffer_input_tdata buffer_input_tdata buffer_input_tvalid buffer_input_tdata i 384* Buffer input data buffer_input_tdata write_registers i 384 Writable device registers, forwarded by | $rectification\_map\_tready$ | О | 1 | Ready to receive rectification map | | left_output_tready i 1 Ready to deliver left output left_output_tvalid o 1 Left output data is valid left_output_tdata o 8 Left output data disparity_output_tready i 1 Ready to deliver disparity output disparity_output_tvalid o 1 Disparity output data is valid disparity_output_tdata o 11* Disparity output data. buffer_output_rewind o 1 Writing to buffer shall start from 0. See Sections 3.3 and 4.1. buffer_output_tready i 1 Ready to deliver buffer output buffer_output_tvalid o 1 Buffer output data is valid buffer_input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready o 1 Ready to recive buffer input buffer_input_tvalid i 1 Buffer input data is valid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | $rectification\_map\_tvalid$ | i | 1 | Rectification map data is valid | | left_output_tvalid left_output_tdata o | $rectification\_map\_tdata$ | i | 32 | Rectification map data | | left_output_tdata disparity_output_tready disparity_output_tvalid disparity_output_tvalid disparity_output_tdata buffer_output_rewind buffer_output_tready buffer_output_tvalid buffer_input_rewind o 1 Ready to deliver disparity output data is valid Disparity output data. buffer shall start from 0. See Sections 3.3 and 4.1. Ready to deliver buffer output buffer output_tvalid buffer_output_tvalid buffer_input_rewind o 1 Buffer output data buffer output data buffer input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready buffer_input_tready buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | $left\_output\_tready$ | i | 1 | Ready to deliver left output | | disparity_output_tready disparity_output_tvalid of the sparity_output_tvalid disparity_output_tdata of the sparity_output_tdata of the sparity_output_tdata of the sparity_output_data of the sparity_output_data of the sparity_output_data of the sparity_output_data of the sparity_output_data of the sparity_output_data. buffer_output_rewind of the sparity_output_data t | $left\_output\_tvalid$ | О | 1 | Left output data is valid | | disparity_output_tdata disparity_output_tdata buffer_output_rewind buffer_output_tready buffer_output_tvalid buffer_output_tdata buffer_output_tvalid buffer_input_rewind buffer_input_tready buffer_input_tvalid buffer_input_tdata i 384* Buffer input data buffer input data write_registers i 384 Writable device registers, forwarded by | $left\_output\_tdata$ | О | 8 | Left output data | | disparity_output_tdata buffer_output_rewind o 1 Writing to buffer shall start from 0. See Sections 3.3 and 4.1. buffer_output_tready buffer_output_tvalid buffer_output_tdata buffer_input_rewind o 1 Ready to deliver buffer output data is valid buffer_output_tdata o 384* Buffer output data buffer_input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready buffer_input_tready buffer_input_tvalid buffer_input_tvalid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | $disparity\_output\_tready$ | i | 1 | Ready to deliver disparity output | | buffer_output_rewind o 1 Writing to buffer shall start from 0. See Sections 3.3 and 4.1. buffer_output_tready buffer_output_tvalid buffer_output_tdata o 384* Buffer output data buffer_input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready buffer_input_tvalid i 1 Ready to recive buffer input buffer_input_tvalid i 1 Buffer input data is valid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | $disparity\_output\_tvalid$ | О | 1 | Disparity output data is valid | | buffer_output_tready buffer_output_tvalid buffer_output_tdata buffer_input_rewind buffer_input_tready buffer_input_tready buffer_input_tready buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tdata i 384* Writable device registers, forwarded by | $disparity\_output\_tdata$ | О | 11* | Disparity output data. | | buffer_output_tready buffer_output_tvalid buffer_output_tdata buffer_input_rewind buffer_input_tready buffer_input_tready buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tdata write_registers i 1 Ready to deliver buffer output Buffer output data is valid Buffer output data Buffer output data Buffer output data Buffer output data Buffer output data Buffer shall start from 0. See Sections 3.3 and 4.1. Buffer input data is valid Buffer input data is valid Buffer input data Writable device registers, forwarded by | $buffer\_output\_rewind$ | О | 1 | Writing to buffer shall start from 0. See | | buffer_output_tvalid buffer_output_tdata buffer_input_rewind buffer_input_tready buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tvalid buffer_input_tdata i 384* Buffer output data i Reading from buffer shall start from 0. See Sections 3.3 and 4.1. Buffer input data is valid buffer_input_tvalid i 1 Buffer input data is valid buffer_input_tdata i 384* Buffer input data Writable device registers, forwarded by | | | | Sections 3.3 and 4.1. | | buffer_output_tdata buffer_input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready buffer_input_tvalid buffer_input_tdata i 384* Buffer output data See Sections 3.3 and 4.1. Buffer input data is valid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | buffer_output_tready | i | 1 | Ready to deliver buffer output | | buffer_input_rewind o 1 Reading from buffer shall start from 0. See Sections 3.3 and 4.1. buffer_input_tready o 1 Ready to recive buffer input buffer_input_tvalid i 1 Buffer input data is valid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | $buffer\_output\_tvalid$ | О | 1 | Buffer output data is valid | | See Sections 3.3 and 4.1. buffer_input_tready buffer_input_tvalid buffer_input_tdata i 384* Writable device registers, forwarded by | $buffer\_output\_tdata$ | О | 384* | Buffer output data | | buffer_input_tready buffer_input_tvalid buffer_input_tdata write_registers o 1 Ready to recive buffer input Buffer input data is valid Buffer input data Writable device registers, forwarded by | buffer_input_rewind | О | 1 | Reading from buffer shall start from 0. | | buffer_input_tvalid i 1 Buffer input data is valid buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | | | | See Sections 3.3 and 4.1. | | buffer_input_tdata i 384* Buffer input data write_registers i 384 Writable device registers, forwarded by | buffer_input_tready | О | 1 | Ready to recive buffer input | | write_registers i 384 Writable device registers, forwarded by | buffer_input_tvalid | i | 1 | Buffer input data is valid | | | buffer_input_tdata | i | 384* | Buffer input data | | DMA core. See Section 4.1 | write_registers | i | 384 | Writable device registers, forwarded by | | | | | | DMA core. See Section 4.1 | $<sup>^{\</sup>ast}$ Size given for reference parameterization. Table 6: List of DMA core input and output signals. | Signal name | i/o | Bits | Description | | | | | |------------------------|---------|----------------------------|--------------------------------------------------------|--|--|--|--| | clk | i | 1 | Main clock source | | | | | | reset | i | 1 | Gobal reset signal. Active low | | | | | | | | | on is signals | | | | | | nomiator is anoddr | : | _ | er_io signals | | | | | | register_io_araddr | i | 32 | Read address | | | | | | register_io_arprot | i | 3 | Protection type. Ignored! | | | | | | register_io_arready | 0 | 1 | Read address ready. Always 1! | | | | | | register_io_arvalid | i . | 1 | Read address valid | | | | | | register_io_awaddr | i | 32 | Write address | | | | | | register_io_awprot | i | 3 | Protection type. Ignored! | | | | | | register_io_awready | 0 | 1 | Write address ready. Always 1! | | | | | | register_io_awvalid | i | 1 | Write address valid | | | | | | register_io_bready | i | 1 | Response ready | | | | | | register_io_bresp | О | 2 | Write response. Always 00 <sub>2</sub> (OK)! | | | | | | $register\_io\_bvalid$ | О | 1 | Write response valid | | | | | | $register\_io\_rdata$ | О | 32 | Read data | | | | | | $register\_io\_rready$ | i | 1 | Read ready | | | | | | $register\_io\_rresp$ | О | 2 | Read response. Always $00_2$ (OK)! | | | | | | $register\_io\_rvalid$ | О | 1 | Read valid | | | | | | $register\_io\_wdata$ | i | 32 | Write data | | | | | | $register\_io\_wready$ | О | 1 | Write ready | | | | | | $register\_io\_wstrb$ | i | 4 | Write strobes. Ignored! | | | | | | register_io_wvalid | i | 1 | Write valid | | | | | | buffer id | ı / lef | it in / | right io / rect map signals | | | | | | * araddr | ) | $\frac{10^{-10^{-7}}}{32}$ | Read addres | | | | | | *_aradar<br>*_arburst | 0 | 2 | Read burst type. Always 01 <sub>2</sub> (INCR)! | | | | | | * arcache | 0 | 4 | Read memory type. Always 0011 <sub>2</sub> (Normal, | | | | | | -areaene | | 1 | non-cacheable, bufferable). | | | | | | * arid | О | 6 | Read address ID. Always $000000_2$ ! | | | | | | *_arlen | 0 | 4 | Read burst length. Always 1111 <sub>2</sub> (16 trans- | | | | | | *_ <sup>arren</sup> | | 4 | fers)! | | | | | | * arlock | | 2 | Read lock type. Always 00 <sub>2</sub> (Normal ac- | | | | | | *_arrock | О | 4 | ces)! | | | | | | 4 arment | 0 | 3 | Read protection type. Always $000_2$ (Un- | | | | | | $*\_\mathrm{arprot}$ | О | 3 | privileged secure data)! | | | | | | | | 4 | , | | | | | | *_arqos | 0 | 4 | Read quality of service. Always 0000 <sub>2</sub> ! | | | | | | *_arready | i | 1 | Read ready | | | | | | *_arsize | О | 3 | Read burst size. Always 011 <sub>2</sub> (8 bytes)! | | | | | | *_arvalid | О | 1 | Read valid | | | | | | *_awaddr | O | 32 | Write address | | | | | | *_awburst | О | 2 | Write burst type. Always 01 <sub>2</sub> (INCR)! | | | | | | $*_{awcache}$ | О | 4 | Write memory type. Always 0011 <sub>2</sub> (Nor- | | | | | | | | | mal, non-cacheable, bufferable). | | | | | | | ı | 1 | | | | | | | | |-------------------------------|--------|--------------------------------------------|-------------------------------------------------------------------------------|--|--|--|--|--|--| | *_awid | О | 6 | Write address ID. Always 000000 <sub>2</sub> ! | | | | | | | | *_awlen | О | 4 | Write burst length. Always 1111 <sub>2</sub> (16 transfers)! | | | | | | | | $*$ _awlock | О | 2 | Write lock type. Always $00_2$ (Normal acces)! | | | | | | | | $*$ _awprot | О | 3 | Write protection type. Always 000 <sub>2</sub> (Un- | | | | | | | | d. owegod | | 1 | privileged secure data)! Write quality of service. Always 0000 <sub>2</sub> ! | | | | | | | | *_awqos<br>* awready | 0; | $\begin{vmatrix} & 4 \\ & 1 \end{vmatrix}$ | Write address ready | | | | | | | | *_awready<br>* awsize | i | $\frac{1}{3}$ | Write address ready Write burst size. Always 011 <sub>2</sub> (8 bytes)! | | | | | | | | *_awsize<br>* awvalid | 0 | $\begin{vmatrix} & 3 \\ & 1 \end{vmatrix}$ | Write address valid | | | | | | | | *_aw vand<br>* bid | 0; | 6 | Write address vand Write response ID. Ignored! | | | | | | | | *_bid<br>* bready | i | 1 | Write response ready. Always 1! | | | | | | | | *_bready<br>* bresp | 0; | $\frac{1}{2}$ | Write response ready. Always 1: Write response. Ignored! | | | | | | | | *_bresp<br>* bvalid | i<br>i | $\begin{vmatrix} 2 \\ 1 \end{vmatrix}$ | Write response valid | | | | | | | | *_bvand<br>* rdata | i | 64 | Read data | | | | | | | | *_rid | i | 6 | Read ID. Ignored! | | | | | | | | *_nd<br>* rlast | i | 1 | Read last | | | | | | | | _ | | | | | | | | | | | *_rready | 0; | $\begin{vmatrix} 1\\2 \end{vmatrix}$ | Read ready Read response. Ignored! | | | | | | | | *_rresp | 1; | 1 | Read last | | | | | | | | *_rvalid<br>* wdata | 1 | 64 | Write data | | | | | | | | *_wdata<br>* wid | 0 | 6 | Write ID. Always $000000_2$ ! | | | | | | | | *_wid<br>* wlast | 0 | 1 | Write last | | | | | | | | *_wiast<br>* wready | o<br>i | 1 | Write ready | | | | | | | | *_wready<br>* wstrb | | 8 | Write strobes. Always 11111111 <sub>2</sub> ! | | | | | | | | *_wstrb<br>* wvalid | 0 | 1 | Write valid | | | | | | | | *_wvand | О | 1 | Write valid | | | | | | | | left input / right | inpı | ıt / rect | tification map / buffer input signals | | | | | | | | *_ready | i | 1 | Ready to deliver input data | | | | | | | | * valid | О | 1 | Input data valid | | | | | | | | *_data | О | varied | Data directed to SVC. | | | | | | | | | ı | • | ' | | | | | | | | <del></del> | / disp | . — | output / buffer_ output signals | | | | | | | | $*_{\text{ready}}$ | О | l | Ready to receive output data | | | | | | | | $*$ _valid | i | 1 | Output data is valid | | | | | | | | $*_{ m data}$ | i | varied | Data received from SVC. | | | | | | | | Other signals directed to SVC | | | | | | | | | | | system_reset | О | 1 | Active-low reset signal for SVC. See Sec- | | | | | | | | _ | | | tion 4.1. | | | | | | | | $write\_registers$ | О | 384 | Writable device registers forwarded to SVC. | | | | | | | | | | | See Section 4.1. | | | | | | | | | | | | | | | | | | # 10 Reference Design When using the SVC in combination with the DMA core, all inputs and outputs of the SVC shall be connected to the DMA core. The SVC's reset input shall be connected to the DMA core's system\_reset output, such that it will also be reset when triggering a soft reset through the device registers. When using the provided IP cores on a Zynq SoC, it is usually desired that processing can be controlled by software, which is run on the Zynq's CPU cores. This requires that the device registers can be read and written from software. To facilitate this, the <code>register\_io</code> port of the DMA core needs to be connected to one of the Zynq's general purpose AXI master interfaces. Due to the fact that <code>register\_io</code> complies to the AXI4-Lite standard, an AXI interconnect block is necessary to make this connection. All the remaining \*\_dma ports can be connected to the Zynq's high performance AXI slave interfaces. This allows reading input data from system memory, and writing the processing results back to memory. This reference design is illustrated in Figure 5. # 11 DMA Core Registers The DMA core keeps several registers that control the device behavior and provide information about the internal device state. Please note that read- and write-registers each have a separate address space. Hence writing to a given address has no effect on the data that is received when reading from the same address. The DMA core forwards all write registers to the SVC through the output port write\_registers. The SVC only requires a subset of the write registers (see Section 4.1). When used without the DMA core, the appropriate register values can be input directly to the SVC. Each register has a size of 32 bits. To simplify access from a CPU core, the register addresses are always multiples of four. Read and write operations must always be aligned to a 4-byte boundary. Reading or writing to an address that is not a multiple of four is disallowed and has an undefined outcome. In the following, a description of all read and write registers is provided, sorted by register address. ### 11.1 Read Registers #### 11.1.1 0x00: Status General device status information. **R** If 0 then the device is currently performing a soft or hard reset. LW If 1 then writing to left\_dma has finished. LR If 1 then reading from left\_dma has finished. RR If 1 then reading from right\_dma has finished. Figure 5: Reference design for Zynq SOC in IP Integrator. M If 1 then reading from rectification\_map\_dma has finished. **BW** If 1 then writing to buffer\_dma has finished. **BR** If 1 then reading from buffer\_dma has finished. #### 11.1.2 0x04: Output Bytes Available The number of bytes that have successfully been written to left\_dma since the start of the current frame. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Available bytes # 11.1.3 0x08: Input FIFO Info Statistics for the input FIFO buffers that are attached to left\_dma and right\_dma. Counters are reset with every new frame. ### 11.1.4 0x0C: Output FIFO Info Statistics for the output FIFO buffer that is attached to left\_dma. Counters are reset with every new frame. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Reserved Output FIFO overruns #### 11.1.5 0x10: Buffer FIFO Info Statistics for the FIFO buffers that are attached to buffer\_dma. Counters are reset with every new frame. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Input FIFO underruns. Output FIFO overruns. ### 11.2 Write Registers #### 11.2.1 0x00: Control General parameters that control the behavior of the SVC. **R** If set to 1 then the device performs a soft reset. **OP** Operation mode. Possible values are: **00** Pass through. The SVC's left input is passed directly to the left output, and the right input is passed to the disparity output. - **01** Rectify. The rectification results are passed directly to the SVC's left and right output. - 10 Stereo matching. Stereo matching results are written to the SVC's disparity output, and the left rectified image is written to the left output. - 11 Reserved #### 11.2.2 0x04: Output Address Write address for left\_dma. Writing ends once one full frame has been written to memory. #### 11.2.3 0x08: Left Input Address Read address for left\_dma. Reading from this address begins immediately after this register has been written. Reading continues until one full frame has been read from memory. #### 11.2.4 0x0C: Left Input Bytes Available The number of bytes that can currently be read from left\_dma, starting at the left input address. If this number is smaller than the frame size then reading will pause once the specified number of bytes have been read. In this case reading will continue once a higher value is written to this register. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Bytes available #### 11.2.5 0x10: Right Input Address Read address for right\_dma. Reading from this address begins immediately after this register has been written. Reading continues until one full frame has been read from memory. # 11.2.6 0x14: Right Input Bytes Available The number of bytes that can currently be read from right\_dma, starting at the right input address. If this number is smaller than the frame size then reading will pause once the specified number of bytes have been read. In this case reading will continue once a higher value is written to this register. 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 Bytes available # 11.2.7 0x18: Rectification Map Address Read address for rectification\_map\_dma. Reading from this address begins immediately after this register has been written. Reading continues until the full rectification map has been read from memory. #### 11.2.8 0x1C: Buffer Address Memory address for buffer\_io. This address is used for both, reading and writing data. # 11.2.9 0x20: Algorithm Parameters Algorithmic parameters that can be changed at run-time. | 31 | 30 | 29 | 28 | 27 | 26 | 25 | 24 | 23 | 22 | 21 | 20 | 19 | 18 | 17 | 16 | 15 | 14 | 13 | 12 | 11 | 10 | 9 | 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 | |----|----|-----|----|----|----|----|----|----|-----|-----|-----|-----|----|-----|----|----|----|----|----|----|----|---|---|---|---|---|---|---|---|---|---| | С | on | sis | t. | Ν | G | С | 1 | Un | iqı | ıer | ies | s F | ac | toı | r | | | | Р | 2 | | | | | | | Р | 1 | | | | - $\mathbf{P}_1$ SGM penalty for small disparity variations. See Section 3.3. - $P_2$ SGM penalty for large disparity variations. See Section 3.3. - Uniqueness Factor Uniqueness factor q times 256. A value of 0 disable the uniqueness check. See Section 3.5. - C If set to 1 then the consistency check is disabled. See Section 3.6. - **G** If set to 1 then the gap interpolation is disabled. See Section 3.7. - ${f N}$ If set to 1 then the noise reduction is disabled. See Section 3.8. - Consist. Consistency check threshold $t_c$ . See Section 3.6. #### 11.2.10 0x24: License Key Part 1 The most-significant 32-bit of the device-specific license key. #### 11.2.11 0x28: License Key Part 2 The least-significant 32-bit of the device-specific license key. # 12 Control Flow When using the DMA core, it is necessary to write to several device registers for processing an input stereo frame. As writing to some of these registers triggers certain actions, it is important to access them in a defined order. While many different access patterns lead to the desired result, we recommend using the reference control flow detailed in this section. #### 12.1 One-Time Initializations After a hard or a soft reset the following registers should be written: - 1. A value of 0 to write register 0x0C. - 2. A value of 0 to write register 0x14. - 3. Buffer memory address to write register 0x1C. - 4. Algorithm parameters to write register 0x20. - 5. Operation mode to write register 0x00. #### 12.2 Per-Frame Control Flow For each frame that should be processed, the following registers have to be written: - 1. A value of 0 to write register 0x0C. - 2. A value of 0 to write register 0x14. - 3. Output address to write register 0x04. - 4. Left input address to write register 0x08. - 5. Right input address to write register 0x10. - 6. Rectification map input address to write register 0x18. - 7. Available left input bytes to write register 0x0C. - 8. Available right input bytes to write register 0x14. ### 12.3 Result Retrieval When using the DMA core, the processing results can be retrieved directly from the selected memory location that has been written to write register 0x04. The number of valid output bytes can be read from read register 0x04. Processing is complete once these counters are equal to the expected output size (See Section 4.3). Alternatively, one can monitor the status bits in read register 0x00 to determine when processing has finished. REFERENCES REFERENCES # **Revision History** | Revision | Date | Author(s) | Description | |----------|----------------|-----------|----------------------------------------------------------------------------------------------------------------------------------------------------------| | v1.1 | July 15, 2015 | KS | Updated timing, resource usage and reference parameterization for optimized SVC. | | v1.0 | June 20, 2015 | KS | Simplification of Section 2; minor rewording. | | v0.2 | June 4, 2015 | KS | Split IP core into SVC and DMA core; added output merging; added subpixel optimization; updated resource usage, timing and registers to current version. | | v0.1 | April 10, 2015 | KS | Initial revision | # References ARM (2010). AMBA 4 AXI4-Stream Protocol. ARM IHI 0051A (ID030510). ARM (2013). AMBA AXI and ACE Protocol Specification. ARM IHI 0022E (ID022613). Hirschmüller, H. (2005). Accurate and Efficient Stereo Processing by Semi-Global Matching and Mutual Information. In *IEEE Conference on Computer Vision and Pattern Recognition (CVPR)*, volume 2, pages 807–814. Gefördert durch: aufgrund eines Beschlusses des Deutschen Bundestages