FAQ
Frequently Asked Questions

FAQ - Frequently Asked Questions

Here we present an overview of frequently asked questions about our products and our company.

But that is not enough! We would also like to make sure that no questions from related fields – such as stereo vision, 3D cameras, 3D depth measurement or machine vision – remain unanswered.

Should you miss a specific question, please contact us.

Table of contents

1. Basic questions about stereo vision

In human sciences stereo vision or stereoscopic vision is the ability to obtain a spatial visual impression with both eyes through image comparison. When viewing an object, each eye looks up from a different angle. Each eye filters the information and sends it to the brain, where both visual impressions are processed into a combined image. This is what creates our three-dimensional depth perception.

Computer stereo vision mimics human depth perception. With a stereo camera two images are recorded synchronously and compared. The distance information between the camera and the object observed is lost by taking an image with a conventional camera, but this depth information can be recovered by several images taken from different known camera directions and their comparison.

Stereo matching is the applied process in computer vision for stereo image comparison. The operation is based on finding all the pixels in the stereo images that correspond to the same 3D point in the captured scene.

In order to obtain the 3-dimensional points, once all correspondences were found, a triangulation has to be calculated by taking into account the intrinsic and extrinsic geometry of the cameras and their calibration.

With Nerian’s SceneScan this process is performed on an FPGA, speeding up the calculation process by magnitudes and providing real-time 3D-measurement!

In the context of stereo vision and depth image evaluation, disparity is the offset of the positions thatan object occupies on two different image planes.

If we stick to the example of human vision, our eyes are located at different lateral positions, which causes two slightly different image perceptions. So with the left eye we see more of the left side of an object, while with our right eye we see more of the right side. This positional difference is called disparity. You can check that yourself if you, for instance, close one eye and look at an object. Then close your eye, while open the other one. You will see that the object is detected at a slightly different position. The nearer an object, the higher the difference.

For determining the disparity in machine vision, stereo matching makes use of comparing all matching pixel columns.

The disparity range in stereo vision specifies the overlapping image region that is searched for pixel correspondences. The larger the disparity range is selected, the more accurate the measurement results will get. But a large disparity range causes a high computational load, and thus lowers the achievable frame rate.

The selection of a small disparity range is suited for high speed measurement applications, while the selection of a large disparity range is appropriate for closer range measurements and if higher accuracy is required. Nevertheless, a larger disparity range increases the invalid left border where the camera image cannot be compared.

With SceneScan we offer a configurable disparity range from 32 to 256 pixels. We recommend the following combinations of resolution, disparity range and frame rate:

ModelDisparityImage Resolution
 Range640 × 480800 × 5921024 × 7681600 × 12002016 × 1536
SceneScan64 pixels45 fps30 fpsn/an/an/a
monochrome128 pixels30 fps20 fpsn/an/an/a
SceneScan Pro128 pixels135 fps90 fps55 fps22 fps13 fps
monochrome256 pixels75 fps53 fps34 fps12 fps7 fps
SceneScan Pro128 pixels80 fps53 fps32 fps13 fps8 fps
color256 pixels72 fps49 fps32 fps12 fps7 fps

For Scarlet we have the following combinations:

 Image resolution
Disparity range832 × 6081024 × 7681216 × 10242432 × 2048
256 pixels120 fps84 fps55 fps15 fps
512 pixelsn/an/a38 fps11 fps

Disparity is the apparent displacement of pixels in a stereo image pair (see What is disparity?). The disparity map is computed from this image pair and contains the depth information of each recorded pixel. Structurally it is a 2D image, so many 2D image processing algorithms are applicable.
In contrast, a 3D point cloud provides 3D coordinates for each pixel, which allows actual 3D measurements. Nevertheless the 3D output requires more complex algorithms and more data than disparity maps, causing slower image processing.
For faster results, working with disparity maps should be preferred over 3D point clouds.

2. Questions about our Scarlet 3D depth camera and the SceneScan 3D depth sensor

SceneScan and SceneScan Pro are embedded image processing systems for real-time stereo matching. SceneScan connects to a dedicated stereo camera or two industrial USB cameras, which are mounted at different viewing positions.

SceneScan sends out trigger signals in order to synchronously capture stereoscopic image pairs from the connected cameras. By correlating the image data from both cameras on the integrated FPGA, SceneScan can infer the depth of the observed scene. The computed depth map is transmitted through gigabit ethernet to a connected computer or another embedded system. SceneScan captures up to 100 frames per second, providing depth measuring in real time.

Our Scarlet 3D depth camera works on the same principle, but combines a 3D stereo camera and image processing in one device. A particularly powerful FPGA allows a processing performance of up to 120 fps, over 70 million 3D points per second and that at a resolution of up to 5 megapixels.

In contrast to conventional 3D sensing solutions, our sensors passively. This means that no light needs to be emitted for performing measurements. This makes our stereo vision solutions particularly robust towards the illumination conditions, and it facilitates long-range measurements, the use of multiple sensors with overlapping field of views, and a flexible reconfiguration of the system for different measurement ranges.
Furthermore we provide faster, higher-resolution and higher quality depth maps, because we can harness the high computational capabilities of a FPGA and use high-quality image sensors. Because all image processing is done on SceneScan or Scarlet, there is also no computational load on the host PC.

Our big advantage compared to LiDARs is that we have a much higher vertical resolution (typical LiDARs only measure in up to 64 rows). In some applications this can result in objects with a low height not being detected.

  • Lower power consumption
  • Smaller size
  • Faster results and more reliable timing because no other operating system or software are competing for resources

SceneScan and Scarlet deliver the stereo matching results in the form of a disparity map from the perspective of the left camera. The disparity map associates each pixel in the left camera image with a corresponding pixel in the right camera image. Because both images were previously rectified to match an ideal stereo camera geometry, corresponding pixels should only differ in their horizontal coordinates. The disparity map thus only encodes a horizontal coordinate difference. Additionally the left camera image is also output. 3D point clouds are possible as well.

Two different models exist for the given image processing system: SceneScan and SceneScan Pro. Both models provide the same functionality, however, SceneScan Pro has significantly more computational power when compared to SceneScan. This means that SceneScan Pro can process a given input stereo image much faster than SceneScan.

Thanks to the additional processing power, SceneScan Pro is also capable of processing higher image resolutions, color images and larger disparity ranges. Due to these benefits, SceneScan Pro can achieve a higher measurement accuracy than SceneScan.

You can find a brief comparison between both 3D sensor devices on the SceneScan product page.

A detailed comparison of the achievable frame rates at different image resolutions and disparity range can be found here.

Yes, you can purchase SceneScan without our stereoscopic camera. Nevertheless you need a stereo camera configuration for usage. Thus you need to acquire third party cameras. For recommendations on cameras please see “Can I use third party cameras?”

The maximum frame rate that can be achieved depends on the image size and the configured disparity range. The following table provides a list of recommended configurations. This is only a subset of the available configuration space. Differing image resolutions and disparity ranges can be used to meet specific application requirements. Here is an overview for SceneScan:

ModelDisparityImage Resolution
 Range640 × 480800 × 5921024 × 7681600 × 12002016 × 1536
SceneScan64 pixels45 fps30 fpsn/an/an/a
monochrome128 pixels30 fps20 fpsn/an/an/a
SceneScan Pro128 pixels135 fps90 fps55 fps22 fps13 fps
monochrome256 pixels75 fps53 fps34 fps12 fps7 fps
SceneScan Pro128 pixels80 fps53 fps32 fps13 fps8 fps
color256 pixels72 fps49 fps32 fps12 fps7 fps

Our recommendations for Scarlet are as follows:

 Image resolution
Disparity range832 × 6081024 × 7681216 × 10242432 × 2048
256 pixels120 fps84 fps55 fps15 fps
512 pixelsn/an/a38 fps11 fps

3. Before use: features, feasibility, usage

SceneScan and Scarlet are configured through a web interface, which can be reached by entering the system’s IP address into your browser. If the device has just been plugged in, it will take several seconds before the web interface is accessible. For using the web interface, you require a browser with support for HTML 5. Please use a recent version of one of the major browsers, such as Chrome, Firefox, Safari or Edge.

As an alternative, many parameters can be adjusted in software through the supplied API. Please refer to the API documentation and provided examples.

In case of SceneScan or Scarlet, the output of the depth map is from the view of the left camera. Thus the grey area or invalid left border on the depth map is the part, where no image details from the right camera are available for stereo matching. The stripe increases with greater disparity range.

Different color-coding schemes can be selected through the drop-down list below the preview area. A color scale is shown to the right, which provides information on the mapping between colors and disparity values. The possible color schemes are:

  • Red / blue: A gradient from red to blue, with red hues corresponding to high disparities and blue hues corresponding to low disparities. Invalid disparities are depicted in black.
  • Rainbow: A rainbow color scheme with low wavelengths corresponding to high disparities and high wavelengths corresponding to low disparities. Invalid disparities are depicted in gray.
  • Raw data: The raw disparity data without color-coding. The pixel intensity matches the integer component of the measured disparity. Invalid disparities are displayed in white.

Theoretically there is no upper range limit for stereo vision and depth measurement. However, the measurement accuracy decreases approximately quadratically with the measured distance. How quickly this accuracy reduction happens depends on the baseline distance and the lenses used. Thus the right camera setup in accordance with your individual application is the most important issue for optimal measurement performance.

The SceneScan 3D-depth sensor can only be connected with one Karmin camera at a time. For measuring with two stereoscopic cameras you need a SceneScan device for each. Furthermore each system should be connected to its own Ethernet interface, in order to avoid network throughput problems.

SceneScan supports greyscale and color USB3Vision cameras.

We recommend using our Karmin3 camera or cameras by one of our verified manufacturers, which currently are The Imaging Source, FLIR (formerly PointGrey) and Basler.

Further recommendations on the camera selection:

  • Sony Pregius sensors are preferred

  • Global Shutter

  • Monochrome

We also offer customized systems for your individual application – please contact us.

Yes! You can download our open source C++ API on our vision software release website. The API is inter operable with OpenCV and with PCL (Point Cloud Library), but can also be used on its own. You can find the API and documentation as a free download on our website. Also we support ROS. (http://wiki.ros.org/nerian_stereo)

Furthermore we offer an API for Python in our software downloads.

We support OpenCV, ROS, PCL, Matrox MIL, Halcon and more.

Windows, Linux x86 and ARM are supported.

We provide a GenTL module that can be used with Matlab’s Image Acquisition Toolbox. A short Matlab Example is included with the software download.

We are not able to change the FPGA processor implemented in our products but we allow licensing of our IP core to third-parties. This allows you to choose any compatible FPGA hardware. However, integrating the IP core into a complete system requires FPGA development know-how

It’s not about the speed, but about mounting stability of the system. For good results, the image data captured by the cameras should contain little motion blur. Motion blur can be reduced by reducing the exposure time of the camera. You have full control over the exposure time through the camera settings. In bright environments you can set very short exposure times without compromising image quality.

Our Karmin3 stereo camera and SceneScan stereo vision sensor are not waterproof. However, we can offer additional camera housings for two single USB3 vision cameras. This way we are able to deliver at least cameras with IP66 or IP67 protection.

Our Scarlet 3D depth camera offers the IP67 protection class as long as the lens windows are mounted.

We have had customers that used our systems for under water applications. In these cases, the enclosures were build by the customers. As an example, this picture is of an autonomous under water vehicle that was built by Michigan Tech Research Institute:

https://twitter.com/nerian_vision/status/1031496344005947393/photo/1

It is possible to use the system at night but you will need some light source. This can also be an infrared light, if visible light is not an option.

4. Stereo vision in use: technical answers

You can save the 3D point cloud as PLY files, which can be viewed with e.g. MeshLab or CloudCompare. You can do this with our NVCom application if you check the “3D” icon. You can also use our API to write PLY files with the method Reconstruct3D::writePlyFile()

For the white balance, you can select different presets, or an auto mode.
Fix-Focus lenses should be used, as a change of focus will slightly change the focal length with all lenses. Since the cameras are calibrated exactly, the focal length must not be changed afterwards.

The latency time depends on the chosen configuration.

Typically it is:

the time between two frames + approx. 9 ms

until a processed frame is fully received by the host computer.

Each pair of rectified left camera image and disparity map, which is transmitted by SceneScan, also includes a timestamp and a sequence number. The timestamp is measured with microsecond accuracy and is set to either the time at which a camera trigger signal was generated or the time at which a frame was received from the cameras. Additionally we offer various options for clock synchronization. The preferred option is PTP. (= Precision Time Protocol)

You can increase the frame rate if you reduce the resolution. This can be done by cropping the image in an arbitrary aspect ratio. When doing so please always keep in mind the following recommendations on frame rates and resolution for SceneScan:
Model Disparity Image Resolution
Range 640 × 480 800 × 592 1024 × 768 1600 × 1200 2016 × 1536
SceneScan 64 pixels 45 fps 30 fps n/a n/a n/a
monochrome 128 pixels 30 fps 20 fps n/a n/a n/a
SceneScan Pro 128 pixels 135 fps 90 fps 55 fps 22 fps 13 fps
monochrome 256 pixels 75 fps 53 fps 34 fps 12 fps 7 fps
SceneScan Pro 128 pixels 80 fps 53 fps 32 fps 13 fps 8 fps
color 256 pixels 72 fps 49 fps 32 fps 12 fps 7 fps
For Scarlet:
Image resolution
Disparity range 832 × 608 1024 × 768 1216 × 1024 2432 × 2048
256 pixels 120 fps 84 fps 55 fps 15 fps
512 pixels n/a n/a 38 fps 11 fps

There is no trade off between frame rate and maximum measurement range. The real trade off is between image resolution and frame rate. The image resolution does not affect the depth resolution (if the disparity range is kept equal).

For stereo vision it is very important that the cameras are precisely calibrated and that there will be no mechanical movements after the calibration is performed. Tiny deformations can otherwise severely affect the calibration and disrupt the image processing results.

This is the reason why we implemented the auto re-calibration feature. This tracks the camera calibration at runtime and continuously updates it. For this to work, the software monitors natural landmark features that the camera identifies. There is an older YouTube video (from our per-decessor system SP1) on our channel that demonstrates this technology:

https://www.youtube.com/watch?v=2QGnOwfQKYo

The auto re-calibation only adjusts the most critical calibration parameters. It is thus still necessary to perform a full manual calibration. In SceneScan’s default configuration the auto re-calibration adjusts the calibration parameters approx. every 2 minutes, but it can be configured more aggressively.

If you want to use our IP core, the auto-calibration is not implemented in the FPGA. We are performing this step in software on an ARM CPU (which is part of the Zynq SoC that we use, but you could also use a separate CPU). The source code for the auto-calibration is provided with the IP core license.

With our system it is possible to reduce effects of flickering by setting the frame rate to a multiple of the power grid frequency (e.g. 50Hz). So 50 or 25 fps should work without flickering. With LEDs we found that there are some lights that flicker and others don’t.

We have an auto exposure algorithm implemented to quickly adapt the exposure and gain settings to varying lighting conditions. So this is not a problem.

5. Questions about the Karmin 3D camera series

Our Karmin 3D cameras imitate human depth perception (see stereo vision). Two individual camera sensors record two partial images synchronously in accordance with the trigger signal SceneScan emits. In order to give reliable results it is important that the cameras are always calibrated appropriately.

SceneScan is our main product and our Karmin cameras were developed exclusively for this system’s sensor requirements. Due to technical coordination we do not support the individual use of Karmin stereo cameras without SceneScan. The stereo camera is intended as product accessory and is not for individual sale.

The baseline does influence the depth range that you can observe with a stereo camera, and also your depth resolution. The same also applies to the focal length of the lenses that you use.
Assuming that you process a disparity range with a constant size, then the following rules apply:

  • Increasing the baseline will increase your depth resolution, but will also increase the minimum distance to the camera
  • Increasing the focal length will also increase the depth resolution and minimum distance to the camera, but also reduce the field of view.

You can study this relationship in our online camera configurator.

Cameras with a base width of 1m are possible. We have some customers who use similar baseline distances. For this purpose we would use two individual cameras and mount them on an aluminum profile. With two single cameras, even an IP67 protection class is no problem. Camera protective housings can be used for this purpose. Here is an example of such a complete system:  https://twitter.com/nerian_vision/status/798097909208190976/photo/1

While SceneScan is limited to grayscale cameras, SceneScan Pro also supports color cameras. We recommend using monochrome cameras unless you absolutely need color information. The reason is that the processing performance is reduced when using color cameras (typically less than half of the frame rate). Additionally monochrome cameras provide a better image quality, which improves the stereo results.

For color camera the recommended resolution is half the sensor resolution, as we aim to avoid Bayer pattern artifacts. That means for a sensor with a native resolution of 2048 x 1536 we recommend a resolution of 1024 x 768 for the use with SceneScan.

You can change settings in our software interface. Nevertheless image quality suffers while changing color output to monochrome. So it is recommended to use a monochrome camera from the beginning, if monochrome pictures are needed.

The widest FoV that we can offer is 90° diagonally. Beyond that it is unfortunately not possible to find low-distortion lenses. When choosing your own lenses, please select for lenses with a low radial distortion. Max 2.5% TV Distortion for 2 MP resolution or 5% for 0.5 MP).

6. General questions about the company and processes

For sure, we will be available for supporting the installation of our system.

We sell and ship worldwide.

Our products come with a 2-year warranty.

The delivery time is up to 4 weeks, depending on stock and configuration.