Help for MARSECORR
PURPOSE:
Marsecorr, which stands for Mars Epipolar Correlation, is a correlation
program that accounts for all camera types and acquisition geometries between
two images to perform first-stage disparity map computation. This program is
similar to the use of marsjplstereo prior to using affine correlator marscor3.
One of the limitation of marsjplstereo is the linearization of the images prior
to correlation which can be highly distorded if the epipoles are close or within
the image frames. marsjplstereo also does not support of PSPH PIG format at the
time of writing.
Instead of a pre-linearization and looking for matching pixel along the line,
this program follows (for each left-image pixel) the corresponding epipolar
curve in the right image, and locally pseudo-linearize (homography transform)
w.r.t. the right image prior to correlation with the left image.
EXECUTION:
Here are the main expected calls:
The simplest call:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img
Adding a post-processing filter to remove outliers:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img -STAT_FILTER
For non-standard stereo, long-baseline, gonio-type:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img -STAT_FILTER
-MULTI_PLANE
This will be a longer process, but more robust to complex topography.
If topography is mostly flat, don't use this. It is meant for scene where an
approximation of the topography by a single plane orientation would be highly
incorrect.
If a prior topography is available, a mesh for instance, it can be used to
narrow down the range bracket search and estimate the surface normal. A call
with an input mesh would be:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img IN_MESH =
\(mesh.obj,file.xy\)
file.xyz is a standard vicar file whose CS corresponds to the CS of the mesh.
This is a (hopefully) temporary hack to provide the CS of the mesh as, as of
now, the mesh ancillary does not contain enough information to reconstruct a
full CS.
A busier call:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img out_coefs=coefs.vic
out_quality=qual.vic PYRLEVEL=1 MAX_RANGE=50 -STAT_FILTER
Default Options.
Here are some of the most significant options that are activated by default:
- Pyramidal approach:
Images are downsampled to a small size and results are used to reduce range
search at next pyramid level. This improve speed and quality. See RUN_PYR
- Perpendicular epipolar offset correction:
At each pyramid level, the offset found between the match location and the
epipolar curve (in theory should be 0, but is not because of imperfection and
camera relative position error) is accounted for in the next pyramid level.
This improves speed by reducing the need for a large SEARCH. See SHIFT_EPI.
- Horizontal plane:
Because most in-situ images display level-ish flat ground, the level plane is
automatically added to the list of planes used for projecting the images.
See LEVEL_PLANE.
PROCESSING:
The goal is to get, for each pixel of the left image, the corresponding pixel
in the right image.
The idea is to project each pixel of the left image out to the 3D world at some
given range and on a given surface (plane usually) and backproject them to the
right image. A matching score is computed for each range backprojection. The
best score indicates the matching between left and right pixels (winner-takes-all
strategy). However, doing this on a pixel scale leads to very long processing
time. Instead, a plane-sweep + local "linearization" of the camera is used to
process the image by "tiles", i.e., by small chunk at the scale of which the
transform between L and R can reasonably be approximated by an homography.
The basic approach is as follow:
- The left image is sliced in tiles (TILE_SIZE) and each tile is processed
independently (multithreading over the tiles). At the tile level, the camera
model is assumed to be a linear (pinhole) model. This allows to use a
homography transform between L and R which speeds up the process.
- For each tile:
-- A list of ranges is defined such that the tile corners, once projected out
to the 3D world at the range locations and backprojected to the right
image are spaced by about EPI_STEP pixels along the epipolar curve. There
are three ways to provide a range of "ranges" to consider (see 3D World
Input Prior).
Along with the list of range, the projection plane orientation (or several
planes orientations - see MULTI_PLANE) is defined (see 3D World Input Prior).
-- For each of the planes orientations in the list:
-- For each of the ranges in the list:
* A subgrid of pixels in the tile (see GRID) of the left image are
projected on the current plane at the current range, and backprojected
onto the right image.
* An homography transform is defined from these "tiepoints" using a Least
Square approach.
* The right image is then projected on the Left tile using the homography.
* The tiles are then correlated for each pixels, using the Pearson
correlation coefficient. Pixel-wise, no subpixel precision.
-- For each pixel of the left tile, the best correlation score (for each range
location and plane orientation) is kept, along with the position in the
right image (the disparity) and the corresponding affine transform. Note that
altough the process is done with an homography, an affine transform is also
defined from the same points and is saved instead. This is due to marscor3
not supporting homography as of writing.
Once all tiles are processed, the output is filtered out if SCORE_MIN is used
and outlier filetered (STAT_FILTER) if used.
3D WORLD INPUT PRIOR.
In the process of scanning the epipolar line in the Right image to find a match
with a given Left image pixel, the Right image pixels are projected out on a
surface plane (plane defined by a range distance from the Left camera and an
orientation) and back-projected to the left image for comparison using
correlation. Both the range bracket (the list of ranges) and the orientation(s)
are important to maximize the chance of finding the corresponding Right pixel.
Depending on the situation we might have a prior knowledge (or not) of the
surface which could be used to guide the projection plane definition to maximize
the chance of finding the correct match and speed up the process:
The default situation assumes no prior knowledge of the surface. In that case,
all the range between MIN_RANGE and MAX_RANGE are tested and the surface plane
orientations (its normal) default to the plane perpendicular to the look
direction of each tile center pixel (in addition to level plane LEVEL_PLANE
which is activated by default), unless the user sets it with PLANENORM or uses
MULTI_PLANE. The large number of ranges/planes needed to make sure the matching
pixel is not missed can lead to long processing time.
There are situations where a prior knowledge of the topography, even if
low-resolution compared to the L/R images, could be used to "guide" the depth
search and surface plane orientation. For instance, the topography has been
estimated from a NavCam pair and it could be used to process some mastcam
images of the same area. In that case, instead of scanning the entire search
space between MIN_RANGE/MAX_RANGE, a smaller bracket centered around the
topography prior is scanned, which can dramatically speed up the process. The
local surface plane normal can also be estimated from that topographic prior.
There are two formats available to supply topography priors:
- IN_RANGE/IN_NORMAL: These files, which can be supplied independently of each
other, are a range file and a surface normal file. They must be in the Left
image frame, which means that it might require a pre-processing step, consisting
in *projecting* the range and/or surface normal dataset into the Left image
frame. That is, same dimensions, expressed in the Left image camera CS. Pixels
for which the topography is unknown (gaps, out of fov, etc) should be set to 0.
- IN_MESH: A mesh (.OBJ) of the topography, along with a vicar file whose CS
corresponds to the CS of the mesh. Usually is will be the XYZ vicar file from
which the mesh was computed, but it doesn't have to. The need of a vicar file
comes from the unablility to construct a full CS from the limited CS information
stored in the mesh ancillary. No other information is used from the XYZ vicar
file. Just the CS. This might change in the futur. The advantage of IN_MESH over
IN_RANGE/IN_NORMAL, is that there is no need to project the topo prior into the
Left image. One mesh can be used on several images directly.
Note that the mesh input has priority over the IN_RANGE and IN_NORMAL. If all
are supplied, IN_RANGE/IN_NORMAL are not used.
How does the topography prior is accounted for?
For each tile, the min/max ranges and average surface normal are computed. The
average surface normal defines the orientation of the projection plane for that
tile. A small range of depths centered around the min/max ranges is defined,
which [min range - step, max range + step], with step defined from SAMP_RANGE,
LINEAR_STEP, POWER_STEP. That depth bracket will be the depth search space for
that tile.
Depending on the coverage of the topographic prior, some tiles might be not
covered at all by the topography prior (large gaps in the topography, part
of the image out of FOV w.r.t the topography prior, etc). In that case, there
are two strategies (GAP_INPUT) to fill in the range and surface normal
for these tiles:
- EMPTY: don't do anything. Essentially these tiles won't be processed and the
disparity output for all pixels of that tile will be 0 (no matching)
- FULL_SWEEP: Run the full depth scan between MIN_RANGE/MAX_RANGE with the set
number of plane orientation (PLANENORM, MULTI_PLANE, LEVEL_PLANE).
PROCESSING TIME AND MULTI-THREADING
The main process is multi-threaded using openMP. The parallelism is done on the
tile list. The default tile size is defined from the TEMPLATE size and is
3xTEMPLATE size. Tile size cannot be smaller than TEMPLATE. They can be larger,
but if one wants to have optimal processing time, the tiles have to be sized in
such a way that all threads get involved. So setting a tile size of 512x512 on a
1024x1024 image will only use 4 threads.
ELLIPSOID WEIGHTED AVERAGE (EWA) resampler
The combination of the two image geometries and projection plane result in
an affine transform between the left and right image that can have strong
distortion. Such affine transform are the combination of a rotation, a shear
in X and Y, and another rotation. The shear represents the minification or
magnification of one image w.r.t. the other in directions given by the
rotation. If one wants to use the affine tranform to project one image to the
other without introducing distortion (aliasing), that directional shear must be
accounted for. To do so, we use the EWA resampler, which is a Gaussian kernel
based resampler that accounts for the affine directional shear. Essentially,
a symetric gaussian kernel interpolator in the destination image is "deformed"
according to the affine transform to define the kernel in the source image.
Usually, the destination is the Left image and the source the Right image.
The objective is to get the left image and the projected right image to have
the same frequency content. For the right image, it is dealt with the EWA
resampler which simultaneously low-pass the image in the necessary directions
and interpolate the pixel values. The left image is the destination, so it
doesn't need interpolation. However it might need a low pass filtering as some
of the left frequencies may not be present in the right image. This is also
defined from the affine transform (its inverse actually). So, in practice, the
EWA resampler is used on both the left and right image.
In a naive approach, one might uses a simple interpolator like the bicubic to
reproject the right image and just take as-is the left image. This would
provide good results if the affine transform contains a negligible shear. Which
is actually the case for a lot of situations (standard stereo). And a much
faster approach to EWA.
The two approaches are implemented in the program and can be combined/accessed
by with EWA parameter. EWA activates the use of the EWA filter if it's needed
(depends on affine transform) on the Left image only, or the Right image only,
or both, or none (see EWA). If that filter is activated for a given image (left,
righ or both) and it is not needed given the affine transform, then the bicubic
interpolator is used. If it is not activated for a given image, then the bicubic
(for right image) or as-is (for left image) are used no matter what the affine
transform is. If EWA is activated, the threshold at which EWA is used vs bicubic
is controlled by EWA_THRESHOLD. In theory EWA should be activated as soon as the
shear is > 1, but in practice, that threshold can go a bit higher without too
much effect on the result, but with decreased processing time.
HISTORY:
2018-12 ayoubfra Initial implementation.
2019-05 ayoubfra Use tiling and plane sweep for efficiency.
2019-09 ayoubfra Add IN_NORMAL and IN_RANGE.
2019-10 ayoubfra Add IN_MESH.
2020-02 ayoubfra Add PROGRESSIVE
2020-07 ayoubfra remove PROGRESSIVE, add MULTI_PLANE, SHIFT_EPI, RUN_PYR
COGNIZANT PROGRAMMER: F. Ayoub
PARAMETERS:
INP
Input images.
IN_NORMAL
Input low-res surface normal
IN_RANGE
Input low-res surface range
IN_MESH
Input mesh file
OUT
Output disparity map.
OUT_COEFS
Output affine coefficients
map.
OUT_QUALITY
Output pearson correlation
coefficient map.
BANDS
Band id to process
PYRLEVEL
Pyramid level.
MIN_RANGE
minimum range.
MAX_RANGE
maximum range.
RIGHT_MAX_DIST
plane minimum distance
from Right camera
SAMP_RANGE
range sampling
strategy.
LINEAR_STEP
range sampling step
with linear strategy.
POWER_STEP
range sampling step
with power law strategy.
EPI_STEP
stepping length -in pixel-
along epipolar curve.
TEMPLATE
Correlation window
size.
SEARCH
Search range in epipolar
perpendicular direction.
SCORE_MIN
Minimum correlation score
acceptable.
FILTER
pre low-pass images.
FILTER_SIZE
Low-pass filter
intensity.
FILTER_CONST
Scale low-pass filter
intensity.
SEP_ANGLE
maximum relative angle between
left and rigth cameras
PLANENORM
plane normal orientation
HIT_MIN_ANGLE
Minimum ray hit angle with
the surface plane
GAP_INPUT
strategy for projection plane
selection
GRID_TILE
subgrid to define homography
RES_RATIO_MAX
maximum allowed L/R resolution
ratio
TILE_SIZE
Tile size for applying
pinhole model
CHECK
whether or not run an overlap
check only
OVERLAP_CHECK
output variable for potential
overlap percentage
STAT_FILTER
Activate post-process outlier
filteting
STAT_EXTENT
Number of pixel to extend the
filter neighborood window
STAT_STHRESHOLD
Scaler for similarity threshold
STAT_NTHRESHOLD
percentage of similar pixel
in neighborood to validate pixel
OUT_DRAW
Draw epipolar curve.
DRAW_COORD
pixel coordinates to
draw epipolar curve.
TILING
Activate image tiling
considertion
MULTI_PLANE
Activate multi-plane
orientations
MULTI_NUM
Control number of
planes
LEVEL_PLANE
Add horizontal plane
SHIFT_EPI
Activate perp epipolar
offset correction
EWA
Activate ewa resampler
EWA_THRESHOLD
trigger ewa
KEWA_SIGMA
ewa gauss kernel sigma
KEWA_NSIGMA
ewa number of sigmas
KEWA_NSAMPLES
ewa kernel number of
samples
NAVTABLE
Corrected navigation filename
CONFIG_PATH
Path used to find
configuration/calibration
files.
MATCH_METHOD
Specifies a method
for pointing corrections.
MATCH_TOL
Tolerance value for
matching pointing params
in pointing corrections file.
POINT_METHOD
Specifies a mission-
specific pointing
method to use
NOSITE
Disables coordinate
system sites.
OMP_ON
Turns on or off parallel
processing (default: on)
DATA_SET_NAME
Specifies the full name given
to a data set or a data product.
DATA_SET_ID
Specifies a unique alphanumeric
identifier for a data set or data
product.
RELEASE_ID
Specifies the unique identifier
associated with the release to the
public of all or part of a data set.
The release number is associated with
the data set, not the mission.
PRODUCT_ID
Specifies a permanent, unique
identifier assigned to a data
product by its producer.
PRODUCER_ID
Specifies the unique identifier
of an entity associated with the
production a data set.
PRODUCER_INST
Specifies the full name of the
identity of an entity associated
with the production of a data set.
TARGET_NAME
Specifies a target.
TARGET_TYPE
Specifies the type of a named target.
RSF
Rover State File(s) to use.
DEBUG_RSF
Turns on debugging of RSF
parameter.
COORD
Coordinate system to use
COORD_INDEX
Coordinate system index for
some COORD/mission combos.
FIXED_SITE
Which site is FIXED for
rover missions.
SOLUTION_ID
Solution ID to use for
pointing correction.
See Examples:
Cognizant Programmer: