Help for MARSECORR

PURPOSE:
Marsecorr, which stands for Mars Epipolar Correlation, is a correlation 
program that accounts for all camera types and acquisition geometries between 
two images to perform first-stage disparity map computation. This program is 
similar to the use of marsjplstereo prior to using affine correlator marscor3.
One of the limitation of marsjplstereo is the linearization of the images prior
to correlation which can be highly distorded if the epipoles are close or within
the image frames. marsjplstereo also does not support of PSPH PIG format at the
time of writing.

Instead of a pre-linearization and looking for matching pixel along the line, 
this program follows (for each left-image pixel) the corresponding epipolar 
curve in the right image, and locally pseudo-linearize (homography transform) 
w.r.t. the right image prior to correlation with the left image.



EXECUTION:
Here are the main expected calls:

The simplest call:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img

Adding a post-processing filter to remove outliers:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img -STAT_FILTER

For non-standard stereo, long-baseline, gonio-type:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img -STAT_FILTER
-MULTI_PLANE
This will be a longer process, but more robust to complex topography.
If topography is mostly flat, don't use this. It is meant for scene where an
approximation of the topography by a single plane orientation would be highly 
incorrect. 

If a prior topography is available, a mesh for instance, it can be used to 
narrow down the range bracket search and estimate the surface normal. A call
with an input mesh would be:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img IN_MESH =
\(mesh.obj,file.xy\)
file.xyz is a standard vicar file whose CS corresponds to the CS of the mesh. 
This is a (hopefully) temporary hack to provide the CS of the mesh as, as of 
now, the mesh ancillary does not contain enough information to reconstruct a 
full CS. 

A busier call:
marsecorr inp=\(leftImage.img, rightImage\) out=disparity.img out_coefs=coefs.vic
out_quality=qual.vic PYRLEVEL=1 MAX_RANGE=50 -STAT_FILTER

Default Options.
Here are some of the most significant options that are activated by default:
- Pyramidal approach:
Images are downsampled to a small size and results are used to reduce range 
search at next pyramid level. This improve speed and quality. See RUN_PYR
- Perpendicular epipolar offset correction:
At each pyramid level, the offset found between the match location and the
epipolar curve (in theory should be 0, but is not because of imperfection and
camera relative position error) is accounted for in the next pyramid level. 
This improves speed by reducing the need for a large SEARCH. See SHIFT_EPI.
- Horizontal plane:
Because most in-situ images display level-ish flat ground, the level plane is
automatically added to the list of planes used for projecting the images.
See LEVEL_PLANE.




PROCESSING:
The goal is to get, for each pixel of the left image, the corresponding pixel
in the right image.

The idea is to project each pixel of the left image out to the 3D world at some
given range and on a given surface (plane usually) and backproject them to the 
right image. A matching score is computed for each range backprojection. The 
best score indicates the matching between left and right pixels (winner-takes-all
strategy). However, doing this on a pixel scale leads to very long processing 
time. Instead, a plane-sweep + local "linearization" of the camera is used to 
process the image by "tiles", i.e., by small chunk at the scale of which the 
transform between L and R can reasonably be approximated by an homography.

The basic approach is as follow:
- The left image is sliced in tiles (TILE_SIZE) and each tile is processed 
independently (multithreading over the tiles). At the tile level, the camera 
model is assumed to be a linear (pinhole) model. This allows to use a 
homography transform between L and R which speeds up the process.
- For each tile:
  -- A list of ranges is defined such that the tile corners, once projected out
     to the 3D world at the range locations and backprojected to the right
     image are spaced by about EPI_STEP pixels along the epipolar curve. There
     are three ways to provide a range of "ranges" to consider (see 3D World 
     Input Prior).
     Along with the list of range, the projection plane orientation (or several
     planes orientations - see MULTI_PLANE) is defined (see 3D World Input Prior). 
  -- For each of the planes orientations in the list:
     -- For each of the ranges in the list:
        * A subgrid of pixels in the tile (see GRID) of the left image are 
        projected on the current plane at the current range, and backprojected 
        onto the right image. 
        * An homography transform is defined from these "tiepoints" using a Least 
        Square approach.
        * The right image is then projected on the Left tile using the homography.
        * The tiles are then correlated for each pixels, using the Pearson 
        correlation coefficient. Pixel-wise, no subpixel precision.
  -- For each pixel of the left tile, the best correlation score (for each range
     location and plane orientation) is kept, along with the position in the 
     right image (the disparity) and the corresponding affine transform. Note that
     altough the process is done with an homography, an affine transform is also
     defined from the same points and is saved instead. This is due to marscor3 
     not supporting homography as of writing.

Once all tiles are processed, the output is filtered out if SCORE_MIN is used
and outlier filetered (STAT_FILTER) if used.




3D WORLD INPUT PRIOR.
In the process of scanning the epipolar line in the Right image to find a match
with a given Left image pixel, the Right image pixels are projected out on a
surface plane (plane defined by a range distance from the Left camera and an
orientation) and back-projected to the left image for comparison using 
correlation. Both the range bracket (the list of ranges) and the orientation(s) 
are important to maximize the chance of finding the corresponding Right pixel. 
Depending on the situation we might have a prior knowledge (or not) of the 
surface which could be used to guide the projection plane definition to maximize
the chance of finding the correct match and speed up the process: 

The default situation assumes no prior knowledge of the surface. In that case,
all the range between MIN_RANGE and MAX_RANGE are tested and the surface plane 
orientations (its normal) default to the plane perpendicular to the look 
direction of each tile center pixel (in addition to level plane LEVEL_PLANE
which is activated by default), unless the user sets it with PLANENORM or uses
MULTI_PLANE. The large number of ranges/planes needed to make sure the matching 
pixel is not missed can lead to long processing time.

There are situations where a prior knowledge of the topography, even if 
low-resolution compared to the L/R images, could be used to "guide" the depth
search and surface plane orientation. For instance, the topography has been
estimated from a NavCam pair and it could be used to process some mastcam 
images of the same area. In that case, instead of scanning the entire search 
space between MIN_RANGE/MAX_RANGE, a smaller bracket centered around the 
topography prior is scanned, which can dramatically speed up the process. The
local surface plane normal can also be estimated from that topographic prior. 

There are two formats available to supply topography priors:
- IN_RANGE/IN_NORMAL: These files, which can be supplied independently of each
other, are a range file and a surface normal file. They must be in the Left
image frame, which means that it might require a pre-processing step, consisting
in *projecting* the range and/or surface normal dataset into the Left image 
frame. That is, same dimensions, expressed in the Left image camera CS. Pixels 
for which the topography is unknown (gaps, out of fov, etc) should be set to 0.
- IN_MESH: A mesh (.OBJ) of the topography, along with a vicar file whose CS 
corresponds to the CS of the mesh. Usually is will be the XYZ vicar file from 
which the mesh was computed, but it doesn't have to. The need of a vicar file
comes from the unablility to construct a full CS from the limited CS information
stored in the mesh ancillary. No other information is used from the XYZ vicar
file. Just the CS. This might change in the futur. The advantage of IN_MESH over
IN_RANGE/IN_NORMAL, is that there is no need to project the topo prior into the 
Left image. One mesh can be used on several images directly.

Note that the mesh input has priority over the IN_RANGE and IN_NORMAL. If all 
are supplied, IN_RANGE/IN_NORMAL are not used.

How does the topography prior is accounted for?
For each tile, the min/max ranges and average surface normal are computed. The
average surface normal defines the orientation of the projection plane for that
tile. A small range of depths centered around the min/max ranges is defined, 
which [min range - step, max range + step], with step defined from SAMP_RANGE, 
LINEAR_STEP, POWER_STEP. That depth bracket will be the depth search space for 
that tile. 
Depending on the coverage of the topographic prior, some tiles might be not 
covered at all by the topography prior (large gaps in the topography, part
of the image out of FOV w.r.t the topography prior, etc). In that case, there
are two strategies (GAP_INPUT) to fill in the range and surface normal 
for these tiles:
- EMPTY: don't do anything. Essentially these tiles won't be processed and the
disparity output for all pixels of that tile will be 0 (no matching)
- FULL_SWEEP: Run the full depth scan between MIN_RANGE/MAX_RANGE with the set
number of plane orientation (PLANENORM, MULTI_PLANE, LEVEL_PLANE).


PROCESSING TIME AND MULTI-THREADING
The main process is multi-threaded using openMP. The parallelism is done on the
tile list. The default tile size is defined from the TEMPLATE size and is
3xTEMPLATE size. Tile size cannot be smaller than TEMPLATE. They can be larger, 
but if one wants to have optimal processing time, the tiles have to be sized in
such a way that all threads get involved. So setting a tile size of 512x512 on a
1024x1024 image will only use 4 threads.



ELLIPSOID WEIGHTED AVERAGE (EWA) resampler
The combination of the two image geometries and projection plane result in 
an affine transform between the left and right image that can have strong
distortion. Such affine transform are the combination of a rotation, a shear
in X and Y, and another rotation. The shear represents the minification or 
magnification of one image w.r.t. the other in directions given by the 
rotation. If one wants to use the affine tranform to project one image to the
other without introducing distortion (aliasing), that directional shear must be
accounted for. To do so, we use the EWA resampler, which is a Gaussian kernel
based resampler that accounts for the affine directional shear. Essentially,
a symetric gaussian kernel interpolator in the destination image is "deformed"
according to the affine transform to define the kernel in the source image.
Usually, the destination is the Left image and the source the Right image.
The objective is to get the left image and the projected right image to have
the same frequency content. For the right image, it is dealt with the EWA
resampler which simultaneously low-pass the image in the necessary directions
and interpolate the pixel values. The left image is the destination, so it
doesn't need interpolation. However it might need a low pass filtering as some
of the left frequencies may not be present in the right image. This is also 
defined from the affine transform (its inverse actually). So, in practice, the 
EWA resampler is used on both the left and right image.

In a naive approach, one might uses a simple interpolator like the bicubic to
reproject the right image and just take as-is the left image. This would 
provide good results if the affine transform contains a negligible shear. Which
is actually the case for a lot of situations (standard stereo). And a much
faster approach to EWA.

The two approaches are implemented in the program and can be combined/accessed
by with EWA parameter. EWA activates the use of the EWA filter if it's needed 
(depends on affine transform) on the Left image only, or the Right image only, 
or both, or none (see EWA). If that filter is activated for a given image (left,
righ or both) and it is not needed given the affine transform, then the bicubic 
interpolator is used. If it is not activated for a given image, then the bicubic 
(for right image) or as-is (for left image) are used no matter what the affine 
transform is. If EWA is activated, the threshold at which EWA is used vs bicubic 
is controlled by EWA_THRESHOLD. In theory EWA should be activated as soon as the 
shear is > 1, but in practice, that threshold can go a bit higher without too
much effect on the result, but with decreased processing time.  





HISTORY:
2018-12 ayoubfra  Initial implementation. 
2019-05 ayoubfra  Use tiling and plane sweep for efficiency. 
2019-09 ayoubfra  Add IN_NORMAL and IN_RANGE. 
2019-10 ayoubfra  Add IN_MESH. 
2020-02 ayoubfra  Add PROGRESSIVE
2020-07 ayoubfra  remove PROGRESSIVE, add MULTI_PLANE, SHIFT_EPI, RUN_PYR

COGNIZANT PROGRAMMER:  F. Ayoub


PARAMETERS:


INP

Input images.

IN_NORMAL

Input low-res surface normal

IN_RANGE

Input low-res surface range

IN_MESH

Input mesh file

OUT

Output disparity map.

OUT_COEFS

Output affine coefficients map.

OUT_QUALITY

Output pearson correlation coefficient map.

BANDS

Band id to process

PYRLEVEL

Pyramid level.

MIN_RANGE

minimum range.

MAX_RANGE

maximum range.

RIGHT_MAX_DIST

plane minimum distance from Right camera

SAMP_RANGE

range sampling strategy.

LINEAR_STEP

range sampling step with linear strategy.

POWER_STEP

range sampling step with power law strategy.

EPI_STEP

stepping length -in pixel- along epipolar curve.

TEMPLATE

Correlation window size.

SEARCH

Search range in epipolar perpendicular direction.

SCORE_MIN

Minimum correlation score acceptable.

FILTER

pre low-pass images.

FILTER_SIZE

Low-pass filter intensity.

FILTER_CONST

Scale low-pass filter intensity.

SEP_ANGLE

maximum relative angle between left and rigth cameras

PLANENORM

plane normal orientation

HIT_MIN_ANGLE

Minimum ray hit angle with the surface plane

GAP_INPUT

strategy for projection plane selection

GRID_TILE

subgrid to define homography

RES_RATIO_MAX

maximum allowed L/R resolution ratio

TILE_SIZE

Tile size for applying pinhole model

CHECK

whether or not run an overlap check only

OVERLAP_CHECK

output variable for potential overlap percentage

STAT_FILTER

Activate post-process outlier filteting

STAT_EXTENT

Number of pixel to extend the filter neighborood window

STAT_STHRESHOLD

Scaler for similarity threshold

STAT_NTHRESHOLD

percentage of similar pixel in neighborood to validate pixel

OUT_DRAW

Draw epipolar curve.

DRAW_COORD

pixel coordinates to draw epipolar curve.

TILING

Activate image tiling considertion

MULTI_PLANE

Activate multi-plane orientations

MULTI_NUM

Control number of planes

LEVEL_PLANE

Add horizontal plane

SHIFT_EPI

Activate perp epipolar offset correction

EWA

Activate ewa resampler

EWA_THRESHOLD

trigger ewa

KEWA_SIGMA

ewa gauss kernel sigma

KEWA_NSIGMA

ewa number of sigmas

KEWA_NSAMPLES

ewa kernel number of samples

NAVTABLE

Corrected navigation filename

CONFIG_PATH

Path used to find configuration/calibration files.

MATCH_METHOD

Specifies a method for pointing corrections.

MATCH_TOL

Tolerance value for matching pointing params in pointing corrections file.

POINT_METHOD

Specifies a mission- specific pointing method to use

NOSITE

Disables coordinate system sites.

OMP_ON

Turns on or off parallel processing (default: on)

DATA_SET_NAME

Specifies the full name given to a data set or a data product.

DATA_SET_ID

Specifies a unique alphanumeric identifier for a data set or data product.

RELEASE_ID

Specifies the unique identifier associated with the release to the public of all or part of a data set. The release number is associated with the data set, not the mission.

PRODUCT_ID

Specifies a permanent, unique identifier assigned to a data product by its producer.

PRODUCER_ID

Specifies the unique identifier of an entity associated with the production a data set.

PRODUCER_INST

Specifies the full name of the identity of an entity associated with the production of a data set.

TARGET_NAME

Specifies a target.

TARGET_TYPE

Specifies the type of a named target.

RSF

Rover State File(s) to use.

DEBUG_RSF

Turns on debugging of RSF parameter.

COORD

Coordinate system to use

COORD_INDEX

Coordinate system index for some COORD/mission combos.

FIXED_SITE

Which site is FIXED for rover missions.

SOLUTION_ID

Solution ID to use for pointing correction.

See Examples:


Cognizant Programmer: