Help for PATCH

PURPOSE:
PATCH uses radiometric data to derive probable range in parts of an image
that lack good stereophotometric data ("holes"). It takes XYZ and RAD data,
returning XYZ data with dense depth. Areas that previously had XYZ data are
typically not changed much, if at all.

EXECUTION:
patch INP=\(xyz_data.IMG rad_data.IMG\) OUT=patched_xyz.IMG
where:
xyz_data.IMG is an input 3-band image of type REAL with the X,Y and Z values
at that pixel in meters.
rad_data.IMG is an input 1 band image of type REAL with the radiometric
brightness at each pixel, ie, an image.
patched_xyz.IMG is the output 3-band image in the same format.

BACKGROUND:

Stereophotoclinometry depends on there being sufficiently high contrast
between adjacent features at the local pixel scale. In practice this means that,
for an image with about 1 mrad per pixel, the landscape must have substantial
entropy at the mm scale 1m away, and the 10cm scale 100m away. Fortunately,
weathered rocks have fractal shape properties so this method is quite reliable
on rocky surfaces. Unfortunately, sand is both low-entropy and self-similar
across a wide range of scales. That is, the noise power spectrum is weighted
towards low spatial frequencies, which is to say that stereo correlation is
noisy at best, erroneous at norm, and non-existent at worst.

Sand is also not always avoidable for rover navigation, so a method for patching
the point cloud is desired.

Photoclinometry steps up.

Photoclinometry, or "shape from shading" is a way of recovering range information
even (or especially) in the absense of stereo-correlatable features in the landscape.

It must be emphasized that photoclinometric solutions are not "unique" in the
same sense that stereocorrelation solutions can be. The very existence of graphic
art and programmable computer displays demonstrate that the properties of matter
permit infinite ways to create an illusion of depth. Even humans, with highly
developed stereo and image processing neural circuitry, can be predictably fooled
by a wide variety of optical illusions, to say nothing of forced perspective,
optical mirages, and featureless terrain. An extreme example can exist on the Antarctic
ice sheet where fata morgana and a total lack of trees can lead to hilarious levels of
disorientation, such as mistaking a pocket knife for a skidoo or a mountain,
or vice versa.

With this in mind, photoclinometric solutions represent a "best guess", as
encapsulated by a multivariate "energy function" to be minimized over a very
high dimensionality parameter space. This energy function encodes our "prior" or
innate knowledge about the environment, which fulfills the dual purpose of both
drastically reducing the allowable space of solutions, and navigating that
reduced space to a good local, if not global, optimum.

As an example, such prior assumptions could include:
- Sand has a relatively uniform albedo, so radiance variations are due to angles.
- A scene is lit by a combination of direct sunlight, sky glow, and surface scatter.
- Sand's physical properties preclude slopes beyond a certain steepness, or tunnels.
- The surface of a planet is at least C0 continuous.
- The pixel-scale slope of a given surface is related to its position relative to
surrounding pixels.
- Where it exists, stereo data will be pretty good, or at least have rigorously
bounded error.

The energy function is composed of a series of sub functions that measure
fitness on these and potentially other priors, then combine them with respect to
carefully tuned weighting parameters.

The relative value of the weights can alter the outcome, and this gives some measure
of how robust a given solution might be.

METHOD:

The method is based loosely on Jiang et al. 2016.
http://dx.doi.org/10.1016/j.isprsjprs.2017.06.010

Substantial modifications are made to account for the oblique nature of the
navcam images. A comprehensive discussion follows:

Equations 1-3 discuss how Jiang et al. develop their radiance model to account for sky glow,
scattering, and other factors. PATCH greatly simplifies this radiance model but could,
in principle, add the complexity back in later if it was desired. In my experiments
it was not necessary to do so. Furthermore, while it is tempting to add complexity to the
radiance estimation system, such complexity more often delivers the capability to fool
a human evaluator rather than actually delivering physical insight. Forcing the optimizer
to work with an unphysically simple radiance model, in some sense, makes errors obvious
enough to notice, understand, and account for.

Equation 4 describes the "image irradiance equation" which is the basis of the method.
The modeled reflectance has to match the observed image.

Equation 5 is inconsistent with later expressions - it may be corrected by adding a
square to the last parenthetical term, thus ensuring that this component of the energy
function is also positive definite.

Equation 6's method does not work in the oblique case as these discrete gradient
convolution kernels require an evenly spaced x-y grid to be valid. Instead, PATCH
calculates the normal vector for each point using an inbuilt VICAR function. In
prototype (python) PATCH computed normals by taking cross products of pointcloud
translation vectors and combining them with weights equal to the inverse of their
lengths to account for, and compensate for, much greater baselines and thus error in
the direction of depth.

There is a compact analytic bijective function between representations of the surface
in terms of some Cartesian unit normal vector and discrete x and y gradients. This can
be used to convert a normal vector to a gradient and back, though in the final implementation
all computations were expressed vectorially.

Equations 7-10 give prescriptions for updating the independent slope functions (p and q) in
terms of derivatives. Jiang et al. was able to use a partly analytic expression due to the
compact representation of their energy function. In the oblique, 3D case, this is cumbersome
at best and is more vulnerable to certain forms of numerical instability due to the widely
varying ranges within one image. In PATCH, these derivatives are calculated by perturbing the
variable (such as one component of a normal vector field) and recalculating the energy function.

Equations 11-12 exploits the Gauss-Seidel method to find the energy-minimizing range surface
of Equation 5's associated Euler-Lagrange equation. I investigated PATCH/oblique variations of
this approach in exhaustive depth and found them to be both cumbersome and intractably prone
to numerical instability. These instabilities typically manifest as the development of some
"spike" or line-of-sight oriented delta function that grows. PATCH incorporates an
energy-penalty term for this sort of unphysical noise, but there is no neat way to
add this to the Euler-Lagrange equation. Instead, PATCH uses the same perturbative approach
described above. Unlike the normal vector field, however, range is locally continuous and
thus a global perturbation in range does not result in useful gradients. That is, shifting
range by a few millimeters is equivalent to repositioning the camera by a few millimeters,
resulting in no difference to the resulting image. I experimented with different approaches
and patterns of perturbations, including using random noise and iterating a few times.
PATCH currently uses an alternating set of horizontal and vertical pixel-scale stripes as this
approach gives predictable gradients and minimizes instability.

Equation 13 solves beta (the global albedo parameter) in closed form. Herein lies a big hack
in all photoclinometric solvers. As far as I know, there is no rigorous way to uniquely
recover albedo information from a single image. There are, broadly, two approaches.

The first is to allow the albedo of each pixel to vary freely, or within certain bounds,
as yet another free parameter to be solved for globally. Very quickly one finds the
optimizer painting the landscape with the desired scene without deriving any depth information.
This is unsatisfactory, though of course permitted by the laws of nature and exploited
in photography and LCD displays every day!

The second is to process the image to "assign" an albedo to the various parts of the image
to begin with, then use the beta parameter to globally tweak the overall level of lighting.
There are complex ways to guess the albedo of rocky surfaces, such as picking the brightest
pixel in a given region as representative of what other adjacent pixels would look like
under better light. Such approaches are both error prone, lossy, and not needed for PATCH.
PATCH exists to solve depth for sand, and sand has a mostly uniform albedo. Therefore, PATCH
simply segments the image by whether or not stereo correlation has failed. The albedo
of the sand is statistically estimated and applied to the sand segment. Ditto the rocks.
For particularly pathological images with the sky in the wrong place, this method may fail
but it doesn't matter all that much since the global beta parameter can tweak this value
for regions that the energy function is sensitive to.

At the end of the day, very few of Jiang et al.'s equations are used in any recognizable form.

So...

How does the algorithm work?

1) Load the xyz point cloud, and the image, and geometric info.
2) Produce the mask, identifying which parts of the image lack stereo data.
In general this also includes the margin where the stereo images don't overlap.
3) Patch the patches by fitting the landscape to a plane.
4) Segment the image into sand, rocks using the mask, and thus create an "albedo"
image.
5) Initialize energy function parameters and history.
6) Iterate initial patch by performing energy gradient descent for each parameter.
7) Track and print progress.
8) Return optimized patch.

As implemented, PATCH has a couple of dozen helper functions called from within
the main() routine.

There are three substantial non-trivialities to the program.
- Generate the initial condition.
- Generate images from pointcloud data.
- Optimize with stability and convergence.

The initial condition is generated by fitting a plane to the inverse range array.
Inverting the range removes the problem of overfitting to distant points with
high ranges and low relevance. It also turns a hyperbolic function into a
linear one. This plane fit is fully implemented in PATCH, using Kramer's method
to solve the fit.

Image generation is performed by combining the albedo prior with a radiance function.
Radiance is calculated geometrically from surface normal vectors uvw, and
geometric information about the location of the sun and the camera. This is
substantially less sophisticated than, say, the Unity graphics engine.
Normal vectors uvw are calculated from xyz using the xyz_to_uvw.h subroutine,
which may or may not work well for this purpose.

Stability is highly non-trivial. Jiang et al.'s original method called for
simultaneous solving of range and gradient vectors using the Euler-Lagrange
formalism. While non-trivial, it is possible to generalize this method for the
fully 3D case using normal vectors instead of gradients. It does, however, lead
to insurmountable numerical instabilities due to occlusions and spikes, no matter
how aggressively they are filtered. Therefore, I opted to include the dislocation
between range and the surface normal vectors as part of the energy function to be
minimized, then vary each. Because these are all locally defined, a global delta
range does not actually help find the local gradient. Therefore, a set of four
high spatial frequency (pin striped!) perturbations are used sequentially. This
introduces noise but that noise converges away in the long run.

The algorithm employs a number of parameters, not all of which are currently able
to be set at the command line. A good number of these parameters are carefully
tuned and probably shouldn't be changed.

TESTING:

Run ./patch INP=\(NLB_577089270XYZLF0691828NCAM00379M1.IMG NLB_577089270RADLF0691828NCAM00379M1.IMG\) OUT=patched_xyz.IMG

Generating the initial condition takes about a minute while gaps are filled with a good surface guess.

Then the "big loop" starts. Global energy starts at about 2.5 and reduces to about
0.02 after ~30 steps. As this optimization runs more and more detail is adduced by
matching the radiance model to the actual image. For coarse accuracy, fewer steps
are needed.

Suggestions for refinements are included in the .cc file. In particular, the radiance
model is very very basic and could get additional bells and whistles. Image segmentation
and albedo estimation can also be tweaked depending on requirements.

After copious debugging etc it turns out the c++ version is no faster than the numpy-
accelerated Python implementation. Shucks.

HISTORY:
2020-07-24 rgd Initial version
2020-09-24 Updated to include specific differences to prior implementation
COGNIZANT PROGRAMMER: Casey Handmer

Help for PATCH

PARAMETERS:

INP

OUT

NAVTABLE

ORIGIN

CAST

CONFIG_PATH

POINT_METHOD

NOSITE

RSF

DEBUG_RSF

COORD

COORD_INDEX

FIXED_SITE

SOLUTION_ID

DATA_SET_NAME

DATA_SET_ID

RELEASE_ID

PRODUCT_ID

PRODUCER_ID

PRODUCER_INST

TARGET_NAME

TARGET_TYPE

See Examples:

Cognizant Programmer: