1 Introduction

Shortest paths in position and orientation space are central in this paper. Dubins describes in [21] the problem of finding shortest paths for a car in the plane between initial and final points and direction, with a penalization on the radius of curvature, for a car that has no reverse gear. Reeds and Shepp consider in [50] the same problem, but then for a car that does have the possibility for backward motion. In both papers, the focus lies on describing and proving the general shape of the optimal paths, without giving explicit solutions for the shortest paths.

This can be considered a curve optimization problem in the space \({\mathbb {R}}^2 \times (\mathbb {R}/2 \pi \mathbb {Z})\), equipped with the natural Euclidean metric but only among curves \(\gamma (t) = (x(t),y(t),\theta (t))\) subject to the constraint that \((\dot{x}(t), \dot{y}(t))\) is proportional to \((\cos \theta (t), \sin \theta (t))\). Formulating the problem this way, it becomes one of the simplest examples of sub-Riemannian (SR) geometry: the tangent vector \(\dot{\gamma }(t)\) is constrained to remain in the span of \((\cos \theta (t),\sin \theta (t),0)\) and (0, 0, 1), see Fig. 1. The SR curve optimization problem and the properties of its geodesics in \({\mathbb {R}}^2 \times \mathbb {S}^1\) have been studied and applied in image analysis by [2, 11, 16, 22, 37, 48], and in particular for modeling the Reeds–Shepp car in [10, 44, 52], whereas the latter presented a complete and optimal synthesis for the geometric control problem on \({\mathbb {R}}^{2}\times \mathbb {S}^{1}\) with uniform cost. Properties of SR geodesics in \({\mathbb {R}}^d \times \mathbb {S}^{d-1}\) with \(d=3\) have been studied in [25] and for general d in [24]. Apart from the Reeds–Shepp car problem, there are other examples relating optimal control theory and SR geometry, see for example the books by Agrachev and Sachkov [2] and Montgomery [45]. Applications in robotics and visual modeling of SR geometry and control theory can be found in, e.g., [56].

Fig. 1
figure 1

Top: a car can only move in its current orientation or change its current orientation. In other words, when the path \(\gamma (t) = (x(t),y(t),\theta (t))\) is considered as indicated in the left figure, the tangent \(\dot{\gamma }(t)\) is restricted to the span of \((\cos \theta (t), \sin \theta (t),0)\) and (0, 0, 1), of which the green plane on the right is an example. Bottom: the meaning of shortest path between points in an image is determined by a combination of a cost computed from the data, the restriction above and a curvature penalization. The path optimization problem is formulated on the position-orientation domain such as in the image on the right. The cost for moving through the orange parts is lower than elsewhere (Color figure online)

On the left in Fig. 2, we show an example of an optimal path between two points in \({\mathbb {R}}^2 \times \mathbb {S}^1\). The projection on \({\mathbb {R}}^2\) of this curve has two parts where the car moves in reverse (the red parts of the line), resulting in two cusps. From the perspective of image analysis applications this is undesirable and it is a valid question what the optimal paths are if cusps and reverse gear are not allowed. In this paper, similar to the difference between the Dubins car and the Reeds–Shepp car, we also consider this variant: it can be accounted for by requiring that the spatial propagation is forward. This variant falls outside the SR framework and requires asymmetric Finsler geometry instead.

Furthermore, we would like to extend the Finsler metric using two data-driven factors that can vary with position and orientation. This can be used to compute shortest paths for a car, where for example road conditions and obstacles are taken into account. In [8] it is shown this approach is useful for tracking vessels in retinal images. Likewise, the 3D variant of the problem provides a basis for algorithms for blood vessel detection in 3D magnetic resonance angiography (MRA) data or detection of shortest paths and quantification of structural connectivity in 5D diffusion-weighted magnetic resonance imaging (MRI) data of the brain.

Fig. 2
figure 2

Top: example of a shortest path with (left) and without (right) reverse gear in \({\mathbb {R}}^2 \times S\) and its projection on \({\mathbb {R}}^2\). The black arrows indicate the begin and end condition in the plane, corresponding to the blue dots in \({\mathbb {R}}^2 \times S\). The paths in the lifted space are smooth, but vertical tangents appear in both cases. In the left figure, the projection of the path has two cusps, and the first and last part of the path is traversed backward (the red parts). On the right, backward motion is not possible. Instead, according to our model, the shortest path is a concatenation of an in-place rotation (green), a SR geodesic and again an in-place rotation. Bottom: corresponding control sets as defined in (7) for the allowed velocities at each position and orientation, with \(B_{\mathcal {F}_0}\) on the left and \(B_{\mathcal {F}_0^+}\) on the right (Color figure online)

1.1 A Distance Function and the Corresponding Shortest Paths on \(\mathbb {R}^d \times \mathbb {S}^{d-1}\)

We fix the dimension \(d\in \{2,3\}\), and let \({\mathbb {M}} := \mathbb {R}^d\times \mathbb {S}^{d-1}\) be the \(2d-1\)-dimensional manifold of positions and orientations. We use a Finsler metric on the tangent bundle of \({\mathbb {M}}\), \(\mathcal {F}: T({\mathbb {M}}) \rightarrow [0,+\infty ]\), of which specific properties are discussed later, to define a geometry on \({\mathbb {M}}\). Any such Finsler metric \(\mathcal {F}\) induces a measure of length \({{\mathrm{Length}}}_\mathcal {F}\) on the class of paths with Lipschitz regularity, defined asFootnote 1

$$\begin{aligned} {{\mathrm{Length}}}_\mathcal {F}(\gamma ) := \int _0^1 \mathcal {F}(\gamma (t), \dot{\gamma }(t)) \, \mathrm{d}t, \end{aligned}$$

with the convention \(\dot{\gamma }(t) := \frac{d}{dt} \gamma (t)\). The path is said to be normalized w.r.t. \(\mathcal {F}\) iff \(\mathcal {F}(\gamma (t), \dot{\gamma }(t)) = {{\mathrm{Length}}}_\mathcal {F}(\gamma )\) for all \(t \in [0,1]\). Any Lipschitz continuous path of finite length can be normalized by a suitable reparameterization. Finally, the quasi-distance \(d_\mathcal {F}: {\mathbb {M}} \times {\mathbb {M}} \rightarrow [0,+\infty ]\) is defined for all \({\mathbf {p}},{\mathbf {q}}\in {\mathbb {M}}\) by

$$\begin{aligned} \begin{aligned} d_\mathcal {F}({\mathbf {p}}, {\mathbf {q}}) := \inf \{ {{\mathrm{Length}}}_\mathcal {F}(\gamma ) \; |\; \gamma \in \varGamma ,\,&\gamma (0)={\mathbf {p}}, \\&\gamma (1)={\mathbf {q}}\}, \end{aligned} \end{aligned}$$
(1)

with \(\varGamma :={{\mathrm{Lip}}}([0,1], {\mathbb {M}})\). Normalized minimizers of (1) are called minimizing geodesics from \({\mathbf {p}}\) to \({\mathbf {q}}\) w.r.t. \(\mathcal {F}\). For certain pairs \(({\mathbf {p}},{\mathbf {q}})\) these minimizers may not be unique, and these points are often of interest, see for example [9, 44].

Definition 1

(Maxwell point) Let \({\mathbf {p}}_S \in {\mathbb {M}}\) be a fixed point source and \(\gamma \in \varGamma \) a geodesic connecting \({\mathbf {p}}_S\) with \({\mathbf {q}}\in {\mathbb {M}}\), \({\mathbf {q}}\ne {\mathbf {p}}_S\). Then \({\mathbf {q}}\) is a Maxwell point if there exists another extremal path \(\tilde{\gamma } \in \varGamma \) connecting \({\mathbf {p}}_S\) and \({\mathbf {q}}\), with \({{\mathrm{Length}}}_\mathcal {F}(\gamma ) = {{\mathrm{Length}}}_\mathcal {F}(\tilde{\gamma })\). If \({\mathbf {q}}\) is the first point (distinct from \({\mathbf {p}}_S\)) on \(\gamma \) where such \(\tilde{\gamma }\) exists, then \({\mathbf {q}}\) is called the first Maxwell point. The curves \(\gamma , \tilde{\gamma }\) lose global optimality after the first Maxwell point.

Remark 1

(Terminology) We use the common terminology of ‘Finsler metric’ for \(\mathcal {F}\), although it is also called ‘Finsler function’, ‘Finsler norm’ or ‘Finsler structure’, and despite the fact that \(\mathcal {F}\) is not a metric (distance) in the classical sense. The Finsler metric \(\mathcal {F}\) induces the quasi-distance \(d_{\mathcal {F}}\) as defined in (1). If \(\mathcal {F}({\mathbf {p}},\dot{{\mathbf {p}}}) = \mathcal {F}({\mathbf {p}},-\dot{{\mathbf {p}}})\) for all \({\mathbf {p}}\in {\mathbb {M}}\) and tangent vectors \(\dot{{\mathbf {p}}} \in T_{{\mathbf {p}}}({\mathbb {M}})\), then \(d_{\mathcal {F}}\) is a true metric, satisfying \(d_{\mathcal {F}}({\mathbf {p}},{\mathbf {q}}) = d_{\mathcal {F}}({\mathbf {q}},{\mathbf {p}})\) for all \({\mathbf {p}},{\mathbf {q}}\in {\mathbb {M}}\). However, to avoid confusion of the word metric, we will only refer to \(d_{\mathcal {F}}\) as a distance or quasi-distance. If the ‘Finsler metric’ \(\mathcal {F}\) is induced by a metric tensor field \(\mathcal {G}\) on Riemannian manifold \(({\mathbb {M}},{\mathcal {G}})\), then one has \({{\mathcal {F}}}(\mathbf {p},\dot{\mathbf {p}})=\sqrt{\left. \mathcal {G}\right| _{\mathbf {p}}(\dot{\mathbf {p}},\dot{\mathbf {p}})}\).

Throughout the document, we use the words path and curve synonymously. When we consider formal curve optimization problem (1), we speak of geodesics for the stationary curves. Such stationary curves are locally minimizing. A global minimizer of (1) is referred to as minimizing geodesic or minimizer.

1.2 Geometry of the Reeds–Shepp Model

We introduce the Finsler metric \(\mathcal {F}_0\) underlying the Reeds–Shepp car model and the Finsler metric \(\mathcal {F}_0^+\) corresponding to the variant without reverse gear. Let \(({\mathbf {p}}, \dot{{\mathbf {p}}}) \in T({\mathbb {M}})\) be a pair consisting of a point \({\mathbf {p}}\in {\mathbb {M}}\) and a tangent vector \(\dot{{\mathbf {p}}} \in T_{\mathbf {p}}({\mathbb {M}})\) at this point. The physical and angular components of a point \({\mathbf {p}}\in {\mathbb {M}}\) are denoted by \({{\mathbf {x}}}\in \mathbb {R}^d\) and \(\mathbf {n}\in \mathbb {S}^{d-1}\), and this convention carries over to the tangent:

$$\begin{aligned} {\mathbf {p}}&= ({{\mathbf {x}}},\mathbf {n}),&\dot{{\mathbf {p}}}&= (\dot{{{\mathbf {x}}}}, \dot{\mathbf {n}}) \in T_{\mathbf {p}}( {\mathbb {M}}). \end{aligned}$$

We say that \(\dot{{{\mathbf {x}}}}\) is proportional to \(\mathbf {n}\), that we write as \(\dot{{{\mathbf {x}}}} \propto \mathbf {n}\), iff there exists a \(\lambda \in \mathbb {R}\) such that \(\dot{{{\mathbf {x}}}} = \lambda \mathbf {n}\). Define

$$\begin{aligned} \mathcal {F}_0({\mathbf {p}}, \dot{{\mathbf {p}}})^2&:= {\left\{ \begin{array}{ll} \mathcal {C}^{2}_1({\mathbf {p}}) |\dot{{{\mathbf {x}}}} \cdot \mathbf {n}|^2 +\mathcal {C}_{2}^2({\mathbf {p}}) \Vert \dot{\mathbf {n}}\Vert ^2 &{} \text {if }\dot{{{\mathbf {x}}}} \propto \mathbf {n}, \\ +\infty &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(2)
$$\begin{aligned} \mathcal {F}^+_0({\mathbf {p}}, \dot{{\mathbf {p}}})^2&:= {\left\{ \begin{array}{ll} \mathcal {C}^{2}_1({\mathbf {p}}) |\dot{{{\mathbf {x}}}} \cdot \mathbf {n}|^2 + \mathcal {C}_{2}^2({\mathbf {p}}) \Vert \dot{\mathbf {n}}\Vert ^2 &{} \hbox { if } \dot{{{\mathbf {x}}}} \propto \mathbf {n}\hbox { and } \\ \dot{{{\mathbf {x}}}} \cdot \mathbf {n}\ge 0, \\ +\infty &{} \text {otherwise.} \end{array}\right. } \end{aligned}$$
(3)

Here \(\Vert {\cdot } \Vert \) denotes the norm and ‘\(\cdot \)’ the usual inner product on the Euclidean space \({\mathbb {R}}^d\). The functions \(\mathcal {C}_{1}\) and \(\mathcal {C}_{2}\) are assumed to be continuous on \({\mathbb {M}}\) and uniformly bounded from below by a positive constant \(\delta >0\). In applications, \(\mathcal {C}_{1}\) and \(\mathcal {C}_{2}\) are chosen so as to favor paths which remain close to regions of interest, e.g., along blood vessels in retinal images, see Fig. 1. Note that their physical units are distinct: if one wishes \(d_\mathcal {F}\) to have the dimension [T] of a travel time, then \(\mathcal {C}_1^{-1}\) is a physical, (strictly) spatial velocity \([\mathrm {Length}][T]^{-1}\), and \( \mathcal {C}_2^{-1}\) is an angular velocity \([\mathrm {Rad}][T]^{-1}\). For simplicity one often sets \(\mathcal {C}_{1} = \xi \mathcal {C}_{2}\), where \(\xi ^{-1}>0\) is a unit of spatial length. The special case \(\mathcal {C}_1({\mathbf {p}}) = \xi \mathcal {C}_2({\mathbf {p}}) = \xi \) for all \({\mathbf {p}}\in {\mathbb {M}}\) is referred to as the uniform cost case.

1.3 The Eikonal Equation and the Fast-Marching Algorithm

We compute the distance map to a point source on a volume using the relation to eikonal equations. Let \({\mathbf {p}}_\mathrm{S}\in {\mathbb {M}}\) be an arbitrary source point, and let U be the associated distance function

$$\begin{aligned} U({\mathbf {p}}) := d_\mathcal {F}({\mathbf {p}}_\mathrm{S},{\mathbf {p}}). \end{aligned}$$
(4)

Then U is the unique viscosity solution [18, 19] to the eikonal PDE:

$$\begin{aligned} \left\{ \begin{aligned}&\mathcal {F}^*({\mathbf {p}},\mathrm{d}U({\mathbf {p}})) = 1 \qquad \text { for all } {\mathbf {p}}\in {\mathbb {M}}{\setminus }\{{\mathbf {p}}_\mathrm{S}\}, \\&U({\mathbf {p}}_\mathrm{S}) = 0. \end{aligned} \right. \end{aligned}$$
(5)

Here \(\mathcal {F}^*\) is the dual metric of \(\mathcal {F}\) and \({\mathrm {d}}U\) is the differential of the distance map U. However, for these relations to hold, and for numerical discretization to be practical, \(\mathcal {F}\) should be at least continuous.Footnote 2 We therefore propose in Sect. 2.3 for both \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\) an approximating metric that we denote by \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\), respectively, that are continuous and converge to \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\) as \(\varepsilon \rightarrow 0\). The approximating metrics correspond to a highly anisotropic Riemannian and Finslerian metric, rather than a sub-Riemannian or sub-Finslerian metric. The metric \(\mathcal {F}_{\epsilon }\) is in line with previous approximations [8, 16, 53] for the case \(d=2\).

We design a monotone and causal discretization scheme for static Hamilton–Jacobi PDE (5), which allows to apply an efficient, single-pass fast-marching algorithm [59]. Let us emphasize that designing a causal discretization scheme for (5) is non-trivial, because its local connectivity needs to obey an acuteness property [55, 61] depending on the geometry defined by \(\mathcal {F}\). We provide constructions for the metrics \(\mathcal {F}_\varepsilon \) or \(\mathcal {F}_\varepsilon ^+\) of interest, based on the earlier works [40, 41].

Fig. 3
figure 3

Challenges and applications. Top row: the case \(d = 2\), with a toy problem for finding the shortest way with or without reverse gear (blue and red, respectively) to the exit in Centre Pompidou (top left) and a vessel tracking problem in a retinal image. Bottom row: the case \(d = 3\), connectivity in (simulated) dMRI data. Left: visualization of a dataset with two crossing bundles without torsion, with a glyph visualization of the data in \({\mathbb {R}}^3 \times \mathbb {S}^2\) and a magnification of one such glyph, indicating two main fiber directions. Right: the spatial configuration in \({\mathbb {R}}^{3}\) of bundles with torsion in an artificial dataset on \({\mathbb {R}}^3 \times \mathbb {S}^2\) (Color figure online)

1.4 Shortest Paths and Minimal Distances in Medical Images

The application of the Hamilton–Jacobi framework for finding shortest paths has been shown to be useful for vessel tracking in retinal images [8], see Fig. 3 (top, right). The computational advantage of the fast-marching solver over the numerical method in [8] in this setting was demonstrated by Sanguinetti et al. [53]. A related approach using fast marching with elastica functionals can be found in [14, 15]. The sub-Riemannian approach by Bekkers et al. [8] concerns the two-dimensional Reeds–Shepp car model with reverse gear, where 2D grayscale images are first lifted to an orientation score [23] defined on the higher-dimensional manifold \(\mathbb {R}^2 \times \mathbb {S}^1\). There, the combination of the sub-Riemannian metric, the cost function derived from the orientation score, and the numerical anisotropic fast-marching solver, provided a solid approach to accurately track vessels in challenging sets of images.

In the previous works [8, 9] the clear advantage of sub-Riemannian geometrical models over isotropic Riemannian models on \({\mathbb {R}}^{2}\times \mathbb {S}^{1}\) has been shown with many experiments.Footnote 3

In this work we will show similar benefits for our sub-Riemannian tracking in \({\mathbb {R}}^{3}\times \mathbb {S}^{2}\). In general, regardless of the choice of image dimension \(d \in \{2,3\}\), one has that our extension of the Hamilton–Jacobi framework from the conventional base manifold of position space only (i.e., \({\mathbb {R}}^d\)) to the base manifold of positions and orientations (i.e., \({\mathbb {R}}^d \times \mathbb {S}^{d\!-\!1}\)) generically deals with the ‘leakage problem’ where wavefronts leak at crossings in the conventional eikonal frameworks acting directly in the image domain. See Fig. 4 where our solution to the ‘leakage problem’ is illustrated for \(d=2\).

Fig. 4
figure 4

Top: an orientation score [23, 32] provides a complete overview of how the image is decomposed out of local orientations. It is a method that enlarges the image domain from \({\mathbb {R}}^{d}\) to \({\mathbb {R}}^{d} \times \mathbb {S}^{d\!-\!1}\) (here \(d=2\)). Bottom: conventional geodesic wavefront propagation in images (in red) typically leaks at crossings, whereas wavefront propagation in orientation scores (in green) does not suffer from this complication. A minimum intensity projection over orientation gives optimal fronts in the image. The cost for moving through the orange parts is lower than elsewhere and is computed from the orientation score, see, e.g., [8]. The ‘leakage problem’ is gone both for propagating symmetric sub-Riemannian spheres (left), and it is also gone for propagation of asymmetric Finsler spheres (right) (Color figure online)

Regarding image analysis applications, we propose to use the same strategy of sub-Riemannian and Finslerian tracking above the extended base manifold \({\mathbb {R}}^{3} \times \mathbb {S}^{2}\) of positions and orientations for fiber tracking and structural connectivity in brain white matter in diffusion-weighted MRI data. For diffusion-weighted MRI images, a signal related to the amount of diffusion of water molecules is measured, which in the case of neuroimages is considered to reflect the structural connectivity in brain white matter. The images can in a natural way be considered to have domain \(\varOmega \subset \mathbb {R}^3 \times \mathbb {S}^2\). Figure 3 (bottom) illustrates such images. On the left we use a glyph visualization that shows a surface for each grid point, where the distance from the surface to the corresponding grid point \(\mathbf {x}\) is proportional to the data value \(U(\mathbf {x},\mathbf {n})\) and the coloring is related to the orientation \(\mathbf {n} \in \mathbb {S}^{2}\). As such the dMRI data already provide a distribution on \({\mathbb {R}}^{3}\times \mathbb {S}^2\) and do not require an ‘orientation score’ as depicted in Figs. 1 and 4.

A large number of tractography methods exist that are designed to estimate/approximate the fiber paths in the brain based on dMRI data. Most of these methods construct tracks that locally follow the structure of the data, see, e.g., [20, 58] or references in [34]. More related to our approach are geodesic methods that have the advantage that they minimize a functional, and thereby are less sensitive to noise and provide a certain measure of connectivity between regions. These methods can be based on diffusion tensors in combination with Riemannian geometry on position space, e.g., [30, 33, 35]. One can also make use of the more general Finsler geodesic tracking to include directionality [38, 39] and use high angular resolution data (HARDI), examples of which can be found in [5, 54]. Recently, a promising method has been proposed, based on geodesics on the full position and orientation space using a data-adaptive Riemannian metric [47]. We also work on this joint space of positions and orientations, but use either Riemannian or asymmetric Finsler metrics that are highly anisotropic that we solve by a numerical fast-marching method that is able to deal with this high anisotropy. We show on artificial datasets how our method can be employed to give shortest paths between two regions w.r.t the imposed Finsler metric and that these paths correctly follow the bundle structure.

1.5 Contributions and Outline

The extension to 3D of the Reeds–Shepp car model and the adaptation to model shortest paths for cars that cannot move backward are new and provide an interesting collection of new theoretical and practical results:

  • In Theorem 1 we show that the Reeds–Shepp model is globally and locally controllable, and that the Reeds–Shepp model without reverse gear is globally but not locally controllable. Hence, the distance map loses continuity.

  • We introduce regularizations \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) of the Finsler metrics \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\), which make our numerical discretization possible. We show that both the corresponding distances converge to \(d_{\mathcal {F}_0}\) and \(d_{\mathcal {F}_0^+}\) as \(\varepsilon \rightarrow 0\) and the minimizing curves converge to the ones for \(\varepsilon = 0\), see Theorem 2.

  • We present and prove for \(d =2\) and uniform cost a theorem that describes the occurrence of cusps for the sub-Riemannian model using \(\mathcal {F}_0\), and that using \(\mathcal {F}_0^+\) leads to geodesics that are a concatenation of purely angular motion, a sub-Riemannian geodesic without cusps and again a purely angular motion. We call the positions where in-place rotation (or purely angular motion) takes place keypoints. For uniform cost, we show that the only possible keypoints are the begin and endpoints, and for many end conditions we can describe how this happens. The precise theoretical statement and proof are found in Theorem 3.

  • Furthermore, we show in Theorem 4 how the geodesics can be obtained from the distance map, for a general Finsler metric, and in the more specific cases that we use in this paper. For our cases of interest, we show that backtracking of geodesics is either done via a single intrinsic gradient descent (for the models with reverse gear) or via two intrinsic gradient descents (for the model without reverse gear).

  • For our numerical experiments we make use of a fast-marching implementation, for \(d = 2\) introduced in [41]. In Section 6 we give a summary of the numerical approach for \(d = 3\), but a detailed discussion of the implementation and an evaluation of the accuracy of the method are beyond the scope of this paper and will follow in future work. For \(d = 2\), we show an extensive comparison between the models with and without reverse gear for uniform cost, to illustrate the useful principle of the keypoints and to show the qualitative difference between the two models. In examples with non-uniform cost, see for example the top row of Fig. 3, we show that the model places the keypoints optimally at corners/bifurcations in the data, where the in-place rotation forms a natural, automatic ‘re-initialization’ of the tracking.

    For \(d = 3\), we give several examples to show the influence of the model parameters, in particular the cost parameter. The examples indicate that the method adequately deals with crossing or kissing structures.

Outline In Sect. 2, we give a detailed overview of the theoretical results of the paper. Theorems 13 and 4 are discussed and proven in Sects. 34 and 5, respectively. The reader who is primarily interested in the application of the methods may choose to skip these three sections. The proof of Theorem 2 is given in ‘Appendix A’ section. We discuss the numerics briefly in Sect. 6. Section 7 contains all experimental results. Conclusion and discussion follow in Sect. 8. For an overview of notations, ‘Appendix D’ section may be helpful.

2 Main Results

In this section, we state formally the mathematical results announced in Sect. 1. Some preliminaries regarding the distance function are introduced in the section below. Results regarding the exact Reeds–Shepp car models are gathered in Sect. 2.2. The description of the approximate models and the related convergence results appears in Sect. 2.3. Analysis of special interest points (cusps and keypoints) is done in Sect. 2.4. Results on the eikonal equation and subsequent backtracking of minimizing geodesics via intrinsic gradients are presented in Sect. 2.5.

2.1 Preliminaries on the (Quasi-)Distance Function and Underlying Geometry

Geometries on the manifold of states \({\mathbb {M}} = \mathbb {R}^d \times \mathbb {S}^{d-1}\) are defined by means of Finsler metrics which are functions \(\mathcal {F}: T({\mathbb {M}}) \rightarrow [0,+\infty ]\). On each tangent space, the metric should be 1-homogeneous, convex and quantitatively non-degenerate with a uniform constant \(\delta >0\): for all \({\mathbf {p}}=(\mathbf {x},\mathbf {n})\in {\mathbb {M}}\), \(\dot{{\mathbf {p}}}, \dot{{\mathbf {p}}}_0,\dot{{\mathbf {p}}}_1\in T_{\mathbf {p}}({\mathbb {M}})\) and \(\lambda \ge 0\):

$$\begin{aligned} \mathcal {F}({\mathbf {p}}, \lambda \dot{{\mathbf {p}}})&= \lambda \mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}}), \nonumber \\ \mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}}_0+\dot{{\mathbf {p}}}_1)&\le \mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}}_0) + \mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}}_1), \nonumber \\ \mathcal {F}({\mathbf {p}},\dot{{\mathbf {p}}})&\ge \delta \sqrt{\Vert \dot{{{\mathbf {x}}}}\Vert ^2 + \Vert \dot{\mathbf {n}}\Vert ^2}. \end{aligned}$$
(6)

A weak regularity property is required as well, see the next remark. The induced distance \(d_\mathcal {F}\), defined in (1), obeys \(d_\mathcal {F}({\mathbf {p}}, {\mathbf {q}}) = 0\) iff \({\mathbf {p}}= {\mathbf {q}}\) and obeys the triangle inequality. However, unlike a regular distance, \(d_\mathcal {F}\) needs not be finite, or continuous, or symmetric in its arguments. Note that \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\) as defined in (2) and (3), respectively, indeed satisfy the properties in (6).

Remark 2

In contrast to the more common definition of Finsler metrics, we will not assume the Finsler metric to be smooth on \(T(\mathbb {M}) \setminus (\mathbb {M} \times \{\mathbf {0}\})\) but use a weaker condition instead. Following [13], we require that the sets

$$\begin{aligned} \mathcal {B}_\mathcal {F}({\mathbf {p}}) := \{ \dot{{\mathbf {p}}} \in T_{\mathbf {p}}{\mathbb {M}} \, | \, \mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}}) \le 1\} \end{aligned}$$
(7)

are closed and vary continuously with respect to the point \({\mathbf {p}}\in {\mathbb {M}}\) in the sense of the Hausdorff distance. The sets \(\mathcal {B}_\mathcal {F}({\mathbf {p}})\) are illustrated in Fig. 2 for the models of interest. The condition implies that a shortest path exists from \({\mathbf {p}}\) to \({\mathbf {q}}\in {\mathbb {M}}\) whenever \(d_\mathcal {F}({\mathbf {p}}, {\mathbf {q}})\) is finite, and is used to prove convergence results in ‘Appendix A’ section.

A common technique in optimal control theory is to reformulate the shortest path problem defining the distance \(d_\mathcal {F}({\mathbf {p}}, {\mathbf {q}})\) into a time optimal control problem. That is, for \(p \in [1,\infty ]\) one has by Hölder’s (in)equality, time reparameterization, and by 1-homogeneity of \(\mathcal {F}\) in its 2nd entry, that:

$$\begin{aligned}&d_{\mathcal {F}}({\mathbf {p}},{\mathbf {q}}) \end{aligned}$$
(8)
$$\begin{aligned}&\quad \textstyle = \inf \left\{ \int \limits _{0}^{1} \mathcal {F}(\gamma (t),\dot{\gamma }(t))\, \mathrm{d}t \;|\; \gamma \in \varGamma ,\; \nonumber \gamma (0)={\mathbf {p}}, \gamma (1)={\mathbf {q}}\right\} \nonumber \\&\quad = \inf \left\{ (\textstyle \int \limits _{0}^{1} |\mathcal {F}(\gamma (t),\dot{\gamma }(t))|^{p}\, \mathrm{d}t) ^{\frac{1}{p}} \! | \gamma \in \varGamma , \gamma (0)={\mathbf {p}}, \gamma (1)={\mathbf {q}}\right\} \nonumber \\&\quad = \inf \left\{ T \ge 0 \;\;|\;\; \exists \gamma \in \varGamma _T, \; \gamma (0)={\mathbf {p}}, \right. \nonumber \\&\quad \quad \left. \gamma (T)={\mathbf {q}}, \forall _{t \in [0,T]}\,\dot{\gamma }(t) \in \mathcal {B}_{\mathcal {F}}(\gamma (t)) \right\} , \end{aligned}$$
(9)

where \(\varGamma _T :={{\mathrm{Lip}}}([0,T],{\mathbb {M}})\), and with \(\mathcal {B}_\mathcal {F}({\mathbf {p}})\) as defined in (7). The latter reformulation is used in ‘Appendix A’ section to prove convergence results via closedness of controllable paths and Arzela–Ascoli’s theorem, based on a general result originally applied to Euler elastica curves in [13].

In the special case \(\mathcal {F}=\mathcal {F}_{0}\) the geodesics are SR geodesics, where \(\mathcal {F}_0\) is obtained by the square root of quadratic form associated with a SR metric \(\left. \mathcal {G}_0\right| _{{\mathbf {p}}}(\cdot ,\cdot )=\mathcal {F}_{0}({\mathbf {p}},\cdot )^2\) on a SR manifold \((\mathbb {M}, \Delta , \mathcal {G}_0)\), where \(\Delta \subset T(\mathbb {M})\) is a strict subset of allowable tangent vectors that comes along with the horizontality constraint

$$\begin{aligned} \dot{\mathbf {x}}(t) = (\dot{{{\mathbf {x}}}}(t)\cdot \mathbf {n}(t)) \mathbf {n}(t), \qquad \forall t \in [0,1], \end{aligned}$$
(10)

that arises from (2). For details on the case \(d = 2\) see [11, 52], for \(d =3\) see [25].

Finally, we note that for the uniform cost case (\(\xi ^{-1}\mathcal {C}_{1}=\mathcal {C}_{2}=1\)), the problem is covariant with respect to rotations and translations. For the data-driven case, such covariance is only obtained when simultaneously rotating the data-driven cost factors \(\mathcal {C}_{1}, \mathcal {C}_{2}\). Therefore, only in the uniform cost case, for \(d=2,3\), we shall use a reference point (‘the origin’) \(\mathbf {e} \in {\mathbb {R}}^{d} \times \mathbb {S}^{d-1}\). To adhere to common conventions we use

$$\begin{aligned} \begin{aligned} \mathbf {e}=(\mathbf {0},\mathbf {a})&\in {\mathbb {R}}^{d}\times \mathbb {S}^{d-1}, \text { with } \\&\mathbf {a}:=(1,0)^T \qquad \text { if } d=2 \text { and } \\&\mathbf {a}:=(0,0,1)^T \quad \text { if }d=3. \end{aligned} \end{aligned}$$
(11)

2.2 Controllability of the Reeds–Shepp Model

A model \(({\mathbb {M}}, d_\mathcal {F})\) is globally controllable if the distance \(d_\mathcal {F}\) takes finite values on \({\mathbb {M}}\times {\mathbb {M}}\); in other words, a car can go from any place on the manifold to any other place in finite time. In Theorem 1 we show that this is indeed the case for \(\mathcal {F}= \mathcal {F}_0\) and \(\mathcal {F}= \mathcal {F}_0^+\), given in (2) and (3). Local controllability is satisfied when \(d_\mathcal {F}\) satisfies a certain continuity requirement: if \({\mathbf {p}}\rightarrow {\mathbf {q}}\in (\mathbb {M},\Vert {\cdot }\Vert )\), with \(\Vert {\cdot }\Vert \) denoting the standard (flat) Euclidean norm on \(\mathbb {M}={\mathbb {R}}^{d} \times \mathbb {S}^{d-1}\), we must have \(d_{\mathcal {F}}({\mathbf {p}},{\mathbf {q}}) \rightarrow 0\). We prove in Theorem 1 that the metric space \(({\mathbb {M}}, d_{\mathcal {F}_0})\) is locally controllable, but the quasi-metric space \(({\mathbb {M}},d_{\mathcal {F}_0^+})\) is not. Indeed the SR Reeds–Shepp car can achieve sideways motions by alternating the forward and reverse gear with slight direction changes, whereas the model without reverse gear lacks this possibility. For completeness, the theorem contains a standard (rough) estimate of the distance near the source (due to well-known estimates [16, 31, 49, 57]).

Furthermore, we prove existence of minimizers for the Reeds–Shepp model without reverse gear. Existence results of minimizers of the model with reverse gear (the SR model) already exist, by the Chow–Rashevsky theorem and Filippov theorems [2].

Theorem 1

((Local) controllability properties) Minimizers exist for both the classical Reeds–Shepp model and for the Reeds–Shepp model without reverse gear. Both models are globally controllable.

  • The Reeds–Shepp model without reverse gear is not locally controllable, since

    $$\begin{aligned} \limsup _{{\mathbf {p}}' \rightarrow {\mathbf {p}}} d_{\mathcal {F}^+_0}({\mathbf {p}},{\mathbf {p}}')\ge 2 \pi \delta , \text { for all }{\mathbf {p}}\in \mathbb {M}. \end{aligned}$$
    (12)

    If the cost \(\mathcal {C}_{2}=\delta \) is constant on \({\mathbb {M}}\), then this inequality is sharp:

    $$\begin{aligned} \limsup _{{\mathbf {p}}' \rightarrow {\mathbf {p}}} d_{\mathcal {F}^+_0}({\mathbf {p}},{\mathbf {p}}') = \lim _{\mu \downarrow 0} d_{\mathcal {F}_0^+} (({{\mathbf {x}}},\mathbf {n}),\, ({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})) = 2 \pi \delta . \end{aligned}$$
    (13)
  • The sub-Riemannian Reeds–Shepp model is locally controllable, since

    $$\begin{aligned} d_{\mathcal {F}_0}({\mathbf {p}},{\mathbf {p}}')&= \mathcal {O}\left( \mathcal {C}_{2}({\mathbf {p}}) \Vert \mathbf {n}-\mathbf {n}'\Vert + \sqrt{\mathcal {C}_{2}({\mathbf {p}})\mathcal {C}_{1}({\mathbf {p}})\Vert {{\mathbf {x}}}-{{\mathbf {x}}}'\Vert }\right) \ \nonumber \\&\text { as } {\mathbf {p}}'=({{\mathbf {x}}}',\mathbf {n}') \rightarrow {\mathbf {p}}=({{\mathbf {x}}},\mathbf {n}). \end{aligned}$$
    (14)

For a proof see Sect. 3.

2.3 A Continuous Approximation for the Reeds–Shepp Geometry

We introduce approximations \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) of the Finsler metrics \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\), depending on a small parameter \(0< \varepsilon \le 1\), which are continuous and in particular take only finite values. This is a prerequisite for our numerical methods. Both approximations penalize the deviation from the constraints of collinearity \(\dot{{{\mathbf {x}}}} \propto \mathbf {n}\), and in addition, \(\mathcal {F}_\varepsilon ^+\) penalizes negativity of the scalar product \(\dot{{{\mathbf {x}}}} \cdot \mathbf {n}\), appearing in (2) and (3). For that purpose, we introduce some additional notation: for \(\dot{{{\mathbf {x}}}} \in \mathbb {R}^d\) and \(\mathbf {n}\in \mathbb {S}^{d-1}\) we define

$$\begin{aligned} \Vert \dot{{{\mathbf {x}}}} \wedge \mathbf {n}\Vert ^2&:= \Vert \dot{{{\mathbf {x}}}} \Vert ^2 - |\dot{{{\mathbf {x}}}} \cdot \mathbf {n}|^2, \nonumber \\ (\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_-&:= \min \{0, \dot{{{\mathbf {x}}}} \cdot \mathbf {n}\}, \quad (\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_+ := \max \{\dot{{{\mathbf {x}}}} \cdot \mathbf {n},0\}. \end{aligned}$$
(15)

These are, respectively, the norm of the orthogonal projectionFootnote 4 of \(\dot{{{\mathbf {x}}}}\) onto the plane orthogonal to \(\mathbf {n}\) and the negative and positive parts of their scalar product. The two metrics \(\mathcal {F}_\varepsilon ,\mathcal {F}_\varepsilon ^+ : T({\mathbb {M}}) \rightarrow \mathbb {R}_+\) are defined for each \(0<\varepsilon \le 1\), as follows: for \(({\mathbf {p}}, \dot{{\mathbf {p}}}) \in T({\mathbb {M}})\) with components \({\mathbf {p}}= ({{\mathbf {x}}},\mathbf {n})\) and \(\dot{{\mathbf {p}}} = (\dot{{{\mathbf {x}}}},\dot{\mathbf {n}})\) we define

$$\begin{aligned} \mathcal {F}_\varepsilon ({\mathbf {p}}, \dot{{\mathbf {p}}})^2:= & {} \mathcal {C}_{1}({\mathbf {p}})^2 ( |\dot{{{\mathbf {x}}}} \cdot \mathbf {n}|^2 + \varepsilon ^{-2}\Vert \dot{{{\mathbf {x}}}} \wedge \mathbf {n}\Vert ^2) + \nonumber \\&\mathcal {C}_{2}({\mathbf {p}})^2 \Vert \dot{\mathbf {n}}\Vert ^2, \end{aligned}$$
(16)
$$\begin{aligned} \mathcal {F}^+_\varepsilon ({\mathbf {p}}, \dot{{\mathbf {p}}})^2:= & {} \, \mathcal {C}_{1}({\mathbf {p}})^2 ( |\dot{{{\mathbf {x}}}} \cdot \mathbf {n}|^2 + \varepsilon ^{-2}\Vert \dot{{{\mathbf {x}}}} \wedge \mathbf {n}\Vert ^2 + \nonumber \\&(\varepsilon ^{-2}-1) (\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_-^2) +\, \mathcal {C}_{2}({\mathbf {p}})^2 \Vert \dot{\mathbf {n}}\Vert ^2 \end{aligned}$$
(17)
$$\begin{aligned}= & {} \mathcal {C}_{1}({\mathbf {p}})^2 ( (\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_+^2 + \varepsilon ^{-2}\Vert \dot{{{\mathbf {x}}}} \wedge \mathbf {n}\Vert ^2 + \nonumber \\&\varepsilon ^{-2} (\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_-^2) \, +\, \mathcal {C}_{2}({\mathbf {p}})^2 \Vert \dot{\mathbf {n}}\Vert ^2. \end{aligned}$$
(18)

See Fig. 5 for a visualization of a level set of both metrics in \({\mathbb {R}}^2 \times \mathbb {S}^1\). Note that \(\mathcal {F}_\varepsilon \) is a Riemannian metric on \({\mathbb {M}}\) (with the same smoothness as the cost functions \(\mathcal {C}_{2}, \mathcal {C}_{1}\)), and that \(\mathcal {F}_\varepsilon ^+\) is neither Riemannian nor smooth due to the term \((\dot{{{\mathbf {x}}}} \cdot \mathbf {n})_-\). One clearly has the pointwise convergence \(\mathcal {F}_\varepsilon ({\mathbf {p}}, \dot{{\mathbf {p}}}) \rightarrow \mathcal {F}_0({\mathbf {p}}, \dot{{\mathbf {p}}})\) as \(\varepsilon \rightarrow 0\), and likewise \(\mathcal {F}^+_\varepsilon ({\mathbf {p}}, \dot{{\mathbf {p}}}) \rightarrow \mathcal {F}^+_0({\mathbf {p}}, \dot{{\mathbf {p}}})\). The use of \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) is further justified by the following convergence result.

Fig. 5
figure 5

Level sets for \(d = 2\) of the (approximating) metrics \(\mathcal {F}_\varepsilon (\mathbf {0},(\dot{x},\dot{y},\dot{\theta })) = 1\) (left) and \(\mathcal {F}^+_\varepsilon (\mathbf {0},(\dot{x},\dot{y},\dot{\theta })) = 1\) (right), with \(\varepsilon = 0.2\) (top) and \(\varepsilon = 0\) (bottom). In this example, \(\mathcal {C}_2(\mathbf {0}) = 2 \mathcal {C}_1(\mathbf {0})\)

Theorem 2

(Convergence of the approximative models to the exact models) One has the pointwise convergence: for any \({\mathbf {p}},{\mathbf {q}}\in {\mathbb {M}}\)

$$\begin{aligned} \begin{aligned} d_{\mathcal {F}_\varepsilon }({\mathbf {p}},{\mathbf {q}})&\rightarrow d_{\mathcal {F}_0}({\mathbf {p}},{\mathbf {q}}), \\ d_{\mathcal {F}_\varepsilon ^+}({\mathbf {p}},{\mathbf {q}})&\rightarrow d_{\mathcal {F}_0^+}({\mathbf {p}},{\mathbf {q}}), \end{aligned} \quad \text {as } \varepsilon \rightarrow 0. \end{aligned}$$

Consider for each \(\varepsilon >0\) a minimizing path \(\gamma _\varepsilon ^*\) from \({\mathbf {p}}\) to \({\mathbf {q}}\), with respect to the metric \(\mathcal {F}_\varepsilon \), parameterized at constant speed

$$\begin{aligned} \mathcal {F}_{\varepsilon }(\gamma ^*_{\varepsilon }(t),\dot{\gamma }_{\varepsilon }^*(t))= d_{\mathcal {F}_{\varepsilon }}({\mathbf {p}},{\mathbf {q}}), \qquad \forall t \in [0,1]. \end{aligned}$$

Assume that there is a unique shortest path \(\gamma ^*\) from \({\mathbf {p}}\) to \({\mathbf {q}}\) with respect to the sub-Riemannian distance \(d_{\mathcal {F}_{0}}\) (in other words \({\mathbf {q}}\) is not within the cut locus of \({\mathbf {p}}\)), parameterized at constant speed:

$$\begin{aligned} \mathcal {F}_{0}(\gamma ^*(t),\dot{\gamma }^*(t))= d_{\mathcal {F}_{0}}({\mathbf {p}},{\mathbf {q}}), \qquad \forall t \in [0,1]. \end{aligned}$$

Then \(\gamma _{\varepsilon }^* \rightarrow \gamma ^*\) as \(\varepsilon \rightarrow 0\), uniformly on [0, 1]. Likewise replacing \(\mathcal {F}_{\varepsilon }\) with \(\mathcal {F}_{\varepsilon }^+\) for all \(\varepsilon \ge 0\).

The proof, presented in ‘Appendix A’ section, is based on a general result originally applied to the Euler elastica curves in [13]. Combining Theorem 2 with the local controllability properties established in Theorem 1, one obtains that \(d_{\mathcal {F}_\varepsilon } \rightarrow d_{\mathcal {F}_0}\) locally uniformly on \({\mathbb {M}} \times {\mathbb {M}}\), and that the convergence \(d_{\mathcal {F}_\varepsilon ^+} \rightarrow d_{\mathcal {F}_0^+}\) is only pointwise.

Remark 3

If there exists a family of minimizing geodesics \((\gamma _i^*)_{i \in I}\) from \({\mathbf {p}}\) to \({\mathbf {q}}\) with respect to \(\mathcal {F}_0\) (resp. \(\mathcal {F}_0^+\)), then one can show that for any sequence \(\varepsilon _n \rightarrow 0\) one can find a subsequence and an index \(i \in I\) such that \(\gamma _{\varepsilon _{\varphi (n)}}^* \rightarrow \gamma _i^*\) uniformly as \(n\rightarrow \infty \).

Fig. 6
figure 6

Illustration of cusps in SR (\(\varepsilon =0\)) geodesics (possibly non-optimal) in \(\mathbb {M}=\mathbb {R}^{d} \times \mathbb {S}^{d-1}\). Left: cusps in spatial projections \(\mathbf {x}(\cdot )\) of SR geodesics \({\varvec{\gamma }}(\cdot )=(\mathbf {x}(\cdot ),\mathbf {n}(\cdot ))\) for \(d=2\), right: cusps (red dots) appearing in spatial projections of SR geodesics for \(d=3\). In the 3D case we indicate the corresponding rotations \(\mathbf {R}_{\mathbf {n}_1}\) via a local 3D frame (Color figure online)

2.4 Points of Interest in Spatial Projections of Geodesics for the Uniform Cost Case: Cusps Versus Keypoints

Next we provide a theorem that tells us in each of the models/metric spaces \((\mathbb {M},d_{\mathcal {F}_{0}})\), \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) and \((\mathbb {M},d_{\mathcal {F}_{0}^+})\), \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\), with \(\mathcal {C}_1=\mathcal {C}_2=1\) and \(d=2\) where cusps occur in spatial projections of geodesics or where keypoints with in-place rotations take place. Recall Fig. 2 for a geometric illustration of the specific behavior of the path at such points. In Theorem 3, we provide an analysis of the occurrence of these points for the uniform cost case.

Note that for vessel tracking (or fiber tracking) applications, cusps are not wanted as they are unnatural for vessels (or fibers), whereas keypoints are only desirable at bifurcations of vessels. In the data-driven case, the practical advantage of the forward-only model resulting in keypoints instead of cusps can indeed be observed (see, e.g., Figs. 1314).

Definition 2

(Cusp) A cusp point \(\mathbf {x}(t_0)\) on a spatial projection of a (SR) geodesic \( t \mapsto (\mathbf {x}(t),\mathbf {n}(t))\) in \(\mathbb {M}\) is a point where

$$\begin{aligned} \begin{array}{l} \tilde{u} (t_0)=0, \text { and }\ \dot{\tilde{u}}(t_0) \ne 0, \\ \text {where } \tilde{u}(t):= \mathbf {n}(t) \cdot \dot{\mathbf {x}}(t) \text { for all }t. \end{array} \end{aligned}$$
(19)

That is, a cusp point is a point where the spatial control aligned with \(\mathbf {n}(t_0)\) vanishes and switches sign locally.

Although this definition explains the notion of a cusp geometrically (as can be observed in Figs. 26), it contains a redundant part for the relevant case of interest: the second condition automatically follows when considering the SR geodesics in \((\mathbb {M}, d_{\mathcal {F}_{0}})\). The following lemma gives a characterization of a cusp point in terms of the distance function along a curve.

Lemma 1

Consider a SR geodesic \(\gamma =(\mathbf {x},\mathbf {n}) : [0,1] \rightarrow (\mathbb {M}, d_{\mathcal {F}_{0}})\), parameterized at constant speed, and which physical position \({{\mathbf {x}}}(\cdot )\) is not identically constant. Denote \({\mathbf {p}}_S := \gamma (0)\) and \(U(\cdot ):= d_{\mathcal {F}_{0}}({\mathbf {p}}_{S},\cdot )\). Let \(t_0\in (0,1)\) be such that U is differentiable at \(\gamma (t_0) =(\mathbf {x}(t_0),\mathbf {n}(t_0))\). Then

$$\begin{aligned} \begin{aligned} {{\mathbf {x}}}(t_0) \text { is a cusp point }\Leftrightarrow \mathbf {n}(t_0) \cdot \dot{\mathbf {x}}(t_0) =0 \\ \Leftrightarrow \mathbf {n}(t_0) \cdot \nabla _{{\mathbb {R}}^{d}} U(\mathbf {x}(t_0),\mathbf {n}(t_0))=0. \end{aligned} \end{aligned}$$
(20)

The proof can be found in ‘Appendix C’ section.

Definition 3

(Keypoint) A point \(\tilde{{{\mathbf {x}}}}\) on the spatial projection of a geodesic \(\gamma (\cdot ) = ({{\mathbf {x}}}(\cdot ),\mathbf {n}(\cdot ))\) in \({\mathbb {M}}\) is a keypoint of \(\gamma \) if there exist \(t_0 < t_1\), such that \({{\mathbf {x}}}(t) = \tilde{{{\mathbf {x}}}}\) and \(\dot{\mathbf {n}}(t) \ne 0\) for all \(t \in [t_0,t_1]\), i.e., a point where an in-place rotation takes place.

Definition 4

We define the set to be all endpoints that can be reached with a geodesic \(\gamma ^*:[0,1] \rightarrow \mathbb {M}\) in \((\mathbb {M},d_{\mathcal {F}_0})\) whose spatial control \(\tilde{u}(t)\) stays positive for all \(t \in [0,1]\).

Remark 4

The word ‘geodesic’ in this definition can (in the case \(d = 2\)) be replaced by ‘globally minimizing geodesic’ [11]. For a definition in terms of the exponential map of a geometrical control problem \(\mathbf {P}_{curve}\), see, e.g., [22, 24], in which the same positivity condition for \(\tilde{u}\) is imposed. Figure 7 shows more precisely what this set looks like for \(d = 2\) [22], in particular, that it is contained in the half-space \(\mathbf {a} \cdot \mathbf {x} \ge 0\), and for \(d = 3\) [24]. We extend these results with the following theorem.

Fig. 7
figure 7

The set of endpoints reachable from the origin \(\mathbf {e}\) [recall (11)] via SR geodesics whose spatial projections do not exhibit cusps has been studied for the case \(d=2\) (left) and for the case \(d=3\) (right). For \(d=2\) it is contained in \(x \ge 0\), and for \(d=3\) it is contained in \(z\ge 0\). The boundary of this set contains endpoints of geodesics departing at a cusp (in red) or endpoints of geodesics ending in a cusp (in blue). If an endpoint \((\mathbf {x},\mathbf {n})\) is placed outside (e.g., the green points above), then following the approach in Theorem 4, depending on its initial spatial location it first connects to a blue point \((\mathbf {x},\mathbf {n}_{new})\) via a spherical geodesic end and then connects to the origin \(\mathbf {e}\) via a SR geodesic. Then it has a keypoint at the endpoint. For other locations of spatial locations (orange points), the geodesic has the keypoint in the origin, or even at both boundaries, cf. Fig. 8 (Color figure online)

Theorem 3

(Cusps and Keypoints) Let \(\varepsilon >0\), \(d=2\), \(\mathcal {C}_{1}=\mathcal {C}_{2}=1\). Then,

  • in \((\mathbb {M},d_{\mathcal {F}_{0}})\) cusps are present in spatial projections of almost every optimal SR geodesics when their times t are extended on the real line (until they lose optimality). The straight lines connecting specific boundary points \(\mathbf {p}=(\mathbf {x},\mathbf {n})\) and \(\mathbf {q}=(\mathbf {x}+ \lambda \mathbf {n},\mathbf {n})\) with \(\lambda \in {\mathbb {R}}\) are the only exceptions.

  • in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\) and \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) and \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) no cusps appear in spatial projections of geodesics.

Furthermore,

  • in \((\mathbb {M},d_{\mathcal {F}_{0}})\), \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) and \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\) keypoints only occur with vertical geodesics (moving only angularly).

  • in \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) keypoints only occur at the endpoints of shortest paths.

A minimizing geodesic \(\gamma _{+}\) in \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) departing from \(\mathbf {e}=(0,0,0)\) and ending in \({\mathbf {p}}=(x,y,\theta )\) has

  1. (A)

    no keypoint if ,

  2. (B)

    a keypoint in (0, 0) if \(x<0\),

  3. (C)

    a keypoint only in (xy) ifFootnote 5

    1. (C1)

      and \(x \ge 2\),

    2. (C2)

      and \(0 \le x < 2\) and

      \(|y| \le -i x \, E \left( i \text {arcsinh}\left( \frac{x}{\sqrt{4-x^2}}\right) , \frac{x^2-4}{x^2} \right) \), where E(zm) denotes the Elliptic integral of the second kind.

Fig. 8
figure 8

Shortest paths for \(d=2\) using the Finsler metrics \(\mathcal {F}_{0}\) (blue) and \(\mathcal {F}_{0}^+\) (red), with point source \({\mathbf {p}}_S = (0,0,0)\) and varying end conditions. Row A: \({\mathbf {p}}= (0,0.8,\pi n /4)\). Row B: \({\mathbf {p}}= (0.8,0.8,\pi n/4)\). Row C: \({\mathbf {p}}= (-0.8,0,\pi n/4)\). Here \(n = 1, \ldots , 8\), corresponding to the columns. When there are two minimizing geodesics, both are drawn. Circles around the begin or endpoint indicate in-place rotation of the red curve at that point. We see that whenever the blue geodesic has a cusp, the red geodesic has at least one in-place rotation (keypoint). This numerically supports our statements in Theorem 3 considering cusps and keypoints. For high accuracy we applied the relatively slow iterative PDE approach [8] on a \(101 \times 101 \times 64\)-grid in \(\mathbb {M}\) to compute \(d_{\mathcal {F}_{0}}({\mathbf {p}},{\mathbf {p}}_S)\) and \(d_{\mathcal {F}_{0}^+}({\mathbf {p}},{\mathbf {p}}_S)\), see [27, Appendix B] (Color figure online)

Remark 5

In case A, \(\gamma _{+}\) is a minimizing geodesic in \((\mathbb {M},d_{\mathcal {F}_0})\) as well. In case B, \(\gamma _{+}\) departs from a cusp. In case C, \(\gamma ^+\) is a concatenation of a minimizing geodesic in \((\mathbb {M},d_{\mathcal {F}_0})\) and an in-place rotation. For other endpoints \((x,y,\theta )\) for geodesics departing from \(\mathbf {e}\) with \(0\le x < 2\), other than the ones reported in C2 it is not immediately clear what happens, due to [22, Theorem 9]. Also points with \(x<0\) may have keypoints at the end as well. See Fig. 8 where various cases of minimizing geodesics in \((\mathbb {M},d_{\mathcal {F}_0^+})\) are depicted.

Remark 6

See [27, Fig. 6] to see the smoothing effect of taking \(\varepsilon \) small but nonzero on the cusps of non-optimal geodesics in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) and keypoints in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^{+}})\).

2.5 The Eikonal PDE Formalism

As briefly discussed in Sect. 1.3, continuous metrics like \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) for any \(\varepsilon >0\) allow to use the standard theory of viscosity solutions of eikonal PDEs and thus to design provable and efficient numerical schemes for the computation of distance maps and minimizing geodesics. More precisely, consider a continuous Finsler metric \(\mathcal {F}\in C^0(T({\mathbb {M}}),{\mathbb {R}}^+)\), and define the dual \(\mathcal {F}^*\) on the co-tangent bundle as follows: for all \(({\mathbf {p}}, \hat{{\mathbf {p}}}) \in T^*({\mathbb {M}})\)

$$\begin{aligned} \mathcal {F}^*({\mathbf {p}}, \hat{{\mathbf {p}}}) := \sup _{\dot{{\mathbf {p}}} \in T_{\mathbf {p}}{\mathbb {M}}{\setminus }\{0\}} \frac{\langle \hat{{\mathbf {p}}}, \dot{{\mathbf {p}}}\rangle }{\mathcal {F}({\mathbf {p}}, \dot{{\mathbf {p}}})}. \end{aligned}$$
(21)

The distance map \(U = d_\mathcal {F}({\mathbf {p}}_\mathrm{S}, \cdot )\) from a given source point \({\mathbf {p}}_\mathrm{S}\in {\mathbb {M}}\) is the unique solution, in the sense of viscosity solutions, of the static Hamilton Jacobi equation: \(U({\mathbf {p}}_\mathrm{S})=0\), and for all \(p \in {\mathbb {M}}\)

$$\begin{aligned} \mathcal {F}^*({\mathbf {p}}, \mathrm{d}U({\mathbf {p}}))=1. \end{aligned}$$
(22)

Furthermore, if \(\gamma \) is a minimizing geodesic from \({\mathbf {p}}_\mathrm{S}\) to some \({\mathbf {p}}\in {\mathbb {M}}\), then it obeys the ordinary differential equation (ODE):

$$\begin{aligned} \left\{ \begin{aligned}&\dot{\gamma }(t) = L \ \mathrm{d}_{\hat{{\mathbf {p}}}} \mathcal {F}^*(\gamma (t), \mathrm{d}U(\gamma (t))), \; L := d_\mathcal {F}({\mathbf {p}}_\mathrm{S}, {\mathbf {p}}) \\&\gamma (0) = {\mathbf {p}}_S, \quad \gamma (1) = {\mathbf {p}}. \end{aligned}\right. \end{aligned}$$
(23)

for any \(t\in [0,1]\) such that the differentiability of U and \(\mathcal {F}^*\) holds at the required points. The proof of ODE (23) is for completeness derived in Proposition 4 of ‘Appendix B’ section, where we also discuss in Remark 14 the common alternative formalism based on the Hamiltonian. We denoted by \(\mathrm{d}_{\hat{{\mathbf {p}}}} \mathcal {F}^*\) the differential of the dual Finsler metric \(\mathcal {F}^*\) with respect to the second variable \(\hat{{\mathbf {p}}}\); hence, \(\mathrm{d}_{\hat{{\mathbf {p}}}} \mathcal {F}^*({\mathbf {p}}, \hat{{\mathbf {p}}}) \in T^{**}_{\mathbf {p}}({\mathbb {M}}) \cong T_{\mathbf {p}}({\mathbb {M}})\) is indeed a tangent vector to \({\mathbb {M}}\), for all \(({\mathbf {p}},\hat{{\mathbf {p}}})\in T^*{\mathbb {M}}\).

In the rest of this section, we specialize (22) and (23) to the Finsler metrics \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\). Our first result provides explicit expressions for the dual Finsler metrics (required for the eikonal equation).

Proposition 1

For any \(0<\varepsilon \le 1\), the duals to the approximating Finsler metrics \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) are: for all \(({\mathbf {p}},\hat{{\mathbf {p}}}) \in T^*({\mathbb {M}})\), with \({\mathbf {p}}= ({{\mathbf {x}}},\mathbf {n})\) and \(\hat{{\mathbf {p}}}= (\hat{{{\mathbf {x}}}}, \hat{\mathbf {n}})\)

$$\begin{aligned} \begin{aligned} \mathcal {F}_\varepsilon ^*({\mathbf {p}}, \hat{{\mathbf {p}}})^2&= (\mathcal {C}_{2}({\mathbf {p}}))^{-2} \Vert \hat{\mathbf {n}}\Vert ^2 + (\mathcal {C}_{1}({\mathbf {p}}))^{-2} (|\hat{{{\mathbf {x}}}}\cdot \mathbf {n}|^2 \\& + \varepsilon ^2 \Vert \hat{{{\mathbf {x}}}}\wedge \mathbf {n}\Vert ^2) \\ \mathcal {F}_\varepsilon ^{+*}({\mathbf {p}}, \hat{{\mathbf {p}}})^2&= (\mathcal {C}_{2}({\mathbf {p}}))^{-2} \Vert \hat{\mathbf {n}}\Vert ^2 + (\mathcal {C}_{1}({\mathbf {p}}))^{-2} (|\hat{{{\mathbf {x}}}}\cdot \mathbf {n}|^2 \\& + \varepsilon ^2 \Vert \hat{{{\mathbf {x}}}}\wedge \mathbf {n}\Vert ^2 - (1-\varepsilon ^2) (\hat{{{\mathbf {x}}}}\cdot \mathbf {n})_-^2) \\&= (\mathcal {C}_{2}({\mathbf {p}}))^{-2} \Vert \hat{\mathbf {n}}\Vert ^2 + (\mathcal {C}_{1}({\mathbf {p}}))^{-2} ((\hat{{{\mathbf {x}}}}\cdot \mathbf {n})_+^2 \\& + \varepsilon ^2(\hat{{{\mathbf {x}}}}\cdot \mathbf {n})_-^2 + \varepsilon ^2 \Vert \hat{{{\mathbf {x}}}}\wedge \mathbf {n}\Vert ^2) \end{aligned} \end{aligned}$$
(24)

In order to relate Finslerian HJB equation (22) and backtracking equation (23) to some more classical Riemannian counterparts, we introduce two Riemannian metric tensor fields on \({\mathbb {M}}\). The first is defined as the polarization of the norm \(\mathcal {F}_\varepsilon ({\mathbf {p}},\cdot )\)

$$\begin{aligned} \begin{aligned} \mathcal {G}_{{\mathbf {p}}; \varepsilon }(\dot{{\mathbf {p}}},\dot{{\mathbf {p}}})&= |\mathcal {F}_{\varepsilon }({\mathbf {p}},\dot{{\mathbf {p}}})|^2 \\ {}&= \mathcal {C}_{1}^{2}({\mathbf {p}}) ((\dot{{{\mathbf {x}}}} \cdot \mathbf {n})^2+\varepsilon ^{-2}\Vert \dot{{{\mathbf {x}}}} \wedge \mathbf {n}\Vert ^2) \\&+ \mathcal {C}_{2}^2({\mathbf {p}}) \Vert \dot{\mathbf {n}}\Vert ^2, \end{aligned} \end{aligned}$$
(25)

where \(\dot{{\mathbf {p}}}=(\dot{\mathbf {x}},\dot{\mathbf {n}})\), and then one can also rely on gradient fields \({\mathbf {p}}\mapsto \mathcal {G}_{{\mathbf {p}}; \varepsilon }^{-1}\mathrm{d}U({\mathbf {p}})\) relative to this metric tensor. This has benefits if it comes to geometric understanding of the eikonal equation and its tracking. Even in the analysis of the non-symmetric case—where one does not have a single metric tensor—this notion plays a role, as we will see in the next main theorem. To this end, in the non-symmetric case, we shall rely on a second spatially isotropic metric tensor given by:

$$\begin{aligned} \widetilde{\mathcal {G}}_{{\mathbf {p}}; \varepsilon }(\dot{{\mathbf {p}}},\dot{{\mathbf {p}}}):= \mathcal {C}_{1}^{2}({\mathbf {p}}) \, \varepsilon ^{-2}\, \Vert \dot{\mathbf {x}}\Vert ^2 + \mathcal {C}_{2}^2({\mathbf {p}}) \Vert \dot{\mathbf {n}}\Vert ^2. \end{aligned}$$
(26)

We denote by \(\nabla _{\mathbb {S}^{d-1}}\) the gradient operator on \(\mathbb {S}^{d-1}\) with respect to the inner product induced by the embedding \(\mathbb {S}^{d-1} \subset \mathbb {R}^d\) and by \(\nabla _{\mathbb {R}^d}\) the canonical gradient operator on \(\mathbb {R}^d\).

Corollary 1

Let \(\varepsilon \ge 0\). Then eikonal PDE (5) for the case \((\mathbb {M},\mathcal {F}_\varepsilon )\) takes the form

$$\begin{aligned} \begin{array}{c} \sqrt{ \textstyle {\frac{\Vert \nabla _{\mathbb {S}^{d\!-\!1}}U({\mathbf {p}})\Vert ^2}{\mathcal {C}^{2}_{2}({\mathbf {p}})}} + \textstyle {\frac{\varepsilon ^{2}\Vert \nabla _{{\mathbb {R}}^{d}}U({\mathbf {p}})\Vert ^2 + (1-\varepsilon ^2) |\, \mathbf {n} \cdot \nabla _{{\mathbb {R}}^{d}}U({\mathbf {p}}) \,|^2}{\mathcal {C}^{2}_{1}({\mathbf {p}})}} } = 1, \\ \Leftrightarrow \\ \left. \mathcal {G}_{{\mathbf {p}}; \varepsilon }\right| _{{\mathbf {p}}}\left( \,\mathcal {G}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U({\mathbf {p}})\,, \mathcal {G}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U({\mathbf {p}})\, \right) =1. \end{array} \end{aligned}$$

Eikonal PDE (5) for the case \((\mathbb {M},\mathcal {F}_\varepsilon ^+)\) now takes the explicit form:

$$\begin{aligned} \begin{array}{c} \sqrt{ \begin{aligned} &{}\textstyle {\frac{\Vert \nabla _{\mathbb {S}^{d\!-\!1}}U^+({\mathbf {p}})\Vert ^2}{\mathcal {C}^{2}_{2}({\mathbf {p}})}} + \\ &{}\textstyle {\frac{\varepsilon ^{2}\Vert \nabla _{{\mathbb {R}}^{d}}U^+({\mathbf {p}})\Vert ^2 + (1-\varepsilon ^2) |\, (\,\mathbf {n} \cdot \nabla _{{\mathbb {R}}^{d}}U^+({\mathbf {p}})\,)_+\, \,|^2}{\mathcal {C}^{2}_{1}({\mathbf {p}})}} \end{aligned} }=1 \\ \Leftrightarrow \\ \left\{ \begin{array}{l} \left. \mathcal {G}_{{\mathbf {p}}; \varepsilon }\right| _{{\mathbf {p}}}\left( \,\mathcal {G}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U^+({\mathbf {p}})\,, \mathcal {G}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U^+({\mathbf {p}})\, \right) =1, \\ \text { if }{\mathbf {p}}\in \mathbb {M}_{+}:=\{{\mathbf {p}}\in \mathbb {M} \;|\; \mathbf {n} \cdot \nabla _{\mathbb {R}^{d}}U^+(\mathbf {p}) > 0 \}, \\ \left. \widetilde{\mathcal {G}}_{{\mathbf {p}}; \varepsilon }\right| _{{\mathbf {p}}}\left( \,\widetilde{\mathcal {G}}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U^+({\mathbf {p}})\,, \widetilde{\mathcal {G}}_{{\mathbf {p}}; \varepsilon }^{-1} \mathrm{d}U^+({\mathbf {p}}) \, \right) =1, \\ \text { if }{\mathbf {p}}\in \mathbb {M}_- := \{{\mathbf {p}}\in \mathbb {M} \;|\; \mathbf {n} \cdot \nabla _{\mathbb {R}^{d}}U^+(\mathbf {p}) < 0 \}. \end{array} \right. \end{array} \end{aligned}$$

for those \(\mathbf {p} \in \mathbb {M}_+ \cup \mathbb {M}_{-}\) where \(U^{+}\) is differentiable.Footnote 6

The proof of Proposition 1 and Corollary 1 can be found in Sect. 5.

We finally specialize geodesic ODE (23) to the models of interest. Note that for the model \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon ^+})\), the backtracking switches between qualitatively distinct modes, respectively, almost sub-Riemannian and almost purely angular, in the spirit of Theorem 3. Given \(\varepsilon > 0\) and \(\mathbf {n}\in \mathbb {S}^{d-1}\) let \(D_\mathbf {n}^\varepsilon \) denote the \(d \times d\) symmetric positive definite matrix with eigenvalue 1 in the direction \(\mathbf {n}\) and eigenvalue \(\varepsilon ^2\) in the orthogonal directions :

$$\begin{aligned} D_\mathbf {n}^\varepsilon := \mathbf {n}\otimes \mathbf {n}+ \varepsilon ^2 ({{\mathrm{Id}}}-\mathbf {n}\otimes \mathbf {n}). \end{aligned}$$
(27)

Theorem 4

(Backtracking) Let \(0< \varepsilon <1\). Let \({\mathbf {p}}_\mathrm{S}\in {\mathbb {M}}\) be a source point. Let \(U({\mathbf {p}}):= d_{\mathcal {F}_{\varepsilon }}({\mathbf {p}},{\mathbf {p}}_s)\), \(U^+({\mathbf {p}}):=d_{\mathcal {F}_{\varepsilon }^+}({\mathbf {p}},{\mathbf {p}}_s)\) be distance maps from \({\mathbf {p}}_{s}\), w.r.t. the Finsler metric \(\mathcal {F}_{\varepsilon }\), and \(\mathcal {F}_{\varepsilon }^+\). Let \(\gamma , \gamma ^+:[0,1] \rightarrow \mathbb {M}\) be normalized geodesics of length L starting at \({\mathbf {p}}_s\) in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) resp. \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\). Let time \(t \in [0,1]\).

For the Riemannian approximation paths of the Reeds–Shepp car we have, provided that U is differentiable at \(\gamma (t)=(\mathbf {x}(t),\mathbf {n}(t))\), that

$$\begin{aligned} \begin{array}{c} \dot{\gamma }(t) = L \, \mathcal {G}^{-1}_{\gamma (t); \varepsilon } \mathrm{d}U(\gamma (t)) \\ \Leftrightarrow \\ \left\{ \begin{array}{ll} \dot{\mathbf {n}}(t) &{}= L\, \mathcal {C}_{2}(\gamma (t))^{-1}\; \nabla _{\mathbb {S}^{d-1}} U(\gamma (t)), \\ \dot{{{\mathbf {x}}}}(t) &{}= L\, \mathcal {C}_{1}(\gamma (t))^{-1}\; D_{\mathbf {n}(t)}^\varepsilon \nabla _{\mathbb {R}^d} U(\gamma (t)). \end{array} \right. \end{array} \end{aligned}$$
(28)

For the approximation paths of the car without reverse gear we have, provided that \(U^+\) is differentiable at \(\gamma ^+(t) = ({{\mathbf {x}}}^+(t),\mathbf {n}^+(t))\), that

$$\begin{aligned} \dot{\gamma }^+(t) = L \left\{ \begin{array}{ll} \mathcal {G}^{-1}_{\gamma ^+(t); \varepsilon } \mathrm{d}U^+(\gamma ^+(t)) &{}\text {if }\gamma ^+(t) \in \mathbb {M}_+,\\ \widetilde{\mathcal {G}}^{-1}_{\gamma ^+(t); \varepsilon } \mathrm{d}U^+(\gamma ^+(t)) &{}\text {if }\gamma ^+(t) \in \mathbb {M}_-, \end{array} \right. \end{aligned}$$
(29)

with \(\widetilde{\mathcal {G}}_{{\mathbf {p}}; \varepsilon }(\dot{{\mathbf {p}}},\dot{{\mathbf {p}}})\) given by (26), with disjoint Riemannian manifold splitting \(\mathbb {M}=\mathbb {M}_{+} \cup \mathbb {M}_{-} \cup \partial \mathbb {M}_{\pm }\). Manifold \(\mathbb {M}_{+}\) is equipped with metric tensor \(\mathcal {G}_{\varepsilon }\), \(\mathbb {M}_{-}\) is equipped with metric tensor \(\widetilde{\mathcal {G}}_{\varepsilon }\) and

$$\begin{aligned} \partial \mathbb {M}_{\pm } := \overline{{\mathbb {M}}_+} {\setminus } {\mathbb {M}}_+ = \overline{{\mathbb {M}}_-} {\setminus } {\mathbb {M}}_- \end{aligned}$$
(30)

denotes the transition surface (surface of keypoints).

Remark 7

General abstract formula (29) reflects that the backtracking in \((\mathbb {M},\mathcal {F}^{+})\) is a combined gradient descent flow on the distance map \(U^+\) on a splitting of \(\mathbb {M}\) into two (symmetric) Riemannian manifolds. Its explicit form (likewise (28)) is

$$\begin{aligned} \left\{ \begin{array}{ll} \dot{\mathbf {n}}^+(t) &{}= L\, \mathcal {C}_{2}(\gamma ^+(t))^{-1}\; \nabla _{\mathbb {S}^{d-1}} U^+(\gamma ^+(t)), \\ \dot{{{\mathbf {x}}}}^+(t) &{}= L \left\{ \begin{array}{lll} \mathcal {C}_{1}(\gamma ^+(t))^{-1}\; D_{\mathbf {n}(t)}^\varepsilon \nabla _{\mathbb {R}^d} U^+(\gamma ^+(t)) \\ \qquad \qquad \text { if }\gamma ^+(t) \in \mathbb {M}_+, \\ \varepsilon ^2 \, \mathcal {C}_{1}(\gamma ^+(t))^{-1}\; \nabla _{\mathbb {R}^d} U^+(\gamma ^+(t)) \\ \qquad \qquad \text { if }\gamma ^+(t) \in \mathbb {M}_-,\\ \end{array} \right. \end{array} \right. \end{aligned}$$
(31)

Note that for the (less useful) isotropic case \(\varepsilon =1\), \(\mathcal {F}_{1}\) and \(\mathcal {F}_{1}^+\) coincide and geodesics consist of straight lines \(\mathbf {x}(\cdot )\) in \({\mathbb {R}}^{d}\) and great circles \(\mathbf {n}(\cdot )\) in \(\mathbb {S}^{d}\) that do not influence each other.

Remark 8

In Theorem 4, we assumed distance maps U and \(U^+\) to be differentiable along the path, which is not always the case. In points where the distance map is not differentiable, one can take any subgradient in the subdifferential \(\partial U(\mathbf {p})\) in order to identify Maxwell points (and Maxwell strata). In particular, in SR geometry, the set of points where the squared distance function \((d_{\mathcal {F}_0}(\cdot ,e))^2\) is smooth is open and dense in any compact subset of \({\mathbb {M}}\), see [1, Theorem 11.15]. The points where it is non-smooth are rare and meaningful: they are either first Maxwell points, conjugate points or abnormal points. The last type does not appear here, because we have a 2-bracket generating distribution, see, e.g., [25, Remark 4] and [1, Chap. 20.5.1.]. At points in the closure of the first Maxwell set, two geodesically equidistant wavefronts collide for the first time, see for example [8, Fig.3, Theorem 3.2] for the case \(d=2\) and \(\mathcal {C}=\mathcal {C}_{1}=\mathcal {C}_{2}=1\). See also Fig. 8, where for some end conditions 2 optimally backtracked geodesics end with the same length in such a first Maxwell point. The conjugate points are points where local optimality is lost, for a precise definition see, e.g., [1, Definition 8.43].

Remark 9

Recall the convergence result from Theorem 2, and the non-local-controllability for the model \((\mathbb {M},d_{\mathcal {F}_{0}^+})\). From this we see that the convergence holds pointwise but not uniformly. (Otherwise the limit distance \(d_{\mathcal {F}_0^+}\) was continuous.) Nevertheless the shortest paths converge strongly as \(\varepsilon \downarrow 0\), and we see that the spatial velocity tends to 0 in (31) if \(\varepsilon \downarrow 0\) if \(\gamma _{\varepsilon }^*(t) \in \mathbb {M}_{-}\). In the SR case \(\varepsilon =0\), the gradient flows themselves fit continuously and the interface \(\partial {\mathbb {M}}_{\pm }\) is reached with \(\dot{{{\mathbf {x}}}} \cdot \mathbf {n}= 0\) (and \(\dot{{{\mathbf {x}}}} = 0\)).

Theorem 4 can be extended to the SR case:

Corollary 2

(SR backtracking) Let the cost \(\mathcal {C}_1, \mathcal {C}_2\) be smooth, let the source \({\mathbf {p}}_S \in {\mathbb {M}}\) and \({\mathbf {p}}\ne {\mathbf {p}}_S \in {\mathbb {M}}\) be such that they can be connected by a unique smooth minimizer \(\gamma _\varepsilon ^*\) in \(({\mathbb {M}},\mathcal {F}_\varepsilon )\) and \(\gamma _0^*\) in \(({\mathbb {M}}, \mathcal {F}_0)\), such that \(\gamma _\varepsilon ^*(t)\) is not a conjugate point for all \(t \in [0,1]\) and all sufficiently small \(\varepsilon >0\), say \(\varepsilon < \varepsilon _0\), for some \(\varepsilon _0 > 0\). Then defining \(U_0: {\mathbf {q}}\in {\mathbb {M}} \mapsto d_{\mathcal {F}_\varepsilon }({\mathbf {p}}_s,{\mathbf {q}})\) one has

$$\begin{aligned} \dot{\gamma }^*_0(t) = U_0({\mathbf {p}})\mathcal {G}_{\gamma ^*_0(t) ; 0}^{-1} \, \mathrm{d}U_{0}(\gamma _0^*(t)), \qquad t \in [0,1], \end{aligned}$$

assuming \(U_0\) is differentiable at \(\gamma _0^*(t)\). In addition \(U_0\) satisfies the SR eikonal equation:

$$\begin{aligned} \sqrt{\mathcal {G}_{{\mathbf {p}};0}\left( \mathcal {G}_{{\mathbf {p}};0}^{-1} \mathrm{d}U_0({\mathbf {p}}), \mathcal {G}_{{\mathbf {p}};0}^{-1} \mathrm{d}U_0({\mathbf {p}})\right) } =1. \end{aligned}$$

Proof

From our assumptions on \({\mathbf {p}}\) and \(\gamma _\varepsilon ^*(t)\) for \(\varepsilon < \varepsilon _0\), we have, recall Remark 8, that \((U_\varepsilon (\cdot ))^2\) is differentiable at \(\gamma _\varepsilon ^*(t)\) for all \(0 \le t \le 1\) and \(0 \le \varepsilon < \varepsilon _0\). This implies that \(U_\varepsilon \) is differentiable at \(\{\gamma _\varepsilon ^*(t) \; | \; 0 < t \le 1 \}\), for all \(0<\varepsilon < \varepsilon _0\).

From Theorem 2 we have pointwise convergence \(U_\varepsilon ({\mathbf {p}})\rightarrow U_0({\mathbf {p}})\) and uniform convergence \(\gamma _\varepsilon ^* \rightarrow \gamma _0^*\) as \(\varepsilon \downarrow 0\). Moreover, as \(\gamma _\varepsilon ^*\) and \(\gamma _0^*\) are solutions of the canonical ODEs of Pontryagin’s maximum principle, the trajectories are continuously depending on \(\varepsilon >0\) and so are the derivatives \(\dot{\gamma }_\varepsilon ^*\). As a result, we can apply backtracking Theorem 4 for \(\varepsilon > 0\) and take the limits:

$$\begin{aligned} \begin{aligned} \dot{\gamma }_0^*(t)&= \lim _{\varepsilon \downarrow 0} \dot{\gamma }_\varepsilon ^*(t) \\&\mathop {=}\limits ^{\text {Theorem}~4} \lim _{\varepsilon \downarrow 0} U_\varepsilon ({\mathbf {p}})\,({\mathcal {G}}_{\gamma _\varepsilon ^*(t);\varepsilon }^{-1} \mathrm{d} U_\varepsilon )(\gamma _\varepsilon ^*(t)) \\&= U_0({\mathbf {p}}) \, \left( \lim _{\varepsilon \downarrow 0} {\mathcal {G}}_{\gamma _\varepsilon ^*(t);\varepsilon }^{-1}\right) \left( \lim _{\varepsilon \downarrow 0} (\mathrm{d} U_\varepsilon (\gamma _\varepsilon ^*(t)))\right) \\&\mathop {=}\limits ^{\text {Theorem}~2} U_0({\mathbf {p}}) \, {\mathcal {G}}_{\gamma _0^*(t);0}^{-1}(\mathrm{d} U_0)(\gamma _0^*(t)). \end{aligned} \end{aligned}$$
(32)

Furthermore,

$$\begin{aligned} \begin{aligned} 1&= \lim \limits _{\varepsilon \downarrow 0} \sqrt{\mathcal {G}_{{\mathbf {p}};\varepsilon }\left( \mathcal {G}_{{\mathbf {p}};\varepsilon }^{-1} \mathrm{d}U_{\varepsilon }({\mathbf {p}}), \mathcal {G}_{{\mathbf {p}};\varepsilon }^{-1} \mathrm{d}U_{\varepsilon }({\mathbf {p}})\right) } \\&=\sqrt{\mathcal {G}_{{\mathbf {p}};0}\left( \mathcal {G}_{{\mathbf {p}};0}^{-1} \mathrm{d}U_0({\mathbf {p}}), \mathcal {G}_{{\mathbf {p}};0}^{-1} \mathrm{d}U_0({\mathbf {p}})\right) } \end{aligned} \end{aligned}$$

where we recall Corollary 1. Here due to our assumptions, \(U_\varepsilon \) and \(U_0\) are both differentiable at \({\mathbf {p}}\). Note that the limit for the inverse metric \({\mathcal {G}}_{{\mathbf {p}},\varepsilon }^{-1}\) as \(\varepsilon \downarrow 0\) exists, recall Corollary 1. \(\square \)

Now that we stated our 4 main theoretical results we will prove them in the subsequent sections (and ‘Appendix A’ section).

3 Controllability Properties: Proof of Theorem 1 and Maxwell Points in \((\mathbb {M},d_{\mathcal {F}_0^+})\)

(Global controllability) The two considered Reeds–Shepp models \((\mathbb {M},d_{\mathcal {F}_0})\) and \((\mathbb {M},d_{\mathcal {F}_0^+})\) are globally controllable, in the sense that the distances \(d_{\mathcal {F}_0}\) and \(d_{\mathcal {F}_0^+}\) take finite values on \({\mathbb {M}}\times {\mathbb {M}}\). This easily follows from the observation that any path \({{\mathbf {x}}}: [0,1] \rightarrow \mathbb {R}^d\), which time derivative \(\dot{{{\mathbf {x}}}} := \frac{\mathrm{d}{{\mathbf {x}}}}{\mathrm{d}t}\) is Lipschitz and non-vanishing, can be lifted into a path \(\gamma : [0,1]\rightarrow {\mathbb {M}}\) of finite length w.r.t. \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\), defined by \(\gamma (t):=({{\mathbf {x}}}(t),\dot{{{\mathbf {x}}}}(t)/\Vert \dot{{{\mathbf {x}}}}(t)\Vert )\) for all \(t\in [0,1]\). The fact that the infimum in (1) is actually a minimum for \(\mathcal {F}=\mathcal {F}_{0}^+\) follows by Corollary 3 in ‘Appendix A’ section and (9), and the fact that the quasi-distances take finite values.

(Local controllability) In order to show that the model \(({\mathbb {M}},d_{\mathcal {F}_0^+})\) is not locally controllable, we need the following lemma.

Lemma 2

Let \(\mathbf {n}:[0,\pi ] \rightarrow \mathbb {S}^{d-1}\) be strictly 1-Lipschitz. Then \(\int _{0}^{\pi } \mathbf {n}(0) \cdot \mathbf {n}(t)\, \mathrm{d}t>0\). Let \(\mathbf {n}:{\mathbb {R}}\rightarrow \mathbb {S}^{d-1}\) be strictly 1-Lipschitz and \(2\pi \)-periodic. Then all points \(\mathbf {n}(t)\) lay in a common strict hemisphere. In particular \(\mathbf {0} \notin \text {Hull}\{\mathbf {n}(t)\;|\; t \in [0,2\pi ] \}\).

Proof

The Lipschitzness assumption implies \(\mathbf {n}(0) \cdot \mathbf {n}(t) > \cos (t)\) for all \(t \in (0,\pi ]\) so \(\int _{0}^{\pi } \mathbf {n}(0) \cdot \mathbf {n}(t)\, \mathrm{d}t>0\).

Let \(\mathbf {n}:{\mathbb {R}}\rightarrow \mathbb {S}^{d-1}\) be strictly 1-Lipschitz and \(2\pi \)-periodic. Set \(\mathbf {M}:= \int _{0}^{2\pi }\mathbf {n}(t)\, \mathrm{d}t\). Then for any \(t_0 \in [0,2\pi ]\) one has by the two assumptions

$$\begin{aligned} \mathbf {n}(t_0) \cdot \mathbf {M}= & {} \int \limits _{0}^{\pi } \mathbf {n}(t_0) \cdot \mathbf {n}(t_0+t) \,\mathrm{d}t\\&+ \int \limits _{0}^{\pi } \mathbf {n}(t_0) \cdot \mathbf {n}(t_0-t) \,\mathrm{d}t >0, \end{aligned}$$

so for all \(t_0\), we have \(\mathbf {n}(t_0) \in \{\mathbf {n} \in \mathbb {S}^{d-1} \;|\; \mathbf {n} \cdot \mathbf {M} >0\}\). \(\square \)

Now statements (12) and (13) on the non-local-controllability of \(({\mathbb {M}}, d_{\mathcal {F}^+_0})\) are shown in two steps.

Step 1: we show in the case of a constant cost function \(\mathcal {C}_{2}=\delta \) one has \(\limsup \limits _{{\mathbf {p}}' \rightarrow {\mathbf {p}}} d_{\mathcal {F}^+_0}({\mathbf {p}},{\mathbf {p}}') \le 2\pi \delta \), for any \({\mathbf {p}}\in {\mathbb {M}}\). Indeed, one can design an admissible curve in \((\mathbb {M}, \mathcal {F}_{0}^+)\) as the concatenation of an in-place rotation, a straight line and an in-place rotation. The length of the straight line is \(\mathcal {O}(\Vert {\mathbf {p}}'-{\mathbf {p}}\Vert )\) and vanishes when \({\mathbf {p}}' \rightarrow {\mathbf {p}}\), and the in-place rotations each have maximum cost \(\pi \delta \).

Step 2: we prove the lower bound \(\lim \limits _{\mu \downarrow 0} d_{\mathcal {F}_0^+}(({{\mathbf {x}}},\mathbf {n}),({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})) \ge 2 \pi \delta \), for any \(({{\mathbf {x}}},\mathbf {n})\in {\mathbb {M}}\). This and the above-established upper bound implies the required result. As \(\mathcal {C}_{1}, \mathcal {C}_{2} \ge \delta \), we can restrict ourselves to the case of uniform cost \(\mathcal {C}_{1}=\mathcal {C}_{2}=\delta =1\) and just show equality (13), as estimate (12) follows by scaling with \(\delta \).

Consider a Lipschitz regular path \(\gamma (t) = ({{\mathbf {x}}}(t), \mathbf {n}(t))\), with \(\dot{{{\mathbf {x}}}} \propto \mathbf {n}\text { and } \dot{{{\mathbf {x}}}} \cdot \mathbf {n}\ge 0\), from \(({{\mathbf {x}}},\mathbf {n})\) to \(({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})\). Then

$$\begin{aligned} \mathbf {0}= \mu \mathbf {n}+ \int _0^1 \dot{{{\mathbf {x}}}}(t) \mathrm{d}t = \mu \mathbf {n}(0) +\int _0^1 \Vert \dot{{{\mathbf {x}}}}(t)\Vert \mathbf {n}(t) \mathrm{d}t, \end{aligned}$$

so \(\mathbf {0} \in {{\mathrm{Hull}}}\{\mathbf {n}(t);\, 0 \le t \le 1\}\). Let \(\mathbf {m}:[0,1] \rightarrow \mathbb {S}^{d-1}\) be a constant speed parameterization of \(\mathbf {n}\). Let \(\tilde{\mathbf {m}}: {\mathbb {R}}\rightarrow \mathbb {S}^{d-1}\) be defined by \(\tilde{\mathbf {m}}(2\pi t)=\mathbf {m}(t)\) for all \(t \in [0,2\pi ]\) and extended by \(2\pi \)-periodicity. If \(\tilde{\mathbf {m}}(\cdot )\) were strictly 1-Lipschitz, then by Lemma 2 we would get \(\mathbf {0} \notin \text {Hull}\{\tilde{\mathbf {m}}(t)\;|\; t \in [0,2\pi ] \}=\text {Hull}\{\mathbf {n}(t)\;|\; t \in [0,1]\}\) and a contradiction. Hence there exists a \(t_0 \in {\mathbb {R}}\) such that \(\Vert \dot{\tilde{\mathbf {m}}}(t_0)\Vert \ge 1\) and via the constant speed parameterization assumption we get the required coercivity:

$$\begin{aligned} \begin{aligned}&1 \le \Vert \dot{\tilde{\mathbf {m}}}(t_0)\Vert = \frac{1}{2\pi } \int _0^1 \Vert \dot{\mathbf {n}}(t)\Vert \, \mathrm{d}t \Rightarrow \\&\int _{0}^{1} \mathcal {F}_{0}^{+}(\gamma (t),\dot{\gamma }(t))\, \mathrm{d}t \ge \int _{0}^{1} \mathcal {C}_{2}(\gamma (t))\, \Vert \dot{\mathbf {n}}(t)\Vert \, \mathrm{d}t \ge 2\pi \delta . \end{aligned} \end{aligned}$$
Fig. 9
figure 9

The development of spheres centered around \(\mathbf {e}=(0,0,0)\) with increasing radius R. A The normal SR spheres on \(\mathbb {M}\) given by \(\{\mathbf {p} \in \mathbb {M} \;|\; d_{\mathcal {F}_0}(\mathbf {p},\mathbf {e})=R\}\) where the folds reflect the 1st Maxwell sets [8, 52]. B The SR spheres with identification of antipodal points given by \(\left\{ \mathbf {p} \in \mathbb {M} \;|\; \min \{\; d_{\mathcal {F}_0}(\mathbf {p},\mathbf {e}), d_{\mathcal {F}_0}(\mathbf {p}+(0,0,\pi ),\mathbf {e})\; \}=R\right\} \) with additional folds (1st Maxwell sets) due to \(\pi \)-symmetry. C The asymmetric Finsler norm spheres given by \(\{\mathbf {p} \in \mathbb {M} \;|\; d_{\mathcal {F}_0^+}(\mathbf {p},\mathbf {e})=R\}\) visualized from two perspectives with extra folds (1st Maxwell sets) at the back \((-\mu ,0,0)\). The black dots indicate points with two folds. In the case of B, this is a Maxwell point with 4 geodesics merging. In the case of C, this is just the origin itself reached from behind at \(R=2\pi \), recall Lemma 3. Although not depicted here, if the radius \(R>2\pi \) the origin becomes an interior point of the corresponding ball

To prove local controllability of the model \(({\mathbb {M}}, d_{\mathcal {F}_0})\), we apply the logarithmic approximation for weighted subcoercive operators on Lie groups, cf. [57] applied to the Lie group \(SE(d)={\mathbb {R}}^{d} \rtimes SO(d)\), in which the space of positions and orientations is placed via a Lie group quotient \(SE(d)/(\{0\} \times SO(d-1))\). One obtains a sharp estimate,Footnote 7 where the weights of allowable (horizontal) vector fields are 1, whereas the remaining spatial vector fields orthogonal to \(\mathbf {n} \cdot \nabla _{{\mathbb {R}}^d}\) get weight 2, as they follow by a single commutator of allowable vector fields, see, e.g., [24, 25]. Relaxing all spatial weights to 2 and continuity of costs \(\mathcal {C}_{1}, \mathcal {C}_{2}\), yields (14). \(\square \)

Remark 10

In view of the above one might expect that the point \((\mathbf {x} - \mu \mathbf {n}, \mathbf {n})\) is reached by a geodesic that consists of a concatenation of 1. an in-place rotation by \(\pi \), 2. a straight line, 3. an in-place rotation by \(\pi \). However, this is not the case as can be observed in the very lower left corner in Fig. 8, where the two minimizing red curves show a very different behavior. This is explained by the next lemma.

Lemma 3

Let \(\mu >0\), and \(\mathcal {C}_{1}=\mathcal {C}_{2}=\delta \). Let \({{\mathbf {R}}}_\theta \) denote the (counter-clockwise) rotation matrix about the origin by angle \(\theta \). The endpoint \(({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})\) for each \(\mu \ge 0\) is a Maxwell point w.r.t. \(({{\mathbf {x}}},\mathbf {n})\), since there are two minimizing geodesics in \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) that are a concatenation

  1. 1.

    an in-place rotation from \(({{\mathbf {x}}},\mathbf {n})\) to \(({{\mathbf {x}}},{{\mathbf {R}}}_{\pm \frac{\pi }{2}}\mathbf {n})\),

  2. 2.

    a full U-curve, see [44], departing from and ending in a cusp from \(({{\mathbf {x}}},{{\mathbf {R}}}_{\pm \frac{\pi }{2}}\mathbf {n})\) to \(({{\mathbf {x}}}- \mu \mathbf {n},{{\mathbf {R}}}_{\mp \frac{\pi }{2}}\mathbf {n})\),

  3. 3.

    an in-place rotation from \(({{\mathbf {x}}}- \mu \mathbf {n},{{\mathbf {R}}}_{\mp \frac{\pi }{2}}\mathbf {n})\) to \(({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})\).

We have the limit \(\lim \limits _{\mu \downarrow 0} d_{\mathcal {F}_{0}^+}(({{\mathbf {x}}},\mathbf {n}),({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n}))=2\pi \delta \).

Proof

See [27]. \(\square \)

Remark 11

Consider the case \(d=2\), \(\mathcal {C}_{1}=\mathcal {C}_{2}=\delta \), and source point \(\mathbf {p}_S=(\mathbf {x},\mathbf {n})=\mathbf {e}=(0,0,\theta =0)\). The endpoints \(({{\mathbf {x}}}-\mu \mathbf {n},\mathbf {n})=(-\mu ,0,0)\), with \(\mu >0\) sufficiently small, are 1st Maxwell points in \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) where geodesically equidistant wavefronts departing from the source point collide for the first time, see Fig. 9C. The distance mapping \(d_{\mathcal {F}_{0}}^{+}({\mathbf {p}}_{S}, \cdot )\) is not continuous, but the asymmetric distance spheres

$$\begin{aligned} \mathcal {S}_R:=\{\mathbf {p} \in \mathbb {M} \;|\; d_{\mathcal {F}_{0}}^{+}({\mathbf {p}}_{S}, {\mathbf {p}})=R\} \end{aligned}$$

are connected and compact, and they collide at \(R=2\pi \) in such a way that the origin \({\mathbf {p}}_{s}\) becomes an interior point in the asymmetric balls of radius \(R> 2\pi \).

4 Cusps and Keypoints: Proof of Theorem 3

In this section we provide a proof of Theorem 3 on the occurrence of cusps and keypoints. For the uniform cost case \(\mathcal {C}_{1}=\mathcal {C}_{2}=1\) for \(d=2\), our curve optimization problem (1) \((\mathbb {M},d_{\mathcal {F}_{0}})\) in consideration boils down to a standard left-invariant curve optimization in the roto-translation group \(SE(2)={\mathbb {R}}^{2} \rtimes SO(2)\). As we will apply tools from previous works [10, 11, 22, 52], we will make use of the following notations for expansionFootnote 8 of velocity and momentum in the left-invariant (co)-frame:

$$\begin{aligned} \begin{array}{c} \left\{ \begin{array}{l} \mathcal {A}_{1}:= \cos \theta \, \partial _{x} + \sin \theta \, \partial _y, \\ \mathcal {A}_{2}:= -\sin \theta \, \partial _x + \cos \theta \, \partial _y, \\ \mathcal {A}_{3}:=\partial _{\theta }, \end{array} \right. \\ \left\{ \begin{array}{l} \omega ^{1} := \cos \theta \, \mathrm{d}x + \sin \theta \, \mathrm{d}y, \\ \omega ^{2} := -\sin \theta \, \mathrm{d}x + \cos \theta \, \mathrm{d}y, \\ \omega ^{3} := \mathrm{d}\theta , \end{array} \right. \\ \dot{\gamma }(t)= \sum \limits _{i=1}^{3} u^{i}(t) \, \left. \mathcal {A}_{i}\right| _{\gamma (t)} \in T_{\gamma (t)}(\mathbb {M}), \\ \hat{{\mathbf {p}}}(t)= \sum \limits _{i=1}^{3} \hat{p}_{i}(t) \, \left. \omega ^{i}\right| _{\gamma (t)} \in T_{\gamma (t)}^*(\mathbb {M}), \end{array} \end{aligned}$$
(33)

where the indexing of the left-invariant frame is different here, in order to stick to the ordering \((x,y,\theta )\) applied in this article. Note that for the case \(\varepsilon =0\) admissible smooth curves \(\gamma \) in \((\mathbb {M}, d_{\mathcal {F}_{0}})\) satisfy the horizontality constraint \(\dot{\gamma }(t) \in \text {Span}\{\left. \mathcal {A}_{1}\right| _{\gamma (t)},\left. \mathcal {A}_{3}\right| _{\gamma (t)}\}\).

Proof of the statements regarding cusps

  • We can describe our curve optimization problem (1) using a Hamiltonian formalism, with Hamiltonian \(H(\hat{{\mathbf {p}}})= \frac{1}{2}\left( \hat{p}_{1}^2 + \hat{p}_{3}^2 \right) = \frac{1}{2}\) [44]. By Pontryagin’s maximum principle, geodesics adhere to the following Hamilton equations:

    $$\begin{aligned} \left\{ \begin{array}{l} \dot{p}_{1}=u^1= \hat{p}_{1}, \\ \dot{p}_{2}=u^2= 0, \\ \dot{p}_{3}=u^3=\hat{p}_{3}, \end{array} \right. , \ \left\{ \begin{array}{l} \frac{d \hat{p}_{1}}{dt}= \hat{p}_{2} \hat{p}_{3}, \\ \frac{d \hat{p}_{2}}{dt}= -\hat{p}_{1} \hat{p}_{3}, \\ \frac{d \hat{p}_{3}}{dt}= -\hat{p}_{1} \hat{p}_{2}. \end{array} \right. \end{aligned}$$
    (34)

For fixed initial momentum \(\hat{{\mathbf {p}}}(0)\), this uniquely determines a SR geodesic. Moreover, SR geodesics are contained within the (co-adjoint) orbits

$$\begin{aligned} (\hat{p}_{1}(t))^2 + (\hat{p}_{2}(t))^2=(\hat{p}_{1}(0))^2 + (\hat{p}_{2}(0))^2. \end{aligned}$$
(35)

The parameter t in system (34) is SR arc length, but by reparameterizing (possible as long as \(u^1\) does not change sign) to spatial arc length parameter s, with \(\frac{ds}{dt} = \hat{p}_1\), we get a partially linear system. Combining (34) and (35), we find orbits in the (hyperbolic) phase portrait induced by

$$\begin{aligned} \begin{array}{l} \left\{ \! \begin{aligned} \hat{p}_{2}'(s)= -\hat{p}_{3} \\ \hat{p}_{3}'(s)= -\hat{p}_{2} \end{aligned} \right. \Rightarrow \! \left\{ \!\begin{aligned} \hat{p}_2(s) &{}= \hat{p}_2(0) \cosh {s} - \hat{p}_3(0) \sinh {s}\\ \hat{p}_3(s) &{}= -\hat{p}_2(0) \sinh {s} + \hat{p}_3(0) \cosh {s}. \end{aligned}\right. \end{array} \end{aligned}$$

Hence \(|\hat{p}_3(s)| = 1\) always has a solution for some finite (possibly negative) s, except when \(\hat{p}^{2}(0)=\hat{p}_{3}(0)=0\), in which case the solutions are straight lines. Preservation of the Hamiltonian then implies \(\hat{p}_1(s) = u^1(s) = \tilde{u}(s)=0\). We conclude that every SR geodesic (with unconstrained time \(t \in {\mathbb {R}}\)) in \((\mathbb {M},d_{\mathcal {F}_0})\) which is not a straight line admits a cusp.

  • We now consider \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\), \(\varepsilon > 0\). To have a cusp, we need \(\hat{p}_1(t) = \hat{p}_2(t) = 0\) for some \(t \in {\mathbb {R}}\). Co-adjoint orbit condition (35) then implies that \(\hat{p}_1(t) = \hat{p}_2(t) = 0\) for all t, corresponding to a vertical geodesic that has purely angular momentum and no cusp. The same argument holds for \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon ^+})\). In \(({\mathbb {M}},d_{\mathcal {F}_0^+})\) we have the condition that \(u^1 \ge 0\); hence, by definition it can never switch sign and all geodesics are cuspless.

Proof of the statements regarding keypoints

  • For the cases \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon })\) and \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon ^+})\) with \(\varepsilon > 0\) we can use the same line of arguments as above. Also here both spatial controls have to vanish, resulting in vertical geodesics. The spatial projection of such curves is a single keypoint. For \(({\mathbb {M}},d_{\mathcal {F}_0})\) we rely on the result that SR geodesics are analytical, and therefore if the control \(u^1(t) = 0\) for some open time interval \((t_0,t_1)\), then \(u^1(t) = 0\) for all \(t \in {\mathbb {R}}\), again corresponding to purely angular motion.

  • Geodesics in \((\mathbb {M},d_{\mathcal {F}_0^+})\) can have keypoints only at the boundaries. Suppose a geodesic \(\gamma :[0,1] \rightarrow \mathbb {M}\) in \((\mathbb {M},d_{\mathcal {F}_0^+})\) has an internal keypoint, with a corner of angle \(\delta >0\), at internal time \(T_{1} \in (0,1)\). Then one can create a local shortcut with a straight line segment connecting two sufficiently close points before and after the corner with two in-place rotations whose angles add up to \(\delta \). With a suitable mollifier this shortcut can be approximated by a curve in \(\varGamma \). For details see similar arguments in [11].

Next we explain the cases (A), (B) and (C), where we fix initial point \(\gamma (0)=\mathbf {e}=(0,0,0)\).

  1. (A)

    Suppose that the endpoint and \(x \ge 0\). Then \(\mathbf {p}\) can already be reached by a geodesic in \((\mathbb {M},d_{\mathcal {F}_0})\), and the positivity constraint (i.e., no reverse gear), which can only increase length, becomes obsolete.

  2. (B)

    Now suppose the endpoint \(\mathbf {p}=(x,y,\theta )\) lays in the half-space \(x<0\). Then by the half-space property of geodesics in \((\mathbb {M},d_{\mathcal {F}_0})\), cf. [22, Theorem 7], the geodesic in \((\mathbb {M},d_{\mathcal {F}_0^+})\) must have a keypoint. By the preceding keypoints can only be located at the boundaries. If it takes place at the endpoint only, then still the constraint \(x<0\) is not satisfied; thereby, it must take place at the origin.

  3. (C)

    In those cases the endpoint \(\mathbf {p}\) lays outside the connected cone of reachable angles, which are by [22, Theorem 9] bounded (for those endpoints) by geodesics ending in a cusp (so not endpoints of geodesics starting at a cusp). So for those points, minimizing geodesics will first move by an in-place rotation (along a spherical geodesic) until it hits the cusp surface , after which it is traced back to the origin by a regular geodesic with strictly positive spatial control inside the volume .

5 Eikonal Equations and Backtracking: Proof of Proposition 1, Corollary 1 and Theorem 4

First we shall prove Proposition 1, regarding the duals of \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\), and Corollary 1, providing explicit expressions for the corresponding eikonal equations. To this end we need a basic lemma on computing dual norms on \({\mathbb {R}}^{n}\), where later we will set \(n = 2d - 1 = \mathrm {dim}({\mathbb {M}})\).

Lemma 4

Let \(\mathbf {w} \in {\mathbb {R}}^{n}\) and let \(M \in {\mathbb {R}}^{n\times n}\) be symmetric, positive definite. Define the norm \(F_{M, \mathbf {w}}:{\mathbb {R}}^{n} \rightarrow {\mathbb {R}}^+\) by

$$\begin{aligned} F_{M, \mathbf {w}}(\mathbf {v})= \sqrt{(M\mathbf {v},\mathbf {v}) + (\mathbf {w},\mathbf {v})_{-}^2}. \end{aligned}$$

Then its dual norm \(F_{M, \mathbf {w}}^*: ({\mathbb {R}}^{n})^* \rightarrow {\mathbb {R}}^+\) equals

$$\begin{aligned} F_{M, \mathbf {w}}^*(\hat{\mathbf {v}})= \sqrt{(\hat{\mathbf {v}},\hat{M}\hat{\mathbf {v}}) + (\hat{\mathbf {v}},\hat{\mathbf {w}})_{+}^2}, \end{aligned}$$
(36)

with \(\hat{M}=(M + \mathbf {w} \otimes \mathbf {w})^{-1}\) and \(\hat{\mathbf {w}}=\frac{M^{-1}\mathbf {w}}{\sqrt{1+ (\mathbf {w}, M^{-1}\mathbf {w})}}\).

Proof

For \(n=1\) the result is readily verified, and for \(\mathbf {w}=\mathbf {0}\) the result is classical. We next turn to the special case \(M= {{\mathrm{Id}}}\), and \({\mathbf {w}}= (w_1, \mathbf {0}_{\mathbb {R}^{n-1}})\) is zero except maybe for its first coordinate \(w_1\). Thus for any \({\mathbf {v}}= (v_1,{\mathbf {v}}_2) \in \mathbb {R}^n = \mathbb {R}\times \mathbb {R}^{n-1}\) one has the splitting

$$\begin{aligned} \begin{aligned} F_{M,{\mathbf {w}}}(v_1, {\mathbf {v}}_2)^2 =&\left( |v_1|^2 + ( w_1 v_1)_-^2\right) + \Vert {\mathbf {v}}_2\Vert ^2 \\:=&F_1(v_1)^2+F_2({\mathbf {v}}_2)^2. \end{aligned} \end{aligned}$$
(37)

Using the compatibility of norm duality with such splittings, and the special cases \(n=1\) and \({\mathbf {w}}=0\) mentioned above, we obtain

$$\begin{aligned} \begin{aligned} (F_{M, {\mathbf {w}}}^*(\hat{v}_1, \hat{{\mathbf {v}}}_2))^2&= (F_1^*(\hat{v}_1))^2+ (F_2^*(\hat{{\mathbf {v}}}_2))^2 \\&=\frac{|\hat{v}_1|^2+ (w_1\hat{v}_1)_+^2}{1+|w_1|^2} + \Vert \hat{{\mathbf {v}}}_2\Vert ^2, \end{aligned} \end{aligned}$$

which is exactly of form (36). The general case for arbitrary \({\mathbf {w}}\) and symmetric positive definite M follows from affine invariance. Indeed let A be an invertible \(n \times n\) matrix, and let \(M' = A^{\mathrm T}M A\) and \({\mathbf {w}}' = A^{\mathrm T}{\mathbf {w}}\). Let \(F = F_{M,{\mathbf {w}}}\) and \(F'=F_{M',{\mathbf {w}}'}\), so that \(F'({\mathbf {v}})=F(A{\mathbf {v}})\) for all \({\mathbf {v}}\in \mathbb {R}^n\). Let \(F^*\), \(\hat{M}\), \(\hat{\mathbf {w}}\) and \(F'^*\), \(\hat{M}'\), \(\hat{\mathbf {w}}'\), be, respectively, the dual norms and the matrices defined by the explicit formulas above. Then denoting \(B := (A^{\mathrm T})^{-1}\) one has by the definition of dual norms that \(F'^*(\hat{{\mathbf {v}}}) = F^*(B \hat{{\mathbf {v}}})\) for all \(\hat{{\mathbf {v}}}\in \mathbb {R}^n\) and by the explicit formulas \(\hat{M}' = B^{\mathrm T}\hat{M} B\), \({\mathbf {w}}' = B^{\mathrm T}{\mathbf {w}}\). Thus, \(F^*=F^*_{M, {\mathbf {w}}}\) holds if and only if \(F'^* = F^*_{M', {\mathbf {w}}'}\). Since for any \(M, {\mathbf {w}}\), there exists a linear change of variables A such that \(M'={{\mathrm{Id}}}\) and \({\mathbf {w}}'\) is zero except maybe for its first coordinate, the proof is complete. \(\square \)

Now Proposition 1 follows from Lemma 4 by writing out the dual norm, using for each \({\mathbf {p}}\in {\mathbb {M}}\):

$$\begin{aligned} \begin{aligned} M_{{\mathbf {p}}}&= (\mathcal {C}_{1}({\mathbf {p}}))^2 (D_{\mathbf {n}}^{\varepsilon })^{-1} \oplus (\mathcal {C}_{2}({\mathbf {p}}))^2 I_{d} \quad \text { and } \\ \mathbf {w}_{{\mathbf {p}}}&= \left\{ \begin{array}{ll} \mathcal {C}_1({\mathbf {p}}) \sqrt{\varepsilon ^{-2}-1} \; (\mathbf {n},\mathbf {0}), \quad &{} \text {for } \mathcal {F}_\varepsilon ^+, \\ \mathbf {0}, \quad &{}\text {for } \mathcal {F}_\varepsilon , \end{array}\right. \end{aligned} \end{aligned}$$
(38)

with \(D_\mathbf {n}^\varepsilon \) as in (27). Corollary 1 then follows by setting the momentum covector \(\hat{{\mathbf {p}}}=\mathrm{d}U({\mathbf {p}})\) equal to the derivative of the value function evaluated at \({\mathbf {p}}\).

Now that we have derived the eikonal equations, we obtain backtracking Theorem 4 by Proposition 4 in ‘Appendix B’ section, which shows us that level sets of solutions of the eikonal equations are geodesically equidistant surfaces and that geodesics are found by an intrinsic gradient descent.

However, to obtain the explicit backtracking formulas we differentiate the Hamiltonian, rather than the dual metric, which is equivalent thanks to (57) (in Remark 14 in ‘Appendix B’ section). We focus below on the model \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon ^+})\) without reverse gear, since the other case is similar. Let \({\mathbf {p}}\in {\mathbb {M}}\), let \(F := \mathcal {F}_\varepsilon ^+({\mathbf {p}}, \cdot )\), and let \(\hat{{\mathbf {p}}}= (\hat{{{\mathbf {x}}}}, \hat{\mathbf {n}}) \in T^*_{\mathbf {p}}({\mathbb {M}})\). Then differentiating w.r.t. \(\hat{\mathbf {n}}\) we obtain

$$\begin{aligned} \mathrm{d}_{\hat{\mathbf {n}}} F^*(\hat{{{\mathbf {x}}}}, \hat{\mathbf {n}})^2 = \mathcal {C}_2({\mathbf {p}})^{-2} \, \mathrm{d}_{\hat{\mathbf {n}}} \Vert \hat{\mathbf {n}}\Vert ^2 = 2 \, \mathcal {C}_2({\mathbf {p}})^{-2} \hat{\mathbf {n}}, \end{aligned}$$

where \(\Vert {\cdot }\Vert \) is the Riemannian metric induced by the embedding \(\mathbb {S}^{d-1} \subset \mathbb {R}^d\). Differentiating w.r.t. \(\hat{{{\mathbf {x}}}}\) we obtain

$$\begin{aligned} \mathrm{d}_{\hat{{{\mathbf {x}}}}} F^*(\hat{{{\mathbf {x}}}}, \hat{\mathbf {n}})^2&= \mathcal {C}_1({\mathbf {p}})^{-2} \, \mathrm{d}_{\hat{{{\mathbf {x}}}}} ( \hat{{{\mathbf {x}}}}\cdot D_\mathbf {n}^\varepsilon \hat{{{\mathbf {x}}}}-(1-\varepsilon ^2) (\hat{{{\mathbf {x}}}}\cdot \mathbf {n})^2_-) \nonumber \\&= 2 \, \mathcal {C}_1({\mathbf {p}})^{-2} {\left\{ \begin{array}{ll} D_\mathbf {n}^\varepsilon \hat{{{\mathbf {x}}}}&{} \text {if } \hat{{{\mathbf {x}}}}\cdot \mathbf {n}\ge 0,\\ \varepsilon ^2 {{\mathrm{Id}}}\hat{{{\mathbf {x}}}}&{} \text {if } \hat{{{\mathbf {x}}}}\cdot \mathbf {n}\le 0. \end{array}\right. } \end{aligned}$$
(39)

Announced result (31), which is equivalent to its more concise abstract form (28), follows by choosing \(\hat{{{\mathbf {x}}}}:= \nabla _{\mathbb {R}^d} U(\gamma (t))\) and \(\hat{\mathbf {n}}:= \nabla _{\mathbb {S}^{d-1}} U(\gamma (t))\) and a basic re-scaling \([0,L]\in t \mapsto t/L \in [0,1]\). \(\square \)

Remark 12

The computation of the dual norms can be simplified by expressing velocity (entering the Finsler metric) and momentum (entering the dual metric) in a (left-invariant) local, orthogonal, moving frame of reference, attached to the point \({\mathbf {p}}= ({{\mathbf {x}}},\mathbf {n}) \in {\mathbb {M}}\):

$$\begin{aligned} \dot{\mathbf {p}}= \sum \limits _{i=1}^{2d-1} u^{i} \left. \mathcal {A}_{i} \right| _{{\mathbf {p}}}, \ \ \hat{\mathbf {p}}= \sum \limits _{i=1}^{2d-1} \hat{p}_{i} \, \left. \omega ^{i}\right| _{\mathbf {p}} \end{aligned}$$
(40)

where a moving frame of reference is chosen such that

$$\begin{aligned} \left\{ \begin{array}{l} u^{d}= \tilde{u} = \mathbf {n}\cdot \dot{\mathbf {x}}, \\ \sum \limits _{i=1}^{d-1} (u^{i})^2 = \Vert \dot{\mathbf {x}}\Vert ^{2}- (\mathbf {n} \cdot \dot{\mathbf {x}})^2, \\ \sum \limits _{i=1}^{d-1} (u^{d+i})^2 = \Vert \dot{\mathbf {n}}\Vert ^{2}, \end{array} \right. \end{aligned}$$

inducing a corresponding dual frame \(\{\left. \omega ^{i}\right| _{\mathbf {p}}\}\) via

$$\begin{aligned} \langle \left. \omega ^{i}\right| _{{\mathbf {p}}}, \left. \mathcal {A}_{j}\right| _{{\mathbf {p}}} \rangle =\delta ^{i}_{j}, \text { for all } i,j=1,\ldots ,2d-1. \end{aligned}$$
(41)

w.r.t. the left-invariant frame the matrices \(D_\mathbf {n}^\varepsilon \), \(M_{\mathbf {p}}\) as in (38) and \(\hat{M_{\mathbf {p}}}\) all become diagonal matrices, and the dual can be computed straightforwardly. Furthermore, in this formulation we can see from the expression for the dual \((\mathcal {F}_0^+)^*\), i.e., in the limit \(\varepsilon \downarrow 0\), that the positive spatial control \(u^d\) constraint results in a positive momentum \(\hat{p}_d\) constraint:

$$\begin{aligned} (\mathcal {F}_0^+)^*({\mathbf {p}},\hat{{\mathbf {p}}}) = \sqrt{\frac{(\hat{p}_d)_+^2}{\mathcal {C}_1^{2}({\mathbf {p}})} + \frac{1}{\mathcal {C}_2^{2}({\mathbf {p}})} \sum _{i = d+1}^{2d-1} (\hat{p}_i)^2}. \end{aligned}$$
(42)

Therefore the eikonal equation in the positive control model \(({\mathbb {M}}, d_{\mathcal {F}_0^+})\) is simply given by

$$\begin{aligned} \sqrt{ \frac{\Vert \nabla _{\mathbb {S}^{d\!-\!1}}U({\mathbf {p}})\Vert ^2}{\mathcal {C}^{2}_{2}({\mathbf {p}})} + \frac{ ((\mathbf {n} \cdot \nabla _{{\mathbb {R}}^{d}}U({\mathbf {p}}))_+)^2}{\mathcal {C}^{2}_{1}({\mathbf {p}})} } = 1 \end{aligned}$$
(43)

6 Discretization of the Eikonal PDEs

6.1 Causal Operators and the Fast-Marching Algorithm

The fast-marching algorithm is an efficient numerical method [59] for numerically solving static first-order Hamilton–Jacobi–Bellman (or simply eikonal) PDE (5) which characterizes the distance map U to a fixed source point \({\mathbf {p}}_\mathrm{S}\). Fast marching is tightly connected with Dijkstra’s algorithm on graphs, and in particular it shares the \(\mathcal {O}(K N \ln N)\) complexity, where \(N = \#(X)\) is the cardinality of the discrete domain \(X \subset {\mathbb {M}}\), \(X\ni {\mathbf {p}}_\mathrm{S}\), and K is the average number of neighbors for each point. Both fast-marching and Dijkstra’s algorithms can be regarded as specialized solvers of nonlinear fixed point systems of equations \(\varLambda u = u\), where the unknown \(u\in \mathbb {R}^X\) is a discrete map representing the front arrival times, which rely on the a-priori assumption that the operator \(\varLambda :\mathbb {R}^X \rightarrow \mathbb {R}^X\) is causal (and monotone, but this second assumption is not discussed here). Causality informally means that the estimated front arrival time \(\varLambda u({\mathbf {p}})\) at a point \({\mathbf {p}}\in X\) depends on the given arrival times \(u({\mathbf {q}})\), \({\mathbf {q}}\in X\), prior to \(\varLambda u({\mathbf {p}})\), but not on the simultaneous or the future ones. Formally, one requires that for any \(u,v\in \mathbb {R}^X\), \(t\in \mathbb {R}\):

$$\begin{aligned} \begin{aligned}&\text {If } u^{<t}=v^{<t} \text { then } (\varLambda u)^{\le t} = (\varLambda v)^{\le t}, \\&\quad \text {where } u^{<t}({\mathbf {p}}) := {\left\{ \begin{array}{ll} u({\mathbf {p}}) &{} \text {if } u({\mathbf {p}})<t,\\ +\infty &{} \text {otherwise}, \end{array}\right. } \end{aligned} \end{aligned}$$
(44)

and \(v^{<t}\), \((\varLambda u)^{\le t}\) and \((\varLambda v)^{\le t}\) are defined similarly.

A semi-Lagrangian scheme We implemented two discretizations of eikonal equation (5) which benefit from the causality property. The first one is a semi-Lagrangian scheme, inspired by Bellman’s optimality principle which informally states that any subpolicy of an optimal policy is an optimal policy. Formally, let \(\mathcal {F}\) be a Finsler metric, and let \(U := d_\mathcal {F}(\cdot , {\mathbf {p}}_\mathrm{S})\) be defined as the distance to a given source point \({\mathbf {p}}_\mathrm{S}\). Then for any \({\mathbf {p}}\in {\mathbb {M}}\) and any neighborhood V of \({\mathbf {p}}\) not containing \({\mathbf {p}}_\mathrm{S}\) one has the property

$$\begin{aligned} U({\mathbf {p}}) := \min _{{\mathbf {q}}\in \partial V} d_\mathcal {F}({\mathbf {p}},{\mathbf {q}}) + U({\mathbf {q}}). \end{aligned}$$
(45)

In the spirit of [55, 59] we discretize (45) by introducing for each interior \({\mathbf {p}}\in X {\setminus }\{{\mathbf {p}}_\mathrm{S}\}\) a small polygonal neighborhood \(V({\mathbf {p}})\), which vertices belong to the discrete point set X. The nonlinear operator \(\varLambda \) is defined as

$$\begin{aligned} \begin{aligned} \varLambda u({\mathbf {p}}) := \min _{\begin{array}{c} \{{\mathbf {q}}_1, \ldots , {\mathbf {q}}_n\}\\ \text {facet of } \partial V({\mathbf {p}}) \end{array}} \min _{\xi \in \varXi } \mathcal {F}\left( {\mathbf {p}}, \sum _{i=1}^n \xi _i {\mathbf {q}}_i - {\mathbf {p}}\right) \\ + \sum _{i=1}^n \xi _i u({\mathbf {q}}_i), \end{aligned} \end{aligned}$$
(46)

where \(\varXi = \{ \xi \in \mathbb {R}^n_+; \, \sum _{i=1}^n \xi _i=1\}\). In other words, the boundary point \({\mathbf {q}}\in \partial V({\mathbf {p}})\) in (45) is represented in (46) by the barycentric sum \({\mathbf {q}}= \sum _{i=1}^n \xi _i {\mathbf {q}}_i\), the distance \(d_\mathcal {F}({\mathbf {p}}, {\mathbf {q}})\) is approximated with the norm \(\mathcal {F}({\mathbf {p}}, {\mathbf {q}}-{\mathbf {p}})\), and the value \(U({\mathbf {q}})\) is approximated with the interpolation \(\sum _{i=1}^n \xi _i u({\mathbf {q}}_i)\).

Fig. 10
figure 10

Left: stencil used for the metric \(\mathcal {F}_\varepsilon \) on \(\mathbb {R}^2 \times \mathbb {S}^1\), \(\varepsilon =0.1\), obeying the generalized acuteness property required for Bellman-type discretization (46). See also the control sets in Fig. 2. Center: likewise with \(\mathcal {F}_\varepsilon ^+\), \(\varepsilon =0.1\). Right: coarse discretization of \(\mathbb {S}^2\) with 162 vertices, used in some experiments posed on \(\mathbb {R}^3 \times \mathbb {S}^2\). Some acute stencils (in the classical Euclidean sense) shown in color (Color figure online)

Fig. 11
figure 11

Left: slice in \(\mathbb {R}^3\) of control sets (7) for \(\mathcal {F}_\varepsilon \) on \(\mathbb {R}^3\times \mathbb {S}^2\), \(\varepsilon =0.2\), for different orientations of \(\mathbf {n}\). Stencils obeying the generalized acuteness property required for Bellman-type discretizations (46). Right: slice in \(\mathbb {R}^3\) of the control sets for \(\mathcal {F}_\varepsilon ^+\), \(\varepsilon =0.2\). Offsets used for finite differences discretization (49), for four distinct orientations \(\mathbf {n}\)

We refer to [55, 61] for proofs of convergence and for the following essential property: operator (46) obeys causality property (44) iff the chosen stencil \(V({\mathbf {p}})\) obeys the following generalized acuteness property: for any \({\mathbf {q}}, {\mathbf {q}}'\) in a common facet of \(V({\mathbf {p}})\), one has

$$\begin{aligned} \langle d_{\hat{{\mathbf {p}}}} \mathcal {F}({\mathbf {p}}, {\mathbf {q}}-{\mathbf {p}}), {\mathbf {q}}'-{\mathbf {p}}\rangle \ge 0. \end{aligned}$$

For the construction of such stencils \(V({\mathbf {p}})\), \({\mathbf {p}}\in X\), we rely on the previous works [40, 41] and on the following observation: the metrics \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) associated with the Reeds–Shepp car models can be decomposed as

$$\begin{aligned} \mathcal {F}({\mathbf {p}}, (\dot{{{\mathbf {x}}}}, \dot{\mathbf {n}}))^2 = \mathcal {F}_1({\mathbf {p}}, \dot{{{\mathbf {x}}}})^2 +\mathcal {F}_2({\mathbf {p}}, \dot{\mathbf {n}})^2, \end{aligned}$$
(47)

which allows to build the stencils \(V({\mathbf {p}})\) for \(\mathcal {F}\) by combining, as discussed in [41, p. 9], some lower-dimensional stencils \(V_1({\mathbf {p}})\) and \(V_2({\mathbf {p}})\) built independently for the spatial \({{\mathbf {x}}}\in \mathbb {R}^d\) and spherical \(\mathbf {n}\in \mathbb {S}^{d-1}\) variables.

We discretize \(\mathbb {S}^1\) uniformly, with the standard choice of stencil. We discretize \(\mathbb {S}^2\) by refining uniformly the faces of an icosahedron and projecting their vertices onto the sphere (as performed by the Mathematica\(^{\textregistered }\) Geodesate function). The resulting triangulation only features acute interior angles, in the classical Euclidean sense, and thus provides adequate stencils since in our applications \(\mathcal {F}_2({\mathbf {p}}, \dot{\mathbf {n}}) = \mathcal {C}_2({\mathbf {p}}) \Vert \dot{\mathbf {n}}\Vert \) is proportional to the Euclidean norm, see Fig. 10. We typically use 60 discretization points for \(\mathbb {S}^1\) and from 200 to 2000 points for \(\mathbb {S}^2\).

We discretize \(\mathbb {R}^d\) using the Cartesian grid \(h \mathbb {Z}^d\), where \(h >0\) is the discretization scale. The norm \(\mathcal {F}_{\varepsilon ,1}({\mathbf {p}}, \dot{{{\mathbf {x}}}}) = \mathcal {C}_1({\mathbf {p}}) \sqrt{\dot{{{\mathbf {x}}}}^T (D_\mathbf {n}^{\varepsilon })^{-1} \dot{{{\mathbf {x}}}}}\), recall the notation in (47), that is induced by the approximate Finsler metric \(\mathcal {F}_\varepsilon \) on the physical variables in \(\mathbb {R}^d\), is of Riemannian type and is strongly anisotropic. In dimension \(d \le 3\), this is the adequate setting for the adaptive stencils of [41], built using discrete geometry tools known as lattice basis reduction. The norm \(\mathcal {F}_{\varepsilon ,1}^+({\mathbf {p}}, \dot{{{\mathbf {x}}}}) = \mathcal {C}_1({\mathbf {p}}) \sqrt{\dot{{{\mathbf {x}}}}^T (D_\mathbf {n}^{\varepsilon })^{-1} \dot{{{\mathbf {x}}}}+ (\varepsilon ^{-2}-1)(\mathbf {n},{{\mathbf {x}}})_-^2}\) induced by \(\mathcal {F}_\varepsilon ^+\) on \(\mathbb {R}^d\) is Finslerian (i.e., non-Riemannian) and strongly anisotropic. In dimension \(d=2\), this is the adequate setting for the adaptive stencils of [40], built using an arithmetic object known as the Stern–Brocot tree.

Direct approximation of the Hamiltonian A new approach, not semi-Lagrangian, had to be developed for the Finsler metric \(\mathcal {F}_\varepsilon ^+\) in dimension \(d=3\) due to our failure to construct viable (i.e., with a reasonably small number of reasonably small vertices) stencils obeying the generalized acuteness property in this case, see Fig. 11. For manuscript size reasons, we only describe it informally and postpone proofs of convergence for future work.

Let \(\mathbf {n}\in \mathbb {S}^2\) and let \(\varepsilon >0\) be fixed. Then one can find nonnegative weights and integral vectors \((\rho _i,{\mathbf {w}}_i) \in (\mathbb {R}_+\times \mathbb {Z}^3)^6\), such that for all \({\mathbf {v}}\in \mathbb {R}^3\)

$$\begin{aligned} \sum _{1 \le i \le 6} \rho _i ({\mathbf {w}}_i\cdot {\mathbf {v}})^2 = (\mathbf {n}\cdot {\mathbf {v}})^2 + \varepsilon ^2 \Vert \mathbf {n}\times {\mathbf {v}}\Vert ^2. \end{aligned}$$
(48)

A simple and efficient construction of \((\rho _i, {\mathbf {w}}_i)_{i=1}^6\), relying on the concept of obtuse superbase of a lattice, is in [29] described and used to discretize anisotropic diffusion PDEs. One may furthermore assume that \((\mathbf {n}, {\mathbf {w}}_i) \ge 0\) for all \(1 \le i \le 6\), up to replacing \({\mathbf {w}}_i\) with its opposite. Then

$$\begin{aligned} \begin{array}{l} \sum _{1 \le i \le 6} \rho _i ({\mathbf {w}}_i\cdot {\mathbf {v}})_+^2 \approx (\mathbf {n}\cdot {\mathbf {v}})_+^2,\\ (\mathbf {n}\cdot \nabla _{\mathbb {R}^3} U({\mathbf {p}}))_+^2 \approx \\ \frac{1}{h^2}\sum _{i=1}^6 \rho _i (U({{\mathbf {x}}},\mathbf {n})-U({{\mathbf {x}}}-h{\mathbf {w}}_i,\mathbf {n}))_+^2, \end{array} \end{aligned}$$
(49)

up to, respectively, an \(\mathcal {O}(\varepsilon ^2)\Vert {\mathbf {v}}\Vert ^2\) and \(\mathcal {O}(\varepsilon ^2+h)\) error. Following [51], we design a similar upwind discretization of the angular part of the metric

$$\begin{aligned} \Vert \nabla _{\mathbb {S}^2} U({\mathbf {p}})\Vert ^2 \approx (\delta _\theta U({\mathbf {p}}))^2 + \frac{1}{\sin ^2 \theta } (\delta _\varphi U({\mathbf {p}}))^2, \end{aligned}$$
(50)

where \(\delta _\theta U({\mathbf {p}})\), and likewise \(\delta _\varphi U({\mathbf {p}})\), is defined as

$$\begin{aligned} \delta _\theta U({\mathbf {p}}) := \frac{1}{h} \max \{0,&U({{\mathbf {x}}},\mathbf {n})- U({{\mathbf {x}}},\mathbf {n}(\theta +h, \varphi )),\\&U({{\mathbf {x}}},\mathbf {n})-U({{\mathbf {x}}},\mathbf {n}(\theta -h, \varphi ))\}. \end{aligned}$$

We denoted by \(\mathbf {n}(\theta , \varphi ) := (\sin \theta \cos \varphi , \sin \theta \sin \varphi , \cos \theta )\) the parameterization of \(\mathbb {S}^2\) by Euler angles \((\theta , \varphi ) \in [0,\pi ] \times [0, 2 \pi ]\). Combining (49) and (50), one obtains an approximation of \(\mathcal {F}_0^{+*}({\mathbf {p}}, \mathrm{d}U({\mathbf {p}}))^2\), within \(\mathcal {O}(\varepsilon ^2+r(\varepsilon )h)\) error for smooth U, denoted . We denoted by \(r(\varepsilon ) := \max _{i=1}^6 |{\mathbf {w}}_i|\) the norm of the largest offset appearing in (48), since these clearly depend on \(\varepsilon \). Importantly, only depends on positive parts of finite differences \((U({\mathbf {p}})-U({\mathbf {q}}))_+\); hence, the system can be solved using the fast-marching algorithm, as shown in [51]. The convergence analysis of this discretization, as the grid scale h and tolerance \(\varepsilon \) tend to zero suitably, is postponed for future work, see [42, 43].

Fig. 12
figure 12

Comparison of spatial projections on \({\mathbb {R}}^3\) of exact sub-Riemannian geodesics in \(({\mathbb {R}}^{3}\times \mathbb {S}^{2}, d_{\mathcal {F}_0})\) (black curves) [25] and our numerical approximation (colored curves), with \(\xi = 1/64\) and \(\varepsilon =.1\), for five different end conditions (\(a = ((0,0,60),(0,0,1))\), \(b = ((6.4,6.4,60),(0,0,1))\), \(c = ((-60,0,60),(-1,0,0))\), \(d = ((0,60,60),1/\sqrt{6}(-1,2,1))\), \(e = ((60,60,10),(0,0,-1))\). The color indicates the error with the exact sub-Riemannian geodesics (Color figure online)

Note that this approach could also be applied in dimension \(d=2\), and to the symmetric model \(({\mathbb {M}}, d_{\mathcal {F}_{\varepsilon }})\) featuring a reverse gear. We present only a single assessment of the numerical performance of our method, see Fig. 12. We compare numerically obtained shortest paths with exact SR geodesics for a small number of endpoints that correspond to various types of curves. For fair end conditions (a, b, c) the numerical curves are close to the exact curves. For very challenging end conditions inducing torsion (d) or extreme curvature (e) the curves are further from the exact SR geodesics. An extensive evaluation of the performance of the numerics is left for future work.

Fig. 13
figure 13

Comparison between the shortest paths from end points (black) to one of the exits (green) in a model map of Centre Pompidou, for cars with (left, blue lines) and without (right, red lines) reverse gear. The yellow arrows indicate the orientation of the curve. The background colors show the distances at each position, minimized over the orientation. White points left indicate the cusps, and white points right indicate the (automatically placed) keypoints where in-place rotations take place (Color figure online)

7 Applications

To show the potential of anisotropic fast marching for path tracing in 2D and 3D (medical) images we performed experiments on each of the datasets in Fig. 3:

  • a 2D toy example using a map of Centre Pompidou,

  • a 2D retinal image,

  • two synthetic diffusion-weighted magnetic resonance imaging (dMRI) datasets, with different bundle configurations.

We use the 2D datasets to point out the difference in results when using the metric \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) and to explain the role of the keypoints when using \(\mathcal {F}_\varepsilon ^+\) that occur instead of (possibly unwanted) cusps.

On the synthetic dMRI datasets we present the first application of our methods to this type of data. We present how a cost function can be extracted from the data, and how this leads to correct tracking of bundles, similar to the 2D case. The benefits of anisotropic metrics compared to isotropic metrics are demonstrated by performing backtracking for various model parameter variations.

The experiments were performed using an anisotropic FM implementation written in C++, for \(d = 2\) described in [41]. Implementation details for \(d = 3\) will be described in future work. Mathematica 11.0 (Wolfram Research, Inc., Champaign, IL) was used for further data analysis, applying Wolfram LibraryLink (Wolfram Research, Inc., Champaign, IL) to interface with the FM library.

7.1 Applications in 2D

7.1.1 Shortest Path to the Exit in Centre Pompidou

To illustrate the difference between the models with and without reverse gear and to show the role of the keypoints for non-uniform cost, we use a map of Centre Pompidou as a 2D image, see Fig. 13. The walls (in black) have infinite cost, everywhere else the cost is 1. We place endpoints (black dots) in various places of the museum and look for the shortest path from those points to one of the two exits, regardless of the end orientation. Since there are now two exits, say at \({\mathbf {p}}_0\) and \({\mathbf {p}}_1\), the distance \(U({\mathbf {p}})\) of any point \({\mathbf {p}}\in {\mathbb {M}}\) to one of the exits is given by

$$\begin{aligned} U_{\mathcal {F}}({\mathbf {p}}) = \min \{ d_{\mathcal {F}}({\mathbf {p}}_0,{\mathbf {p}}), d_{\mathcal {F}}({\mathbf {p}}_1,{\mathbf {p}})\}. \end{aligned}$$
(51)

We use a resolution of \(N_x \times N_y \times N_o = 706 \times 441 \times 60\). The cost in this example is only dependent on position, but constant in the orientation. Moreover, we use \(\mathcal {C}_1 = \mathcal {C}_2\) and \(\varepsilon = 1/10\).

On the left of Fig. 13 we see optimal paths (in blue) obtained using the Finsler metric \(\mathcal {F}= \mathcal {F}_\varepsilon \). The fast-marching algorithm successfully connects all endpoints to one of the exits. Some of the geodesics have cusps, indicated with white points, resulting in backward motion on (a part of) the curve. The colors show the distance \(U_{\mathcal {F}_\varepsilon }\) as above, at each position minimized over the orientations.

On the right, the optimal paths using the asymmetric Finsler metric \(\mathcal {F}= \mathcal {F}_\varepsilon ^+\) are shown in red. The curves no longer exhibit cusps, but have in-place rotations (white dots) instead. These keypoints occur in this example on corners of walls. (Due to the fact that \(\varepsilon \) is small but nonzero, there can still be small sideways motion.) The shortest paths for this model are successions of sub-Riemannian geodesics and of in-place rotations, which can be regarded as reinitializations of the former: the orientation is adapted until an orientation is found from which the path can continue in an optimal sub-Riemannian way.

We stress that the fast-marching algorithm has no special treatment for keypoints, which are only detected in a postprocessing step. We observe that keypoints are automatically positioned at positions where it makes sense to have an in-place rotation. Small differences in the distance maps between \(U_{\mathcal {F}_\varepsilon }\) left and \(U_{\mathcal {F}_\varepsilon ^+}\) right can be observed: the constrained model usually has a slightly higher cost right around corners.

Fig. 14
figure 14

Left: SR geodesics (in blue) in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) with given boundary conditions (both forward and backward). Right: SR geodesics (in red) in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\) with the same boundary conditions. We recognize one end-condition case where on the left we get a cusp, whereas on the right we have a keypoint (with in-place rotation) precisely at the bifurcation (Color figure online)

7.1.2 Vessel Tracking in Retinal Images

Another application is vessel tracking in retinal images, for which the model with reverse gear and the fast-marching algorithm have shown to be useful in [8, 53]. Although the algorithm works fast and led to successful vessel segmentation in many cases, in some cases, in particular bifurcations of vessels, cusps occur. Figure 14 shows one such example on the left. The image has resolution \(N_x \times N_y \times N_o = 121 \times 114 \times 64\). The cost is constructed as in [8]: the image is first lifted using cake wavelets [23], resulting in an image on \(\mathbb {R}^2 \times \mathbb {S}^1\). For the lifting and for the computation of the cost function from the lifted image, we rely on their parameter settings. We use \(\mathcal {C}_1 = \xi \mathcal {C}_2\), with \(\xi = 0.02\) (top) and \(\xi = 0.04\), and \(\varepsilon = 0.1\). The orientations of the end conditions A, B and C (white arrows) are chosen tangent to the vessel, where we considered both the forward case and the backward case. The vessel with end condition C is particularly challenging, since it comes across a bifurcation. For the tracking of this vessel, we indicated the orientation with yellow arrows.

The unconstrained model \(({\mathbb {M}},d_{\mathcal {F}_\varepsilon })\), corresponding to the blue tracks on the left half of Fig. 14, gives a correct vessel tracking for the forward end conditions of A and B, for both values of \(\xi \). This is obviously the better choice than the backward cases. However, for end condition C, neither the forward or backward with neither values of \(\xi \) gives a vessel tracking without cusps. On the other hand, if we use the constrained model \(({\mathbb {M}},d_{\mathcal {F}^+_\varepsilon })\), we obtain an in-place rotation or keypoint in the neighborhood of the bifurcation. Typically a higher value of \(\xi \) brings these points closer to the bifurcation. Taking the backward end conditions in combination with this model, we see in some cases that end locations are first passed by the vessel tracking algorithm, until it reaches a point where in-place rotation is cheaper, and then returns to the end position.

Fig. 15
figure 15

Comparison of the results of backtracking on a 2D plane in a synthetic dMRI dataset on \(\mathbb {M}={\mathbb {R}}^{3}\times \mathbb {S}^2\). In case A the default parameters for \(\sigma \), \(\xi \) and \(\varepsilon \) are applied resulting in a global minimizing geodesic (left) and its corresponding distance map (right). Case B reflects the influence of the data term \(\sigma \). Case C reflects the isotropic Riemannian case. Case D reflects a high cost for moving spatially and results in curves that resemble a piecewise linear curve. The distance map is illustrated using a glyph visualization in which the size of the glyph corresponds to \(\exp (-d_{\mathcal {F}_\varepsilon }({\mathbf {p}}_s,{\mathbf {p}}_e)/s)^p\), with \(p = s = 3\), where \({\mathbf {p}}_s\) is the seed location, \({\mathbf {p}}_e\) is a location on a glyph, and s and p are chosen based on visualization clarity

7.2 Application to Diffusion-Weighted MRI Data

DW-MRI is a magnetic resonance technique for noninvasive measurement of water diffusion in fibrous tissues [46]. In the brain, diffusion is less constrained parallel to white matter fibers (or axons) than perpendicular to them, allowing us to infer the paths of these fibers. The diffusion measurements are distributions \((\mathbf {y}, \mathbf {n}) \mapsto U(\mathbf {y},\mathbf {n})\) within the manifold \({\mathbb {M}}\) for \(d=3\). From these measurements a fiber orientation distribution (FOD) can be created, yielding a probability of finding a fiber at a certain position and orientation [60].

Backtracking is performed through forward Euler integration of the backtracking PDE involving the intrinsic gradient, following Theorem 4 and Eqs. (28) and (31). The spatial derivative was implemented as a first-order Gaussian derivative. The angular derivatives are implemented by a first-order spherical harmonic derivative. The latter has the key advantage that in a spherical harmonic basis exact analytic computations can be done. Here, one must rely on two-fold recursions in [28, Lemma 2 & 4], so that the poles due to a standard Euler angle parameterization of \(\mathbb {S}^{2}\) do not appear in exact recursions of Legendre polynomials!

If data-driven factors \(\mathcal {C}_{1}\) and \(\mathcal {C}_{2}\) come in a spherical sampling or if one wants to work in a spherical sampling (e.g., higher-order tessellation of the icosahedron) in a fast-marching method, then one can easily perform the pseudo-inverse of the discrete inverse spherical harmonic transform, where one typically keeps the number of spherical harmonics very close to the number of spherical sampling points, so that maximum accuracy order is maintained for computing angular derivatives in the intrinsic gradient descent of Theorem 4.

7.2.1 Construction of the Cost Function

The synthetic dMRI data are created by generating/simulating a fiber orientation density (FOD) of a desired structure. There are sophisticated methods for this, e.g., [12, 17], but evaluation on phantom data constructed with these tools is left for future work. Here we use a basic but practical method on two simple configurations of bundles in \(\mathbb {R}^3\), the ones on the bottom row in Fig. 3. In each voxel inside a bundle, we place a spherical \(\delta \)-distribution, with the peak in the orientation of the bundle. We convolve each \(\delta \)-distribution with an FOD kernel that was extracted from real dMRI data and is related to the dMRI signal measured in a voxel with just a single orientation of fibers. Spherical rotation of the FOD kernel is done in the spherical harmonics domain by use of the Wigner D-matrix to prevent interpolation issues. We compose from all distributions an FOD function \(W: {\mathbb {M}}\rightarrow {\mathbb {R}}^+\). This function evaluates to high values in positions/orientations that are inside and aligned with the bundle structure.

We use the FOD W to define the cost function \(\frac{1}{1+\sigma } \le \mathcal {C}\le 1\) via

$$\begin{aligned} \mathcal {C}({\mathbf {p}})&= \frac{1}{1 + \sigma \left| \frac{W_+({\mathbf {p}})}{\Vert W_+ \Vert _\infty } \right| ^p} \end{aligned}$$

where \(\sigma \ge 0\), \(p \in \mathbb {N}\), with \(\Vert {\cdot } \Vert _\infty \) the sup-norm and \(W_+({\mathbf {p}}) = {\text {max}}\{0,W({\mathbf {p}})\}\). The cost function \(\mathcal {C}\) induces the following spatial and angular cost functions \((\mathcal {C}_1, \mathcal {C}_2)\):

$$\begin{aligned} \mathcal {C}_1({\mathbf {p}}) = \xi \mathcal {C}({\mathbf {p}}), \qquad \mathcal {C}_2({\mathbf {p}}) = \mathcal {C}({\mathbf {p}}). \end{aligned}$$

The implementation of non-uniform cost is comparable to the application of vessel tracking in retinal images in \(d=2\) by Bekkers et al. [8].

Fig. 16
figure 16

Backtracking of minimizing geodesics of the model \((\mathbb {M},d_{\mathcal {F}_\varepsilon ^+})\) without reverse gear (top) and the model with reverse gear \((\mathbb {M},d_{\mathcal {F}_\varepsilon })\) (bottom) using the model parameters of configuration A (\(\sigma =3.0\), \(\xi =0.1\) and \(\varepsilon =0.1\)) for various end conditions

7.2.2 Influence of Model Parameters

The first synthetic dataset consists of a curved and a straight bundle (tube), which cross at two locations as shown in Fig. 15. The experiments using metric \(\mathcal {F}_\varepsilon \) demonstrate the effect of the model parameters on the geodesic backtraced from the bottom-left to the seed location at the bottom-right of the curved bundle. A distance map is computed for parameter configuration A (Fig. 15, right) in which suitable values are used for the data term \(\sigma \), and the fast-marching parameters \(\xi \) and \(\varepsilon \). Furthermore, fixed values are used for data sharpening \(p=3\), spatial smoothing \(\sigma s=0.5\), forward Euler integration step size \(\delta t=0.04\) and a gridscale of 1. By use of these parameters the global minimizing geodesic (Fig. 15A, left) is shown to take the longer, curved route. In parameter configuration B the data term \(\sigma \) is lowered, which creates a geodesic that is primarily steered by internal curve-dependent costs and is shown to take the shortcut route (Fig. 15B). Setting \(\varepsilon =1\) in configuration C leads to a Riemannian case where the geodesic resembles a piecewise linear curve. In configuration D the relative cost of spatial movement relative to angular movement is high, leading to geodesics with shortcuts.

Fig. 17
figure 17

Left: 3D configuration of bundles and a visualization of part of the synthetic dMRI data. Middle: backtracking of geodesics in \((\mathbb {M}, d_{\mathcal {F}_{\varepsilon }})\) from several points inside the curves to endpoints of the bundle is successful when using \(\varepsilon = 0.1\). Right: when using \(\varepsilon = 1\), the dominant red bundle can cause the paths from the green bundle to deviate from the correct structure

We conclude that configuration \(\mathbf A \) with a relatively strong data term, large bending stiffness (\(\xi ^{-1}=10\)) and a nearly SR geometry (\(\varepsilon =0.1\)) avoids unwanted shortcuts.

7.2.3 Positive Control Constraint

For the application of FM in dMRI data it is desirable that the resulting geodesic is not overly sensitive to the boundary conditions, i.e., the placement and orientation of the geodesic tip. Furthermore, since neural fibers do not form cusps, these are undesirable in the backtracking results. In Fig. 16 the backtracking results are shown for the cases without reverse gear \(\mathcal {F}_\varepsilon ^+\) (top) and the model with reverse gear \(\mathcal {F}_\varepsilon \) (bottom). The distance map for \(\mathcal {F}_\varepsilon ^+\) was computed by the iterative method implementing the forward Reeds–Shepp car, while for \(\mathcal {F}_\varepsilon \) the FM method was used.

We conclude that without the positive control constraint, small changes in tip orientation cause large variations in the traced geodesic in the metric space \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\), whereas the traced geodesic in the quasi-metric space \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\) is both more stable and more reasonable.

7.2.4 Robustness to Neighboring Structures

A pitfall of methods that provide globally minimizing curves using a data term is that dominant structures in the data attract many of the curves, much like the highway usually has the preference for cars rather than local roads. This phenomenon is to a certain extent unwanted in our applications, and we illustrate with the following example that it can be circumvented using a sub-Riemannian instead of Riemannian metric. We use the dataset as introduced in Fig. 3. It consists of one bundle that has torsion (green) and that crosses with another bundle (blue), and a third bundle (red) that is parallel with the first in one part. The cost in these bundles is constructed in the same way as above, but now the cost in the red bundle is twice as low as in the other bundles. A small part of the data is visualized on the left of Fig. 17. These data are used to construct the cost function as explained above.

The resolution of the data is \(N_x \times N_y \times N_z \times N_o = 32 \times 32 \times 32 \times 162\). Again we use \(\mathcal {C}_1 = \xi \mathcal {C}_2 = \mathcal {C}\), with \(\xi = 0.1\). To have comparable parameters as in the previous experiment, despite increasing the amplitude in one of the bundles by a factor 2, we choose to construct the cost using parameter \(p = 3\), and \(\sigma = 3 \cdot 2^p = 24\). From various positions inside the green, blue and red bundle, the shortest paths to the end of the bundles computed by the FM algorithm nicely follow the shape of the actual bundles, when we choose \(\varepsilon =.1\) small, corresponding to an almost SR geodesic. This is precisely what prevents the geodesic in the green bundle to drift into the (much cheaper) red bundle. We show on the right in Fig. 17 that choosing \(\varepsilon = 1\), corresponding to having an isotropic Riemannian metric, this unwanted behavior can easily occur.

We conclude that the SR geodesics in \(({\mathbb {M}}= {\mathbb {R}}^3 \times \mathbb {S}^2, d_{\mathcal {F}_\varepsilon })\) with \(\varepsilon \ll 1\) are less attracted to parallel, dominant structures than isotropic Riemannian geodesics.

8 Conclusion and Discussion

We have extended the existing methodology for modeling and solving the problem of finding optimal paths for a Reeds–Shepp car to 3D and to a case without reverse gear. We have shown that the use of the constrained model leads to more meaningful shortest paths in some cases and that the extension to 3D has opened up the possibility for tractography in dMRI data.

Instead of using a hard constraint on the curvature as in the original paper by Reeds and Shepp [50], we used symmetric and asymmetric Finsler metrics. We have introduced these metrics, \(\mathcal {F}_0\) and \(\mathcal {F}_0^+\), for \(d = 2, 3\), such that they allow for curves that have a spatial displacement proportional to the orientation, with a positive proportionality constant in the case of \(\mathcal {F}_0^+\).

We have captured theoretically some of the nature of the distance maps and geodesics following from the new constrained model. We have shown in Theorem 1 that both models are globally controllable, but only the unconstrained model is also locally controllable.

The sub-Riemannian and sub-Finslerian nature is difficult to capture numerically. To this end, we introduced approximating Finsler metrics \(\mathcal {F}_\varepsilon \) and \(\mathcal {F}_\varepsilon ^+\) that do allow for numerical approaches. We have shown in Theorem 2 that as \(\varepsilon \rightarrow 0\), the distance map converges pointwise and the geodesics converge uniformly, implying that for sufficiently small \(\varepsilon \) we indeed have a reasonable approximation of the \(\varepsilon = 0\) case.

We have analyzed cusps in the metric space \((\mathbb {M},d_{\mathcal {F}_{0}})\) and keypoints in the quasi-metric space \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) which occur on the interface surface \(\partial \mathbb {M}_{\pm }\) given by (30). The analysis, for uniform costs, is summarized in Theorem 3. We have shown that cusps are absent in \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) for \(\varepsilon >0\) and that keypoints in \((\mathbb {M},d_{\mathcal {F}_{0}^+})\) occur only on the boundary, and we provided analysis on how this happens. In Theorem 4 we have shown how minimizing geodesics in \(({\mathbb {M}}, d_{\mathcal {F}_\varepsilon })\) and \(({\mathbb {M}}, d_{\mathcal {F}^+_\varepsilon })\) can be obtained from the distance maps with an intrinsic gradient descent method.

To obtain solutions for the distance maps and optimal paths, we used a fast-marching method. By formulating an equivalent problem to the minimization problem for optimal paths in the form of an eikonal equation, the FM method can be used using specific discretization schemes. We briefly compared the numerical solutions using \(\mathcal {F}_\varepsilon \) with \(\varepsilon \ll 1\) with the exact sub-Riemannian geodesics in SE(2) with uniform cost, which showed sufficient accuracy for not too extreme begin and end conditions.

To show the use of our method in image analysis, we have tested it on two 2D problems and two 3D problems. All four experiments confirm that the combination of the eikonal PDE formulation, the fast-marching method and the construction of the non-uniform cost from the images, results in geodesics that follow the desired paths. From the experiment on an image of Centre Pompidou, with constant, finite cost everywhere except for the walls, it followed that instead of having cusps when using the Finsler metric \(\mathcal {F}_\varepsilon \), we get keypoints (in-place rotations) when using \(\mathcal {F}_\varepsilon ^+\). These keypoints turn out to be located on logical places in the image. On the 2D retinal image we showed that the Finsler metric \(\mathcal {F}_\varepsilon ^+\) gives a new tool for tackling vessel tracking through bifurcations. We see that keypoints appear close to the bifurcation, leading to paths that more correctly follow the data.

The basic experiments on 3D show advantages of the model \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) with \(0 < \varepsilon \ll 1\) over the model \((\mathbb {M},d_{\mathcal {F}_{1}})\) in the sense that the minimizing geodesics better follow the curvilinear structure and deal with crossings and nearby parallel bundles (even if torsion is present). Furthermore, we have shown the advantage of model \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }^+})\) with \(0 < \varepsilon \ll 1\), compared to \((\mathbb {M},d_{\mathcal {F}_{\varepsilon }})\) in terms of stability, with keypoints instead of cusps.

The strong performance of the Reeds–Shepp car model in 2D vessel tracking and positive first results on artificial dMRI data, encourages us to pursue a more quantitative assessment of the performance in both 3D vessel tracking problems and in actual dMRI data. Such 3D vessel tracking problems are encountered in, for example magnetic resonance angiography. In future work we will elaborate on the implementation and evaluation of the fast-marching and the iterative PDE implementation of [27, Appendix B]. Furthermore, we aim to integrate locally adaptive frames [26] into the Finsler metrics \(\mathcal {F}_{\varepsilon }\), \(\mathcal {F}_{\varepsilon }^+\), for a more adaptive vessel/fiber tracking.