Arc Length Parameterization
A fast and accurate algorithm for re-parameterizing arbitrary three dimensional curves by their arc length.
Introduction
The previous posts covered a set of different three dimensional curves, namely B-splines, NURBS, arcs, béziers and splines of degree 1, 3, 5. All these curves can be generalized as a function \(f : [0,1] \to \mathbb{R}^3\). As the parameter \(t\) in \(f(t)\) is increased from \(0\) to \(1\) the point \((x(t), y(t), z(t))\) moves along the curve from the beginning to the end.
Now imagine being at a value of \(t\) say \(t_0\), moving by a little bit \(\Delta t\) gives a new point on the curve, the distance traveled along the curve itself is \(\Delta l\) where \(l\) is arc length. It is critical to realize that \(\Delta t \neq k \Delta l\). In other words, changing the parameter \(t\) by a little bit \(\Delta t\) does not yield a proportional change in the distance traveled along the curve itself \(\Delta l\). However in the case of linear splines and arcs, \(\Delta t = k \Delta l\) actually holds. This is just due to the nature of these curves being naturally arc length parameterized.
The implication of \(\Delta t \neq k\Delta l\) is that while moving along the curve \(f(t)\) with the parameter \(t\), the speed along the curve is not constant. If the speed is not constant it might be unsuitably slow or fast on certain sections of the curve. Most importantly, it means that it cannot be controlled to a desired speed. This is especially important in the application of robotics since control is crucial. Even in the case of just wanting to move the arm from point A to B, where the actual speed between A and B does not matter. Too high speed and you might just break the arm itself.
The goal of arc length parameterization is to reparameterize the curve \(f(t)\) to \(f(t(l))\) such that instead of being parameterized by \(t\) it is parameterized by \(l\in[0,1]\). The parameter \(t\) is now given by a function which transforms \(l\) to the corresponding \(t\) value. Thus, the new curve \(f(t(l))\) has the desired property of having constant speed.
The algorithm presented in this post is a heavily modified and improved version of the algorithm in the paper “Arc Length Parameterization of Spline Curves” by John W. Peterson. As always, the code presented in this post is ANSI C, not all of the sub-routines and or macros are explained or shown here. All of the relevant source code is available at github.com/fullnitrous/arm in src/core
.
Overview
The generalized trajectory \(f(t)\) is explicitly.
\[ f(t) = \begin{bmatrix} x(t) \\ y(t) \\ z(t) \end{bmatrix}, \quad t \in [0,1] \]
Where the speed (not velocity) is.
\[ \begin{aligned} v(t) &= \sqrt{x'(t)^2 + y'(t)^2 + z'(t)^2} \\ &= \sqrt{f'(t) \cdot f'(t)} \end{aligned} \]
Where the goal of re-parameterization by arc length is to achieve.
\[ \left | \frac{d}{dl}f(t(l)) \right | = v(t(l))t'(l) = 1 \]
The arc length function \(L(t)\) gives the length along the curve \(f(t)\) at \(t\) and is given by integrating the speed from \(0\) to \(t\).
\[ \begin{aligned} L(t) &= \int_0^t v(t)dt \\ &= \int_0^t \sqrt{x'(t)^2 + y'(t)^2 + z'(t)^2}dt \end{aligned} \]
Problem
In order to re-parameterize \(f(t)\) by its arc length one must define a function which maps the length \(l\) along the curve to the corresponding parameter value \(t\). Since the function \(L(t)\) gives length given \(t\), the inverse \([L(t)]^{-1}\) gives \(t\) given length. This is exactly the function written as \(t(l)\) required to map length to \(t\). However, it is analytically impossible to find the inverse of the integral which defines \(L(t)\), in the vast majority of cases the even integral itself cannot be solved algebraically.
Intuitively thinking about the function \(L(t)\) being a function that describes length along a curve, it must be monotonically increasing. It does not make sense to move along a curve then not have increased the total distance moved. The function \(L(t)\) being monotonically increasing is also easily proven mathematically. The sum of three squared non-zero real numbers is always positive. The square root of which is also positive. Integrating something from \(0\) to \(t\) that is always positive will be monotonically increasing as \(t\) is increased.
Since \(L(t)\) is monotonically increasing such that every possible \(t\in[0,1]\) has a unique length associated to it. Then \(L(t)\) must always be invertible. Therefore the inverse exists and can be numerically solved for all \(f(t)\). The goal of the algorithm is essentially the following.
\[ \left | \frac{d}{dl}f(t(l)) \right | = v(t(l))t'(l) \approx 1 \]
The challenge of designing an algorithm to do this is that the numerical approximation for \(t(l)\) has to be precise, memory efficient and fast to compute. All meanwhile the function \(t(l)\) is particularly hard to approximate. This is due to the fact that on certain regions for particular \(f(t)\), the speed might come close to \(0\). Thus the slope of \(L(t)\) becomes near horizontal, if the slope of \(L(t)\) is near horizontal it is near vertical for \([L(t)]^{-1}\). A function which misbehaves in this sense of having a derivative that may suddenly spike with an extremely high derivative is hard to approximate precisely. Essentially, the main reason why \(t(l)\) is hard to approximate is that derivative may temporarily become near infinite.
Algorithm Design
The method for arc length parameterization in the aforementioned paper “Arc Length Parameterization of Spline Curves” essentially works in the following way. For the differentiable curve \(f(t)\) to be re-parameterized. The arc length integral can be efficiently computed in \(\mathcal{O}(1)\) complexity to several digits of precision using Gauss-Legendere quadrature. With that, any \(L(t)\) is numerically defined. The re-parameterization function is the inverse of \(L(t)\), therefore, the paper uses Newton’s method for root finding to solve the equation \(L(t) = l\). Thus, the procedure to compute \(t(l)\) given \(l\) is to solve the equation \(L(t)=l\). However, it would be too slow to solve an equation in order to evaluate a single \(t\). Which is why the procedure for computing \(t(l)\) is used to then approximate \(t(l)\) with Chebyshev approximation. Essentially Gauss-Legendere quadrature and Newton’s method for root finding make it possible to compute \(t(l)\) at all, very precisely but slowly. Which is why it is then approximated using Chebyshev approximation. The algorithm is also adaptive in the sense that it dynamically subdivides intervals of \(t\) for \(f(t)\) to be approximated by a single Chebyshev polynomial of degree \(K\). These Chebyshev polynomials are then later converted to a spline function which becomes the final re-parameterization function. Note that the paper is re-parameterizing two-dimensional splines, in this post all curves to be re-parameterized are three-dimensional.
The algorithm presented in the paper was tested for several curves and I discovered that it may completely fail in some cases. Failing in the sense of requiring a very high subdivision depth in order to achieve the desired accuracy. Or completely run out of memory trying to achieve that accuracy. Note that this was in particular pathological cases, nevertheless these cases occur frequently enough to cause a problem.
Therefore I decided to improve on the algorithm with the goal of achieving higher accuracy with less memory and computational power. The baseline algorithm struggles to approximate \([L(t)]^{-1}\) where the derivative becomes near infinite. Regardless of how many subdivisions or how high of a degree for the Chebyshev polynomial it still cannot be properly approximated.
One thing I tried was to impose a low limit on how much you could subdivide the curve. If the Chebyshev approximation was still failing after the subdivision limit then rational function approximation was applied on the failing intervals. The idea behind this is that a rational function should be better at approximating functions with singularities. Though when testing, it did not even improve anything.
Another thing was to simply use Taylor series approximation specifically where the singularities appeared by bracketing the pathological intervals as much as possible. Therefore applying the Taylor approximation on as small of an interval as possible containing the problem. This did not work either. Other types of approximation were also tested on these bracketed intervals but that still did not solve the problem. Out of the types of approximation, Chebyshev was still the one that yielded the best results. However, dynamically increasing the degree of the Chebyshev approximation on pathological intervals did not fix the issue either.
Then finally, after months of banging my head against the wall and testing different things. I thought that, since the difficulty in approximating the re-parameterization curve lies in the derivative. Then why not approximate the derivative of the re-parameterization function, then integrate the approximation which then in turn gives the re-parameterization function. Approximating the derivative should be able to better capture the sometimes erratic behaviour.
This method of approximating the derivative of the re-parameterization function then integrating it significantly increased the accuracy and reliability of the algorithm. So much so that the degree of the Chebyshev approximation could be set to a relatively low constant value. Then the algorithm is only adaptive in the sense of subdiving the intervals until a high enough accuracy is obtained.
I am not completely sure if my explaination for why my solution worked is reasonable or not. It just intuitively makes sense, I do not really care either, it seems to work, that is good enough.
Prerequisite Algorithms
The routines for approximation were implemented with the help of the book “Numerical Recipes in C, The Art of Scientific Computing” by William H. Press et al. These were Chebyshev approximation, Newton’s and the bisection method for root finding and Gauss-Legendere quadrature.
The prerequisite algorithms will just be covered in terms of the types they use and the prototypes of the routines / functions. Again for the exact source code visit the GitHub.
Two types of functions to be approximated are defined as function pointers of scalar and vec2_t
return values. There parameters to these two types of functions is a general anything input void*
and a variable / parameter of type double
.
The Chebyshev approximation function is given by the following type chebyshev_t
. Consisting of the interval a
to b
on which the approximation is defined. The coefficients c
and the coefficients for the integral cint
. Also the length of the coefficients n
, in other words the degree of the Chebyshev polynomial.
With this type, the following routines for Chebyshev approximation are defined. One just to get an initialized chebyshev_t
called approx_chebyshev_init
. The routine approx_chebyshev
which given a APPROX_FUNC1
approximates it and saves the coefficients to apx
. The routine approx_chebyshev_int
integrates the the existing approximation in apx
stored in the array c
and saves the coefficients for the integral to cint
in apx
. Lastly approx_eval_chebyshev
evaluates the Chebyshev approximation apx
given x
. If the parameter der
is set to -1
the integral of the approximation is evaluated, otherwise just the appproximation.
chebyshev_t approx_chebyshev_init(double* m, double a, double b, int n);
void approx_chebyshev(APPROX_FUNC1, void* f_, chebyshev_t* apx);
void approx_chebyshev_int(chebyshev_t* apx);
double approx_eval_chebyshev(chebyshev_t* apx, int der, double x);`
In order to solve an equation a root finding algorithm is required. The type solvecfg_t
is the configuration for solving an equation using a hybrid root finding algorithm. The configuration requires the interval a
to b
where the root is known to lie within. The tolerence tol
for the answer of the root. The amount of bisection passes bi_guess_n
before Newton’s method if the algorithm is set to be hybrid by only_bisect = 0
. The maximum amount of bisection iterations bi_nmax
if Newton’s fails or if only_bisect=1
, and the maximum amount of iterations using Newton’s method nr_n
.
typedef struct approx_solve_config {
double a, b, tol;
int bi_guess_n, bi_nmax, only_bisect, nr_n;
} solvecfg_t;
Once a configuration is set, a function f
of prototype APPROX_FUNC2
can be solved with the right-hand side of y
. The reason for why f
is of prototype APPROX_FUNC2
is that Newton’s method requires the value and derivative of the function to be solved for. Therefore the x
component of the return vector is the value and the y
component is the derivative. The hybrid algorithm works by firstly doing bi_guess_n
bisection iterations narrowing the interval where the root lies then doing nr_n
iterations with Newton’s method. If the algorithm is set to only use bisection then it will at maximum do bi_nmax
bisection iterations. This is again also used as a fallback if Newton’s method fails. Note that it is assumed that the interval defined by a
and b
is known to contain exactly one root, otherwise the algorithm does not work.
The last prequisite is Gauss-Legendere quadrature which is simply just the function.
Which calculates the integral of f
with prototype APPROX_FUNC1
between a
and b
using 32 weights and abscissas for the quadrature. I chose 32 because it is precise enough while not being too slow.
Algorithm
Firstly some of the root finding configuration parameters are defined.
#define ALP_SOLVE_TOL 1e-10
#define ALP_SOLVE_GUESS_N 5
#define ALP_SOLVE_NMAX 50
#define ALP_SOLVE_NR_N 10
The value ALP_CHK
is the degree of the Chebyshev polynomials used. ALP_CHECKRES
is the resolution (number of points) to check the result of the algorithm. The guess factor (explained later) given by ALP_GUESS_FAC
and ALP_MIN_INTERVAL
gives the smallest possible subdivided interval length.
A global function pointer alp_integrator
defines what function to use for integration, this level of modularity is not really required but was useful for testing.
It is set to.
A function context alpctx_t
defines an algorithm context which is used to pass around information to helper functions, usecase will become clear later.
The type alp_t
is the re-parameterization function \(t(l)\). It consists of a parameter array u
since the re-parameterization function is defined piece-wise by Chebyshev polynomials. Then the coefficients a
. Note that it is within a union
since for some functions only a scaling factor s
is required for translating length \(l\) to parameter \(t\). The member n
is simply the length of a
.
typedef struct arc_length_parameterization {
double *u;
union {
double* a;
double s;
} c;
int n;
} alp_t;
The sub-routine alp_speed
simply returns the magnitude of the derivative for \(f(t)\). Note that the function \(f(t)\) is evaluated using eval_e3d
. This evaluation function is a generalized evaluation function for all the different types of curves, depending on what curve it is, it will call the appropriate evaluation function for example eval_nurbs
.
The sub-routine ttls
which stands for time to length speed calculates \(l\) for \(t\) and the speed at \(t\).
vec2_t ttls(void* data, double t) {
vec2_t o;
alpfctx_t* ctx;
ctx = (alpfctx_t*)data;
o.x = alp_integrator(&alp_speed, ctx->f, ctx->a, t);
o.y = alp_speed(ctx->f, t);
return o;
}
This is then used by ltt
which stands for length to time which is simply the function \(t(l)\).
double ltt(void* data, double l) {
alpfctx_t* ctx;
ctx = (alpfctx_t*)data;
if(l == 0.0) { return ctx->a; }
if(l == 1.0) { return ctx->b; }
ctx->cfg->a = ctx->a;
ctx->cfg->b = ctx->b;
return approx_solve(&ttls, data, ctx->cfg, l);
}
The routine lttder
computes the derivative of the ltt
function which is what is approximated in the end.
The routine alp_init
is used to initialize type re-parameterization function alp_t
. The memory size required is given by macros.
alp_t alp_init(double* mem, int n_upper) {
alp_t alp;
if(mem == NULL) {
alp.u = alp.c.a = NULL;
return alp;
}
alp.u = mem; mem += n_upper + 1;
alp.c.a = mem;
return alp;
}
The routine alp_eval
evaluates \(t(l)\) given \(l\) and the re-parameterization function itself alp
.
double alp_eval(alp_t* alp, double l) {
int k;
chebyshev_t ch;
if(alp->u == NULL) { return l / alp->c.s; }
k = eval_findinterval(alp->u, l, alp->n);
ch.n = ALP_CHK;
ch.a = 0.0;
ch.b = alp->u[k+1] - alp->u[k];
ch.c = alp->c.a + ALP_CHK * k;
return approx_eval_chebyshev(&ch, 0, l - alp->u[k]);
}
Finally, the algorithm itself alp_fit
. The function takes in a general three-dimensional curve f
and saves the re-parameterization function to an initialized alp
. The function has an upper limit for how many piecewise Chebyshev functions can fit in alp
and a specified tolerance tolerance
.
double alp_fit(e3d_t* f, alp_t* alp, int n_upper, double tolerance) {
int i, j;
double l, t, tder, vdev, arc_length, maxdev, guess;
alpfctx_t fctx;
solvecfg_t solvecfg;
chebyshev_t ch;
double chmem[APPROX_CHEBYSHEV_MEMSIZE_WITH_INT(ALP_CHK)];
ch = approx_chebyshev_init(chmem, 0.0, 0.0, ALP_CHK);
arc_length = 0.0;
The algorithm works by firstly checking three edge-cases. The first one being if the type of the curve is a linear spline. In which case it just computes the exact arc length without Gauss-Legendere quadrature then sets the scaling factor to be the arc length such that the total arc length is \(1\).
if(f->type == E3D_PPOLY1) {
j = eval_e3dnintervals(f);
for(i = 0; i < j-1; i++) {
f->k = i;
arc_length += (eval_e3dbound(f, i+1)-eval_e3dbound(f, i))*__eval_e3d(f, 0.0).m;
}
alp->u = NULL;
alp->c.s = arc_length;
return arc_length;
}
The second edge case is if the curve is an arc. In which case the arc length is simply the speed evaluated on any point of the arc. The scaling factor is then set to this.
if(f->type == E3D_ARC) {
arc_length = eval_e3d(f, 0.5).m;
alp->u = NULL;
alp->c.s = arc_length;
return arc_length;
}
The last edge-case is if the curve is a bézier curve, then the guess
is set to 1.0
. Otherwise the guess
is set to the guessing factor ALP_GUESS_FAC
times the amount of sub-functions of the piecewise defined function f
. The reasoning behind why this guessing thing is here is the following. The algorithm works by approximating an interval of the function then subdividing it until the approximation is within tolerance. Now, it is highly unlikely that the approximation of the interval covering the entire function will be within tolerance. Therefore it makes sense to skip that computation altogether to save on resources. It is highly likely that an approximation covering an interval with the average length a sub-function in \(f(t)\) will be within tolerance. But it is also somewhat likely that the approximation will be good enough covering double the average length of a sub-function. Therefore, a guessing factor is multiplied by the average length. So that the algorithm slightly overshoots the maximum interval length. This saves on computing resources by avoiding unnecessary computations while not wasting memory by making too many small intervals.
if(f->type == E3D_BEZIER) {
guess = 1.0;
} else {
guess = ALP_GUESS_FAC / eval_e3dnintervals(f);
guess = guess < 1.0 ? guess : 1.0;
}
solvecfg.tol = ALP_SOLVE_TOL;
solvecfg.bi_guess_n = ALP_SOLVE_GUESS_N;
solvecfg.bi_nmax = ALP_SOLVE_NMAX;
solvecfg.only_bisect = 0;
solvecfg.nr_n = ALP_SOLVE_NR_N;
fctx.cfg = &solvecfg;
fctx.f = f;
*(alp->u) = 0.0;
j = 0;
fctx.a = 0.0;
fctx.b = guess;
The main loop of the algorithm continues to run while the algorithm has yet not covered the entire function \(f(t)\).
The first thing the algorithm does is to calculate the arc length of the current interval. Then it approximates the derivative of the re-parameterization function \(t(l)\) with Chebyshev approximation, then integrates it to get \(t(l)\).
maxdev = 0.0;
fctx.l = alp_integrator(&alp_speed, f, fctx.a, fctx.b);
ch.b = fctx.l;
approx_chebyshev(<tder, (void*)&fctx, &ch);
approx_chebyshev_int(&ch);
The algorithm then proceeds to calculate the maximum velocity deviation (percentage) which is saved to maxdev
.
for(i = 0; i < ALP_CHECKRES; i++) {
l = range(ch.a, ch.b, i, ALP_CHECKRES);
t = fctx.a + approx_eval_chebyshev(&ch, -1, l);
tder = approx_eval_chebyshev(&ch, 0, l);
vdev = fabs(alp_speed(f, t) * tder - 1.0);
if(vdev > maxdev) { maxdev = vdev; }
}
If the current interval is too small it means that the algorithm went too far subdividing the interval so just quit because it probably will not fix anything. Note that the subdivision happened in the previous iteration of the main loop.
If the maximum velocity deviation is outside of tolerance then subdivide the interval.
Otherwise if its within tolerance and there is space left in the re-parameterization function alp
then save the current approximation and move to the next interval.
} else if(j < n_upper) {
*(alp->u+1+j) = arc_length + fctx.l;
for(i = 0; i < ALP_CHK; i++) {
*(alp->c.a+j*ALP_CHK + i) = *(ch.cint + i);
}
*(alp->c.a+j*ALP_CHK) += 2.0 * fctx.a;
arc_length += fctx.l;
fctx.a = fctx.b;
fctx.b = fctx.b + guess < 1.0 ? fctx.b + guess : 1.0;
j++;
} else {
return -1.0;
}
}
Once everything is done return the total (non-normalized) arc length.