A Complete Derivation of Quasi-constant Step Size BDF Method

2021-04-23

The backward differentiation formula (BDF) method is the most widely used method to solve stiff ordinary differential equations (ODEs) and differential-algebraic equations (DAEs). Famous stiff solvers like ode15s, ode15i, LSODE, LSODA, CVODE, IDA, and DASSL all use the BDF method. Unfortunately, all BDF derivations online are not complete, in the sense that one still needs to do hand derivation for some formulas to fully understand the algorithm and be able to write a robust and production-ready variable step size BDF implementation. This blog post aims to bridge this gap by providing a complete derivation of all the formulas needed for a production-ready quasi-constant step size BDF method.

Recall the definitions of ODEs u=f(u,p,t)u' = f(u, p, t) and DAEs F(u,u,p,t)=0F(u', u, p, t) = 0. The idea behind BDFs is to use a polynomial p(t)p(t) to interpolate u(t)u(t) using the points (tn+1,un+1),(tn,un),(tn1,un1),...(t_{n+1}, u_{n+1}), (t_{n}, u_{n}), (t_{n-1}, u_{n-1}), ..., and solve for the next step un+1u_{n+1} that makes p(t+h)=un+1p'(t+h) = u'_{n+1}, where hh is the step size.

Constant step size BDF

Let's first solve the simple problem: deriving BDFs assuming the step size hh is constant. An order ss BDF interpolates s+1s+1 points to form a degree ss polynomial ps,n+1(t)p_{s,n+1}(t) when computing the approximation un+1u_{n+1}. Since we want to compute the next step, we can center the polynomial at the current time tnt_{n} and parametrize tt by the step size hh. Thus, we define qs,n+1(c)=ps,n+1(t)=ps,n+1(tn+ch)q_{s,n+1} (c) = p_{s,n+1}(t) = p_{s,n+1}(t_{n}+ch) where cRc\in\mathbb R is the new independent variable after the change of variable. Also note that we have

ttni=(tn+ch)(tnih)=(c+i)h.t- t_{n-i} = (t_{n} + ch) - (t_{n} - ih) = (c+i)h.

We can then explicitly construct the ss-th order interpolant using the Newton polynomial interpolation:

qs,n+1(c)=ps,n+1(t)=ps,n+1(tn+ch)= un+1+[un+1,un](ttn+1)+...+[un+1,...,un+1s](ttn+1)(ttn+2s)= un+1+j=1s[un+1,...,un+1j](ttn+1)(ttn(j2))= un+1+j=1s(c1)c(c+j2)j!hjj![un+1,...,un+1j]= un+1+j=1s1j!(i=1jc+i2)hjun+1,\begin{aligned} &q_{s,n+1}(c) = p_{s,n+1}(t) = p_{s,n+1}(t_{n}+ch) \\ = \;& u_{n+1} + [u_{n+1},u_{n}](t-t_{n+1}) + ... + [u_{n+1},...,u_{n+1-s}](t-t_{n+1})\cdots (t-t_{n+2-s}) \\ = \;& u_{n+1} + \sum_{j=1}^s [u_{n+1},...,u_{n+1-j}](t-t_{n+1})\cdots(t-t_{n-(j-2)}) \\ = \;& u_{n+1} + \sum_{j=1}^s \frac{(c-1) c \cdots (c+j-2)}{j!} h^j j![u_{n+1},...,u_{n+1-j}] \\ = \;& u_{n+1} + \sum_{j=1}^s \frac{1}{j!} \left( \prod_{i=1}^j c+i-2 \right) \nabla_{h}^{j} u_{n+1}, \end{aligned}

where [...][...] denotes Newton's divided difference, and \nabla is the backward differentiation operator. We used the relation hjj![un+1,...,un+1j]=hjun+1h^j j![u_{n+1},...,u_{n+1-j}] = \nabla_{h}^{j} u_{n+1} in the last step.

Now, we want to set p(t+h)=un+1p'(t+h) = u'_{n+1}. Note that we have q(1)=hp(tn+h)=hun+1q'(1) = h p'(t_{n} +h) = h u'_{n+1}. Therefore, we need to solve

j=1sδjjun+1=hun+1, \sum_{j=1}^s \delta_{j} \nabla^{j} u_{n+1} = h u'_{n+1},

where

δj=1j! ⁣d ⁣dci=1jci2c=1=1j!i=1jj=1 and jijc+i2c=1=1j!i=1jj=1 and jijj1note that the product vanishes when i1=1j!j=2jj1=1j!(j1)!=1j.\begin{aligned} \delta_{j} &= \frac{1}{j!} \frac{\mathop{}\!\mathrm{d} }{\mathop{}\!\mathrm{d} c} \prod_{i=1}^j c-i-2 \Big|_{c=1} \\ &= \frac{1}{j!} \sum_{i=1}^j \prod_{j=1 \text{ and } j\ne i}^j c + i - 2 \Big|_{c=1} \\ &= \frac{1}{j!} \sum_{i=1}^j \prod_{j=1 \text{ and } j\ne i}^j j-1 \quad \text{note that the product vanishes when $i\ne 1$}\\ &= \frac{1}{j!} \prod_{j=2}^j j-1 = \frac{1}{j!} (j-1)! = \frac{1}{j}. \end{aligned}

Together, the constant step size BDF is

j=1s1jjun+1=hun+1. \sum_{j=1}^s \frac{1}{j} \nabla^{j} u_{n+1} = h u'_{n+1}.

Varying the step size by transplanting to equidistant grid

A simple strategy to vary the step size is to simply evaluate the old polynomial pp from the at the equidistant grid tnih~t_{n} - i\tilde{h} for i=0,1,2,...,si=0, 1, 2,..., s to obtain an interpolated "constant step size history", when changing to a new step size h~\tilde{h}. During the step changing process, we do not need to compute un+1u_{n+1}. Hence, we only need the polynomial ps,n(tn+ch)p_{s,n}(t_{n}+ch). This strategy is simple because evaluating and re-interpolating a polynomial interpolant are linear transformations of the history, so we can solve for matrices that performs such transformations instead of working with polynomials explicitly.

Note that the only step size dependent term is hjun\nabla_{h}^{j} u_{n}. Let's define r=h~/hr = \tilde{h}/h,

D=[hun,h2un,...,hsun]Rm×s,andD~=[h~un,h~2un,...,h~sun]Rm×s.\begin{aligned} D = [\nabla_{h} u_{n}, \nabla_{h}^2 u_{n}, ..., \nabla_{h}^s u_{n}] \in \mathbb R^{m \times s}, \quad \text{and} \quad \tilde{D} = [\nabla_{\tilde{h}} u_{n}, \nabla_{\tilde{h}}^2 u_{n}, ..., \nabla_{\tilde{h}}^s u_{n}] \in \mathbb R^{m \times s}. \end{aligned}

Constructing the polynomial ps,n(tn+ch)p_{s,n}(t_{n}+ch) is simple as we only need to subtract the indices in ps,n+1(tn+ch)p_{s,n+1}(t_{n}+ch) by 1 appropriately:

ps,n(tn+ch)=qs,n(c)= un+j=1s[un,...,unj](ttn)(ttn(j1))= un+j=1s1j!(i=1jc+i1)hjun= un+j=1s1j!(i=1jc+i1)Dj= un+j=1s1j!(i=0j1c+i)Dj.\begin{aligned} p_{s,n}(t_{n}+ch) = q_{s,n}(c) = \;& u_{n} + \sum_{j=1}^s [u_{n},...,u_{n-j}](t-t_{n})\cdots(t-t_{n-(j-1)}) \\ = \;& u_{n} + \sum_{j=1}^s \frac{1}{j!} \left( \prod_{i=1}^j c+i-1 \right) \nabla_{h}^{j} u_{n}\\ = \;& u_{n} + \sum_{j=1}^s \frac{1}{j!} \left( \prod_{i=1}^j c+i-1 \right) D_{j}\\ = \;& u_{n} + \sum_{j=1}^s \frac{1}{j!} \left( \prod_{i=0}^{j-1} c+i \right) D_{j}. \end{aligned}

The interpolating polynomial with the new step size h~=rh\tilde{h} = rh is then

p~s,n(tn+h~)=q~s,n(c)=un+j=1s1j!(i=0j1c+i)D~j.\begin{aligned} \tilde{p}_{s,n}(t_{n}+\tilde{h}) = \tilde{q}_{s,n}(c) = u_{n} + \sum_{j=1}^s \frac{1}{j!} \left( \prod_{i=0}^{j-1} c+i \right) \tilde{D}_{j}. \end{aligned}

With the interpolation condition, we force ps,n(tn+ch~)=ps,n(tn+crh)=p~s,n(tn+ch~)p_{s,n}(t_{n}+c\tilde{h}) = p_{s,n}(t_{n}+crh) = \tilde{p}_{s,n}(t_{n}+c\tilde{h}) for c=0,1,...,sc = 0, -1, ..., -s. Note that when c=0c=0, the interpolation condition simplifies to p~s,n(tn)=un=ps,n(tn)\tilde{p}_{s,n}(t_{n}) = u_{n} = p_{s,n}(t_{n}) which always holds. Therefore, we only need to solve

j=1sDl,j1j!(i=0j1cr+i)=j=1sD~l,j1j!(i=0j1c+i)\begin{aligned} \sum_{j=1}^s D_{l,j} \frac{1}{j!} \left( \prod_{i=0}^{j-1} cr+i \right) = \sum_{j=1}^s \tilde{D}_{l,j} \frac{1}{j!} \left( \prod_{i=0}^{j-1} c+i \right) \end{aligned}

for D~\tilde{D} with c=1,2,..,sc = -1, -2, .., -s and l=1,...,ml = 1, ..., m. The system of equations can be expressed in terms of matrices:

DR=D~U,\begin{aligned} D R = \tilde{D} U, \end{aligned}

where

Rjk=1j!(i=0j1ikr),andUjk=1j!(i=0j1ik).\begin{aligned} R_{jk} = \frac{1}{j!} \left( \prod_{i=0}^{j-1} i-k r \right), \quad\text{and} \quad U_{jk} = \frac{1}{j!} \left( \prod_{i=0}^{j-1} i-k \right). \end{aligned}

Note that c=kc = -k as k=1,2,...,sk=1,2,...,s. We end this section by remarking that U2=IU^2=I.

julia> using LinearAlgebra

julia> U(s) = [prod(i->i-k, 0:j-1)//factorial(j) for j in 1:s, k in 1:s]
U (generic function with 1 method)

julia> U(5)^2 == I
true

Therefore, the divided differences with h~\tilde{h} is simply

D~=D(RU).\begin{aligned} \tilde{D} = D (R U). \end{aligned}

Updating backward differences

A natural choice for the predictor is then un+1(0)=ps,n(tn+h)=un+j=1sDju^{(0)}_{n+1} = p_{s,n}(t_{n}+h) = u_{n} + \sum_{j=1} ^s D_{j}. From the definition of the backward difference, we have

hs+1un+1=hsun+1hsun=hs1un+1hs1unhsun=hs2un+1hs2unhs1unhsun=un+1un...hs2unhs1unhsun=un+1un+1(0).\begin{aligned} \nabla_{h}^{s+1} u_{n+1} &= \nabla_{h}^{s} u_{n+1} - \nabla_{h}^{s} u_{n}\\ &= \nabla_{h}^{s-1} u_{n+1} - \nabla_{h}^{s-1} u_{n} - \nabla_{h}^{s} u_{n} \\ &= \nabla_{h}^{s-2} u_{n+1} - \nabla_{h}^{s-2} u_{n} - \nabla_{h}^{s-1} u_{n} - \nabla_{h}^{s} u_{n} \\ &= u_{n+1} - u_{n} - ... - \nabla_{h}^{s-2} u_{n} - \nabla_{h}^{s-1} u_{n} - \nabla_{h}^{s} u_{n} \\ &= u_{n+1} - u^{(0)}_{n+1}. \end{aligned}

Therefore, the s+1s+1-th order backward difference of the next step is simply the difference between the corrector un+1u_{n+1} and the predictor un+1(0)u^{(0)} _{n+1}. Also, note that from

hjun+1=hj+1un+1+hjun,\begin{aligned} \nabla_{h}^{j} u_{n+1} = \nabla_{h}^{j+1} u_{n+1} + \nabla_{h}^{j} u_{n}, \end{aligned}

where jNj\in\mathbb{N}, we can compute all the lower order backward differences of the next step, and

hs+2un+1=hs+1un+1hs+1un.\begin{aligned} \nabla_{h}^{s+2} u_{n+1} = \nabla_{h}^{s+1} u_{n+1} - \nabla_{h}^{s+1} u_{n}. \end{aligned}

We can use the above updating relations to simplify the backward differences:

sun+1=s+1un+1+sun=(un+1un+1(0))+suns1un+1=sun+1+s1un=(un+1un+1(0))+sun+s1unjun+1=j+1un+1+j1un=(un+1un+1(0))+k=jskun.\begin{aligned} \nabla^s u_{n+1} &= \nabla^{s+1} u_{n+1} + \nabla^{s} u_{n} = (u_{n+1}-u_{n+1}^{(0)}) + \nabla^{s} u_{n} \\ \nabla^{s-1} u_{n+1} &= \nabla^{s} u_{n+1} + \nabla^{s-1} u_{n} = (u_{n+1}-u_{n+1}^{(0)}) + \nabla^{s} u_{n} + \nabla^{s-1} u_{n} \\ &\vdots \\ \nabla^{j} u_{n+1} &= \nabla^{j+1} u_{n+1} + \nabla^{j-1} u_{n} = (u_{n+1}-u_{n+1}^{(0)}) + \sum_{k=j}^{s}\nabla^{k} u_{n}. \end{aligned}

Therefore, BDF becomes

j=1s1jjun+1=j=1s[1j((un+1un+1(0))+k=jskun)]=γs(un+1un+1(0))+j=1s1jk=jskun,\begin{aligned} \sum_{j=1}^s \frac{1}{j} \nabla^{j} u_{n+1} &= \sum_{j=1}^s \left[\frac{1}{j}\left( (u_{n+1}-u_{n+1}^{(0)}) + \sum_{k=j}^{s}\nabla^{k} u_{n}\right)\right] \\ &=\gamma_{s} (u_{n+1}-u_{n+1}^{(0)}) + \sum_{j=1}^s \frac{1}{j}\sum_{k=j}^{s} \nabla^{k} u_{n}, \end{aligned}

where γj=k=1j1j\gamma_{j} = \sum_{k=1}^j \frac{1}{j}. This naïve approach contains a lot of redundant computation, and the computational complexity is O(s2m)O(s^2 m) for unRmu_{n}\in \mathbb R^m. We can do a lot better if reorder the summation:

j=1s1jk=jskun=j=1sk=js1jkun=j=1sk=1 kjs1jkun=k=1sj=1 jks1jkunnote the interchange of summation=k=1sj=1k1jkun=k=1s(j=1k1j)kun=k=1sγkkun.\begin{aligned} &\sum_{j=1}^s \frac{1}{j}\sum_{k=j}^{s} \nabla^{k} u_{n} = \sum_{j=1}^s\sum_{k=j}^{s} \frac{1}{j} \nabla^{k} u_{n} = \sum_{j=1}^s\sum_{k=1 \;\land\; k\ge j}^{s} \frac{1}{j} \nabla^{k} u_{n}\\ =&\sum_{k=1}^s\sum_{j=1 \;\land\; j\le k}^{s} \frac{1}{j} \nabla^{k} u_{n} \quad\text{note the interchange of summation}\\ =&\sum_{k=1}^s\sum_{j=1}^{k} \frac{1}{j} \nabla^{k} u_{n} = \sum_{k=1}^s\left(\sum_{j=1}^{k} \frac{1}{j}\right) \nabla^{k} u_{n} = \sum_{k=1}^s \gamma_{k} \nabla^{k} u_{n}. \end{aligned}

This expression can be evaluated in O(sm)O(sm) time.

Putting everything together, the simplified BDF is

γs(un+1un+1(0))+k=1sγkkun=hun+1,\begin{aligned} \gamma_{s} (u_{n+1}-u_{n+1}^{(0)}) + \sum_{k=1}^s \gamma_{k} \nabla^{k} u_{n} = h u'_{n+1}, \end{aligned}

and we can use a nonlinear equation solver to solve for un+1u_{n+1}. We will omit the details of writing a stable and efficient nonlinear solver here as they are independent from the BDF method itself.

Local truncation error

By the standard result on polynomial interpolation, we know that

u(t)ps,n+1(t)=u(s+1)(ξt)(s+1)!w(t),\begin{aligned} u(t) - p_{s,n+1}(t) = \frac{u^{(s+1)}(\xi_{t})}{(s+1)!} w(t), \end{aligned}

where w(t)=j=1s1(ttnj)w(t) = \prod_{j=-1}^{s-1}(t - t_{n-j}). We assume that all the history is completely accurate and compute the error of ps,n+1(tn+1)p'_{s,n+1}(t_{n+1}):

u(tn+1)=ps,n+1(tn+1)+u(s+1)(ξt)(s+1)!w(tn+1).\begin{aligned} u'(t_{n+1}) = p'_{s,n+1}(t_{n+1}) + \frac{u^{(s+1)}(\xi_{t})}{(s+1)!} w'(t_{n+1}). \end{aligned}

Note that we don't have the term with w(tn+1)w(t_{n+1}) because it goes to 00 by the construction of ww. Expanding w(tn+1)w'(t_{n+1}):

w(tn+1)=j=1s1[(tn+1tnj)k=1 kjs1(tn+1tnk)]=j=1s1k=1 kjs1(tn+1tnk)the product only doesn’t vanish when j=1=k=0s1(tn+1tnk).\begin{aligned} w'(t_{n+1}) &= \sum_{j=-1}^{s-1} \left[(t_{n+1}-t_{n-j})' \prod_{k=-1\;\land\; k\ne j}^{s-1} (t_{n+1}-t_{n-k}) \right] \\ &= \sum_{j=-1}^{s-1} \prod_{k=-1\;\land\; k\ne j}^{s-1} (t_{n+1}-t_{n-k}) \quad \text{the product only doesn't vanish when $j=-1$} \\ &= \prod_{k=0}^{s-1} (t_{n+1}-t_{n-k}). \end{aligned}

Together, BDF with the leading error term is then

hu(tn+1)=hps,n+1(tn+1)+hu(s+1)(ξt)(s+1)!k=0s1(tn+1tnk)=qs,n+1(tn+1)+hu(s+1)(ξt)(s+1)!(h2h3h...sh)=qs,n+1(tn+1)+hu(s+1)(ξt)(s+1)!s!hs=qs,n+1(tn+1)+u(s+1)(ξt)s+1hs+1=j=1s1jjun+1+1s+1hs+1un+1+O(hs+2).\begin{aligned} hu'(t_{n+1}) &= hp'_{s,n+1}(t_{n+1}) + h\frac{u^{(s+1)}(\xi_{t})}{(s+1)!} \prod_{k=0}^{s-1} (t_{n+1}-t_{n-k}) \\ &= q'_{s,n+1}(t_{n+1}) + h\frac{u^{(s+1)}(\xi_{t})}{(s+1)!} (h\cdot 2h \cdot 3h ... sh) \\ &= q'_{s,n+1}(t_{n+1}) + h\frac{u^{(s+1)}(\xi_{t})}{(s+1)!} s!h^s \\ &= q'_{s,n+1}(t_{n+1}) + \frac{u^{(s+1)}(\xi_{t})}{s+1} h^{s+1} \\ &= \sum_{j=1}^s \frac{1}{j} \nabla^{j} u_{n+1} + \frac{1}{s+1}\nabla_{h}^{s+1} u_{n+1} + O(h^{s+2}). \end{aligned}

Hence, the error estimate for the ss-th order BDF is

1s+1hs+1un+1,\begin{aligned} \frac{1}{s+1}\nabla_{h}^{s+1} u_{n+1}, \end{aligned}

and this quantity can easily be computed from the updated backward differences. With an appropriate controller for the step size and order, a quasi-constant step size BDF method can be implemented in its totality.