Theory

Ray expansion

The geodesic (ray) property of interest is the coordinate deflection integral with respect to the radial coordinate

\[\psi(\alpha,R) = b\mathop{\int}_{R}^{\infty}\frac{dr}{r^{2}}\left[1 - \frac{b^{2}}{r^{2}}\left(1-\frac{r_{s}}{r}\right)\right]^{-1/2},\]

where: \(R\) is the radial coordinate at some spacetime event along the ray; \(r_{s}\) is the Schwarzschild radius; and the impact parameter, \(b=b(\alpha,R)\), is affine-invariant. The impact parameter is

\[b(\alpha, R) = \frac{R\sin\alpha}{\sqrt{1-\frac{r_{s}}{R}}},\]

where \(\alpha\) is the angle subtended by the ray tangent four-vector to the radial direction with respect to the orthonormal tetrad of a local Eulerian frame. We are only interested in angles \(\alpha\leq\pi/2\) for the purpose of approximation design.

The deflection integral has no known closed-form solution. Exact solutions are approached via numerical quadrature with a variable transformation to make the integral domain compact and remove the integrand divergence as \(r\to r_{c}^{+}\), where \(r_{c}>3r_{s}/2\) is the turning point \(dr/d\lambda=0\) of a scattering ray, and is \(r_{c}=r_{c}(b)\).[1]

However, Beloborodov (2002) showed that if one lets \(y=1-\cos\psi\) and \(x=1-\cos\alpha\), then the \(\mathcal{O}(x^{2})\) term of the series expansion \(y(x;u)\) vanishes, leading to a useful linear relation

\[y(x;u) = \frac{x}{1-u}+\mathcal{O}(x^{3}),\]

where \(u\mathrel{:=}r_{s}/R\). Note that in the integrand \(x\) appears via \(\sin\alpha\), and this function of \(x\) has range equal to the unit interval.[2] Also note that due to consideration of \(\cos\psi\), the exact function \(y(x;u)\), for domain \(x\in[0,2]\), is non-injective. The extrema in \(y(x;u)\) denote transitions, with increasing \(x\), to subdomains that add[3] an image—i.e., another ray bundle connecting neighbourhoods of two points in space with principal angular separation \(\psi\in[0,\pi]\). The integral given above is valid for \(\alpha\leq\pi/2\), but the Beloborodov (2002) series truncation is valid for the set \(\{x\colon\;y(x)\leq2\}\). The smoothness of the exact relation \(y(x;u)\), for fixed \(u\), across the transition \(\alpha=\pi/2\) means that a series expansion with some sufficiently small truncation error for \(\alpha\to\pi/2^{-}\) can form an adequate approximation for \(\alpha>\pi/2\), for which the exact function is given by

\[\begin{split}\begin{aligned} \psi(\alpha,R) &= 2b\Theta\left(\alpha - \frac{\pi}{2}\right)\mathop{\int}_{r_{c}}^{\infty}\frac{dr}{r^{2}}\left[1 - \frac{b^{2}}{r^{2}}\left(1-\frac{r_{s}}{r}\right)\right]^{-1/2}\\ &\qquad + (-1)^{\Theta\left(\alpha - \frac{\pi}{2}\right)}b\mathop{\int}_{R}^{\infty}\frac{dr}{r^{2}}\left[1 - \frac{b^{2}}{r^{2}}\left(1-\frac{r_{s}}{r}\right)\right]^{-1/2}, \end{aligned}\end{split}\]

where \(\Theta(\ldots)\) denotes the Heaviside function.

_images/exact_relations.png

Exact deflection relations computed using numerical integration routines from the xpsi package.

However, series truncation results in a lack of curvature, and for a subset of the nominal domain \(\alpha\in[0,\pi/2]\), values \(y>2\), which invalidates the definition \(y=1-\cos\psi\). The approximation is valid only for primary images, defined by deflections \(\psi<\pi\), and the truncation error grows as \(\psi\to\pi^{-}\).

In the context of many astrophysical problems in which the Schwarzschild solution is applicable, a model defines the spatio-temporal structure of radiating material in the near-vicinity of a compact object; the computational domain is then, by virtue of the symmetries of the Schwarzschild solution, discretised for integration over the high-energy image subtended by the system on the sky of a distant observer. The simulator then knows the coordinates of points from whose neighbourhoods radiation originates, and thus knows the coordinate angular separation \(\psi\) between each point and the distant observer, in a coordinate chart with polar axis defined as coincident with the direction from the observer to the origin of any Schwarzschild chart. Due to symmetry, the tangent three-vector to the ray at any spacetime event along it is coplanar with the tangent three-vector at any other point along the ray.

It follows that for integration over the image of the radiating system, a relation

\[x(y;u) = (1-u)y+\mathcal{O}(y^{3})\]

is desired, together with the convergence of ray bundles, which is encoded in the derivative

\[(1-u)\mathcal{D}= \frac{\partial\cos\alpha}{\partial\cos\psi} = \frac{\partial x}{\partial y} = 1-u + \mathcal{O}(y^{2});\]

the derivative in the Beloborodov (2002) approximation is constant and performs poorly in effectively all contexts. Hereafter we simply call this derivative \(\mathcal{D}\) the convergence.

These truncated series have the desirable property that they are entirely inexpensive to evaluate and do not require numerical integration nor interpolation. However, more recents studies have attempted to improve these analytic expressions, with the justification being that the truncation error is undesirably large in many applications. These efforts are summarised by Poutanen (2019), who offers an new \(x(y;u)\) relation—that we will label \(x_{P19}(y;u)\)—that uses the expansion reported by Beloborov (2002), truncated at \(\mathcal{O}(x^{4})\), and extends it with a logarithmic term to increase the curvature of the \(x_{P19}(y;u)\) relation in the near-vicinity of \(y\to2^{-}\).

The intention of rayXpanda is to push the notion of series expansion to a practical limit, and provide functions that can be compiled and executed somewhat inexpensively. The \(x_{P19}(y;u)\) relation is fast, argued to be sufficiently accurate in the majority of statistical contexts involving primary images and (real or simulated) observations, and simple to duplicate. However, rayXpanda is motivated by the pursuit of an accurate implementation that does not depend on numerical quadrature, integration, or interpolation, and does not add any phenomenological terms that are not generated by the integral equation under expansion in a small parameter. By doing so, rayXpanda achieves superior accuracy over most of the domain of interest, at slightly higher computational expense. The orthogonality however, means that rayXpanda can at the least be used as an independent check of exact implementations that call numerical integrators and interpolators.

To achieve this aim, the extension modules supplied by rayXpanda were generated as follows. First we pursued the higher-order expansion terms given by Beloborodov (2002) and verified them. We then proceeded symbolically[4] to generate terms up to \(\mathcal{O}(x^{31})\). The resulting polynomial in \(x\) has clear pattern in the structure of the coefficients, which are themselves polynomials in \(u\), where the order scales with the power of \(x\). For example, truncating at \(\mathcal{O}(x^{8})\) yields

\[\begin{split}\begin{aligned} \frac{y(x;u)}{u^{2}} = \frac{z}{u^{2}} &- \frac{z^{3}}{112}\\ &+\frac{z^{4}(3u - 5)}{672}\\ &-\frac{z^{5}(2673u^2 - 8008u + 6461)}{1345344}\\ &+\frac{z^{6}(9372u^3 - 39347u^2 + 57876u - 30324)}{10762752}\\ &-\frac{z^{7}(3995442u^4 - 21328659u^3 + 44260348u^2-\ldots)}{10429106688}\\ &+\mathcal{O}(x^{8}), \end{aligned}\end{split}\]

where \(z\mathrel{:=}x/(1-u)\).

The polynomial in \(x\) requires almost \(10^{3}\) floating point operations to evaluate. We generated the Cython deflection extension module using a Python script, organising the evaluation in a nested[5] Horner scheme; we did not make any attempt to optimise the evaluation beyond this. Furthermore, we obtain the polynomial partial derivative \(\partial y/\partial x\) simultaneously for the (inverse) convergence, making the overall scheme efficient.

To generate the Cython inversion extension module, it was necessary to reverse the polynomial to obtain a polynomial for \(x(y;u)\). Series reversion requires a larger number of terms in powers of \(y\) to recover the accuracy of the \(y(x;u)\) polynomial truncated at \(\mathcal{O}(x^{31})\). We push the computation to \(\mathcal{O}(y^{61})\).

Both extension modules are statically typed at double precision: the coefficients of the polynomials in \(u\) are truncated at this precision but are represented symbolically as a ratio of very large integers that require many more bits to represent. The function that we automatically generated for the reversed series \(x(y;u)\) is \(\sim\!1700\) lines long, with an average of a little less than two floating point operations per line. Very roughly, on a GHz processor, this amounts to \(\mathcal{O}(10^{3})\) ns execution time.

Performance

We now compare the truncation error to that exhibited by \(x_{P19}(y;u)\). We call routines from the xpsi package to compute the ray properties via numerical quadrature.

_images/primary_image_performance.png

Truncation error comparison. The behaviour and error exhibited by invert() is delineated by the solid lines. The error exhibited by the deflect() is delineated by the dotted lines (not labelled in the legend); these lines follow the exact relations very closely. The behavior and error of the \(x_{P19}(y;u)\) relation is delineated by the dash-dot lines. The exact relations are delineated in the top panels by the dashed lines. The error \(|\varepsilon|\) is the fractional error.

The fractional error \(\varepsilon\) is defined according to the relation whose truncation error we are interested in. The exact ray properties computed via numerical quadrature were generated for an array of \(\cos\alpha\) values, yielding deflections. For approximative inverse relations such as \(x_{P19}(y;u)\), we calculate \(\tilde{\alpha}\) given those exact deflections, and define \(\varepsilon=|\alpha - \tilde{\alpha}|/\alpha\). For the deflect() module we compute \(\tilde{\psi}\) and define \(\varepsilon=|\psi - \tilde{\psi}|/\psi\).

However, for the derivative \(\mathcal{D}\), the error is \(\varepsilon=|\mathcal{D} - \tilde{\mathcal{D}}|/\mathcal{D}\). The rayXpanda error curves—one for deflect(), and one for invert()—pertaining to \(\mathcal{D}\) thus visibly converge with increasing \(\cos\alpha\).

The addition of the logarithmic term by Poutanen (2019) has the effect that in the limit \(y\to2^{-}\), \(x_{P19}\to\infty\), forcing the relation \(x_{P19}(y;u)\) to cross the exact relation. The convergence also diverges. The rayXpanda relation is more accurate in this limit, and for most of the deflection domain \(\cos\psi\in[-1,1]\). However, the \(x_{P19}(y;u)\) performs better for \(\cos\psi\lesssim-0.9\), until the \(\cos\psi\to-1\) limit. The accuracy of rayXpanda relative to \(x_{P19}(y;u)\) is a function of \(u\).

_images/spherical_star_performance.png

Truncation error comparison, for a spherical star. The behaviour and error exhibited by invert() is delineated by the solid lines. The error exhibited by deflect() is delineated by the dotted lines in the lower panels. The behavior and error of the \(x_{P19}(y;u)\) relation is delineated by the dash-dot lines. The exact relations are delineated in the top panels by the dashed lines. The error \(|\varepsilon|\) is the fractional error.

For a spherical star, rayXpanda is effectively always more accurate (still considering primary images only) apart from where the \(x_{P19}(y;u)\) relation cross the exact relation. The accuracies become very comparable for more compact stars when \(\cos\psi\to-1\) for \(\cos\alpha>0\).

An important consideration when benchmarking performance is evaluation time. A call from a compiled program to a compiled function \(x_{P19}(y;u)\) is estimated to require \(\mathcal{O}(10^{1})\) ns. A call to the compiled shared objects of rayXpanda is estimated to require \(\mathcal{O}(10^{3})\) ns. However, if one is calling such functions via a Python extension module, the overhead dominates the processor time to evaluated \(x_{P19}(y;u)\), resulting in an evaluation time of \(\mathcal{O}(10^{3})\) ns. Therefore, when called from a Python program, the expense is almost commensurate per ray. Ideally, however, one would make far fewer calls through the Python/C API than ray evaluations—for instance, by using vectorised functions such as deflect_vec() and invert_vec(). A loop over rays in an extension is a likely context, and in this case rayXpanda is more expensive than a low-level implementation of \(x_{P19}(y;u)\).

Future development

The current version of rayXpanda only treats the ray deflection as a function of the local ray angle (including the derivative for the convergence property). Possible extensions include treatment of the lag—relative to a radial ray—via an expansion. However, for various applications such as neutron star pulse-profile modelling, the gravitational delay makes a much smaller contribution to signal calculation, and the importance decays with spin frequency.

One could also devise an improvement for rays characterised by \(\cos\alpha<0\). The integral equation given above involving Heaviside function is suggestive of the same expansion being applicable. However, one difficulty is additional dependence on \(\cos\alpha\) because the lower limit of the integral is \(r_{c}(b)\) instead of \(R\).

rayXpanda also only considers primary images, and it remains unclear if there is any viable expansion approach to approximate the properties of higher-order images.

Finally, an obvious way to improve accuracy is to push the expansion and reversion orders higher, whilst paying attention to precision loss due to number representation and operation inadequacies, and to the total number of floating point operations.

Footnotes

[1]In units of the gravitational radius: \(r_{c}(b)=\frac{2b}{\sqrt{3}}\cos\left[\frac{1}{3}\tan^{-1}\sqrt{\frac{b^{2}}{27} - 1} - \frac{\pi}{3}\right]\), where \(b\) is also in units of the gravitational radius.
[2]Note that for \(\cos\alpha<0\), \(x\in[1,2]\), but the combination \(\sin\alpha=\sqrt{x(2-x)}\in[0,1]\) remains. The expansion with respect to \(x\) would thus be possible in principle due to the small function of \(x\). The integral must however be rewritten in an alternate form.
[3]In the absence of opaque surfaces that obscure images, as is the case for a neutron star.
[4]We applied sympy, both for series expansion and reversion, and integration.
[5]Meaning that the coefficients of the polynomial in \(x\), which are polynomials themselves in \(u\) are also evaluated using a single-variable Horner scheme. Subsequent coefficient evaluations are interleaved with the topmost Horner scheme that organises the evaluation with respect to \(x\).