Description
This RFC proposes adding a special function extension to the array API specification.
Overview
Several array libraries have some support for "special" functions (e.g. gamma
), that is, mathematical functions that are broadly applicable but not considered to be "elementary" (e.g. sin
). We1 propose adding a special
sub-namespace to the array API specification, which would contain a number of special functions that are already implemented by many array libraries.
Prior Art
We begin with 25 particularly important special functions that are either already available for NumPy, PyTorch, CuPy, and JAX arrays or are easily implemented. Partial information about their signatures in these libraries is included in the table below; parameters that are less commonly supported/used are omitted.
With the exception of log-sum-exp functions, which reduces along an axis, all work elementwise, producing an output that is the broadcasted shape of the arguments. The variable names shown are not necessarily those used by the referenced library; instead they are standardized with x
/z
/n
denoting an arguments of real/complex/integer dtype.
Further information about these functions in other languages (C++, Julia, Mathematica, Matlab, and R) is available in this spreadsheet.
Proposal
The Array API specification would include the following functions in a special
sub-namespace.
Function | Array API (proposed) | Name/interface change notes |
---|---|---|
log-sum-exp | log_sum_exp(z, /, *, axis=-1, weights=None) |
Enforce naming consistency |
logit | logit(x, /) |
Unchanged (standard) |
expit | expit(x, /) |
Unchanged (standard) |
log of normal CDF | log_normcdf(a, b=None, /) |
Existing name is cryptic |
normal CDF | normcdf(a, b=None, /) |
Existing name is cryptic |
normal CDF inverse | normcdf_inv(p, /, *, a=None, b=None) |
Existing name is cryptic |
digamma | digamma(z, /) |
Unchanged (standard) |
polygamma | polygamma(n, x) |
Unchanged (standard) |
multigammaln | log_multigamma(x, n) |
Enforce naming consistency |
log of absolute value of gamma | log_abs_gamma(z, /, *, a=None, b=None, regularized=None) |
Existing name imprecise |
gamma | gamma(z, /, *, a=None, b=None, regularized=None) |
Interface generalized |
log of absolute value of beta | log_abs_beta(x1, x2, /, *, a=None, b=None) |
Existing name imprecise |
beta | beta(x1, x2, /, *, a=None, b=None) |
Interface generalized |
erf | erf(a, b=None, /) |
Interface generalized |
erv inverse | erf_inv(p, /, *, a=None, b=None) |
Interface generalized |
zeta | zeta(x1, x2=None, /) |
Unchanged (standard) |
binomial coefficient | binom(x1, x2, /) |
New to most libraries |
exponential integral | expinti(x, /) |
Existing name is cryptic |
generalized exponential integral | expintv(n, x) |
Existing name is cryptic |
softmax | softmax(z, /) |
Unchanged (standard) |
log of softmax | log_softmax(z, /) |
Unchanged (standard) |
A few notes about the interface:
- The names of some special functions are not universal across all libraries, and some are relics of the conventions of their underlying implementations rather than meaningful names. In these cases, we recommend taking the uncomfortable step of given these functions human-readable names rather than propagating the legacy tokens. For example, the CDF of the normal distribution is commonly implemented as
ndtr
; we call itnormcdf
(as it is named in Matlab). - Many special functions compute the logarithm of another special function; this avoids intermediate over/underflow compared to evaluating the function before taking its logarithm. However, the naming scheme of such functions is often inconsistent, even within a given library. For example, SciPy includes
loggamma
to compute the log of thegamma
function,betaln
to compute the log of thebeta
function, andlog_ndtr
to compute the log of thendtr
function. For consistency, we form the name of the log-version of a function by prependinglog_
to the original function name. - Many special functions compute the complement of another special function; this avoids loss of precision compared to evaluating the function before taking its complement. For example, SciPy includes
erf
to evaluate a particular definite integral from-oo
tox
anderfc
to evaluate the integral fromy
to+oo
without the potential for catastrophic cancellation associated with1 - erf(x)
. A related, unmet need is the ability to evaluate such integrals fromx
toy
without subtraction, e.g.erf(y) - erf(x)
. To better meet this need - and to avoid the need for a separate "complementary" functions - we provide arguments that allow specification ofa
andb
limits of integration. - Many special functions compute the inverse of another special function with respect to one of the arguments, but the naming scheme of such functions is often inconsistent even within a given library. For instance, SciPy includes
ndtri
to compute the inverse ofndtr
anderfinv
to compute the inverse oferf
. For consistency, we form the name of the inverse of a function by appending_inv
to the name. - In some cases, the argument with respect to which a function is inverted is ambiguous. In these cases, we resolve the inconsistency by requiring some arguments to be passed as keywords; the function would produce the inverse with respect to the argument that is not passed.
- Some special functions have "regularized", "normalized", or "scaled" variants. For instance, SciPy has
gamma
for the (unregularized) gamma function andgammainc
for the regularized incomplete gamma function. In these cases, we have only one function with a keyword argument (e.g.regularized
). In some cases, this helps to reduce duplication of similar function names and signatures; in others, it allows developers to be more explicit about which variant is being used.
Where applicable, we find that these conventions generalize well to other special functions that might be added in the future.
Other notes about function selection:
- The binomial coefficient function (
binom
) does not seem to be implemented for PyTorch, CuPy, or JAX arrays, but the need is so fundamental that we wish to include it in the standard. A moderately robust version of the function can be implemented in terms of the log of the gamma function until a more robust, custom implementation is available. - The PyTorch version of the log-sum-exp function does not support "weights" or "scale factors". Nonetheless, we propose that such a feature should be part of the standard.
Questions / Points of Discussion:
- Some array libraries implement special functions only for real arguments, but many applications require the the extension of these functions to complex arguments. We invite discussion about how to approach this. Can the standard specify that complex arguments should be accepted (with corresponding keyword names involving
z
rather thanx
) even if some libraries are not compliant initially? - For functions with both
log_
and_inv
components in the name, the order of operations is ambiguous. For example, wouldlog_normcdf_inv
(which would be useful in statistics) be the logarithm of the inverse ofnormcdf
or the inverse of the logarithm ofnormcdf
?- One proposal for resolving the ambiguity would be to use attributes rather than function names to organize logarithms and inverse functions. For example, instead of separate functions
normcdf
,normcdf_inv
,log_normcdf
, andlog_normcdf_inv
,normcdf
would have attributesnormcdf.log
andnormcdf.inv
, andnormcdf.log
would have an attributenormcdf.log.inv
. - Another possibility is for
log_
andinv_
to both be prefixes. However,_inv
typically appears as a suffix in existing special function names, perhaps because the superscript$-1$ that denotes inversion often appears after the function symbol, e.g.$f^{-1}(x)$ . - The natural alternative is for both
_log
and_inv
to be suffixes. However,log
typically appears as a prefix in existing function names, perhaps because this is how the function appears when typeset mathematically, e.g.$log(f(x))$ .
- One proposal for resolving the ambiguity would be to use attributes rather than function names to organize logarithms and inverse functions. For example, instead of separate functions
- Consider the Python
range
function: it is natural forrange(y)
to denote a range with an upper limit ofy
and forrange(x, y)
to generate a range betweenx
andy
. However, if the arguments were allowed to be specified as keywords, it would be unclear how they should be named. The userange(y)
suggests that the name of the first argument might bestop
, butrange(x, y)
suggests that the name of the first argument should bestart
; assigning either name and allowing both positional and keyword specification leads to confusion. To avoid this ambiguity,range
requires that the arguments be passed as positional-only. We run into a similar situation with oura
andb
arguments. After carefully considering many possibilities, we have suggested the following above:- Functions for which the first two arguments are
a
/b
require that these arguments are positional-only. - Functions with other arguments before
a
/b
require that these arguments are keyword-only.
- Functions for which the first two arguments are
- The downside of these conventions for
a
/b
is that they are somewhat restrictive. Users cannot callnormcdf(a=x, b=y)
with keywords to be explicit, nor can they be callgamma(z, x, y)
without keywords to be concise. A compromise would be to accept separate positional-only and keyword-only versions of the same argument, and implement logic to resolve the intended use. While this is anticipated to allow for both natural and flexible use, it would be somewhat more cumbersome to document and implement. - The default value of the
regularized
argument ofgamma
is challenging to choose.- The incomplete gamma function (e.g.
gamma(z, upper=y)
) will typically be regularized, suggesting that aregularized=True
default is more appropriate for this use case. - The regularized complete gamma function (e.g.
gamma(z)
) is identically 1, suggesting thatregularized=False
is more appropriate for this use case. - We can get the desired behavior by default in both cases by using
regularized=None
. Whengamma
is used as the complete gamma function (withouta/b
),regularized
would be set toFalse
, and whengamma
is used as the incomplete gamma function (witha/b
,regularized
would be set toTrue
. However, this is more complex to document than choosing eitherTrue
orFalse
as the default.
- The incomplete gamma function (e.g.
- Many of the argument names - and especially whether they should be positional-only or not - are up for debate. For instance, the two arguments of
binom
are not interchangeable, suggesting that some users might prefer to pass arguments by keyword. On one hand,n
andk
would be reasonable names, since the binomial coefficient is often needed in situations that call for "n choose k". On the other hand, the namesn
andk
are not entirely universal, and the function is extended for real arguments, whereas namesn
andk
are suggestive of integer dtypes. Also, whilea
andb
are concise names that are commonly used for lower and upper limits of integration, they are not as descriptive aslower
/upper
, and might be confused with the symbols commonly used for different arguments of the same function (e.g.beta
).low
/high
,lo/hi
,ll
/ul
,c
/d
have also been proposed.