I can't seem to find a three-parameter t-distribution in numpy/scipy and I know JAGS has one. It turns out it comes from the "jrmath" library which is a part of R.
And from modules/bugs/distributions/DT.cc:
Which gets you the three-parameter version, and suggests that the following NumPy code should work nicely:
...and in fact generating a large sample from this, as well as from R, followed by a quantile-quantile plot gives some confirmation that this works.
This could be considered the historical-materialist approach to coding statistical functions. I'm sure there is a reference text for this sort of thing somewhere but it's not the sort of thing which jumps to mind for a biologist.
It helps to know that numpy.random is coded expecting that everything is parameterized by scale parameters, whereas JAGS is coded expecting that (almost?) everything is parameterized with rate parameters.
/*
* Mathlib : A C Library of Special Functions
* Copyright (C) 1998 Ross Ihaka
* Copyright (C) 2000-2008 The R Development Core Team
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, a copy is available at
* http://www.r-project.org/Licenses/
*
* DESCRIPTION
*
* Pseudo-random variates from a t distribution.
*
* NOTES
*
* This function calls rchisq and rnorm to do the real work.
*/
#include "nmath.h"
double rt(double df, RNG *rng)
{
if (ISNAN(df) || df <= 0.0) ML_ERR_return_NAN;
if(!R_FINITE(df))
return norm_rand(rng);
else {
/* Some compilers (including MW6) evaluated this from right to left
return norm_rand(rng) / sqrt(rchisq(df) / df); */
double num = norm_rand(rng);
return num / sqrt(rchisq(df, rng) / df);
}
}
And from modules/bugs/distributions/DT.cc:
double DT::r(vector const &par, RNG *rng) const
{
return rt(DF(par), rng) / sqrt(TAU(par)) + MU(par);
}
Which gets you the three-parameter version, and suggests that the following NumPy code should work nicely:
def student_t_3(loc, df, scale, size):
tau = 1/(scale**2)
t_draw = np.random.standard_t(df = df, size = size)
scaled_t_draw = t_draw / np.sqrt(tau)
return(scaled_t_draw + loc)
...and in fact generating a large sample from this, as well as from R, followed by a quantile-quantile plot gives some confirmation that this works.
This could be considered the historical-materialist approach to coding statistical functions. I'm sure there is a reference text for this sort of thing somewhere but it's not the sort of thing which jumps to mind for a biologist.
It helps to know that numpy.random is coded expecting that everything is parameterized by scale parameters, whereas JAGS is coded expecting that (almost?) everything is parameterized with rate parameters.
No comments:
Post a Comment