## Sunday, September 11, 2011

### Three parameter t distribution (in JAGS, but not numpy.random)

I can't seem to find a three-parameter t-distribution in numpy/scipy and I know JAGS has one.  It turns out it comes from the "jrmath" library which is a part of R.

/* *  Mathlib : A C Library of Special Functions *  Copyright (C) 1998 Ross Ihaka *  Copyright (C) 2000-2008 The R Development Core Team * *  This program is free software; you can redistribute it and/or modify *  it under the terms of the GNU General Public License as published by *  the Free Software Foundation; either version 2 of the License, or *  (at your option) any later version. * *  This program is distributed in the hope that it will be useful, *  but WITHOUT ANY WARRANTY; without even the implied warranty of *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the *  GNU General Public License for more details. * *  You should have received a copy of the GNU General Public License *  along with this program; if not, a copy is available at *  http://www.r-project.org/Licenses/ * *  DESCRIPTION * *    Pseudo-random variates from a t distribution. * *  NOTES * *    This function calls rchisq and rnorm to do the real work. */#include "nmath.h"double rt(double df, RNG *rng){    if (ISNAN(df) || df <= 0.0) ML_ERR_return_NAN;    if(!R_FINITE(df))  return norm_rand(rng);    else {/* Some compilers (including MW6) evaluated this from right to left  return norm_rand(rng) / sqrt(rchisq(df) / df); */  double num = norm_rand(rng);  return num / sqrt(rchisq(df, rng) / df);    }}

And from modules/bugs/distributions/DT.cc:

double DT::r(vector const &par, RNG *rng) const{    return rt(DF(par), rng) / sqrt(TAU(par)) + MU(par);}

Which gets you the three-parameter version, and suggests that the following NumPy code should work nicely:

def student_t_3(loc, df, scale, size):    tau = 1/(scale**2)    t_draw = np.random.standard_t(df = df, size = size)    scaled_t_draw = t_draw / np.sqrt(tau)    return(scaled_t_draw + loc)

...and in fact generating a large sample from this, as well as from R, followed by a quantile-quantile plot gives some confirmation that this works.

This could be considered the historical-materialist approach to coding statistical functions.  I'm sure there is a reference text for this sort of thing somewhere but it's not the sort of thing which jumps to mind for a biologist.

It helps to know that numpy.random is coded expecting that everything is parameterized by scale parameters, whereas JAGS is coded expecting that (almost?) everything is parameterized with rate parameters.