In this model set up one considers latent (random) variables
$ Y_i $ described by:
$$
\begin{array}{ccccc}
Y_1 & = & \sum_k M_k a_{1,k} & + \sqrt{1-\sum_k a_{1,k}^2} Z_1 &
\sim \Phi_{Y_1}\nonumber \ ... & = & ... & ... & \nonumber \ Y_i & = & \sum_k M_k a_{i,k} & + \sqrt{1-\sum_k a_{i,k}^2} Z_i &
\sim \Phi_{Y_i}\nonumber \ ... & = & ... & ... & \nonumber \ Y_N & = & \sum_k M_k a_{N,k} & + \sqrt{1-\sum_k a_{N,k}^2} Z_N &
\sim \Phi_{Y_N}
\end{array}
$$
where the systemic $ M_k $ and idiosyncratic $ Z_i $ (this last
one known as error term in some contexts) random variables have
independent zero-mean unit-variance distributions. A restriction of the
model implemented here is that the N idiosyncratic variables all follow
the same probability law $ \Phi_Z(z)$ (but they are still
independent random variables) Also the model is normalized
so that: $-1\leq a_{i,k} \leq 1$ (technically the $Y_i$ are
convex linear combinations). The correlation between $Y_i$ and
$Y_j$ is then $\sum_k a_{i,k} a_{j,k}$.
$\Phi_{Y_i}$ denotes the cumulative distribution function of
$Y_i$ which in general differs for each latent variable.
In its single factor set up this model is usually employed in derivative
pricing and it is best to use it through integration of the desired
statistical properties of the model; in its multifactorial version (with
typically around a dozen factors) it is used in the context of portfolio
risk metrics; because of the number of variables it is best to opt for a
simulation to compute model properties/magnitudes.
For this reason this class template provides a random factor sample
interface and an integration interface that will be instantiated by
derived concrete models as needed. The class is neutral on the
integration and random generation algorithms
The latent variables are typically treated as unobservable magnitudes
and they serve to model one or several magnitudes related to them
through some function
$$
\begin{array}{ccc}
F_i(Y_i) & = &
F_i(\sum_k M_k a_{i,k} + \sqrt{1-\sum_k a_{i,k}^2} Z_i )\nonumber \
& = & F_i(M_1,..., M_k, ..., M_K, Z_i)
\end{array}
$$
The transfer function can have a more generic form:
$F_i(Y_1,....,Y_N)$ but here the model is restricted to a one to
one relation between the latent variables and the modelled ones. Also
it is assumed that $F_i(y_i; \tau)$ is monotonic in $y_i$; it
can then be inverted and the relation of the cumulative probability of
$F_i$ and $Y_i$ is simple:
$$
\int_{\infty}^b \phi_{F_i} df =
\int_{\infty}^{F_i^{-1}(b)} \phi_{Y_i} dy
$$
If $t$ is some value of the functional or modelled variable,
$y$ is mapped to $t$ such that percentiles match, i.e.
$F_Y(y)=Q_i(t)$ or $y=F_Y^{-1}(Q_i(t))$.
The class provides an integration facility of arbitrary functions
dependent on the model states. It also provides random number generation
interfaces for usage of the model in monte carlo simulations.
Now let $\Phi_Z(z)$ be the cumulated distribution function of (all
equal as mentioned) $Z_i$. For a given realization of $M_k$,
this determines the distribution of $y$:
$$
Prob ,(Y_i < y|M_k) = \Phi_Z \left( \frac{y-\sum_k a_{i,k},M_k}
{\sqrt{1-\sum_k a_{i,k}^2}}\right)
\qquad \ \mbox{or} \ \qquad
Prob ,(t_i < t|M) = \Phi_Z \left( \frac
{F_{Y_{i}}^{-1}(Q_i(t))-\sum_k a_{i,k},M_k}
{\sqrt{1-\sum_k a_{i,k}^2}}
\right)
$$
The distribution functions of $ M_k, Z_i $ are specified in
specific copula template classes. The distribution function
of $ Y_i $ is then given by the convolution
$$
F_{Y_{i}}(y) = Prob,(Y_i<y) =
\int_{-\infty}^\infty,\cdots,\int_{-\infty}^{\infty}:
D_Z(z),\prod_k D_{M_{k}}(m_k) \quad \ \Theta \left(y - \sum_k a_{i,k}m_k -
\sqrt{1-\sum_k a_{i,k}^2},z\right),d\bar{m},dz,
\qquad \ \Theta (x) = \left{
\begin{array}{ll}
1 & x \geq 0 \ 0 & x < 0
\end{array}\right.
$$
where $ D_Z(z) $ and $ D_M(m) $ are the probability
densities of $ Z$ and $ M, $ respectively.
This convolution can also be written
$$
F_{Y_{i}}(y) = Prob ,(Y_i < y) =
\int_{-\infty}^\infty,\cdots,\int_{-\infty}^{\infty}
D_{M_{k}}(m_k),dm_k:
\int_{-\infty}^{g(y,\vec{a},\vec{m})} D_Z(z),dz, \qquad \ g(y,\vec{a},\vec{m}) = \frac{y - \sum_k a_{i,k}m_k}
{\sqrt{1-\sum_k a_{i,k}^2}}, \qquad \\ \sum_k a_{i,k}^2 < 1
$$
In general, $ F_{Y_{i}}(y) $ needs to be computed numerically.
The policy class template separates the copula function (the
distributions involved) and the functionality (i.e. what the latent
model represents: a default probability, a recovery...). Since the
copula methods for the
probabilities are to be called repeatedly from an integration or a MC
simulation, virtual tables are avoided and template parameter mechnics
is preferred.
There is nothing at this level enforncing the requirement
on the factor distributions to be of zero mean and unit variance. Thats
the user responsibility and the model fails to behave correctly if it
is not the case.
Derived classes should implement a modelled magnitude (default time,
etc) and will provide probability distributions and conditional values.
They could also provide functionality for the parameter inversion
problem, the (e.g.) time at which the modeled variable first takes a
given value. This problem has solution/sense depending on the transfer
function $F_i(Y_i)$ characteristics.
To make direct integration and simulation time efficient virtual
functions have been avoided in accessing methods in the copula policy
and in the sampling of the random factors
Generic multifactor latent variable model.
In this model set up one considers latent (random) variables $ Y_i $ described by: $$ \begin{array}{ccccc} Y_1 & = & \sum_k M_k a_{1,k} & + \sqrt{1-\sum_k a_{1,k}^2} Z_1 & \sim \Phi_{Y_1}\nonumber \
... & = & ... & ... & \nonumber \
Y_i & = & \sum_k M_k a_{i,k} & + \sqrt{1-\sum_k a_{i,k}^2} Z_i & \sim \Phi_{Y_i}\nonumber \
... & = & ... & ... & \nonumber \
Y_N & = & \sum_k M_k a_{N,k} & + \sqrt{1-\sum_k a_{N,k}^2} Z_N & \sim \Phi_{Y_N} \end{array} $$ where the systemic $ M_k $ and idiosyncratic $ Z_i $ (this last one known as error term in some contexts) random variables have independent zero-mean unit-variance distributions. A restriction of the model implemented here is that the N idiosyncratic variables all follow the same probability law $ \Phi_Z(z)$ (but they are still independent random variables) Also the model is normalized so that: $-1\leq a_{i,k} \leq 1$ (technically the $Y_i$ are convex linear combinations). The correlation between $Y_i$ and $Y_j$ is then $\sum_k a_{i,k} a_{j,k}$. $\Phi_{Y_i}$ denotes the cumulative distribution function of $Y_i$ which in general differs for each latent variable.
In its single factor set up this model is usually employed in derivative pricing and it is best to use it through integration of the desired statistical properties of the model; in its multifactorial version (with typically around a dozen factors) it is used in the context of portfolio risk metrics; because of the number of variables it is best to opt for a simulation to compute model properties/magnitudes. For this reason this class template provides a random factor sample interface and an integration interface that will be instantiated by derived concrete models as needed. The class is neutral on the integration and random generation algorithms
The latent variables are typically treated as unobservable magnitudes and they serve to model one or several magnitudes related to them through some function $$ \begin{array}{ccc} F_i(Y_i) & = & F_i(\sum_k M_k a_{i,k} + \sqrt{1-\sum_k a_{i,k}^2} Z_i )\nonumber \ & = & F_i(M_1,..., M_k, ..., M_K, Z_i) \end{array} $$ The transfer function can have a more generic form: $F_i(Y_1,....,Y_N)$ but here the model is restricted to a one to one relation between the latent variables and the modelled ones. Also it is assumed that $F_i(y_i; \tau)$ is monotonic in $y_i$; it can then be inverted and the relation of the cumulative probability of $F_i$ and $Y_i$ is simple: $$ \int_{\infty}^b \phi_{F_i} df = \int_{\infty}^{F_i^{-1}(b)} \phi_{Y_i} dy $$ If $t$ is some value of the functional or modelled variable, $y$ is mapped to $t$ such that percentiles match, i.e. $F_Y(y)=Q_i(t)$ or $y=F_Y^{-1}(Q_i(t))$. The class provides an integration facility of arbitrary functions dependent on the model states. It also provides random number generation interfaces for usage of the model in monte carlo simulations.
Now let $\Phi_Z(z)$ be the cumulated distribution function of (all equal as mentioned) $Z_i$. For a given realization of $M_k$, this determines the distribution of $y$: $$ Prob ,(Y_i < y|M_k) = \Phi_Z \left( \frac{y-\sum_k a_{i,k},M_k} {\sqrt{1-\sum_k a_{i,k}^2}}\right) \qquad \
\mbox{or} \
\qquad Prob ,(t_i < t|M) = \Phi_Z \left( \frac {F_{Y_{i}}^{-1}(Q_i(t))-\sum_k a_{i,k},M_k} {\sqrt{1-\sum_k a_{i,k}^2}} \right) $$ The distribution functions of $ M_k, Z_i $ are specified in specific copula template classes. The distribution function of $ Y_i $ is then given by the convolution $$ F_{Y_{i}}(y) = Prob,(Y_i<y) = \int_{-\infty}^\infty,\cdots,\int_{-\infty}^{\infty}: D_Z(z),\prod_k D_{M_{k}}(m_k) \quad \
\Theta \left(y - \sum_k a_{i,k}m_k - \sqrt{1-\sum_k a_{i,k}^2},z\right),d\bar{m},dz, \qquad \
\Theta (x) = \left{ \begin{array}{ll} 1 & x \geq 0 \
0 & x < 0 \end{array}\right. $$ where $ D_Z(z) $ and $ D_M(m) $ are the probability densities of $ Z$ and $ M, $ respectively.
This convolution can also be written $$ F_{Y_{i}}(y) = Prob ,(Y_i < y) = \int_{-\infty}^\infty,\cdots,\int_{-\infty}^{\infty} D_{M_{k}}(m_k),dm_k: \int_{-\infty}^{g(y,\vec{a},\vec{m})} D_Z(z),dz, \qquad \
g(y,\vec{a},\vec{m}) = \frac{y - \sum_k a_{i,k}m_k} {\sqrt{1-\sum_k a_{i,k}^2}}, \qquad \\ \sum_k a_{i,k}^2 < 1 $$ In general, $ F_{Y_{i}}(y) $ needs to be computed numerically.
The policy class template separates the copula function (the distributions involved) and the functionality (i.e. what the latent model represents: a default probability, a recovery...). Since the copula methods for the probabilities are to be called repeatedly from an integration or a MC simulation, virtual tables are avoided and template parameter mechnics is preferred.
There is nothing at this level enforncing the requirement on the factor distributions to be of zero mean and unit variance. Thats the user responsibility and the model fails to behave correctly if it is not the case.
Derived classes should implement a modelled magnitude (default time, etc) and will provide probability distributions and conditional values. They could also provide functionality for the parameter inversion problem, the (e.g.) time at which the modeled variable first takes a given value. This problem has solution/sense depending on the transfer function $F_i(Y_i)$ characteristics.
To make direct integration and simulation time efficient virtual functions have been avoided in accessing methods in the copula policy and in the sampling of the random factors