treeple.experimental.simulate.simulate_helix#

treeple.experimental.simulate.simulate_helix(radius_a=0, radius_b=1, obs_noise_func=None, nature_noise_func=None, alpha=0.005, n_samples=1000, return_mi_lb=False, random_seed=None)[source]#

Simulate data from a helix.

Parameters:
radius_aint, optional

The value of the smallest radius, by default 0.0.

radius_bint, optional

The value of the largest radius, by default 1.0

obs_noise_funcCallable, optional

By default None, which defaults to a Uniform distribution from (-0.005, 0.005). If passed in, then must be a callable that when called returns a random number denoting the noise.

nature_noise_funccallable(), optional

By defauult None, which will add no noise. The nature noise func is just an independent noise term added to P before it is passed to the generation of the X, Y, and Z terms.

alphafloat, optional

The value of the noise, by default 0.005.

n_samplesint, optional

Number of samples to generate, by default 1000.

return_mi_lbbool, optional

Whether to return the mutual information lower bound, by default False.

random_seedint, optional

The random seed.

Returns:
Parray_like of shape (n_samples,)

The sampled P.

Xarray_like of shape (n_samples,)

The X dimension.

Yarray_like of shape (n_samples,)

The X dimension.

Zarray_like of shape (n_samples,)

The X dimension.

lbfloat

The mutual information lower bound.

Notes

Data is generated as follows: We first sample a radius that defines the helix, \(R \approx Unif(radius_a, radius_b)\). Afterwards, we generate one sample as follows:

P = 5\pi + 3\pi R
X = (P + \epsilon_1) cos(P + \epsilon_1) / 8\pi + N_1
Y = (P + \epsilon_2) sin(P + \epsilon_2) / 8\pi + N_2
Z = (P + \epsilon_3) / 8\pi + N_3

where \(N_1,N_2,N_3\) are noise variables that are independently sampled for each sample point. And \(\epsilon_1, \epsilon_2, \epsilon_3\) are “nature noise” terms which are off by default. This process is repeated n_samples times.

Note, that this forms the graphical model:

R \rightarrow P

P \rightarrow X
P \rightarrow Y
P \rightarrow Z

such that P is a confounder among X, Y and Z. This implies that X, Y and Z are conditionally dependent on P, whereas