sktree.experimental.simulate.simulate_helix#
- sktree.experimental.simulate.simulate_helix(radius_a=0, radius_b=1, obs_noise_func=None, nature_noise_func=None, alpha=0.005, n_samples=1000, return_mi_lb=False, random_seed=None)[source]#
Simulate data from a helix.
- Parameters:
- radius_a
int, optional The value of the smallest radius, by default 0.0.
- radius_b
int, optional The value of the largest radius, by default 1.0
- obs_noise_func
Callable, optional By default None, which defaults to a Uniform distribution from (-0.005, 0.005). If passed in, then must be a callable that when called returns a random number denoting the noise.
- nature_noise_func
callable(), optional By defauult None, which will add no noise. The nature noise func is just an independent noise term added to
Pbefore it is passed to the generation of the X, Y, and Z terms.- alpha
float, optional The value of the noise, by default 0.005.
- n_samples
int, optional Number of samples to generate, by default 1000.
- return_mi_lb
bool, optional Whether to return the mutual information lower bound, by default False.
- random_seed
int, optional The random seed.
- radius_a
- Returns:
- Parray_like of shape (n_samples,)
The sampled P.
- Xarray_like of shape (n_samples,)
The X dimension.
- Yarray_like of shape (n_samples,)
The X dimension.
- Zarray_like of shape (n_samples,)
The X dimension.
- lb
float The mutual information lower bound.
Notes
Data is generated as follows: We first sample a radius that defines the helix, \(R \approx Unif(radius_a, radius_b)\). Afterwards, we generate one sample as follows:
P = 5\pi + 3\pi R X = (P + \epsilon_1) cos(P + \epsilon_1) / 8\pi + N_1 Y = (P + \epsilon_2) sin(P + \epsilon_2) / 8\pi + N_2 Z = (P + \epsilon_3) / 8\pi + N_3
where \(N_1,N_2,N_3\) are noise variables that are independently sampled for each sample point. And \(\epsilon_1, \epsilon_2, \epsilon_3\) are “nature noise” terms which are off by default. This process is repeated
n_samplestimes.Note, that this forms the graphical model:
R \rightarrow P P \rightarrow X P \rightarrow Y P \rightarrow Z
such that P is a confounder among X, Y and Z. This implies that X, Y and Z are conditionally dependent on P, whereas