treeple.datasets.make_joint_factor_model#
- treeple.datasets.make_joint_factor_model(n_views, n_features, n_samples=100, joint_rank=1, noise_std=1, m=1.5, random_state=None, return_decomp=False)[source]#
Joint factor model data generator.
Samples from a low rank, joint factor model where there is one set of shared scores.
- Parameters:
- n_views
int
Number of views to sample. This corresponds to
B
in the notes.- n_features
int
, orlist
ofint
Number of features in each view. A list specifies a different number of features for each view.
- n_samples
int
Number of samples in each view
- joint_rank
int
(default 1) Rank of the common signal across views.
- noise_std
float
(default 1) Scale of noise distribution.
- m
float
(default 1.5) Signal strength.
- random_state
int
orRandomState
instance, optional (default=None) Controls random orthonormal matrix sampling and random noise generation. Set for reproducible results.
- return_decomp
bool
, default=False If
True
, returns theview_loadings
as well.
- n_views
- Returns:
- Xs
list
of array-likes List of samples data matrices with the following attributes.
Xs length: n_views
Xs[i] shape: (n_samples, n_features_i).
- U: (n_samples, joint_rank)
The true orthonormal joint scores matrix. Returned if
return_decomp
is True.- view_loadings:
list
ofnumpy.ndarray
The true view loadings matrices. Returned if
return_decomp
is True.
- Xs
Notes
The data is generated as follows, where:
\(b\) are the different views
\(U\) is is a (n_samples, joint_rank) matrix of rotation matrices.
svals
are the singular values sampled.- \(W_b\) are (n_features_b, joint_rank) view loadings matrices, which are
orthonormal matrices to linearly transform the data, while preserving inner products (i.e. a unitary transformation).
- For b = 1, .., B
X_b = U @ diag(svals) @ W_b^T + noise_std * E_b
where U and each W_b are orthonormal matrices. The singular values are linearly increasing following [1] section 2.2.3.
References