sweights
sWeights implementation.
sWeights are used to statistically separate signal and background contributions in a data sample when the distributions overlap.
- cal_sweights(pdfs, N=None)[source]
Calculate sWeights for a set of PDF components.
Given normalized PDFs for each component (signal, background, etc.), computes weights such that the weighted sum of events gives the yield for each component.
- Parameters:
pdfs (list of array-like) –
List of PDF values evaluated at each data point. Each element should be an array of shape (n_events,).
Two calling conventions are supported:
Normalized PDFs with N parameter:
>>> import numpy as np >>> from scipy.stats import norm >>> np.random.seed(42) >>> x_sig = np.random.normal(5.0, 0.5, 50) >>> x_bkg = np.random.uniform(0, 10, 100) >>> x = np.concatenate([x_sig, x_bkg]) >>> f_sig = norm.pdf(x, 5.0, 0.5) >>> f_bkg = np.ones_like(x) / 10.0 >>> sw = cal_sweights([f_sig, f_bkg], N=[50, 100]) >>> bool(np.isclose(np.sum(sw[0]), 50)) True
Unnormalized PDFs without N parameter:
>>> sw2 = cal_sweights([f_sig * 50, f_bkg * 100]) >>> bool(np.allclose(np.sum(sw2[0]), 50, rtol=0.1)) True
N (array-like, optional) – Expected yields for each component. If None, yields are estimated from the PDFs assuming the PDFs are already scaled by yields.
- Returns:
sweights – Array of shape (n_components, n_events) containing sWeights for each component at each event.
- Return type:
ndarray
- Raises:
LinAlgError – If the covariance matrix is singular and cannot be inverted.
Notes
The sWeights are computed using Cowan’s formula:
\[p(x) = \sum_i N_i f_i(x)\]\[V_{ij}^{-1} = \sum_k \frac{f_i(x_k) f_j(x_k)}{p(x_k)^2}\]\[w_i(x_k) = \frac{\sum_j V_{ij} f_j(x_k)}{p(x_k)}\]where:
\(N_i\) is the yield for component i
\(f_i(x)\) is the normalized PDF for component i
\(p(x)\) is the total PDF
\(V\) is the covariance matrix of the yields
Reference: sPlot paper