Methodology

Methodology described in this vignette is adapted from the article “BIGL: Biochemically Intuitive Generalized Loewe null model for prediction of the expected combined effect compatible with partial agonism and antagonism” (2017) by K. Van der Borght, A. Tourny, R. Bagdziunas, O. Thas, M. Nazarov, H. Turner, B. Verbist and H. Ceulemans (doi:10.1038/s41598-017-18068-5) as well as its technical supplement. We advise the reader to consult it for a deeper understanding of the procedure described next.

Further chapters were added as extensions on top of the original article regarding variance heterogeneity, Bliss independence and alternative Loewe generalization.

Marginal monotherapy curves

First, a monotherapy model is described by the following equation.

$$ y\left(d\right) = b + \dfrac{m - b}{1 + \left(\frac{\operatorname{EC50}}{d}\right)^{|h|}} $$

where y is the response (or effect), d is the dose (or concentration) of the compound, h is the Hill’s coefficient and b and m are respectively baseline and maximum response for that compound. Lastly, EC50 stands for the dose level of the compound needed to attain the midpoint effect, i.e. $$y\left(\textrm{EC50}\right) = b + \frac{m - b}{2}$$

Note that m > b if and only if the response is increasing with the dose of the compound. If the response is decreasing, then m < b.

This monotherapy equation is estimated for both compounds with the constraint that b, the baseline level, is shared across compounds. This baseline level is denoted by b in the parameter vector. Additionally, m1 and m2 in the parameter vector stand for estimates of maximal responses m1 and m2, respectively, whereas h1 and h2 are Hill’s coefficients (slope) of the monotherapy curve for each compound. Lastly, e1 and e2 are log-transformed inflection points, i.e. e1  = log (EC501) and e2  = log (EC502).

Null models of no synergy

Occupancy

Define the occupancy level occup, i.e. the fractional (enzymatic) effect or observed effect relative to maximal effect, for both compounds at given dose levels as $$ \textrm{occup}_{1}\left(d_{1}\right) = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{1}}{d_{1}}\right)^{h_{1}}} $$ $$ \textrm{occup}_{2}\left(d_{2}\right) = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{2}}{d_{2}}\right)^{h_{2}}} $$ Alternatively, the above equations can be rearranged to express dose in terms of occupancy so that $$ d_{1} = \operatorname{EC50}_{1} \left(\frac{1}{\operatorname{occup_{1}}} - 1 \right)^{-1/h_{1}} $$ $$ d_{2} = \operatorname{EC50}_{2} \left(\frac{1}{\operatorname{occup_{2}}} - 1 \right)^{-1/h_{2}} $$ Although the occupancy was considered here in the marginal case, it is equally well-defined when compounds are combined and is understood as the fraction of enzyme bound to any compound. It can thus be used to re-express classical Loewe additivity equations.

Classical Loewe model

In the classical Loewe model where both marginal models share upper (m) and lower (b) asymptotes, occupancy is defined as the solution to this additivity equation for each dose combination (d1, d2), namely $$\frac{d_1\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}}+ \frac{d_2\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}} = 1$$

Once occupancy is computed, in the classical Loewe model the predicted response at dose combination (d1, d2) can be calculated to be

Generalized Loewe model

Generalized Loewe model extends the classical Loewe model by allowing compounds to have different upper asymptotes so that when adjusted, the above predicted response is written instead as $$ y = b + \textrm{occup} \times \left[\frac{\left(m_{1} - b\right) d_{1}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{1}}}{\textrm{EC50}_{1}} + \frac{\left(m_{2} - b\right) d_{2}\left(\textrm{occup}^{-1} - 1\right)^{1/h_{2}}}{\textrm{EC50}_{2}}\right]$$

In particular, if m1 = m2, then generalized Loewe is equivalent to the classical Loewe.

Highest Single Agent

A null model based on the Highest Single Agent (HSA) model does not attempt to model interaction effects at all and the predicted effect of a combination is either the minimum (if marginal curves are decreasing) or the maximum (if marginal curves are increasing) of both monotherapy curves.

Bliss independence model

Bliss independence implies that two agents do not cooperate, i.e. act independently of each other. Additionally, the assumption is that a decreasing monotherapy curves express the fractions of unaffected control populations, while increasing curves express the fractions of affected control populations.

Bliss independence model is formulated for the fractional responses f (“fraction affected”), where the predicted response f12 at dose combination (d1, d2) is defined as:

f12(d1, d2) = f1(d1) + f2(d2) − f1(d1)f2(d2),

with $$f_1(d_1) = \frac{y\left(d_1\right) - b}{m_1 - b} = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{1}}{d_{1}}\right)^{|h_{1}|}}$$ $$f_2(d_2) = \frac{y\left(d_2\right) - b}{m_2 - b} = \frac{1}{1 + \left(\frac{\operatorname{EC50}_{2}}{d_{2}}\right)^{|h_{2}|}}$$ In the classical Bliss independence model, marginal models share baseline and maximum response.

To allow the compounds to have different maximal responses, the fractional responses are rescaled to the maximum range (i.e. absolute difference between baseline and maximal response). Then the predicted response is defined as:
$$ y = b + (m_{max}-b) \left[ \tilde{f_1}(d_1) + \tilde{f_2}(d_2) - \tilde{f_1}(d_1)\tilde{f_2}(d_2) \right], $$ where mmax is one of m1 or m2, for which the value of |mi − b| is larger, and $$ \tilde{f_i} = f_i\frac{m_i-b}{m_{max}-b}~~\text{for}~~i = 1, 2.$$

This implementation of Bliss independence supports both compounds with decreasing and increasing monotherapy profiles. However using one compound with a decreasing profile and another with an increasing profile in combination is not supported.

Alternative Loewe Generalization

An alternative generalization of Loewe Additivity for the case of different asymptotes can be defined as a combination of Loewe and HSA approaches as follows:

In a classical Loewe equation, predicted response y at a given dose combination (d1, d2) can be found by solving the equation:

$$ \frac{d_1}{D_1(y)} + \frac{d_2}{D_2(y)} = 1, $$

where $D_i(y) = \operatorname{EC50}_{i}\left(\frac{y-b}{m_i-y}\right)^{\frac{1}{|h_i|}}$, for i = 1, 2 is the dose of the i-th compound that gives response y. Note that here Di is properly defined only if y is between b and mi.

For the case of different asymptotes, say when y > m1 (increasing curve) or y < m1 (decreasing curve), we set D1(y) = +∞, so that the y is determined from the equation d2 = D2(y), replicating what is done in the HSA approach.

Calculation procedure

In order to evaluate any of the null models described above, the fitSurface function will use the monotherapy parameter estimates from the previous step. The idea is if there are synergistic or antagonistic effects, then administration of both compounds will lead to important deviations from what combined monotherapy data would suggest according to the null model. Routines within fitSurface function do essentially the following.

  1. Find occupancy for each combination of doses by solving the additivity equation of the classical Loewe model. This step does not require knowledge of the baseline or maximal response for either of the compounds. Occupancy solution is also reported in the HSA model case although occupancy plays no role in such a model.
  2. Compute the predicted response based on the above described response equations and the previously computed occupancy rate for each dose combination.
  3. If desired, the function will then calculate the selected statistic to evaluate the deviation of the predictions from the desired null model.

Synergy evaluation assuming equal variances for on- and off-axis points

Synergy is evaluated for all off-axis dose combinations, i.e. dose combinations that are not used in the monotherapy curve estimation. Synergy evaluation depends on the underlying null model and any of the above models, i.e. generalized or classical Loewe or Highest Single Agent, can be used for this purpose. We provide here a brief summary of both statistical tests. Technical derivations and further details are available in the article cited at the beginning of the document.

To define test statistics, the following notations are used.

  • Let yij be the observed effect for replicate j of dose combination i, so that y11, y12, y13, y21, ..., yknk is a set of observed effects. We assume k different dose combinations and nk replicates for each combination. The number of different off-axis dose combinations is denoted as n1.
  • p1, ..., pk are the predicted responses for the k off-axis dose combinations.
  • σ2 is the variance of the replicate observations, assumed to be constant over all dose combinations, and estimated by taking MSE of the null model.
  • df0 is the number of degrees of freedom from the marginal model estimation.

We construct a vector R = (r1, ..., rk) which represents mean deviation from the predicted effect. In particular, $$ r_{k} = \frac{1}{n_{k}} \sum_{i = 1}^{n_{k}} y_{ki} - p_{k} $$

With the help of bootstrapping, the covariance matrix of R can be estimated under the null hypothesis of no synergy so that Var (R) = σ2(D + Cp) where D is a diagonal matrix with 1/ni in the i-th position and Cp is the covariance matrix obtained from bootstrap.

meanR

The meanR test will evaluate whether the null model globally fits well the observed data. It is derived using a lack-of-fit sum of squares technique. In particular, the test statistic is given by $$ \operatorname{TestStat} = \frac{R^{T}\left(D + C_{p}\right)^{-1}R}{n_{1}\sigma^{2}} $$ Assuming that residuals from the generalized Loewe model are normally distributed, it can be shown that this statistic follows an Fn1, df0 distribution under the null. If these assumptions are not satisfied, the null distribution can be approximated by bootstrapping.

maxR

The maxR test evaluates whether the null model locally fits the observed data. In particular, it provides a test score for each off-axis combination. Based on the sign of this score, it can be determined whether synergy or antagonism is more likely and a formal test can be constructed. Under the null hypothesis of no lack-of-fit and normally distributed effects, max |RT(D + Cp)−1/2|/σ ∼ max |Z1, …, Zk| where Zj ∼ N(0, 1). More particularly, the test statistic for the k-th off-axis dose combination (d1, d2) is computed as TestStat (d1, d2) = [|RT(D + Cp)−1/2|/σ]k where [⋅]k indicates the k-th coordinate. This test statistic is then compared either to the null distribution based on normal approximation or a bootstrapped approximation.

Synergy evaluation in case of variance heterogeneity

In the methodology described above one important assumption is made regarding the variance of the on- and off-axis dose combinations. It is considered to be equal across all points. This assumption is also mentioned in the original article and its technical supplement.

In reality it is often seen that the variance of the monotherapies is not equal to the variance of the off-axis combinations. The assumption of equal variances is thus not always valid. That is why the meanR and maxR test-statistics can also estimate the variances for on-axis (monotherapies) and off-axis dose-combinations separately. Two extra methods are described below: the unequal method (Separated variance) and the model method (Modeled variance). For both methods replicates are required and no variance-stabilizing transformations are required. The latter is often necessary when assuming equal variances.

Adapted meanR

The adapted meanR test uses two separate variance estimates for (a) the monotherapies (= σ02) and (b) the dose combinations (= Σ1, a diagonal matrix). The notation for both unequal as model will be the same, but the estimation of Σ1 will be different. The variance of the monotherapies σ02 is estimated as σ2 above by taking the MSE of the null model. The test statistic is given by:

$$ \operatorname{TestStat} = \frac{R^{T}\left(\Sigma_{1}D + \sigma^{2}_{0}C_{p}\right)^{-1}R}{n_{1}} $$

  1. unequal method: The variance for the dose combinations is estimated by taking the variance in each dose combination and then taking the mean of all these variances, thus Σ1 = σ12In1. The downside of this method is that the variance for all combinations is assumed to be equal. In reality the variance often depends on the mean effect. This is taken into account in the model method.

  2. model method: In this method the diagonal elements of Σ1 are no longer estimated as single number but rather as a vector of variances. Each off-axis point has now its own variance. A linear model is fitted on the original dataset, modeling the variance of each off-axis point as a function of its mean effect. The estimated model parameters are then used to predict the variance for the corresponding mean effect measured for that dose combination. These predicted variances are placed in the diagonal of Σ1. Modelling the variance with a linear model may require a transformation, to achieve a better fit and to avoid negative variances being modelled. A log-transformation often makes a good impression.

Adapted maxR

The same approach is taken for the adapted maxR test statistic. Instead of using one estimated variance for both on- and off-axis points, two separate estimates are used. The estimates for Σ1 are different depending on the method used (unequal or model). The methodology of estimating the variance is the same as was described in the “Adapted meanR” section above.

The maxR test becomes

max |RT(Σ1D + σ02Cp)−1/2| ∼ max |Z1, …, Zk|

where Zj ∼ N(0, 1). In particular, the test statistic for the k-th off-axis dose combination (d1, d2) is computed as TestStat (d1, d2) = [|RT(Σ1D + σ02Cp)−1/2|]k where [⋅]k indicates the k-th coordinate.

Bootstrapping under unequal variances

In case of the unequal variance assumption, the bootstrap proceeds as before, with the off-axis residuals being pooled and resampled. With the model assumption, the resampling is more complicated, as the residuals are no longer exchangeable. One option is to rescale the observed residuals according to the mean-variance model (i.e. dividing them by their standard deviations), resample from this pool of standardized residuals, and then scale back to the true variance (by multiplying by the standard deviation). Yet this approach has proven to be unstable as it leads to extreme observations. An alternative (the default) is to generate zero-mean normal data with the modelled variances (see the rescaleResids argument in fitSurface()).

Advantages of unequal and model methods compared to assumption of equal variances

The assumption of equal variances between monotherapies and off-axis dose-combinations fails to control the type I error rate around pre-specified level, when the variance of off-axis points increases (natural variance or outliers). This results in false positive synergy calls when in reality there were none. Both the unequal and the model methods control the type I error rate far better, with slightly better results obtained by the model method.

Furthermore, the sensitivity and specificity of the maxR test statistics are higher with the methods assuming variance heterogeneity compared to the methods where equal variances are assumed.

Effect size for off-axis points

As with many statistical tests, the researcher may not only be interested in a measure of significance (e.g. a p-value), but also in a measure of effect size, and a measure of the imprecision of this estimated effect size. Here we develop confidence intervals for two types of effect sizes. The first is a pointwise effect size, which is defined at every off axis point as the difference between the true mean response and the expected response under additivity. It is estimated as $E_{i} = \frac{1}{n_i}\sum_{j=1}^{n_i}(R_{ij}-\hat{R}_i)$ with j = 1, ..., ni, for every off-axis point i = 1, ..., n1, whereby we strive to achieve a simultaneous coverage for all off-axis points of 95%.

Confidence interval

Let ei denote the true effect size on off axis point i, and call Ei its estimate based on the data. Relying on the asymptotic normality of the estimator, an approximate (asymptotic) confidence interval would be formed as the set:

with i the estimated standard error of Ei, and zα/2 the 1 − α/2 quantile of the standard normal distribution.

We know, however, from the meanR and maxR tests that the asymptotic distributions provide poor approximations. Therefore, we use the bootstrap here too to build the confidence intervals.

For every bootstrap instance, bootstrap observations are sampled for on- as well as off axis points. For the on-axis points, a parametric bootstrap based on the estimated monothereapy curves is used, as for the calculation of the meanR and maxR statistics. Based on these on-axis bootstrap samples, new monotherapy curves are fitted with resulting residual variances bootstrap variance MSE0b, and corresponding response surfaces with expected outcomes ib are derived. For the off-axis points, with ni replicates at a given point, Aij = Rij − i are the pointwise residuals j = 1, ..., ni. Here Rij is the observed outcome and i, the average outcome at point i, serves as an unbiased estimator of the true response. Note that these residuals are different from the residuals Ei = i − i used to construct the test statistics; for Aij the departure with respect to the mean outcome at that off-axis point is used. These residuals are resampled with replacement from the observed residuals, possibly using rescaling as explained below. The resampled residuals are then added to the estimated effect sizes to obtain bootstrapped observations Rijb = i + Aijb, with b = 1, ..., B denoting the bootstrap instance. This leads to the bootstrap effect sizes Eib = ib − ib and test statistics $\left\vert \frac{E^b_i -E_i}{\hat{s}^b_i} \right\vert$, using the standard deviations ib from the bootstrap.

Over all bootstrap instances, we then find the distribution of

Call tα the threshold such that

Finally, we find for every off-axis point the confidence interval as the collection:

Standard error

As an estimate of the standard error we use

with Fb equal to MSE0bD, MSE1bD or Sb, depending on which variance model is used. This standard error resembles the variance estimators of the meanR and maxR statistics discussed there.

By using only the diagonal elements of Fb, we ignore the covariance between test statistics. We will use the studentised-range concept (which is also central to Tukey’s method for multiple testing) for controlling the family-wise error rate (FWER). For the covariance matrix Cp we use the one estimated for the observed data. This matrix turns out to be quite stable over the bootstrap runs; moreover the calculation of the matrix for each bootstrap sample would imply a time-consuming nested bootstrap procedure. On the other hand, MSE0b is re-estimated with each bootstrap sample. Also Fb is re-estimated based on the bootstrapped data.

Single effect measure

Researchers may also want to have a single measure for the strength of the synergy for a single experiment, e.g. in view of ranking compound combinations according to their synergistic effect. For this we calculate the average of the pointwise off-axis effect sizes. This is estimated as $\bar{E} = \frac{1}{\sum_{i= 1}^{n_1}n_i }\sum_{i=1}^{n_1}\sum_{j= 1}^{n_i}(R_{ij}-\hat{R}_i)$. It may be considered as a measure of the “volume” between the expected and observed response surfaces, similar to the “integrated synergy” of . Note that this effect size may cause synergistic and antagonistic points to cancel out against one another, but we believe this scenario is unlikely. As before we would like a measure of imprecision for this effect size.

Confidence interval

The construction of the bootstrap confidence interval for the single effect size follows the general procedure for bootstrap-t confidence intervals . Let F denote the estimated covariance matrix of the raw residuals as before.

Using Equation in the main text and the fact that

the standard error of is $se(\bar{E}) = \frac{1}{n_1}\sqrt{\sum_{f \in \boldsymbol{F}} f}$. We obtain bootstrapped data for on- and off-axis points for bootstrap instances b = 1, ..., B, and calculate the corresponding mean residual and standard error b and se(b). Then define the statistic

Note that se(b) relies on the bootstrap covariance matrix Fb. Call (qα/2, q1 − α/2) the quantiles of the bootstrap distribution such that

Finally, we find the confidence interval as

Of course, this procedure can easily be adapted to find the sum of all raw residuals, but the mean may be better comparable across experiments with different designs.

On resampling residuals

Depending on the mean-variance structure in the data, residuals are resampled differently. The same strategies are used for resampling 1) residuals with respect to the expected response surface Rij under the null hypothesis for the meanR and maxR statistics as for 2) residuals Aij with respect to the average at the off-axis point under the alternative hypothesis for constructing the confidence intervals. In the description below, we use Uij as generic notation for either Rij or Aij.

In case of constant variability within the off-axis points, the residuals Uij can simply be pooled and resampled with replacement.

When a linear mean-variance structure is assumed, one option is to rescale the pointwise residuals first to $b_{ij} = \frac{U_{ij}}{\sqrt{v^{-1}(\beta_0+\beta_1\mu_i)}}$, then pool and resample them, and then scale them back to $b_{ij}\sqrt{v^{-1}(\beta_0+\beta_1\mu_i)}$ according to their new position i. Yet in practice this leads to extreme observations, which destabilizes the fitting procedure. A second option is to use random draws from a zero-mean normal distribution with the modelled variance v−1(β0 + β1μi) to generate new Uij’s. This latter option was found to be more stable and is used as a default in the BIGL package.