###### Stepped wedge study design

Given further delay in starting up the study as described in the Design section, the sample size requirements estimation for this study was again re-visited.

The previous target of starting the data collection (baseline) in April 2016 is now untenable as there are no available resources that can be used to start the baseline. The main implication of this delay is that the implementation window of the study is narrowing and the number of data collection rounds potentially dwindling which has major knock-on effects on sample size. It has been decided to drop baseline altogether and then keep the start of interventions in May 2016. We would then start with incidence data collection by the first week of May 2016 for 2 weeks and then continue with the second round of incidence data collection for June and then have the first round of stepped wedge data collection by the end of June 2016. Keeping the number of steps to 4, this would mean that the final stepped wedge data collection will be on the last 2 weeks of December 2016. This option will roughly maintain the amount of time previously allocated for data analysis and will ensure deliver out outputs to 3ie by the March 2016 deadline. However, dropping baseline has a sample size implication. It should be remembered that a baseline round has two benefits to the study. It reduces the overall study sample size requirement and per cluster sample size requirement. Also, it increases the power of the study to detect variances and differences. In general, a baseline makes the study so much stronger and better. Losing baseline would require a relevant increase in sample size to make up for the variance lost by giving up baseline. The sample size increase is reflected in the following calculations:

$k = 4 \ \text {steps}$
$b = 0 \ \text {baseline measurement}$
$t = 1 \ \text {measurement after each step}$
$p = 0.034 \ \text {intra-cluster correlation coefficient}$
$n = 192$

Given these parameters, we arrive at the following sample size:

\begin{align} n_{\text {stepped wedge}} &= 1804 \times \frac {1 + 0.034(4 \times 1 \times 192 + 0 \times 192 – 1}{1 + 0.034 \left (\frac {1}{2} \times 4 \times 1 \times 192 + 0 \times 192 – 1 \right )} \times \frac {3(1-0.034)}{2 \times 1 \left (4 – \frac {1}{4} \right )} \\ \\ &= 1804 \times \frac {1 + 0.034(767)}{1 + 0.034(383)} \times \frac {3(0.966)}{2 \left (\frac{15}{4} \right )} \\ \\ &= 1804 \times \frac {27.078}{14.022} \times \frac {2.898}{7.5} \\ \\ &= 1804 \times 1.93110826 \times 0.3864 \\ \\ &\approx 1346 \end{align}

This sample size is higher by 206 samples as compared to the original design. This sample size will require 7 clusters with a size of 192 each. It will still be possible to keep the 6 study cluster structure but we will have to get a minimum of 224 samples within each of the study clusters.

This sample size increase is not outrageously large and can be accommodated with minor adjustments in design. This option is also the least disruptive compared to the previous option. This option can be implemented without changing the study design considerably and without needing to negotiate for the deadline to be extended yet again.

###### Incidence sub-study

For the incidence sub-study, we apply sample size calculations in $y_{\text {person-years}}$ proposed by Hayes and Bennet8 for an individually-randomised cluster controlled trial as follows:

$$y_{\text {person-years}} = \left (Z_{\frac {\alpha}{2}} + Z_\beta \right) ^ 2 \times \frac {\lambda_0 + \lambda_1}{(\lambda_0 – \lambda_1) ^ 2}$$

where

$\lambda_0 = \text {incidence rate in control group}$
$\lambda_1 = \text {incidence rate in intervention group}$

We use a value of $\lambda_0 = 0.32$ (assuming a prevalence rate of 20% in the control group) and a value of $\lambda_1 = 0.24$ (assuming a prevalence rate of 15% in the intervention group). This gives us a sample size for one arm of the incidence study of:

$$y_{\text {person-years}} = (1.96 + 0.84) ^ 2 \times \frac {0.32 + 0.24}{(0.32 – 0.24) ^ 2} \approx 686$$

For both arms, we would therefore need 1372 sample size. To calculate the number of clusters needed based on this sample size, we use the following formula:

$$n_{\text {clusters}} = 1 + \left (Z_{\frac {\alpha}{2}} + Z_\beta \right ) ^ 2 \times \frac {\frac {\lambda_0 \ + \ \lambda_1}{y_{\text {person-years}} \ + \ {k ^ 2}({\lambda_0} ^ 2 \ + \ {\lambda_1} ^ 2)}}{(\lambda_0 \ – \ \lambda_1) ^ 2}$$

where

$k = \text {intra-cluster correlation coefficient which we set at 0.034}$

The formula gives us:

$$n_{\text {clusters}} = 1 + (1.96 + 0.84) ^ 2 \times \frac {\frac {0.32 \ + \ 0.24}{1372 \ + \ 0.034 ^ 2 (0.32 ^ 2 \ + \ 0.24 ^ 2)}}{(0.32 \ – \ 0.24) ^ 2} \approx 2$$

So, we will need 1372 sample (686 per arm) from 2 clusters (one from each study arm).

###### Endnotes

1 See Woertman, Willem, Esther de Hoop, Mirjam Moerbeek, Sytse U Zuidema, Debby L Gerritsen, and Steven Teerenstra. “Stepped Wedge Designs Could Reduce the Required Sample Size in Cluster Randomized Trials.” Journal of Clinical Epidemiology 66, no. 7 (July 1, 2013): 752–58. doi:10.1016/j.jclinepi.2013.01.009.

2 As recommended by Hayes, R J, and S Bennett. “Simple Sample Size Calculation for Cluster-Randomized Trials.” International Journal of Epidemiology 28, no. 2 (April 1999): 319–26. doi:10.1093/ije/28.2.319.

3 Based on ICC recommendations in Kaiser, Reinhard, Bradley A Woodruff, Oleg Bilukha, Paul B Spiegel, and Peter Salama. “Using Design Effects From Previous Cluster Surveys to Guide Sample Size Calculation in Emergency Settings..” Disasters 30, no. 2 (May 31, 2006): 199–211. doi:10.1111/j.0361-3666.2006.00315.x.

4 As recommended by Prudhon, Claudine, and Paul B Spiegel. “A Review of Methodology and Analysis of Nutrition and Mortality Surveys Conducted in Humanitarian Emergencies From October 1993 to April 2004.” Emerging Themes in Epidemiology 4, no. 1 (2007): 10. doi:10.1186/1742-7622-4-10.

5 This is based on sample size simulations for a PROBIT estimator for GAM prevalence conducted by Brixton Health and Valid International (documentation available on request).

6 This will be the same number of sample needed for each of the target groups for each of the outcome measures to be assessed directly from the main study. These target groups are 1) PLW (for measurement of prevalence among PLW); 2) MAM children 6-59 months old (for measurement of MAM treatment coverage); 3) children 6-23 months (for measurement of eBSFP or blanket FBPM coverage); 4) children 6-23 months at risk (for measurement of targeted FBPM coverage)

7 Number of steps and number of measurements per step impact on sample size in a stepped wedge design.

8 This will be the same number of sample needed for each of the target groups for each of the outcome measures to be assessed directly from the main study. These target groups are 1) PLW (for measurement of prevalence among PLW); 2) MAM children 6-59 months old (for measurement of MAM treatment coverage); 3) children 6-23 months (for measurement of eBSFP or blanket FBPM coverage); 4) children 6-23 months at risk (for measurement of targeted FBPM coverage)

9 See Hayes, R J, and S Bennett. “Simple Sample Size Calculation for Cluster-Randomized Trials.” International Journal of Epidemiology 28, no. 2 (April 1999): 319–26. doi:10.1093/ije/28.2.319.