The aim of this post is to provide basic idea about sample size estimation in clinical trial. Every clinical trial must be planned. This plan must include the objective of trial, primary and secondary endpoint, and method of collecting data, sample to be included, sample size with scientific justification, method of handling data, statistical methods and assumptions. This plan is termed as clinical trial protocol. One of the key aspects of protocol is sample size estimation. Sample size must be planned carefully to ensure that the research time, patient effort and support costs invested in any clinical trial are not wasted.
Ideally, clinical trials should be large enough to detect reliably the smallest possible differences in the primary outcome with treatment that are considered clinically worthwhile. It is not uncommon for studies to be underpowered, failing to detect even large treatment effects because of inadequate sample size. Also, it may be considered unethical to recruit patients into a study that does not have a large enough sample size for the trial to deliver meaningful information on the tested intervention.
What is the Basic Requirement for Estimation of Sample Size in Clinical trials?
Estimation of sample size along with other study related parameters depends on Type I error, Type II error and Power.
Type 1 error:
Suppose we want to test two drugs to find out whether brand A is better in curing subject's suffering from fever as compared to brand B or not. Since both brands contain same drug so we can expect the effect of both of them will be similar. Statistical hypothesis can be expressed as follows
Null Hypothesis (H0): Brand A is equal to Brand B
Alternate Hypothesis (H1): Brand A is not equal to Brand B i.e. the effect of brand A can be inferior to effect of brand B or vise versa.
Let us find out the possible error can happen in present situation.
Error 1:
Based on statistical analysis it is concluded that Brand A is better than Brand B, basically we reject H0. Knowing that Brand A is equal to Brand B(H0), we are making an error here by rejecting H0. This is called as Type I error. Statistically it is defined as Type I error = (Reject H0 / H0 is true).
Probability of Type I error is called as level of significance and defined as
Type 2 error:
Consider we are testing intervention against placebo to evaluate if intervention is better in curing subject’s suffering from disease as compared to placebo. (It is expected that the effect of intervention is better than placebo). Let us try to built a statistical hypothesis around this
Null Hypothesis (H0): intervention is equal to placebo
Alternate Hypothesis (H1): intervention is better than placebo Let us try to evaluate the error that can happen.
Based on statistical analysis it is concluded that Brand A is equal to Brand B, basically we accept H0. Knowing that Brand A is not equal to Brand B (H0), we are making an error here by accepting H0. This is called as Type II error.
Statistically it is defined as Type II error = (Accept H0 / H1 is true).
And probability of type II error can be expressed as b
In above case if analysis concludes that intervention is better than placebo, we reject H0, which would be correct decision. Probability of such a decision is called as “Power of test”.
Power = Probability (Reject H0 / H1 is true) and defined as 1- b
Needs to be take considerations for estimating sample size in clinical trials:
Study Design:
There are many statistical designs have been used to achieve objectives. The most common design used are parallel group design and crossover design. For calculating sample size, the study design should be explicitly defined in the objective of the trial. Each design will have different approach and formula for estimating sample size.
Alternative hypothesis either one sided or two sided:
This is another important parameter needed for sample size estimation, which explains the objective of study. The objective can be equality, non-inferiority, superiority or equivalence. Equality and equivalence trials are two-sided trials where as non-inferiority and superiority trials are one-sided trials. Superiority or non-inferiority trials can be conducted only if there is prior information available about the test drug on a specific end point.
Primary endpoint of study:
The sample size calculation depends on primary end point of study. The description of primary study end point should cover whether it is discrete or continuous or time-to-event. Sample size is estimated differently for each of these end points. Sample size will be adjusted if primary end point involves multiple comparison.
Expected Response of treatment:
The information about expected response is usually obtained from previous trials done on the test drug. If this information is not available, it could be obtained from previous published literature.
Clinical Important / Meaningful Difference:
This is one of most critical and one of most challenging parameters. The challenge here is to define a difference between test and reference which can be considered clinically meaningful. The selection of the difference might take account of the severity of the illness being treated (a treatment effect of that reduce mortality by one percent might be clinically important while a treatment effect that reduces transient asthma by 20% may be of little interest). It might take account of the existence of alternate treatments It might also take account of the treatments cost and side effects.
Level of Significance:
This is commonly assumed as 5% or lesser. Type I error is inversely proportional to sample size.
Power:
As per ICH E9. Power should not be less that 80%.Type II error is directly proportional to sample size.
Withdrawals, missing data and losses to follow up:
Any sample size calculation is based on the total number of subjects who are needed in the final study. In practice, eligible subjects will not always be willing to take part and it will be necessary to approach more subjects than are needed in the first instance. Subjects may fail or refuse to give valid responses to particular questions, physical measurements may suffer from technical problems, and in studies involving follow up (e.g. trials or cohort studies) there will always be some degree of attrition. It may therefore be necessary to calculate the number of subjects that need to be approached in order to achieve the final desired sample size. More formally, suppose a total of N subjects are required in the final study but a proportion (q) is expected to refuse to participate or to drop out before the study ends. In this case the following total number of subjects would have to be approached at the outset to ensure that the final sample size is achieved.
Components of sample size calculation:
The minimum information needed to calculate sample size for a randomized controlled trial in which a specific event is being counted includes the power, the level of significance, the underlying event rate in the population under investigation and the size of the treatment effect sought. The calculated sample size should then be adjusted for other factors, including expected compliance rates and, less commonly, an unequal allocation ratio.
Power:
The power of a study is its ability to detect a true difference in outcome between the standard or control arm and the intervention arm. This is usually chosen to be 80%. By definition, a study power set at 80% accepts a likelihood of one in five (that is, 20%) of missing such a real difference. Thus, the power for large trials is occasionally set at 90% to reduce to 10% the possibility of a so-called "false-negative" result. Sample size increases as power increases. Higher the power, lower the chance of missing a real effect of treatments.
Level of significance:
The chosen level of significance sets the likelihood of detecting a treatment effect when no effect exists (leading to a so-called "false-positive" result) and defines the threshold "P value". Results with a P value above the threshold lead to the conclusion that an observed difference may be due to chance alone, while those with a P value below the threshold lead to rejecting chance and concluding that the intervention has a real effect. The level of significance is most commonly set at 5% (that is, P = 0.05) or 1% (P = 0.01). This means the investigator is prepared to accept a 5% (or 1%) chance of erroneously reporting a significant effect. The sample size is inversely proportional to level of significance i.e. sample size increases as level of significance decreases.
Underlying population event rate:
Unlike the statistical power and level of significance, which are generally chosen by convention, the underlying expected event rate (in the standard or control group) must be established by other means, usually from previous studies, including observational cohorts. These often provide the best information available, but may overestimate event rates, as they can be from a different time or place, and thus subject to changing and differing background practices. Additionally, trial participants are often "healthy volunteers", or at least people with stable conditions without other comorbidities, which may further erode the study event rate compared with observed rates in the population. Great care is required in specifying the event rate and, even then, during ongoing trials it is wise to have allowed for sample size adjustment, which may become necessary if the overall event rate proves to be unexpectedly low.
Sample size required to demonstrate equivalence is highest and to demonstrate equality is lowest.
The sample size estimation is challenging for complex designs such as non-inferiority or, time to event end points. Also, the sample size estimation needs adjustment in accommodating
The sample size estimation is challenging for complex designs such as non-inferiority or, time to event end points. Also, the sample size estimation needs adjustment in accommodating
1. unplanned interim analysis
2. planned interim analysis
3. adjustment for covariates.
Sample size calculation for time to failure:
The time to the event may be of special interest. For example, the time to death or time to recurrence etc. Method for analysis of this type of outcome is generally referred to as life-table or survival analysis methods.
Sample size determination for one arm survival time:
The single-arm trial has been the most frequently used approach to the efficacy evaluation at the phase II level in oncology. The formula is based on the assumptions of uniform accrual over time, no loss to follow-up, exponentially distributed death times, and use of the exponential MLE test. A cube root transformation of the hazard rate is used in the calculations to get good sample properties. The important normal approximation used in the sample size calculation is equation 3.2.7 of Lawless. To calculate the sample size we need following information.
1. Length of accrual period
2. T, the length of follow-up period-time from end of accrual to analysis
3. ,the significance level
4. One-sided or two sided test
5. 1-bthe desired power for accrual rate estimation
6. M0 and Ma, the median survival times for the null and alternative hypotheses. The outcome is measured to be exponentially distributed.
Note: There is a simple connection between median survival (M) and hazard rate (l )
for exponential data. The hazard rate is computed as follows:
l 0 = [-log (0.5)/ M0] and l 1 = [-log (0.5)/ M1]
Retrospective Power Analysis:
In sample size calculations, appropriate values for the smallest meaningful difference and the estimated SD are often difficult to obtain. Therefore, the formulas are sometimes applied after the study is completed, when the difference and SD actually observed in the study can be substituted in the appropriate sample size formula. Since sample size is also known after the study is completed, the formula will yield statistical power. In this case, power refers to the sensitivity of the study to enable detection of a statistically significant difference of the magnitude observed in the study. This activity, known as retrospective power analysis, is sometimes performed to aid in the interpretation of the statistical results of a study. If the results were not statistically significant, the investigator might explain the result as being due to a low power.
The sample size is one of the critical steps in planning a Clinical Trials and any negligence in its estimation may lead to rejection of an efficacious drug and an ineffective drug may get approval. As calculation of sample size depends on statistical concepts, it is desirable to consult an experienced statistician in estimation of this vital study parameter.
Very nice..
ReplyDeleteIt is very nice presentation about sample size.. But could you please add something more like examples, formula etc...
ReplyDeletenatural treatment for nonalcoholic fatty liver disease natural
ReplyDeletetreatment for nonalcoholic fatty liver disease natural treatment for nonalcoholic fatty liver disease
my page - fatty liver cholesterol drugs