Sunday, March 3, 2013

More or Fewer Treatments in Experiments?


Person A:
If you have two different treatments and a fixed participant size (say M), which is the better experimental design…?


a)      M/3 control, M/3 Treatment 1, and M/3 Treatment 2
b)      M/4 control, M/4 Treatment 1, M/4 Treatment 2, and M/4 Treatment 1+2

Person B: 
Generally, we do version (b), since then for each treatment, you’d have both a larger experimental and control set (M/2 each versus M/3 each). With version (b), some reviewers might object that the make-up of the Treatment 1 experiment group (which has half Treatment 2 and half not) might not reflect the real distribution “in the wild.” That may be true, but the samples are still balanced for Treatment 1, and that it would be like a stratification, so it shouldn’t matter. The crucial thing would be if the variance for either treatment were very large relative to the other’s effect size, then it would make it harder to find a statistically significant difference.

Of course, this all holds only when Treatments 1 and 2 are not incompatible.

Person A: 
I'm still missing the punch line. If I'm comparing two means, say of Treatment 1 and Treatment 2 using:

z=(x1-x2)/(sqrt(sigma1^2/n1 + sigma2^2/n2)

How would adding in Treatment 3 help uncover a statistically significant difference between the means of T1 and T2 (or T1 and C, or T2 and C)? n1 and n2 go down, and the variance of the mean each treatment shouldn't change when I add in a third Treatment.

Person B:
The idea is T3 = T1 + T2 (in the sense of both T1 and T2 being run together). Then, you can combine the data from T1 and T3 and think of it as a T1 treatment on top of the “combined” control, C and T2. Similarly, you can combine the data form T2 and T3 and think of it as the T2 on top of the combined control, C and T1.

Of course, if T2, say, produces a strong effect, then T1 and T3 will have much greater variance as a combined dataset than T1 by itself. That would make it harder to show a statistically significant result for T1 over the control. However, I think even that could be managed by subtracting out the mean T2 effect before combining the data, though you might have to make some statistical adjustment to merge the two sets, as they might not have identical variances.

Person C's input:
As a first reaction, and before I look at the above in more detail, this might be reference to the following:

Assuming a between design, for each new treatment, you will split the sample in each of the other treatments, so the comparison groups get smaller and smaller. In the extreme, you have so many treatments that each person represents a unique intersection of treatments -  this gives you tremendous heterogeneity, in the way we usually think about it with respect to non-experimental data, and not big enough sample to meaningfully compare treatments.

For example, suppose you have 100 subjects, and treatment 1 (T1) and Control (C) groups. each group has 50 subjects. If you want to add another treatment, you will stratify randomization by T1 and C, so
that you have an equal representation of people exposed to T1 and C in the new treatment (T2) and its control (C2). Now the sample is effectively split in 4 groups, with 25 people per groups (like in a
2x2 table). Add another (cross-cutting) treatment, and numbers get too small to do anything.

Person A:
Yes, the thread is about that, but a bit more.

1. There's an assumption that we can use non control groups as part of the control group.
So if we have:
scenario 1
C T1 T2
and then
scenario 2 (add T3)
C T1 T2 T3, (where T3 is a combo of T1 & T2)

then we want to ask, what's the effect of T1?

The assumption above is that we can compare:
scenario 1
T1 compared to C & T2
scenario 2
T1 compared to C&T2&T3

In scenario 2, the data will have more variation (because there's another treatment in there, i guess?), which may make finding a T1 effect more difficult?? i think.

My issues:
1. Do we only make comparison's to control groups
2. If we compare to more than the control group, then we must account for that with some dummy.