Small samples and large samples
The behavior of the Student's t distribution is such that for n>30, the distribution
is indistinguishable from the standard normal distribution. Thus, for samples
larger than 30 elements when the population variance is unknown, you can use
the same confidence interval as when the population variance is known, but
replacing
with S. Samples for which n>30 are typically referred to as large
samples, otherwise they are small samples.
Confidence interval for a proportion
A discrete random variable X follows a Bernoulli distribution if X can take only
two values, X = 0 (failure), and X = 1 (success). Let X ~ Bernoulli(p), where p
is the probability of success, then the mean value, or expectation, of X is E[X] =
p, and its variance is Var[X] = p(1-p).
If an experiment involving X is repeated n times, and k successful outcomes are
recorded, then an estimate of p is given by p'= k/n, while the standard error of
p' is
= (p (1-p)/n) . In practice, the sample estimate for p, i.e., p' replaces
p'
p in the standard error formula.
For a large sample size, n>30, and n p > 5 and n (1-p)>5, the sampling
distribution is very nearly normal. Therefore, the 100(1- ) % central two-sided
confidence interval for the population mean p is (p'+z
For a small sample (n<30), the interval can be estimated as (p'-t
).
1, /2
p'
Sampling distribution of differences and sums of statistics
Let S
and S
be independent statistics from two populations based on samples
1
2
of sizes n
and n
1
errors of the sampling distributions of those statistics be
and
, respectively. The differences between the statistics from the two
S2
populations, S
-S
1
and standard error
T
+T
has a mean
1
2
2
1/2
)
.
S2
, respectively. Also, let the respective means and standard
2
, have a sampling distribution with mean
2
= (
S1 S2
S1
=
S1+S2
S1
2
2
1/2
+
)
. Also, the sum of the statistics
S2
+
, and standard error
S2
, p'+z
/2
p'
/2
n-1, /2
and
, and
S1
S2
=
S1 S2
S1
= (
S1+S2
Page 18-25
).
p'
,p'+t
p'
n-
S1
-
,
S2
2
+
S1