probability

given a sample space and an associated sigma-algebra , a probability function is a function with domain that satisfies

for all .
.
if are pairwise disjoint, then .

any function that satisfies the axioms of probability is called a probability function. the axiomatic definition makes no attempt to tell what particular function to choose; it merely requires to satisfy the axioms. for any sample space many different probability functions can be defined. which one(s) reflects what is likely to be observed in a particular experiment is still to be discussed.

we need general methods of defining probability functions that we know will always satisfy Kolmogorov's Axioms. we do not want to have to check the axioms for each new probability function. the following gives a common method of defining a legitimate probability function.

let be a finite set. let be any sigma algebra of subsets of . let be nonnegative numbers that sum to 1. for any , define by (the sum over an empty set is defined to be 0.) then is a probability function on . this remains true if is a countable set.

(refer to George Casella, Roger L. Berger, 2002 chapter 1 basics of probability theory example 1.2.7)

before we leave the axiomatic development of probability, there is one further point to consider. axiom 3 of def-prob, which is commonly known as the Axiom of Countable Additivity, is not universally accepted among statisticians. indeed, it can be argued that axioms should be simple, self-evident statements. comparing axiom 3 to the other axioms, which are simple and self-evident, may lead us to doubt whether it is reasonable to assume the truth of axiom 3.

the Axiom of Countable Additivity is rejected by a school of statisticians led by deFinetti (1972), who chooses to replace this axiom with the Axiom of Finite Additivity.

if and are disjoint, then

while this axiom may not be entirely self-evident, it is certainly simpler than the Axiom of Countable Additivity (and is implied by it).

assuming only finite additivity, while perhaps more plausible, can lead to unexpected complications in statistical theory - complications that, at this level, do not necessarily enhance understanding of the subject. we therefore proceed under the assumption that the Axiom of Countable Additivity holds.

if is a probability function and is any set in , then

,
;
.

if is a probability function and and are any sets in , then

;
;
if , then .

the following theorem gives some useful results for dealing with a collection of sets

if is a probability function, then

for any partition ;
for any sets (Boole's inequality).

#+begin_dummy :title Bonferroni inequality formula (b) of the-prob-3 gives a useful inequality for the probability of an intersection. since , we have from the-prob-2, after some rearranging,

this inequality is a special case of what is known as Bonferroni's inequality. Bonferroni's inequality allows us to bound the probability of a simultaneous event (the intersection) in terms of the probabilities of the individual events.

there is a similarity between Boole's inequality and Bonferroni's inequality. in fact, they are essentially the same thing. we could have used boole's inequality to derive the-prob-3. if we apply boole's inequality to , we have and using the facts that and , we obtain this becomes, on rearranging terms,

which is a more general version of the Bonferroni inequality of eq-prob-1.

(taken from George Casella, Roger L. Berger, 2002) #+end_dummy