All Probabilities

Handling Probabilities in R

Purpose

This page provides a list of the distributions we cover in class, how to use R with them, and where to get additional information. It is up to you to know which distribution is to be used and whether a probability, a cumulative probability, or a quantile is to be calculated, or a random number is to be drawn.

If you would like to download the pdf document so that you can print it off, here is the link. This document will include additional information about these probabilities:

[probability handout]
Probability Handout

By the way, if you are interested in practicing identifying distributions from their graphics, please click on the icon below.

[distribution practice]
Distribution Practice

Discrete Distributions

A discrete random variable is only able to take specific values. These values may be integers or decimals. These values may be finite or infinite. Ultimately, discrete random values allow for the concept of “next.” These are the discrete distributions discussed on this page:

 

Binomial Distribution

X ~ Bin(n,p)

A Binomial random variable models the number of successes in a specific number of trials. There are five requirements for a random variable to follow a Binomial distribution:

  1. The number of trials, n, is known.
  2. Each trial results in either a success or a failure.
  3. The probability of a success is constant across the trials.
  4. The trials are independent (of each other).
  5. The random variable is the number of successes.

If your random variable follows a Binomial distribution, then it has two parameters that define it. These are the number of trials and the success probability. In class, these are symbolized as n and p. In R, they are symbolized as size and prob.

 

Poisson Distribution

X ~ Pois(λ)

A Poisson random variable models the number of successes in an area or a time period — not in a given number of trials.

If your random variable follows a Poisson distribution, then it needs just one parameter to define it. It is the average rate. In class, this was symbolized as λ. In R, it is lambda.

 

Hypergeometric Distribution

X ~ Hyper(N, k, n)

A Hypergeometric random variable models the number of successes in a specific number of trials in which the population is finite and repetition is not allowed (the same element can be selected at most once). There are five requirements for a random variable to follow a Hypergeometric distribution:

  1. The number of trials, k, is known.
  2. Each trial results in either a success or a failure.
  3. The population is finite and repetition is not allowed.
  4. The trials are independent (of each other).
  5. The random variable is the number of successes in the trials.

If your random variable follows a Hypergeometric distribution, then it needs three parameters to defines it. These are m, the number of successes in the population, n, the number of failures in the population, and k, the sample size.

 

Geometric Distribution

X ~ Geom(p)

A Geometric random variable models the number of failures until the first success. There are four requirements for a random variable to follow a Geometric distribution:

  1. Each trial results in either a success or a failure.
  2. The probability of a success is constant across the trials.
  3. The trials are independent (of each other).
  4. The random variable is the number of failures until the first success.

If your random variable follows a Geometric distribution, then it has just one parameter that defines it. It is the success probability. In class, this is symbolized as p. In R, it is symbolized as prob.

 

 

Continuous Distributions

A continuous random variable can take on any values in an interval. These are the discrete distributions discussed on this page:

 

Probability Statements

Before we get started, let us look at four possible probability statements and how to calculate them in general. Remember that F(x) is the cumulative distribution function (CDF).

P[ X ≤ a ] = F(a)
P[ a < X ] = 1 − P[X ≤ a] = 1 − F(a)
P[ a < X ≤ b ] = P[X ≤ b] − P[X ≤ a] = F(b) − F(a)
P[ X ≤ a or b < X ] = 1 − (P[ a < X ≤ b ]) = 1 − ( F(b) − F(a) )

 

Uniform Distribution

X ~ Unif(a, b)

A Uniform random variable models random variables that have a constant likelihood between two specified values. If your random variable follows a Uniform distribution, then it needs two parameters to define it: the lowest possible value and the highest possible value. In class, these were symbolized as a and b. In R, they are min and max.

 

Exponential Distribution

X ~ Exp(λ)

An Exponential random variable models the time until a success. If your random variable follows an Exponential distribution, then it needs just one parameter to define it. It is the average rate. In class, this was symbolized as λ. In R, it is rate.

 

Normal (Gaussian) Distribution

X ~ Norm(μ, σ)

A Normal or a Gaussian random variable is useful to model measures of center. It turns out that it can be used to model the sums of types of random variables (thanks to the CLT).

If your random variable follows a Normal distribution, then it needs two parameters to define it. The two parameters are the expected value and the standard deviation. In class, we symbolized them as μ and σ. In R, they are mean and sd.

This page was last modified on 2 January 2024.
All rights reserved by Ole J. Forsberg, PhD, ©2008–2024. No reproduction of any of this material is allowed without explicit written permission of the copyright holder.