Understanding rnorm, dnorm, pnorm and qnorm

Xinchen Pan · 2018/05/01

I just use the normal distribution as an example in the title. The d, p, q, r are a family of functions for different distributions. Though we can understand the meaning of them after taking an introductory mathematical statistics course, sometimes they still look comfusing, especially for p and q.

Let’s start by dnorm and rnorm

dnorm

This dnorm(x, mean = 0, sd = 1, log = FALSE) function simply calculates the result for the value plugged into the probability density distribution or probability mass function if it is a discrete distribution.

So for the normal distribution with \(mean=0, sd=1\), we have

\[ \frac{1}{\sqrt{2\pi}}e^{\frac{-x^2}{2}} \]

If we plug \(x=2\) inside the pdf, we have

1 / sqrt(2 * pi) * exp(-2^2 / 2)
## [1] 0.05399097

which is the same as

dnorm(x = 2, mean = 0, sd = 1)
## [1] 0.05399097

rnorm

rnorm(n, mean = 0, sd = 1) returns n random values that belong to the normal distribution with a \(mean=0\) and \(sd=1\).

For example, for \(N(0,1)\), if we generate 100 values from it. It is very unlikely that we will get a value of 10000. In fact, we are able to get the probability of having a value of 10000 from this \(N(0,1)\) distribution by using pnorm. Mostly we will have some values are not far from 0 depending on the standard deviation. The mean and sd we got from the randomly generately values will get close to the theoretical value as n gets larger. Law of large numbers

rnorm(100, mean = 0, sd = 1)
##   [1]  0.44397253  2.26974278  0.96213017 -0.98114321 -0.81673697
##   [6] -1.27911926  0.43560479  0.63864735  0.70865044  1.91058328
##  [11] -0.79346382 -0.74380923  0.19057270 -1.91290216  0.65753297
##  [16]  0.64780687  0.85601248  0.42054690  1.59931574  1.85609449
##  [21]  0.92548581 -0.63423536  0.40053807  1.11895633  0.86968372
##  [26] -0.88021520  0.69891915 -1.13410683  0.42440412 -1.54164780
##  [31] -1.41371545 -0.77129951  0.58247868 -0.60981978  1.61671347
##  [36] -0.19234311 -0.43230939 -1.69311707  1.28331089 -0.43960770
##  [41] -1.26880188 -1.03024181 -0.09301054  0.09630727  0.09567935
##  [46] -0.95457462  0.15968128  1.59552431  0.70149448  0.59702470
##  [51] -0.79018483 -0.46857261  1.33755335 -0.99504568  0.05257650
##  [56] -1.56017586 -1.09044670  3.44503337 -0.67710208 -0.65193628
##  [61]  0.51748999 -0.64310700 -0.98015442  2.08505345 -0.03036714
##  [66] -0.74714762 -0.56065081  1.69428481  0.87185800 -0.24940924
##  [71]  2.89343687 -1.29225632 -0.07762765  0.78040052  0.54147203
##  [76]  0.77056421  1.56432169  3.31402743  0.01087223 -0.54794083
##  [81]  1.23263952  0.88385819 -0.05748334  0.80355828  1.25799155
##  [86] -1.75506811 -0.35114983 -1.24856268  0.58143097 -0.16829024
##  [91]  0.19160874  1.31997751  0.77578134  2.62336213 -0.22477977
##  [96]  1.61008297 -1.08468341 -2.25972128  1.11632542 -1.15402003
mean(rnorm(100, mean = 0, sd = 1))
## [1] 0.001469718

pnorm

pnorm(q, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE) returns the probabality of \(p(X<=x)\) by default. If we set low.tail = FALSE, then it returns \(p(X>x)=1-p(X<=x)\).

Let’s look at an extreme example which is the one I mentioned above. What is the probability that \(p(X<10000)\) for \(N(0,1)\). It is almost certainly that it should be 1. In another word, \(p(x>10000)\) is 0. You can imagine the chance of having a human being whose height is 40m(ultraman).

It is important to remember the function returns probability.

pnorm(0, mean = 0, sd = 1)
## [1] 0.5
pnorm(10000, mean = 0, sd = 1)
## [1] 1

qnorm

qnorm(p, mean = 0, sd = 1, lower.tail = TRUE, log.p = FALSE) is the inverse of pnorm, so the parameter p inside the qnorm need to be within \([0,1]\).

qnorm(0.999, mean=0, sd=1, lower.tail = TRUE)
## [1] 3.090232
pnorm(3.090232, mean=0, sd=1, lower.tail = TRUE)
## [1] 0.999