Probabilistic stuffs

How to simulate a random number which satisfy a probabilistic distribution

  1. Inverse transform method
  2. Acceptance rejection method

Check here

Confidence Interval and variance

In statistics, confidence interval is a type of estimate computed from the statistics of the observed data. It gives a range of values for an unknown parameter (e.g. the mean). The interval has an associated confidence level that the true parameter (e.g. the mean) is in the proposed range.

Variance is the expectation of the squared deviation of a random variable from its mean. Informally, it measures how far a set of (random) numbers are spread out from their average value

How to shuffle an array

Given an array , design an algorithm to evenly and randomly shuffle it. There are two algorithms. Which one is correct?

 
for i=1 to n do swap(a[i], a[random(1,n)]);
for i=1 to n do swap(a[i], a[random(i,n)]);

The second one is correct.

The second algorithm is that you randomly select a number from to and put that number on the -th position. Obviously, there are totally possible combinations.

In contrast, the first algorithm will generate total combinations. Since is not a integer number, some combinations are more likely appeared.

How to calculate probability density function (概率密度函数)

Put a common conclusion here. If , then will satisfy . The derivation is here. Given a random variable and its probability density function , if another random variable , what is the probability density function of ?

A common mistake is that you put into , and then calculate (when ), and (when ).

This is correct when is a standard function mapping (常规的函数映射).

However, this is wrong in probability density function.

The probability density function represent how possible a random variable drop in a interval .

where is the primitive function (原函数).

To calculate the probability density function of , you should go from its primitive function.

The derivation in Chinese is listed below.

需要搞清楚一种通用且普适的方法用于计算新随机变量的概率密度函数. 一种可能的做法或许可以参考here

知乎中的回答提供了一种借助狄拉克函数来计算新随机变量的概率密度函数的通用计算方法。 狄拉克函数的介绍和性质 Dirac delta function.

具体表述如下: 随机变量符合某概率分布,对进行某种变换后得到一个新的随机变量,即,那么对应的概率密度函数可以如下计算

这里是狄拉克函数,仅在处非0,且,上面积分表达式的含义是 随机变量处的概率密度函数由所有满足概率密度函数求和得到,这是符合概率直觉的。

这里的关键的就是怎么计算带狄拉克函数的积分,下面以上面的为例进行说明,关于多元变量的概率密度函数的计算例子可以参考here。 根据前面介绍的,我们有

显然,当时,。 由于只在时不为0,我们可以将在积分运算中当作常数,也即。 令,那么,所以

这里第二个等号把改成了, 原因在于狄拉克函数积分的几何性质,即坐标轴进行放缩,为了使积分面积依然是1,其高度也要相应进行倍的放缩,具体请阅读上面旁注中的链接。

很显然,由于

这与上面的答案是相同的。

Probabilistic Stuffs - Canyu Le