In probability theory, the central limit theorem (CLT) states that, given certain conditions (large sample size), the arithmetic mean of a sufficiently large number of iterates of independent random variables, each with a well-defined expected value (mean) and finite variance, will be approximately normally distributed, regardless of the underlying distribution.
So if samples of size n are drawn randomly from a population that has a mean µ and a standard deviation of , the sample means, are approximately normally distributed for sufficiently large sample sizes (n > 30) regardless of the shape of the population distribution.
The mean of the sample means is same as population µ and its standard deviations is as . The altered mean and standard deviations are then used in calculating normal probabilities. The above image is the visual representation of the concept in discussion. As the sample number of observations “n” increases the distribution of the data starts fitting as a bell shaped curve.
About Normal Distribution: The normal distribution is described or characterized by two parameters: the mean µ and the standard deviation σ . The values of µ and σ produce a normal distribution. The density function of the normal distribution is
Using Integral Calculus to determine areas under the normal curve from this function is difficult and time- consuming, therefore, virtually all researchers use table values to analyze normal distribution problems rather than this formula. The mechanism was developed by which all normal distributions can be converted into a single distribution: the z distribution. This process yields the standardized normal distribution scores z, also known as Gaussian scores.
The conversion formula for these Gaussian scores is
The normal distribution is also popularly known as the bell shaped distribution. The distribution is symmetric around its mean. The distribution is very robust in nature. It finds applications across various situations in research, social sciences, Biostatistics, business etc.
Significance of CLT:
As a result of central limit theorem, we can use the adjoining altered formula for Z scores so as to use the normal distribution for prediction in analytics. One of the essential conditions to enable us to do so is a large sample size.
In short, the central limit theorem creates the potential for applying the normal distribution to many problems when the sample size is sufficiently large. My take on this central limit theorem is a little philosophical. I call it The Midas Touch.