60
Statistika Ekonomi dan Bisnis Agus Salim

Pertemuan 1 Statistika Ekonomi Dan Bisnis

Embed Size (px)

DESCRIPTION

statistika ekonomi dan bisnis

Citation preview

  • Statistika Ekonomi dan BisnisAgus Salim

  • Pertemuan PertamaPendahuluan dan Distribusi Frekuensi Arti dan Kegunaan StatistikaMacam-macam Data: Data Kuantitaif dan Data KualitatifPengertian tentang Populasi dan SampelUkuran-ukuran Sentral dan PersebaranNilai sentral secara ringkasDeviasi Standar Data Koefisien Variasi DataPerhitungan Kuartil dan Persentil

  • Sebelum MemulaiPilih Ketua KelasBuat email kelas Metode kuliah yang akan dipakai : SCLPembagian kelompokTugas kelompok

  • Arti dan Kegunaan StatistkaApa Statistika itu?Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting numerical data to assist in making more effective decisions. Apa Kegunaan Statistika?Statistical techniques are used extensively by marketing, accounting, quality control, consumers, professional sports people, hospital administrators, educators, politicians, physicians, etc..

  • A. Qualitative or Attribute Data (variable) - the characteristic being studied is nonnumeric. EXAMPLES: Gender, religious affiliation, type of automobile owned, state of birth, eye color are examples.

    B. Quantitative Data (variable) - information is reported numerically. EXAMPLES: balance in your checking account, minutes remaining in class, or number of children in a family.

    Macam-macam Data: Data Kualitatif dan Data Kuantitaif

  • Summary of Types of DataType of Data

  • Pengertian tentang Populasi dan Sampel

  • The central tendency is the middle or typical values of a distribution.Central tendency can be assessed using a dot plot, histogram or more precisely with numerical statistics.Central Tendency

  • Central Tendency Six Measures of Central Tendency

    StatisticFormulaExcel FormulaProConMean=AVERAGE(Data)Familiar and uses all the sample information. Influenced by extreme values.

    MedianMiddle value in sorted array=MEDIAN(Data)Robust when extreme data values exist. Ignores extremes and can be affected by gaps in data values.

  • Central Tendency Six Measures of Central Tendency

    StatisticFormulaExcel FormulaProConModeMost frequently occurring data value=MODE(Data)Useful for attribute data or discrete data with a small range.May not be unique, and is not helpful for continuous data.

    Midrange=0.5*(MIN(Data)+MAX(Data))Easy to understand and calculate.Influenced by extreme values and ignores most data values.

  • Central Tendency Six Measures of Central Tendency

    StatisticFormulaExcel FormulaProConGeometric mean (G)=GEOMEAN(Data)Useful for growth rates and mitigates high extremes.Less familiar and requires positive data.

    Trimmed meanSame as the mean except omit highest and lowest k% of data values (e.g., 5%)=TRMEAN(Data, %)Mitigates effects of extreme values.Excludes some data values that could be relevant.

  • A familiar measure of central tendency.In Excel, use function =AVERAGE(Data) where Data is an array of data values.Central Tendency Mean

    Population FormulaSample Formula

  • For the sample of n = 37 car brands:Central TendencyMean

  • Arithmetic mean is the most familiar average.Affected by every sample item.The balancing point or fulcrum for the data.Central Tendency Characteristics of the Mean

  • Regardless of the shape of the distribution, absolute distances from the mean to the data points always sum to zero.Central Tendency Characteristics of the MeanConsider the following asymmetric distribution of quiz scores whose mean = 65.

  • The median (M) is the 50th percentile or midpoint of the sorted sample data.M separates the upper and lower half of the sorted observations.If n is odd, the median is the middle observation in the data array.If n is even, the median is the average of the middle two observations in the data array.Central Tendency Median

  • Central Tendency MedianFor n = 8, the median is between the fourth and fifth observations in the data array.For n = 9, the median is the fifth observation in the data array.

  • Consider the following n = 6 data values: 11 12 15 17 21 32What is the median?M = (x3+x4)/2 = (15+17)/2 = 16 11 12 15 16 17 21 32n/2 = 6/2 = 3 and n/2+1 = 6/2 + 1 = 4Central Tendency Median

  • Consider the following n = 7 data values: 12 23 23 25 27 34 41What is the median?M = x4 = 2512 23 23 25 27 34 41(n+1)/2 = (7+1)/2 = 8/2 = 4Central Tendency Median

  • Use Excels function =MEDIAN(Data) where Data is an array of data values.For the 37 vehicle quality ratings (odd n) the position of the median is (n+1)/2 = (37+1)/2 = 19. So, the median is x19 = 121.When there are several duplicate data values, the median does not provide a clean 50-50 split in the data.Central Tendency Median

  • The median is insensitive to extreme data values.For example, consider the following quiz scores for 3 students:Toms scores: 20, 40, 70, 75, 80 Mean =57, Median = 70, Total = 285Jakes scores: 60, 65, 70, 90, 95 Mean = 76, Median = 70, Total = 380Marys scores: 50, 65, 70, 75, 90 Mean = 70, Median = 70, Total = 350What does the median for each student tell you?Central Tendency Characteristics of the Median

  • The most frequently occurring data value.Similar to mean and median if data values occur often near the center of sorted data.May have multiple modes or no mode. Central Tendency Mode

  • Lees scores: 60, 70, 70, 70, 80Mean =70, Median = 70, Mode = 70Pats scores: 45, 45, 70, 90, 100Mean = 70, Median = 70, Mode = 45Sams scores: 50, 60, 70, 80, 90Mean = 70, Median = 70, Mode = noneXiaos scores: 50, 50, 70, 90, 90Mean = 70, Median = 70, Modes = 50,90Central Tendency ModeFor example, consider the following quiz scores for 3 students:What does the mode for each student tell you?

  • Easy to define, not easy to calculate in large samples.Use Excels function =MODE(Array) - will return #N/A if there is no mode. - will return first mode found if multimodal.May be far from the middle of the distribution and not at all typical.Central Tendency Mode

  • Generally isnt useful for continuous data since data values rarely repeat.Best for attribute data or a discrete variable with a small range (e.g., Likert scale).Central Tendency Mode

  • Consider the following P/E ratios for a random sample of 68 Standard & Poors 500 stocks.What is the mode?Central Tendency Example: Price/Earnings Ratios and Mode

    7881010101012131313131313131414141515151515161616171818181819191919192020202121212222232323242526262626272929303134363740414548556891

  • Excels descriptive statistics results are:The mode 13 occurs 7 times, but what does the dot plot show?Central Tendency Example: Price/Earnings Ratios and Mode

    Mean22.7206Median19Mode13Range84Minimum7Maximum91Sum1545Count68

  • The dot plot shows local modes (a peak with valleys on either side) at 10, 13, 15, 19, 23, 26, 29.These multiple modes suggest that the mode is not a stable measure of central tendency.Central TendencyExample: Price/Earnings Ratios and Mode

  • Points scored by the winning NCAA football team tends to have modes in multiples of 7 because each touchdown yields 7 points.Central Tendency Example: Rose Bowl Winners PointsConsider the dot plot of the points scored by the winning team in the first 87 Rose Bowl games.What is the mode?

  • A bimodal distribution refers to the shape of the histogram rather than the mode of the raw data.Occurs when dissimilar populations are combined in one sample. For example,Central Tendency Mode

  • Compare mean and median or look at histogram to determine degree of skewness.Central Tendency Skewness

  • Central Tendency Symptoms of Skewness

    Distributions ShapeHistogram AppearanceStatisticsSkewed left(negative skewness)Long tail of histogram points left(a few low values but most data on right)Mean < Median

    SymmetricTails of histogram are balanced (low/high values offset)Mean Median

    Skewed right(positive skewness)Long tail of histogram points right(most data on left but a few high values)Mean > Median

  • For the sample of J.D. Power quality ratings, the mean (125.38) exceeds the median (121). What does this suggest?Central Tendency Skewness

  • The geometric mean (G) is a multiplicative average.For the J. D. Power quality data (n=37):In Excel use =GEOMEAN(Array)The geometric mean tends to mitigate the effects of high outliers.Central Tendency Geometric Mean

  • A variation on the geometric mean used to find the average growth rate for a time series.For example, from 1998 to 2002, Spirit Airlines revenues are:Central Tendency Growth Ratessc

    YearRevenue (mil)19981311999227200031120013542002403

  • The average growth rate is given by taking the geometric mean of the ratios of each years revenue to the preceding year.Due to cancellations, only the first and last years are relevant: = 1.2421 = .242 or 24.2% per yearCentral Tendency Growth Rates

  • The midrange is the point halfway between the lowest and highest values of X.Easy to use but sensitive to extreme data values.For the J. D. Power quality data (n=37):Here, the midrange (130) is higher than the mean (125.38) or median (121).Central Tendency Midrange

  • To calculate the trimmed mean, first remove the highest and lowest k percent of the observations.For example, for the n = 68 P/E ratios, we want a 5 percent trimmed mean (i.e., k = .05).To determine how many observations to trim, multiply k x n = 0.05 x 68 = 3.4 or 3 observations. So, we would remove the three smallest and three largest observations before averaging the remaining values.Central Tendency Trimmed Mean

  • Here is a summary of all the measures of central tendency for the n = 68 P/E values.The trimmed mean mitigates the effects of very high values, but still exceeds the median.Central Tendency Trimmed Mean

    Mean:22.72 =AVERAGE(PERatio)Median:19.00 =MEDIAN(PERatio)Mode:13.00 =MODE(PERatio)Geometric Mean:19.85 =GEOMEAN(PERatio)Midrange:49.00 =(MIN(PERatio)+MAX(PERatio))/25% Trim Mean:21.10 =TRIMMEAN(PERatio,0.1)

  • Central Tendency Trimmed MeanThe Federal Reserve uses a 16% trimmed mean to mitigate the effects of extremes in its analysis of the Consumer Price Index.

  • Variation is the spread of data points about the center of the distribution in a sample. Consider the following measures of dispersion:Dispersion Measures of Variation

    StatisticFormulaExcelProConRangexmax xmin=MAX(Data)-MIN(Data)Easy to calculateSensitive to extreme data values.

    Variance (s2)=VAR(Data)Plays a key role in mathematical statistics.Non-intuitive meaning.

  • Dispersion Measures of Variation

    StatisticFormulaExcelProConStandard deviation (s)=STDEV(Data)Most common measure. Uses same units as the raw data ($ , , , etc.).Non-intuitive meaning.

    Coef-ficient. ofvariation (CV)NoneMeasures relative variation in percent so can compare data sets.Requires non-negative data.

  • Dispersion Measures of Variation

    StatisticFormulaExcelProConMean absolute deviation (MAD)=AVEDEV(Data)Easy to understand.Lacks nice theoretical properties.

  • The difference between the largest and smallest observation.Range = xmax xmin For example, for the n = 68 P/E ratios, Range = 91 7 = 84 Dispersion Range

  • The population variance (s2) is defined as the sum of squared deviations around the mean m divided by the population size.For the sample variance (s2), we divide by n 1 instead of n, otherwise s2 would tend to underestimate the unknown population variance s2.Dispersion Variance

  • The square root of the variance.Units of measure are the same as X.Explains how individual values in a data set vary from the mean.Dispersion Standard Deviation

  • Excels built in functions areDispersion Standard Deviation

    StatisticExcel population formulaExcel sample formulaVariance=VARP(Array)=VAR(Array)Standard deviation=STDEVP(Array)=STDEV(Array)

  • Consider the following five quiz scores for Stephanie.Dispersion Calculating a Standard Deviation

  • Now, calculate the sample standard deviation:Somewhat easier, the two-sum formula can also be used:Dispersion Calculating a Standard Deviation

  • The standard deviation is nonnegative because deviations around the mean are squared.When every observation is exactly equal to the mean, the standard deviation is zero.Standard deviations can be large or small, depending on the units of measure.Compare standard deviations only for data sets measured in the same units and only if the means do not differ substantially.Dispersion Calculating a Standard Deviation

  • Useful for comparing variables measured in different units or with different means.A unit-free measure of dispersionExpressed as a percent of the mean.Only appropriate for nonnegative data. It is undefined if the mean is zero or negative.Dispersion Coefficient of Variation

  • For example:Dispersion Coefficient of Variation

    Defect rates (n = 37)s = 22.89 = 125.38 gives CV = 100 (22.89)/(125.38) = 18%ATM deposits (n = 100)s = 280.80 = 233.89 gives CV = 100 (280.80)/(233.89) = 120%P/E ratios (n = 68)s = 14.28 = 22.72 gives CV = 100 (14.08)/(22.72) = 62%

  • The Mean Absolute Deviation (MAD) reveals the average distance from an individual data point to the mean (center of the distribution).Uses absolute values of the deviations around the mean.Excels function is =AVEDEV(Array)Dispersion Mean Absolute Deviation

  • Consider the histograms of hole diameters drilled in a steel plate during manufacturing.The desired distribution is outlined in red.Dispersion Central Tendency vs. Dispersion: Manufacturing

  • Desired mean (5mm) but too much variation.Acceptable variation but mean is less than 5 mm.Take frequent samples to monitor quality.Dispersion Central Tendency vs. Dispersion: Manufacturing

  • Consider student ratings of four professors on eight teaching attributes (10-point scale).Dispersion Central Tendency vs. Dispersion: Job Performance

  • Jones and Wu have identical means but different standard deviations.Dispersion Central Tendency vs. Dispersion: Job Performance

  • Smith and Gopal have different means but identical standard deviations.Dispersion Central Tendency vs. Dispersion: Job Performance

  • A high mean (better rating) and low standard deviation (more consistency) is preferred. Which professor do you think is best?Dispersion Central Tendency vs. Dispersion: Job Performance

  • Selamat Belajar