Upload
abhishek-jain
View
235
Download
0
Embed Size (px)
Citation preview
8/6/2019 MB0040 - AJ
1/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 1 of 15
MBA SEMESTER 1
MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 1
Q1. Eluc ida te the func tions o f sta tistics.
A1.:
(1) Sta tistics helps in providing a be tter unde rstand ing and exac t de scription of a p henomenon ofnature.
(2) Sta tistica l helps in prop er and eff icient p lanning of a sta tistica l inquiry in any field of study .(3) Sta tistica l helps in c ollecting an app rop riate quantita tive da ta .(4) Sta tistics helps in presenting co mp lex da ta in a suitable ta bular, dia grammat ic a nd grap hic form
for an easy and c lear comp rehension of the da ta.
(5) Sta tistics helps in understa nding the na ture and pa tte rn of va riability of a ph enomenon throughqua ntitative observations. (6) Stat istics helps in draw ing valid inference, a lon g w ith a mea sure oftheir reliability about the population parameters from the sample data.
Q2. What a re the me thod s of sta tistic al survey? Explain b riefly.
A2.: Sta tistic al surveys a re used to c ollec t q ua ntita tive informat ion ab out ite ms in a
po pulation . Surveys of human po pulations a nd institutions are c om mon in polit ical p olling
and government, health, social science and marketing research. A survey may focus on
opinions or factual information depending on its purpose, and many surveys involve
ad ministe ring qu estions to individua ls. When the que st ions a re ad ministered b y a resea rche r,
the survey is c alled a struct ured interview or a resea rcher-a d ministered survey . When th e
qu estions are a d ministered b y the resp on de nt, the survey is referred to a s a qu estionna ire o r
a self-ad ministe red survey .
Structure and stand ardizatio n
The questions a re usua lly struc tured and sta ndard ized . The struct ure is intende d to reduce b ias. Fo r
examp le, questions should b e ordered in such a w ay that a qu estion do es not influenc e the
response to subsequen t qu estion s. Surveys are standardized to ensure reliability, generalizability, and
validity. Every respondent should b e p resented with the sam e questions and in the sam e o rder as
other respo ndents.
In orga nizationa l develop ment (OD), carefully constructed survey instruments a re often used a s the
ba sis for da ta ga thering, organizat ional d iagno sis, and subsequ ent a ction planning. Som e OD
prac titioners (e.g. Fred Nic kols) even c onsider survey guided dev elop ment as the sine qua non of
OD.
Serial surveys
Serial surveys ar e tho se w hich rep eat the sam e que stions at different p oints in time, p rod uc ing
rep eated measures da ta . There a re three ba sic d esigns for a study with mo re than one
mea surement oc casion: cross-sec tional d esign, longitud inal de sign, and time -series de sign.
Cross-sectional surveys use different units (respondents) at each of the measurementoc c asions, by draw ing a new sam ple ea c h time. The time intervals ma y be d ifferent
be tween mea surement oc ca sions, but t hey a re the sam e fo r all units (respo nde nts). A stud y
in which a survey is ad ministered onc e is also co nsidered to b e c ross -sec tional.
Longitudinal surveys use the same units (respondents) at each of the measurementoc c asions, by reco nta cting the sam e sam ple from th e initial survey fo r the follow ing
measureme nt o ccasion(s), and a sking the sa me que stions at every oc ca sion. The time
intervals may be d ifferent betw een mea surement oc casions, but they a re the sa me for all
units (resp ond ents).
Time -series surveys a lso use the same units (responde nts) at ea ch of the measureme ntoc c asions, but the difference with long itudinal stud y d esigns is that in time -series de signs bo th
8/6/2019 MB0040 - AJ
2/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 2 of 15
the numb er of measurement oc casions and the time intervals be twe en oc ca sions ma y be
different b etw een units (respon de nts).
Mode s of Data Co llectio n
There a re seve ra l wa ys of a dministe ring a survey, includ ing:
Tele phone
use o f interviewers enc ourag es samp le pe rsons to respo nd , lead ing to highe r respo nse rates. interviewers can increase comprehension of questions by answering respondents' questions. fairly cost efficient, dep end ing on loca l call cha rge structure go od fo r large nationa l (or internat ional) sampling fram es some p otential for interview er bias (e.g. som e p eople m ay b e m ore willing to discuss a
sensitive issue w ith a fema le interviewe r than with a ma le one )
canno t b e used for non -audio informa tion (grap hics, de monstrations, ta ste/ smell sam ples) unreliable for consumer surveys in rural areas where telephone penetration is low three t ypes:
o traditional telephone interviewso co mp uter assisted tel ep hone d ialingo co mp uter assisted telep hone interviewing ( CATI)
the questionnaire may be hande d to the respo ndents or mailed to the m, but in all cases theyare returned to the resea rcher via ma il.
cost is very low , since bulk po sta ge is cheap in most c ountries long time delays, often several months, before the surveys are returned and statistical
ana lysis can be gin
not suitable for issues that may require clarification respo ndents ca n a nswer at their own c onvenience (allowing them to break up long surve ys;
also useful if they nee d t o c hec k rec ords to an sw er a question)
no interview er bias introdu ce d large amount of information ca n be o btained : some ma il surveys a re a s long a s 50 p ag es respo nse rates ca n be improved by using ma il p anels
o memb ers of the pa nel have agreed to pa rticipateo pa nels ca n b e used in longitudinal designs whe re the same respo nde nts are surveyed
several
Online surveys
ca n use web o r e-mail w eb is p referred ove r e -mail beca use interac tive HTML forms can b e used ofte n inexpensive to adm inister very fast results easy to mod ify respo nse rates c an b e improved b y using Online pa nels - memb ers of the p anel have
agreed to participate
if not p asswo rd -protec ted , easy to ma nipulate b y co mp leting multiple times to skew results da ta c reation, manipu lation a nd reporting ca n be auto ma ted a nd/ or easily expo rted into a
forma t which c an be read by PSPP, DAP o r other sta tistic a l a nalysis softw are
da ta sets c rea ted in rea l time som e a re inc ent ive based (suc h as Survey Vault o r YouGov) ma y skew sam ple towa rd s a younger demo graphic c omp ared with CATI often difficult to determine/control selection probabilities, hindering quantitative analysis of
data
use in large scale industries.Persona l in- home survey
8/6/2019 MB0040 - AJ
3/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 3 of 15
respo nde nts are interviewed in person, in their home s (o r at the front do or) very high cost suitable when graphic representations, smells, or demonstrations are involved ofte n suitab le for long surveys (but som e respo nden ts obje ct to a llowing strangers into their
home for extende d p eriods)
suitab le for loc ation s where telep hone or mail are not d eveloped skilled interviewers can persuad e respondents to coo pe rate, imp roving respo nse rates potential for interviewer bias
Personal mall intercept survey
shop pe rs at m alls are intercep ted - they are either interview ed o n the spo t, taken to a roomand interviewed , or taken to a room and given a self -administered que stionnaire
socially acceptable - people feel that a mall is a more appropriate place to do researchthan their home
potential for interviewer bias fast ea sy to manipulate b y co mp leting multiple times to skew results
Samp ling
Sam ple select ion is c ritical to the va lidity of the information tha t represents the p op ulations tha t a re
being studied. The ap proac h of the samp ling helps to d etermine the fo cus of the study and allowsbet ter ac c epta nce of the generaliza tions that are being m ad e. Careful use of b iased sam pling ca n
be used if it is justified and as long as it is noted that the resulting sample may not be a true
rep resenta tion of the po pula tion of the stud y. Th ere are two d ifferent app roac hes to sam pling insurvey resea rch:
There is nonprob ability sam pling a pp roa ch. In this a pp roac h the researc her doe s not knowea c h eleme nt's p rob a bility of selec tion in the sam ple. The m ost c ommonly used
nonp roba b ility sa mp ling method is the c onvenienc e sa mpling ap p roa c h. With this me thod , it
only sam ples those w ho a re ava ilab le a nd w illing to pa rticipa te in the survey. The use o f this
ap p roa c h allows for convenienc e for the researcher while possibly losing da ta validity due to
the lac k o f rep resenta tion.
The p roba bility sam pling ap proa ch for researc h methods gives ea ch e lement a knownchanc e o f being inc luded in the sample. This meth od is clo ser to a true rep resentat ion of thepop ulation. It c an b e difficult to use d ue to co st of a rigorous sa mpling method , and diffic ulty
in obtaining full co vera ge of the t arget p opulation, but the g eneralizations that c ome f rom it
are more likely to be closer to a true representation of the population. Different forms of
prob ab ility sam pling are d esigned to a chieve various b enefits - e.g. theoretica l simplicity,
op erational simplicity, detailed informa tion on subp op ulations, or minimal cost. Some
co mmo n forms:
o Equa l proba bility of selec tion d esigns (EPS), in which ea ch elem ent o f the po pulationhas a n equa l cha nce o f b eing included in the sam ple. This uniformity ma kes EPS
surveys relat ively simp le to inte rpret . Forms of EPS inc lud e Simp le rando m samp ling
(SRS) and syste ma tic samp ling .
o Probability-proportional-to-size d esigns (PPS), in whic h 'la rger ' elements (ac co rding tosom e known measure of size) have a higher cha nc e of selection . This ap p roa c h is
com mo n in bu siness surveys where the ob jec t is to d ete rmine sec to r tot als (e.g . "to ta l
employme nt in ma nufact uring sec tors"); co mpa red to EPS, c onc entr ating on largerelements may produc e b etter ac curac y for the sam e c ost/ sam ple size.
o Stratified rando m sampling ap proac h , in which the population is divided intosubpopulations (called strata) and random samples are then drawn separately from
each of these strata, using any probability sampling method (sometimes including
further sub -stratificat ion). This ma y be do ne t o p rovide be tter c ontrol ove r the sam ple
size (and hence, a cc urac y) within eac h subp op ulation; when the va riab le/ s of
8/6/2019 MB0040 - AJ
4/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 4 of 15
interest are correlated with subpopulation, it can also improve overall accuracy.
Anothe r use for strat ific at ion is when different subpo pu lations requ ire d ifferent
samp ling method s - for insta nc e, a bu siness survey mig ht use EPS for businesses whose
'size' is not known a nd PPS el sew here.
Q3. Tab ulate t he follow ing da ta:
Age: 20-40; 40-60;60-above
Dep a rtmen ts: Eng lish, Hind i, Politic al sc ienc e, Histo ry, soc iolog y
Degree level: Graduates, Post g ra dua tes; PhD, Tota l stud ents in ag e g roup a nd in d eg ree level.
A3.:
Age Degree Level
DEPARTMENTSTot a l
Eng Hin Pol. Sci Histo ry Sociolo gy
20-40
40-60
60 & Ab ove
Tot a l
Q4. The d ata given b elow is the d istribut ion of em ployees of a business a cc ording to their efficienc y.
Find the mea n deviation and coe fficient of mea n deviation from Mean a nd Med ian:
Effic iency Ind ex 22-26 26-30 30-34 34-38 38-42
Emloyees 25 35 15 5 2
A4.: Calc ulation of Mea n Deviation from Mea n:
E I E (f) x fx| D| =
(x-28.29)f| D|
22-26 25 24 600 4.29 107.25
26-30 35 28 980 0.29 10.15
30-34 15 32 480 3.71 55.65
34-38 5 36 180 7.71 38.55
38-42 2 40 80 11.71 23.42
N=82 fx=2320 235.02
MD = ( f | D| ) /N
X = fx/N = 2320/82 = 28.29
D = X X
MD = 235.02/ 82 = 2.866
Co -efficient of MD from mean = MD/ X = 2.866/28.29 = 0.1013
8/6/2019 MB0040 - AJ
5/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 5 of 15
Calc ulation of MD from Median:
EI f Cf X | D| =x-Me f| D|
22-26 25 25 24 3.83 95.75
26-30 35 60 28 0.17 5.95
30-34 15 75 32 4.17 62.55
34-38 5 80 36 8.17 40.85
38-42 2 82 40 12.17 24.34N=82 f| D| =229.44
Me c lass = N/ 2 th class = 82/2th class = 41 th class
Me c la ss = 26-30
Me = l + [(N/ 2 Cf)/ f] * I = 26 + [(41-25)/ 35] * 4 = 27.83
MD = f| D| /N = 229.44/ 82 = 2.798
Coefficient of MD = MD/M e = 2.798/27.83 = 0.1005
Q5. What is c ond itiona l proba bility? Exp lain with an examp le.
A5.:Conditional probability is the p rob ability of some e vent A, given the oc currenc e of some other
event B. Conditional probability is written P(A | B), and is read "the (conditional) probability of A,
given B" o r "the p roba bility of A under the co ndition B". When in a random experiment the event B is
known to have oc curred , the p ossible outcom es of the experiment a re reduc ed to B, and hence the
proba bility of the oc currenc e of A is cha nged from the unc onditional proba bility into the cond itional
probability given B.
Joint probability is the p rob ab ility of two eve nts in conjunction. Tha t is, it is the p robability of b oth
event s tog ether. The joint p rob ab ility of A a nd B is written or
Marginal probability is then the unconditional probability P(A) of the event A; that is, the p rob ab ilityof A, regardless of whether event B did or did not occur. If B can be thought of as the event of a
random variable X having a given outcome, the marginal probability of A can be obtained bysumming (or integ rat ing, more g enerally) the joint p rob ab ilities over all outc om es for X. For examp le,
if there are two possible outcomes for X with corresponding events B and B', this mea ns tha t
. This is ca lled marginalization.
In these definitions, note that there need not be a causal or temporal relation between A a nd B. A
may precede B or vice versa or they may hap p en at the sam e time. A ma y ca use B or vic e ve rsa o r
they may have no causal relation at all. Notice, however, that causal and temporal relations are
informa l notions, not b elonging to the p rob ab ilistic fram ewo rk. They ma y ap ply in som e examples,
dep ending on the interpreta tion given to e vents.
Conditioning of prob abilities, i.e. upd ating them to ta ke ac co unt of (po ssibly new ) information, maybe ac hieved throug h Ba yes' theorem. In such co nditioning, the prob ab ility of A given only initial
information I, P(A| I), is know n as the p rior p rob ability. The up da ted c ond itional proba bility of A,
given I and t he outco me of the event B, is known a s the p oste rior p rob ab ility, P(A| B,I).
Introduction
Consider the simple sc ena rio of rolling two fair six -sided dic e, lab elled die 1 and die 2. Define th e
follow ing three event s (not assume d to oc cur simulta neou sly):
A: Die 1 lands on 3.
8/6/2019 MB0040 - AJ
6/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 6 of 15
B: Die 2 land s on 1.
C: The dic e sum to 8.
The p rior p rob a bility o f ea ch ev ent d esc ribes how likely the outc om e is be fore the d ice a re rolled,
without a ny knowled ge o f the roll's outc om e. For exam ple, die 1 is equa lly likely to fa ll on eac h of its
6 sides, so P(A) = 1/6. Similarly P(B) = 1/6. Likewise, o f the 6 6 = 36 po ssible w ays tha t a pa ir of d ice
can la nd, just 5 result in a sum o f 8 (namely 2 and 6, 3 and 5, 4 and 4 , 5 a nd 3, and 6 and 2), so P(C)
= 5/36.
Some of these event s ca n both o c cur at th e sa me time; for exam ple events A and C can happen atthe sam e time , in the c ase where d ie 1 lands on 3 a nd d ie 2 lands on 5. This is the o nly one o f the 36
outcomes where bo th A and C oc cur, so its p rob ab ility is 1/36. The p roba bility of both A and C
oc c urring is ca lled the joint p roba bility of A and C and is written , so . On the
other hand, if die 2 land s on 1, the dic e canno t sum to 8, so .
Now supp ose we roll the d ice and co ver up die 2, so we can o nly see d ie 1, and ob serve that d ie 1
landed on 3. Given this pa rtial informat ion, the prob a bility that the d ice sum to 8 is no long er 5/ 36;
instead it is 1/6, sinc e d ie 2 must land on 5 to a chieve this result. This is called the conditionalprobability, because it is the probability of C under the condition that A is observed, and is written
P(C | A), which is rea d "the prob ab ility of C given A." Simila rly, P(C | B) = 0, since if we observe die 2
land ed o n 1, we a lready know the dice ca n't sum to 8, rega rdless of what the other die landed on.
On the o ther hand, if we roll the d ic e a nd c over up die 2, and o bserve d ie 1, this has no impa ct onthe probability of event B, which only depends on die 2. We say events A and B are statistically
independent or just independent and in this ca se
In other words, the probability of B occurring after observing that die 1 landed on 3 is the same as
before w e ob served d ie 1.
Intersection e vents a nd cond itional events are relat ed b y the formul a:
In this examp le, we ha ve:
As noted ab ove, , so by this formula:
On m ultiplying a c ross b y P( A),
In othe r words, if two ev ents are indep end ent, their joint proba bility is the p rod uct of the p rior
prob ab ilities of eac h event o ccurring b y itself.
Definition
Given a proba bility spa ce ( , F, P) and two events A, B F with P(B) > 0, the conditional probability
of A given B is d efined b y
8/6/2019 MB0040 - AJ
7/15
If P(B) = 0 then P(A | B) is unde fined (see BorelKolmog orov pa rad ox for an e xplanation). However it
is po ssible to de fine a co nditional probability with respe ct to a -algeb ra of such events (such as
those arising from a co ntinuous rand om variab le ).
For example, if X and Y are non-d eg enerate a nd jointly cont inuous random va riab les with de nsity
X,Y(x, y) then, if B has positive measure,
The c ase w he re B has zero measure can only be dealt with directly in the case that B={y0},
rep resenting a single po int, in w hich c ase
If A has mea sure zero then the cond itional p rob ab ility is zero. An indica tion of w hy the m ore ge neral
ca se o f zero m ea sure ca nnot b e dea lt w ith in a similar wa y ca n be seen b y noting t hat tha t the limit,
as all yi a pp roac h zero, of
de pe nds on their relationship as they ap proa ch zero. See c ond itional expec ta tion for more
information.
Derivation
The fo llowing deriva tion is ta ken from G rinstea d a nd Snell's Introduc tion to Proba b ility.
Let be a samp le spa ce with the probability P. Supp ose the event has oc curred and an
altered p rob ability P({ } | E) is to b e a ssigned to the elementa ry events { } to reflect the fac t that E
ha s oc cu rred . (In the following w e will om it the c urled b rackets.)
For all w e wa nt to m a ke sure that the intuitive result P( | E) = 0 is true.
Also, without further information provided, we can be certain that the relative magnitude of
proba b ilities is conserved :
.
This requireme nt lead s us to sta te:
whe re , is a p ositive rea l consta nt or sca ling fa ctor to reflect the abo ve requireme nt.
Since w e know Ehas oc curred, w e c an state P(E) > 0 and :
8/6/2019 MB0040 - AJ
8/15
Hence
For ano ther event F this leads to :
Statistical Indepe ndenc e
Two rand om events A and B are stat istica lly inde pen de nt if and only if
Thus, if A and B are indep end ent, then their joint prob ab ility can b e expressed a s a simple p rod uc t of
their individual probabilities.
Equivalently, for two indep endent eve nts A and B with non -zero prob ab ilities,
and
In othe r words, if A and B a re indep endent, then the c onditional proba bility of A , given B is simp ly the
individual probability of A alone; likewise, the probability of B given A is simply the probability of B
alone.
Mutua l Exc lusivity
Two events A a nd B are mutually exclusive if a nd only if . Then .
Therefo re, if P(B) > 0 then is de fined and equal to 0.
The Conditional Proba b ility Fala cy
The c ond itiona l proba bility fallac y is the a ssump tion that P(A | B) is approximately equal to P(B| A).
The ma thema tic ian John Allen Paulos d iscusses this in his boo k Innumeracy , where he po ints out tha t
it is a mistake often ma d e even b y do cto rs, law yers, and o ther highly educ at ed no n -sta tisticians. It
ca n b e overco me b y de sc ribing the d ata in ac tual numbers rathe r than p rob ab ilities.
The relation betw een P(A | B) and P(B| A) is given by Ba yes' theo rem :
In other words, one can only assume that P(A| B) is approximately equal to P(B| A) if the prior
probabilities P(A) and P(B) are also a pp roximately eq ual.
An Exam ple
8/6/2019 MB0040 - AJ
9/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 9 of 15
In the follow ing co nstructed b ut realistic si tuat ion, the differenc e between P(A| B) and P(B| A) may
be surp rising, but is a t the same t ime ob vious.
In ord er to id entify individuals having a serious disea se in a n ea rly curable form, one ma y c onsider
screening a large group of people. While the benefits are obvious, an argument against such
sc reening s is the d isturba nc e c aused by false positive sc reening results: If a p erson not having the
disease is incorrectly found to have it by the initial test, they will most likely be quite distressed until a
mo re ca reful test show s that they d o not hav e the d isease. Even a fter being t old they are w ell, their
lives may b e a ffecte d neg atively.
The ma gnitude of this prob lem is be st und erstoo d in terms o f c ond itional proba bilities.
Supp ose 1% of the g roup suffer from t he d isea se, and the rest a re we ll. Choo sing a n individua l at
random,
P(ill) = 1% = 0.01 and P(well) = 99% = 0.99.
Supp ose tha t w hen the screening test is a pp lied to a pe rson no t ha ving the disea se, the re is a 1%
ch anc e of g etting a false positive result an d 99% c hanc e of g etting a t rue nega tive result, i.e.
P(positive | well) = 1%, and P(negat ive | w ell) = 99%.
Finally, supp ose t hat whe n the test is ap plied t o a p erson having the disease, the re is a 1% chanc e o f
a false ne ga tive result and 99% c hanc e o f ge t ting a true po sitive result, i.e.
P(nega tive | ill) = 1% and P(positive | ill) = 99%.
Now, one ma y calc ulate the following:
The frac tion of individua ls in the whole g roup who a re well and test nega tive (true neg at ive):
The frac tion of ind ividua ls in the who le group who are ill and t est p ositive (true p ositive):
The frac tion o f ind ividua ls in the who le group w ho have fa lse positive results:
The frac tion of ind ividuals in the w hole g roup w ho ha ve false ne gat ive results:
Furthermore, the frac tion of individuals in the w hole g roup who test p ositive:
Finally, the probability that an individual actually has the disease, given that the test result is positive:
In this exam ple, it should b e ea sy to relat e to the differenc e b etwe en the c onditional proba bi lities
P(p ositive | ill) (whic h is 99%) and P(ill | po sitive) (w hich is 50%): the first is the prob a bility tha t an
8/6/2019 MB0040 - AJ
10/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 10 of 15
individual who has the disease tests positive; the second is the probability that an individual who
tests positive ac tua lly has the disease. With the numbers chosen here, the la st result is likely to be
dee med unac ce pta ble: half the p eople testing po sitive a re ac tually false p ositives.
Sec ond Typ e of Cond itional Proba b ility Fala cy
Another type o f fallac y is interpreting c ond itional proba bilitie s of events (or a series of ev ents) as
(unco nditional) probabilities, or seeing them a s be ing in the sam e o rde r of magnitude . A conditional
prob ab ility of a n event a nd its (tota l) proba bility are linked with ea c h other throug h the formula of
tota l prob ab ility, but without a dd itional informat ion one of them says little ab out the o ther. Thefallac y to view P(A| B) as P(A) o r as be ing c lose to P(A) is ofte n related with som e fo rms of sta tistica l
bias but it c an b e subtle.
Here is an e xamp le: One of the c ond itio ns for the leg end ary wild -west hero Wyat t Earp to h ave
become a legend was having survived all the duels he survived. Indeed, it is reported that he was
neve r wounded , not even sc rat ched by a bullet. The p rob ab ility of this to ha ppen is very small,
co ntributing to his fame bec a use eve nts of very sma ll prob ab ilities at trac t at tention. How ever, the
po int is tha t the d egree o f attention dep end s very much o n the ob server. Som ebod y impressed b y aspe c ific event (here seeing a "hero") is prone t o view effec ts of rando mness differently from othe rs
whic h are less imp ressed .
In general it does not make much sense to ask after observation of a remarkable series of events
"Wha t is the prob ab ility o f this?"; this is a co nditional proba bility b ased upo n o bservation. Thedistinction between conditional and unconditional probabilities can be intricate if the observer who
asks "What is the p roba bility?" is himself/ herself an outc ome o f a random selec tion. The na me "Wya tt
Ea rp effec t" was c oined in an a rticle "Der Wyat t Ea rp Effekt" (in German) show ing through seve ralexamp les its subtlety a nd impa c t in various sc ientific do ma ins.
Q6. The proba bility that a footb all player will play Ede n g arden is 0.6 and on Am be d kar Sta dium is
0.4. The p rob ab ility that he w ill ge t knee injury when p laying in Ede n is 0.07 and tha t in Ambed kar
stad ium is 0.04. Wha t is the prob a bility tha t he wo uld g et a knee injury if he played in Eden.
A6.:
P(A) = 0.6 P(B) = 0.4 P(C) = 0.07 P(D) = 0.04
P(A C) = P(A) * P(C)= 0.6*0.07
= 0.042
MBA SEMESTER 1
MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 2
Q1. A rando m sam p le of 6 sac hets of mustard oil wa s examined and two we re found to b e lea king.
A w holesaler rec eives seven hund red twenty six p a cks, ea ch c onta ining 6 sa chets. Find the
expected number of pa ckets to contain exactly one sac het leaking?A1.:
n = 6
N =726
Eac h pa cket c ontains 6 sac hets
Expec ted no. of pa cks to c onta in exa ctly 1 sac het leaking
E(A) = N*P(x)
P(x) = 36/ 726 = 0.0496
8/6/2019 MB0040 - AJ
11/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 11 of 15
E(A) = 726*0.0496 = 36
Q2. What do you me an b y errors in stat istics? Ment ion the me asures to do so.
A2.: In statistics and optimization, statistic al errors and residuals are two closely related and easily
co nfused mea sures o f the d eviation of a sam ple from its theo retica l value. The error of a samp le is
the deviation of the sample from the (unobservable) true function value; while the residual of a
sam ple is the d ifferenc e b etw een the sam ple and the estima ted func tion v alue.
The d istinc tion is most importa nt in reg ression ana lysis, where it lead s to t he c oncep t of stud entized
residuals.
Suppose the re is a series of observat ions from a univa riat e d istribution a nd w e wa nt to estima te the
mea n of tha t d istribution (the so -ca lled loca tion mod el). In this ca se the errors a re the deviations o f
the o bservations from the po pula tion mea n, while the residua ls a re the d eviations of the
ob servations from the sam ple m ea n.
A statistical error is the a mount by which an ob servation d iffers from its expec ted value; the latte r
being b ased on th e w hole po pulation from which t he stat istica l unit w as chosen rando mly. For
exam ple, if the mea n height in a pop ulation of 21 -yea r-old m en is 1.75 meters, and one rand omly
chosen man is 1.80 me ters ta ll, then the erro r is 0.05 met ers; if the rand om ly cho sen ma n is 1.70
met ers ta ll, then the error is 0.05 meters. The expec ted value, being the mea n of the entirepo pulat ion, is typic ally unob servable, and henc e the sta tistica l error cannot be ob served e ither.
The nomenc latu re arose from rand om m ea sureme nt erro rs in astrono my. It is as if the me asurement
of the ma ns height were an atte mpt t o me asure the p op ulation mea n, so that any differenc e
between the m an s height and the mea n would be a mea surement e rror.
A residual (or fitting error), on the other hand, is an observable estimate of the unobservable
sta tistica l error. Consider the p revious example w ith men s heights and supp ose w e ha ve a rand om
sample of n people . The sample mean could serve as a good estimator of the population mean.
Then w e have:
The differenc e be twee n the height of eac h man in the samp le and the unob serva blepopulation me an is a sta tistic a l erro r, whereas
The d ifferenc e b etw een the height of e ac h ma n in the samp le and the o bservab le sample me an is a residua l.
Note tha t th e sum of t he residua ls within a random sam ple is ne cessarily zero, a nd thus the residuals
are ne cessarily not independent . The sta tistica l errors on th e o ther hand are inde pe ndent , a nd their
sum w ithin the ran do m sa mp le is almost surely not zero.
One can standa rdize statistica l errors (espe cia lly of a nor ma l distribution) in a z-score (or standa rd
score ), and sta ndard ize residua ls in a t -sta tistic, o r mo re ge nerally stud ent ized residua ls.
Standard error of the m ean
The standard error of the mean (SEM) is the standa rd d eviation of the sa mple m ea n estimat e of a
population mean. (It can also be viewed as the standard deviation of the error in the sample mean
relative to the true mea n, since the sample m ea n is an unb iased estima to r.) SEM is usually estima te d
by the sam ple estimate of the p opulation stand ard d eviation (samp le stand a rd de viation) divide d
by the squa re root of the sam ple size (a ssuming stat istica l indep end enc e o f the values in the
sample):
8/6/2019 MB0040 - AJ
12/15
where
s is the sa mp le standa rd deviation (i.e., the samp le ba sed estimat e o f the standa rd de viation
of the p opulation), and
n is the size (numb er of ob servat ions) of the sa mp le.
This estimate ma y be comp ared w ith the formula for the t rue sta nda rd d eviation of the me an:
where
is the standa rd deviat ion of the po pulation.
Note 1: Sta nda rd error may a lso b e d efined a s the standa rd de viation of the residua l error term.
Note 2: Both the standard error and the standard deviation of small samples tend to systematically
unde restima te the p op ulation standa rd error and deviations: the sta nda rd e rror of the mea n is a
biased estima to r of the p op ulation sta nda rd erro r. With n = 2 the unde restimat e is abo ut 25%, but for
n = 6 the und erestimat e is only 5%. Gurland a nd Tripa thi (1971) provide a c orrec tion and eq uation fo rthis effec t. Sokal a nd Rohlf (1981) give an e qua tion of the c orrec tion fac tor for sma ll sa mp les of n (-1.96, 1.96) c ritica l values, 0 is rejec ted .
Conc lusion: The mea n is not 300ml, tha t is the ma chine is not funct ioning prop erly.
Q5. Out o f 2000 peo ple surveyed, 1200 b elong to urb an area s an d rest to semi urba n a reas. Among
1000 w ho visited othe r reg ions, 800 be longe d t o u rban area s. Test at 5% level o f significan ce whethe rarea a nd visiting ot her stat es a re d ep end ant.
A5.:N = 2000
P = 800/1000 = 0.8
= 0.05
P0 = 1200/2000 = 0.6
Q0 = 1- P0 = 0.4
H0 : P = 0.6 Area & visting other sta tes are dep end ent
H1 : P 0.6 Area & visting othe r stat es are indep endent
z = (P- P0) / ( P0 Q0 / n) N(0,1)
= (0.8-0.6)/ (0.6*0.4/1000)
= 12.91
zab s = 12.91 Value is not in the interval ( -1.96, 1.96)
H0 is rejected
Co nc lusion: Area & visiting othe r stat es a re indepe ndent.
Q6. How is sta tistics useful for mo de rn m an ag ers? Give exam p les and explain.
A6.: Mod ern mana gers often join ag encies be ca use t hey seek to serve a nd help their c ommunities
and c ountry. Not surprisingly, som e m ana gers are p uzzled b y the sugg estion of enga ging in resea rch
and stat istics: resea rc h ap pe ars bo ring in com pa rison with develop ing and imp lementing new
program s, and sta tistics seem s, well, imp ossibly challenging w ith little p ayoff in sight .
In fac t, ana lytica l techniques involving researc h and sta tistics are increasingly in dem and . Many
de c isions tha t mo d ern mana ge rs ma ke involve dat a a nd ana lysis, one wa y or ano ther. Consider th e
following com mon uses of ana lysis and da ta :
8/6/2019 MB0040 - AJ
15/15
MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 15 of 15
First, data and objective analysis often are used to de scribe and a nalyze p rob lems, such as the
ma gnitude of environm enta l disa sters (for examp le, oil spills), the extent o f soc ial and pub lic health
problems (suc h a s hom elessness or the AIDS ep ide mic ), the e xte nt o f law lessness, the lev el of
eco nomic prosp erity or sta gnation, or the impa ct of w eathe r -related problems such a s brought on
by hurrica nes and snow storms. Fo r exam ple, it ma tters whe ther the illiterac y rat e a mong 12 yea r
olds is 3 percent or 30 perc ent, or som ewh ere in betw een. By desc ribing the e xtent o f these
problems and their underlying causes accurately, managers are able to better formulate effective
strateg ies for dea ling w ith them . Policy an aly sis ofte n beg ins by de sc ribing the extent a nd
cha rac teristics of p rob lems, and the fac tors a ssoc iat ed with them.
Seco nd, da ta are used to d esc ribe p olicies and prog rams. What are programs and policies
expected to achieve? How many services are programs expected to provide? What are some
milestones of a chieve ment? How m uch will a prog ram cost? These q uestions involve qua ntifiable
answe rs, suc h as the numb er of nationa l guardsmen that a re brought in to a ssist w ith sea rch and
rescue efforts after a major hurricane, or the number of evacuees for whom officials expect to
provide refuge. Policies and prog ram s ca n be described in q uite det ailed ways, involving distinct
program a c tivities, the duration and geo graphic sco pe of a ctivities, and staffing levels and are a
program budget d ata.
Third, p rogram s produc e muc h routine, ad ministra tive d ata that are used to monitor progress and
prevent fraud. For exam ple, hospita ls prod uc e a large a mount o f data abo ut pa tient visits, who
at tend ed them , their diag nosis, billing c od es, and so on. Schools produc e va st a mounts of d at a
ab out student a chievement, stude nt c onduc t, extrac urricular a ctivities, supp ort and ad ministrativeservices, and so on . Regulatory program s produc e d ata a bout inspec tions and com plianc e. In
many states, gaming devices (such as slot machines) are monitored electronically to ensure that
taxes are collected and t hat they are not tam pered w ith. Mana gers a re expec ted to b e familiar
with the a dministrat ive data in their lines of b usiness.
Fourth, a na lysis is used to guid e a nd improve program op erations.
Data c an b e brought to bea r on problems that help man ag ers c hose am ong co mp eting strat egies.
For examp le, what -if ana lysis might b e used to d ete rmine the co st -effec tiveness of a lternative
courses of a ct ion. Suc h ana lysis often is ta ilored to unique situa tions a nd p roblems. In ad dition, c lient
and citizen surveys might b e used to inform p rogram priorities by a ssessing p op ula tion need s and
service sa tisfac tion. Systema tic surveys p rovide va lid and ob jec tive a ssessment s of c itizen and c lient
nee ds, priorities, and percep tions of p rograms a nd services. Syste ma tic surveys of c itizens and c lients
are used increa singly and are co nsidered a valuable too l of mo dern mana gem ent.
Fifth, data are used to evaluate outc ome s. Legisla tures and c itizens wa nt to know w hat return theyare getting from their tax dollars. Did programs and policies achieve their aims? Did they produce
any une xpec ted results?
Most grant applications require modern managers to be accountable for program outcomes.
Mode rn ma nag ers must d emonstrat e that t heir programs are produc ing effec tive o utco mes and
that the y are doing so in c ost -effec tive ways. This de ma nd for outcom e eva luation and monitoring
far exc eed s any requireme nt of p rop er funds manag ement. Ana lysis ca n also b e used to d etermine
the impac t of d ifferent c ond itions on p rog ram effec tiveness, lead ing to sugg estions for imp roving
programs.
Data and ana lysis are o mnipresent in prog ra ms and po licies. They are there a t eve ry sta ge , from the
incep tion o f program s and p olicies, to their very end. Of c ourse, de cisions are a lso b ased on
pe rsona l ob servation, p olitic al co nsensus, anec do ta l and impressionistic de sc ript ions, and theideolog ies of lea de rs. Yet d ata a nd analysis often a re present, too, one w ay o r another. This is
be c ause a nalysis is useful. Sp ecifica lly, quantitative ana lysis aid s in providing an ob ject ive, fac tual
unde rpinning o f situations and respo nses. Analysis, along with d at a, helps qua ntify the extent o f
prob lems and solutions in ways tha t othe r informa tion seldom ca n. Ana lysis ca n help quantify the
ac tual or likely impa c t of proposed strategies, for exam ple, helping to d etermine their ade qua c y. At
the ve ry least, a foc us on fa c ts and ob ject ive ana lysis might reduc e jud gm ent errors stemming from
ove rly imp ressionistic or subjec tive p erce ptions tha t are fa c tua lly incorrec t. So managers are
expec ted t o b ring d at a a nd a nalysis to the de cision-making ta ble.