MB0040 - AJ

Embed Size (px)

Citation preview

  • 8/6/2019 MB0040 - AJ

    1/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 1 of 15

    MBA SEMESTER 1

    MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 1

    Q1. Eluc ida te the func tions o f sta tistics.

    A1.:

    (1) Sta tistics helps in providing a be tter unde rstand ing and exac t de scription of a p henomenon ofnature.

    (2) Sta tistica l helps in prop er and eff icient p lanning of a sta tistica l inquiry in any field of study .(3) Sta tistica l helps in c ollecting an app rop riate quantita tive da ta .(4) Sta tistics helps in presenting co mp lex da ta in a suitable ta bular, dia grammat ic a nd grap hic form

    for an easy and c lear comp rehension of the da ta.

    (5) Sta tistics helps in understa nding the na ture and pa tte rn of va riability of a ph enomenon throughqua ntitative observations. (6) Stat istics helps in draw ing valid inference, a lon g w ith a mea sure oftheir reliability about the population parameters from the sample data.

    Q2. What a re the me thod s of sta tistic al survey? Explain b riefly.

    A2.: Sta tistic al surveys a re used to c ollec t q ua ntita tive informat ion ab out ite ms in a

    po pulation . Surveys of human po pulations a nd institutions are c om mon in polit ical p olling

    and government, health, social science and marketing research. A survey may focus on

    opinions or factual information depending on its purpose, and many surveys involve

    ad ministe ring qu estions to individua ls. When the que st ions a re ad ministered b y a resea rche r,

    the survey is c alled a struct ured interview or a resea rcher-a d ministered survey . When th e

    qu estions are a d ministered b y the resp on de nt, the survey is referred to a s a qu estionna ire o r

    a self-ad ministe red survey .

    Structure and stand ardizatio n

    The questions a re usua lly struc tured and sta ndard ized . The struct ure is intende d to reduce b ias. Fo r

    examp le, questions should b e ordered in such a w ay that a qu estion do es not influenc e the

    response to subsequen t qu estion s. Surveys are standardized to ensure reliability, generalizability, and

    validity. Every respondent should b e p resented with the sam e questions and in the sam e o rder as

    other respo ndents.

    In orga nizationa l develop ment (OD), carefully constructed survey instruments a re often used a s the

    ba sis for da ta ga thering, organizat ional d iagno sis, and subsequ ent a ction planning. Som e OD

    prac titioners (e.g. Fred Nic kols) even c onsider survey guided dev elop ment as the sine qua non of

    OD.

    Serial surveys

    Serial surveys ar e tho se w hich rep eat the sam e que stions at different p oints in time, p rod uc ing

    rep eated measures da ta . There a re three ba sic d esigns for a study with mo re than one

    mea surement oc casion: cross-sec tional d esign, longitud inal de sign, and time -series de sign.

    Cross-sectional surveys use different units (respondents) at each of the measurementoc c asions, by draw ing a new sam ple ea c h time. The time intervals ma y be d ifferent

    be tween mea surement oc ca sions, but t hey a re the sam e fo r all units (respo nde nts). A stud y

    in which a survey is ad ministered onc e is also co nsidered to b e c ross -sec tional.

    Longitudinal surveys use the same units (respondents) at each of the measurementoc c asions, by reco nta cting the sam e sam ple from th e initial survey fo r the follow ing

    measureme nt o ccasion(s), and a sking the sa me que stions at every oc ca sion. The time

    intervals may be d ifferent betw een mea surement oc casions, but they a re the sa me for all

    units (resp ond ents).

    Time -series surveys a lso use the same units (responde nts) at ea ch of the measureme ntoc c asions, but the difference with long itudinal stud y d esigns is that in time -series de signs bo th

  • 8/6/2019 MB0040 - AJ

    2/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 2 of 15

    the numb er of measurement oc casions and the time intervals be twe en oc ca sions ma y be

    different b etw een units (respon de nts).

    Mode s of Data Co llectio n

    There a re seve ra l wa ys of a dministe ring a survey, includ ing:

    Tele phone

    use o f interviewers enc ourag es samp le pe rsons to respo nd , lead ing to highe r respo nse rates. interviewers can increase comprehension of questions by answering respondents' questions. fairly cost efficient, dep end ing on loca l call cha rge structure go od fo r large nationa l (or internat ional) sampling fram es some p otential for interview er bias (e.g. som e p eople m ay b e m ore willing to discuss a

    sensitive issue w ith a fema le interviewe r than with a ma le one )

    canno t b e used for non -audio informa tion (grap hics, de monstrations, ta ste/ smell sam ples) unreliable for consumer surveys in rural areas where telephone penetration is low three t ypes:

    o traditional telephone interviewso co mp uter assisted tel ep hone d ialingo co mp uter assisted telep hone interviewing ( CATI)

    Mail

    the questionnaire may be hande d to the respo ndents or mailed to the m, but in all cases theyare returned to the resea rcher via ma il.

    cost is very low , since bulk po sta ge is cheap in most c ountries long time delays, often several months, before the surveys are returned and statistical

    ana lysis can be gin

    not suitable for issues that may require clarification respo ndents ca n a nswer at their own c onvenience (allowing them to break up long surve ys;

    also useful if they nee d t o c hec k rec ords to an sw er a question)

    no interview er bias introdu ce d large amount of information ca n be o btained : some ma il surveys a re a s long a s 50 p ag es respo nse rates ca n be improved by using ma il p anels

    o memb ers of the pa nel have agreed to pa rticipateo pa nels ca n b e used in longitudinal designs whe re the same respo nde nts are surveyed

    several

    Online surveys

    ca n use web o r e-mail w eb is p referred ove r e -mail beca use interac tive HTML forms can b e used ofte n inexpensive to adm inister very fast results easy to mod ify respo nse rates c an b e improved b y using Online pa nels - memb ers of the p anel have

    agreed to participate

    if not p asswo rd -protec ted , easy to ma nipulate b y co mp leting multiple times to skew results da ta c reation, manipu lation a nd reporting ca n be auto ma ted a nd/ or easily expo rted into a

    forma t which c an be read by PSPP, DAP o r other sta tistic a l a nalysis softw are

    da ta sets c rea ted in rea l time som e a re inc ent ive based (suc h as Survey Vault o r YouGov) ma y skew sam ple towa rd s a younger demo graphic c omp ared with CATI often difficult to determine/control selection probabilities, hindering quantitative analysis of

    data

    use in large scale industries.Persona l in- home survey

  • 8/6/2019 MB0040 - AJ

    3/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 3 of 15

    respo nde nts are interviewed in person, in their home s (o r at the front do or) very high cost suitable when graphic representations, smells, or demonstrations are involved ofte n suitab le for long surveys (but som e respo nden ts obje ct to a llowing strangers into their

    home for extende d p eriods)

    suitab le for loc ation s where telep hone or mail are not d eveloped skilled interviewers can persuad e respondents to coo pe rate, imp roving respo nse rates potential for interviewer bias

    Personal mall intercept survey

    shop pe rs at m alls are intercep ted - they are either interview ed o n the spo t, taken to a roomand interviewed , or taken to a room and given a self -administered que stionnaire

    socially acceptable - people feel that a mall is a more appropriate place to do researchthan their home

    potential for interviewer bias fast ea sy to manipulate b y co mp leting multiple times to skew results

    Samp ling

    Sam ple select ion is c ritical to the va lidity of the information tha t represents the p op ulations tha t a re

    being studied. The ap proac h of the samp ling helps to d etermine the fo cus of the study and allowsbet ter ac c epta nce of the generaliza tions that are being m ad e. Careful use of b iased sam pling ca n

    be used if it is justified and as long as it is noted that the resulting sample may not be a true

    rep resenta tion of the po pula tion of the stud y. Th ere are two d ifferent app roac hes to sam pling insurvey resea rch:

    There is nonprob ability sam pling a pp roa ch. In this a pp roac h the researc her doe s not knowea c h eleme nt's p rob a bility of selec tion in the sam ple. The m ost c ommonly used

    nonp roba b ility sa mp ling method is the c onvenienc e sa mpling ap p roa c h. With this me thod , it

    only sam ples those w ho a re ava ilab le a nd w illing to pa rticipa te in the survey. The use o f this

    ap p roa c h allows for convenienc e for the researcher while possibly losing da ta validity due to

    the lac k o f rep resenta tion.

    The p roba bility sam pling ap proa ch for researc h methods gives ea ch e lement a knownchanc e o f being inc luded in the sample. This meth od is clo ser to a true rep resentat ion of thepop ulation. It c an b e difficult to use d ue to co st of a rigorous sa mpling method , and diffic ulty

    in obtaining full co vera ge of the t arget p opulation, but the g eneralizations that c ome f rom it

    are more likely to be closer to a true representation of the population. Different forms of

    prob ab ility sam pling are d esigned to a chieve various b enefits - e.g. theoretica l simplicity,

    op erational simplicity, detailed informa tion on subp op ulations, or minimal cost. Some

    co mmo n forms:

    o Equa l proba bility of selec tion d esigns (EPS), in which ea ch elem ent o f the po pulationhas a n equa l cha nce o f b eing included in the sam ple. This uniformity ma kes EPS

    surveys relat ively simp le to inte rpret . Forms of EPS inc lud e Simp le rando m samp ling

    (SRS) and syste ma tic samp ling .

    o Probability-proportional-to-size d esigns (PPS), in whic h 'la rger ' elements (ac co rding tosom e known measure of size) have a higher cha nc e of selection . This ap p roa c h is

    com mo n in bu siness surveys where the ob jec t is to d ete rmine sec to r tot als (e.g . "to ta l

    employme nt in ma nufact uring sec tors"); co mpa red to EPS, c onc entr ating on largerelements may produc e b etter ac curac y for the sam e c ost/ sam ple size.

    o Stratified rando m sampling ap proac h , in which the population is divided intosubpopulations (called strata) and random samples are then drawn separately from

    each of these strata, using any probability sampling method (sometimes including

    further sub -stratificat ion). This ma y be do ne t o p rovide be tter c ontrol ove r the sam ple

    size (and hence, a cc urac y) within eac h subp op ulation; when the va riab le/ s of

  • 8/6/2019 MB0040 - AJ

    4/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 4 of 15

    interest are correlated with subpopulation, it can also improve overall accuracy.

    Anothe r use for strat ific at ion is when different subpo pu lations requ ire d ifferent

    samp ling method s - for insta nc e, a bu siness survey mig ht use EPS for businesses whose

    'size' is not known a nd PPS el sew here.

    Q3. Tab ulate t he follow ing da ta:

    Age: 20-40; 40-60;60-above

    Dep a rtmen ts: Eng lish, Hind i, Politic al sc ienc e, Histo ry, soc iolog y

    Degree level: Graduates, Post g ra dua tes; PhD, Tota l stud ents in ag e g roup a nd in d eg ree level.

    A3.:

    Age Degree Level

    DEPARTMENTSTot a l

    Eng Hin Pol. Sci Histo ry Sociolo gy

    20-40

    40-60

    60 & Ab ove

    Tot a l

    Q4. The d ata given b elow is the d istribut ion of em ployees of a business a cc ording to their efficienc y.

    Find the mea n deviation and coe fficient of mea n deviation from Mean a nd Med ian:

    Effic iency Ind ex 22-26 26-30 30-34 34-38 38-42

    Emloyees 25 35 15 5 2

    A4.: Calc ulation of Mea n Deviation from Mea n:

    E I E (f) x fx| D| =

    (x-28.29)f| D|

    22-26 25 24 600 4.29 107.25

    26-30 35 28 980 0.29 10.15

    30-34 15 32 480 3.71 55.65

    34-38 5 36 180 7.71 38.55

    38-42 2 40 80 11.71 23.42

    N=82 fx=2320 235.02

    MD = ( f | D| ) /N

    X = fx/N = 2320/82 = 28.29

    D = X X

    MD = 235.02/ 82 = 2.866

    Co -efficient of MD from mean = MD/ X = 2.866/28.29 = 0.1013

  • 8/6/2019 MB0040 - AJ

    5/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 5 of 15

    Calc ulation of MD from Median:

    EI f Cf X | D| =x-Me f| D|

    22-26 25 25 24 3.83 95.75

    26-30 35 60 28 0.17 5.95

    30-34 15 75 32 4.17 62.55

    34-38 5 80 36 8.17 40.85

    38-42 2 82 40 12.17 24.34N=82 f| D| =229.44

    Me c lass = N/ 2 th class = 82/2th class = 41 th class

    Me c la ss = 26-30

    Me = l + [(N/ 2 Cf)/ f] * I = 26 + [(41-25)/ 35] * 4 = 27.83

    MD = f| D| /N = 229.44/ 82 = 2.798

    Coefficient of MD = MD/M e = 2.798/27.83 = 0.1005

    Q5. What is c ond itiona l proba bility? Exp lain with an examp le.

    A5.:Conditional probability is the p rob ability of some e vent A, given the oc currenc e of some other

    event B. Conditional probability is written P(A | B), and is read "the (conditional) probability of A,

    given B" o r "the p roba bility of A under the co ndition B". When in a random experiment the event B is

    known to have oc curred , the p ossible outcom es of the experiment a re reduc ed to B, and hence the

    proba bility of the oc currenc e of A is cha nged from the unc onditional proba bility into the cond itional

    probability given B.

    Joint probability is the p rob ab ility of two eve nts in conjunction. Tha t is, it is the p robability of b oth

    event s tog ether. The joint p rob ab ility of A a nd B is written or

    Marginal probability is then the unconditional probability P(A) of the event A; that is, the p rob ab ilityof A, regardless of whether event B did or did not occur. If B can be thought of as the event of a

    random variable X having a given outcome, the marginal probability of A can be obtained bysumming (or integ rat ing, more g enerally) the joint p rob ab ilities over all outc om es for X. For examp le,

    if there are two possible outcomes for X with corresponding events B and B', this mea ns tha t

    . This is ca lled marginalization.

    In these definitions, note that there need not be a causal or temporal relation between A a nd B. A

    may precede B or vice versa or they may hap p en at the sam e time. A ma y ca use B or vic e ve rsa o r

    they may have no causal relation at all. Notice, however, that causal and temporal relations are

    informa l notions, not b elonging to the p rob ab ilistic fram ewo rk. They ma y ap ply in som e examples,

    dep ending on the interpreta tion given to e vents.

    Conditioning of prob abilities, i.e. upd ating them to ta ke ac co unt of (po ssibly new ) information, maybe ac hieved throug h Ba yes' theorem. In such co nditioning, the prob ab ility of A given only initial

    information I, P(A| I), is know n as the p rior p rob ability. The up da ted c ond itional proba bility of A,

    given I and t he outco me of the event B, is known a s the p oste rior p rob ab ility, P(A| B,I).

    Introduction

    Consider the simple sc ena rio of rolling two fair six -sided dic e, lab elled die 1 and die 2. Define th e

    follow ing three event s (not assume d to oc cur simulta neou sly):

    A: Die 1 lands on 3.

  • 8/6/2019 MB0040 - AJ

    6/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 6 of 15

    B: Die 2 land s on 1.

    C: The dic e sum to 8.

    The p rior p rob a bility o f ea ch ev ent d esc ribes how likely the outc om e is be fore the d ice a re rolled,

    without a ny knowled ge o f the roll's outc om e. For exam ple, die 1 is equa lly likely to fa ll on eac h of its

    6 sides, so P(A) = 1/6. Similarly P(B) = 1/6. Likewise, o f the 6 6 = 36 po ssible w ays tha t a pa ir of d ice

    can la nd, just 5 result in a sum o f 8 (namely 2 and 6, 3 and 5, 4 and 4 , 5 a nd 3, and 6 and 2), so P(C)

    = 5/36.

    Some of these event s ca n both o c cur at th e sa me time; for exam ple events A and C can happen atthe sam e time , in the c ase where d ie 1 lands on 3 a nd d ie 2 lands on 5. This is the o nly one o f the 36

    outcomes where bo th A and C oc cur, so its p rob ab ility is 1/36. The p roba bility of both A and C

    oc c urring is ca lled the joint p roba bility of A and C and is written , so . On the

    other hand, if die 2 land s on 1, the dic e canno t sum to 8, so .

    Now supp ose we roll the d ice and co ver up die 2, so we can o nly see d ie 1, and ob serve that d ie 1

    landed on 3. Given this pa rtial informat ion, the prob a bility that the d ice sum to 8 is no long er 5/ 36;

    instead it is 1/6, sinc e d ie 2 must land on 5 to a chieve this result. This is called the conditionalprobability, because it is the probability of C under the condition that A is observed, and is written

    P(C | A), which is rea d "the prob ab ility of C given A." Simila rly, P(C | B) = 0, since if we observe die 2

    land ed o n 1, we a lready know the dice ca n't sum to 8, rega rdless of what the other die landed on.

    On the o ther hand, if we roll the d ic e a nd c over up die 2, and o bserve d ie 1, this has no impa ct onthe probability of event B, which only depends on die 2. We say events A and B are statistically

    independent or just independent and in this ca se

    In other words, the probability of B occurring after observing that die 1 landed on 3 is the same as

    before w e ob served d ie 1.

    Intersection e vents a nd cond itional events are relat ed b y the formul a:

    In this examp le, we ha ve:

    As noted ab ove, , so by this formula:

    On m ultiplying a c ross b y P( A),

    In othe r words, if two ev ents are indep end ent, their joint proba bility is the p rod uct of the p rior

    prob ab ilities of eac h event o ccurring b y itself.

    Definition

    Given a proba bility spa ce ( , F, P) and two events A, B F with P(B) > 0, the conditional probability

    of A given B is d efined b y

  • 8/6/2019 MB0040 - AJ

    7/15

    If P(B) = 0 then P(A | B) is unde fined (see BorelKolmog orov pa rad ox for an e xplanation). However it

    is po ssible to de fine a co nditional probability with respe ct to a -algeb ra of such events (such as

    those arising from a co ntinuous rand om variab le ).

    For example, if X and Y are non-d eg enerate a nd jointly cont inuous random va riab les with de nsity

    X,Y(x, y) then, if B has positive measure,

    The c ase w he re B has zero measure can only be dealt with directly in the case that B={y0},

    rep resenting a single po int, in w hich c ase

    If A has mea sure zero then the cond itional p rob ab ility is zero. An indica tion of w hy the m ore ge neral

    ca se o f zero m ea sure ca nnot b e dea lt w ith in a similar wa y ca n be seen b y noting t hat tha t the limit,

    as all yi a pp roac h zero, of

    de pe nds on their relationship as they ap proa ch zero. See c ond itional expec ta tion for more

    information.

    Derivation

    The fo llowing deriva tion is ta ken from G rinstea d a nd Snell's Introduc tion to Proba b ility.

    Let be a samp le spa ce with the probability P. Supp ose the event has oc curred and an

    altered p rob ability P({ } | E) is to b e a ssigned to the elementa ry events { } to reflect the fac t that E

    ha s oc cu rred . (In the following w e will om it the c urled b rackets.)

    For all w e wa nt to m a ke sure that the intuitive result P( | E) = 0 is true.

    Also, without further information provided, we can be certain that the relative magnitude of

    proba b ilities is conserved :

    .

    This requireme nt lead s us to sta te:

    whe re , is a p ositive rea l consta nt or sca ling fa ctor to reflect the abo ve requireme nt.

    Since w e know Ehas oc curred, w e c an state P(E) > 0 and :

  • 8/6/2019 MB0040 - AJ

    8/15

    Hence

    For ano ther event F this leads to :

    Statistical Indepe ndenc e

    Two rand om events A and B are stat istica lly inde pen de nt if and only if

    Thus, if A and B are indep end ent, then their joint prob ab ility can b e expressed a s a simple p rod uc t of

    their individual probabilities.

    Equivalently, for two indep endent eve nts A and B with non -zero prob ab ilities,

    and

    In othe r words, if A and B a re indep endent, then the c onditional proba bility of A , given B is simp ly the

    individual probability of A alone; likewise, the probability of B given A is simply the probability of B

    alone.

    Mutua l Exc lusivity

    Two events A a nd B are mutually exclusive if a nd only if . Then .

    Therefo re, if P(B) > 0 then is de fined and equal to 0.

    The Conditional Proba b ility Fala cy

    The c ond itiona l proba bility fallac y is the a ssump tion that P(A | B) is approximately equal to P(B| A).

    The ma thema tic ian John Allen Paulos d iscusses this in his boo k Innumeracy , where he po ints out tha t

    it is a mistake often ma d e even b y do cto rs, law yers, and o ther highly educ at ed no n -sta tisticians. It

    ca n b e overco me b y de sc ribing the d ata in ac tual numbers rathe r than p rob ab ilities.

    The relation betw een P(A | B) and P(B| A) is given by Ba yes' theo rem :

    In other words, one can only assume that P(A| B) is approximately equal to P(B| A) if the prior

    probabilities P(A) and P(B) are also a pp roximately eq ual.

    An Exam ple

  • 8/6/2019 MB0040 - AJ

    9/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 9 of 15

    In the follow ing co nstructed b ut realistic si tuat ion, the differenc e between P(A| B) and P(B| A) may

    be surp rising, but is a t the same t ime ob vious.

    In ord er to id entify individuals having a serious disea se in a n ea rly curable form, one ma y c onsider

    screening a large group of people. While the benefits are obvious, an argument against such

    sc reening s is the d isturba nc e c aused by false positive sc reening results: If a p erson not having the

    disease is incorrectly found to have it by the initial test, they will most likely be quite distressed until a

    mo re ca reful test show s that they d o not hav e the d isease. Even a fter being t old they are w ell, their

    lives may b e a ffecte d neg atively.

    The ma gnitude of this prob lem is be st und erstoo d in terms o f c ond itional proba bilities.

    Supp ose 1% of the g roup suffer from t he d isea se, and the rest a re we ll. Choo sing a n individua l at

    random,

    P(ill) = 1% = 0.01 and P(well) = 99% = 0.99.

    Supp ose tha t w hen the screening test is a pp lied to a pe rson no t ha ving the disea se, the re is a 1%

    ch anc e of g etting a false positive result an d 99% c hanc e of g etting a t rue nega tive result, i.e.

    P(positive | well) = 1%, and P(negat ive | w ell) = 99%.

    Finally, supp ose t hat whe n the test is ap plied t o a p erson having the disease, the re is a 1% chanc e o f

    a false ne ga tive result and 99% c hanc e o f ge t ting a true po sitive result, i.e.

    P(nega tive | ill) = 1% and P(positive | ill) = 99%.

    Now, one ma y calc ulate the following:

    The frac tion of individua ls in the whole g roup who a re well and test nega tive (true neg at ive):

    The frac tion of ind ividua ls in the who le group who are ill and t est p ositive (true p ositive):

    The frac tion o f ind ividua ls in the who le group w ho have fa lse positive results:

    The frac tion of ind ividuals in the w hole g roup w ho ha ve false ne gat ive results:

    Furthermore, the frac tion of individuals in the w hole g roup who test p ositive:

    Finally, the probability that an individual actually has the disease, given that the test result is positive:

    In this exam ple, it should b e ea sy to relat e to the differenc e b etwe en the c onditional proba bi lities

    P(p ositive | ill) (whic h is 99%) and P(ill | po sitive) (w hich is 50%): the first is the prob a bility tha t an

  • 8/6/2019 MB0040 - AJ

    10/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 10 of 15

    individual who has the disease tests positive; the second is the probability that an individual who

    tests positive ac tua lly has the disease. With the numbers chosen here, the la st result is likely to be

    dee med unac ce pta ble: half the p eople testing po sitive a re ac tually false p ositives.

    Sec ond Typ e of Cond itional Proba b ility Fala cy

    Another type o f fallac y is interpreting c ond itional proba bilitie s of events (or a series of ev ents) as

    (unco nditional) probabilities, or seeing them a s be ing in the sam e o rde r of magnitude . A conditional

    prob ab ility of a n event a nd its (tota l) proba bility are linked with ea c h other throug h the formula of

    tota l prob ab ility, but without a dd itional informat ion one of them says little ab out the o ther. Thefallac y to view P(A| B) as P(A) o r as be ing c lose to P(A) is ofte n related with som e fo rms of sta tistica l

    bias but it c an b e subtle.

    Here is an e xamp le: One of the c ond itio ns for the leg end ary wild -west hero Wyat t Earp to h ave

    become a legend was having survived all the duels he survived. Indeed, it is reported that he was

    neve r wounded , not even sc rat ched by a bullet. The p rob ab ility of this to ha ppen is very small,

    co ntributing to his fame bec a use eve nts of very sma ll prob ab ilities at trac t at tention. How ever, the

    po int is tha t the d egree o f attention dep end s very much o n the ob server. Som ebod y impressed b y aspe c ific event (here seeing a "hero") is prone t o view effec ts of rando mness differently from othe rs

    whic h are less imp ressed .

    In general it does not make much sense to ask after observation of a remarkable series of events

    "Wha t is the prob ab ility o f this?"; this is a co nditional proba bility b ased upo n o bservation. Thedistinction between conditional and unconditional probabilities can be intricate if the observer who

    asks "What is the p roba bility?" is himself/ herself an outc ome o f a random selec tion. The na me "Wya tt

    Ea rp effec t" was c oined in an a rticle "Der Wyat t Ea rp Effekt" (in German) show ing through seve ralexamp les its subtlety a nd impa c t in various sc ientific do ma ins.

    Q6. The proba bility that a footb all player will play Ede n g arden is 0.6 and on Am be d kar Sta dium is

    0.4. The p rob ab ility that he w ill ge t knee injury when p laying in Ede n is 0.07 and tha t in Ambed kar

    stad ium is 0.04. Wha t is the prob a bility tha t he wo uld g et a knee injury if he played in Eden.

    A6.:

    P(A) = 0.6 P(B) = 0.4 P(C) = 0.07 P(D) = 0.04

    P(A C) = P(A) * P(C)= 0.6*0.07

    = 0.042

    MBA SEMESTER 1

    MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 2

    Q1. A rando m sam p le of 6 sac hets of mustard oil wa s examined and two we re found to b e lea king.

    A w holesaler rec eives seven hund red twenty six p a cks, ea ch c onta ining 6 sa chets. Find the

    expected number of pa ckets to contain exactly one sac het leaking?A1.:

    n = 6

    N =726

    Eac h pa cket c ontains 6 sac hets

    Expec ted no. of pa cks to c onta in exa ctly 1 sac het leaking

    E(A) = N*P(x)

    P(x) = 36/ 726 = 0.0496

  • 8/6/2019 MB0040 - AJ

    11/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 11 of 15

    E(A) = 726*0.0496 = 36

    Q2. What do you me an b y errors in stat istics? Ment ion the me asures to do so.

    A2.: In statistics and optimization, statistic al errors and residuals are two closely related and easily

    co nfused mea sures o f the d eviation of a sam ple from its theo retica l value. The error of a samp le is

    the deviation of the sample from the (unobservable) true function value; while the residual of a

    sam ple is the d ifferenc e b etw een the sam ple and the estima ted func tion v alue.

    The d istinc tion is most importa nt in reg ression ana lysis, where it lead s to t he c oncep t of stud entized

    residuals.

    Suppose the re is a series of observat ions from a univa riat e d istribution a nd w e wa nt to estima te the

    mea n of tha t d istribution (the so -ca lled loca tion mod el). In this ca se the errors a re the deviations o f

    the o bservations from the po pula tion mea n, while the residua ls a re the d eviations of the

    ob servations from the sam ple m ea n.

    A statistical error is the a mount by which an ob servation d iffers from its expec ted value; the latte r

    being b ased on th e w hole po pulation from which t he stat istica l unit w as chosen rando mly. For

    exam ple, if the mea n height in a pop ulation of 21 -yea r-old m en is 1.75 meters, and one rand omly

    chosen man is 1.80 me ters ta ll, then the erro r is 0.05 met ers; if the rand om ly cho sen ma n is 1.70

    met ers ta ll, then the error is 0.05 meters. The expec ted value, being the mea n of the entirepo pulat ion, is typic ally unob servable, and henc e the sta tistica l error cannot be ob served e ither.

    The nomenc latu re arose from rand om m ea sureme nt erro rs in astrono my. It is as if the me asurement

    of the ma ns height were an atte mpt t o me asure the p op ulation mea n, so that any differenc e

    between the m an s height and the mea n would be a mea surement e rror.

    A residual (or fitting error), on the other hand, is an observable estimate of the unobservable

    sta tistica l error. Consider the p revious example w ith men s heights and supp ose w e ha ve a rand om

    sample of n people . The sample mean could serve as a good estimator of the population mean.

    Then w e have:

    The differenc e be twee n the height of eac h man in the samp le and the unob serva blepopulation me an is a sta tistic a l erro r, whereas

    The d ifferenc e b etw een the height of e ac h ma n in the samp le and the o bservab le sample me an is a residua l.

    Note tha t th e sum of t he residua ls within a random sam ple is ne cessarily zero, a nd thus the residuals

    are ne cessarily not independent . The sta tistica l errors on th e o ther hand are inde pe ndent , a nd their

    sum w ithin the ran do m sa mp le is almost surely not zero.

    One can standa rdize statistica l errors (espe cia lly of a nor ma l distribution) in a z-score (or standa rd

    score ), and sta ndard ize residua ls in a t -sta tistic, o r mo re ge nerally stud ent ized residua ls.

    Standard error of the m ean

    The standard error of the mean (SEM) is the standa rd d eviation of the sa mple m ea n estimat e of a

    population mean. (It can also be viewed as the standard deviation of the error in the sample mean

    relative to the true mea n, since the sample m ea n is an unb iased estima to r.) SEM is usually estima te d

    by the sam ple estimate of the p opulation stand ard d eviation (samp le stand a rd de viation) divide d

    by the squa re root of the sam ple size (a ssuming stat istica l indep end enc e o f the values in the

    sample):

  • 8/6/2019 MB0040 - AJ

    12/15

    where

    s is the sa mp le standa rd deviation (i.e., the samp le ba sed estimat e o f the standa rd de viation

    of the p opulation), and

    n is the size (numb er of ob servat ions) of the sa mp le.

    This estimate ma y be comp ared w ith the formula for the t rue sta nda rd d eviation of the me an:

    where

    is the standa rd deviat ion of the po pulation.

    Note 1: Sta nda rd error may a lso b e d efined a s the standa rd de viation of the residua l error term.

    Note 2: Both the standard error and the standard deviation of small samples tend to systematically

    unde restima te the p op ulation standa rd error and deviations: the sta nda rd e rror of the mea n is a

    biased estima to r of the p op ulation sta nda rd erro r. With n = 2 the unde restimat e is abo ut 25%, but for

    n = 6 the und erestimat e is only 5%. Gurland a nd Tripa thi (1971) provide a c orrec tion and eq uation fo rthis effec t. Sokal a nd Rohlf (1981) give an e qua tion of the c orrec tion fac tor for sma ll sa mp les of n (-1.96, 1.96) c ritica l values, 0 is rejec ted .

    Conc lusion: The mea n is not 300ml, tha t is the ma chine is not funct ioning prop erly.

    Q5. Out o f 2000 peo ple surveyed, 1200 b elong to urb an area s an d rest to semi urba n a reas. Among

    1000 w ho visited othe r reg ions, 800 be longe d t o u rban area s. Test at 5% level o f significan ce whethe rarea a nd visiting ot her stat es a re d ep end ant.

    A5.:N = 2000

    P = 800/1000 = 0.8

    = 0.05

    P0 = 1200/2000 = 0.6

    Q0 = 1- P0 = 0.4

    H0 : P = 0.6 Area & visting other sta tes are dep end ent

    H1 : P 0.6 Area & visting othe r stat es are indep endent

    z = (P- P0) / ( P0 Q0 / n) N(0,1)

    = (0.8-0.6)/ (0.6*0.4/1000)

    = 12.91

    zab s = 12.91 Value is not in the interval ( -1.96, 1.96)

    H0 is rejected

    Co nc lusion: Area & visiting othe r stat es a re indepe ndent.

    Q6. How is sta tistics useful for mo de rn m an ag ers? Give exam p les and explain.

    A6.: Mod ern mana gers often join ag encies be ca use t hey seek to serve a nd help their c ommunities

    and c ountry. Not surprisingly, som e m ana gers are p uzzled b y the sugg estion of enga ging in resea rch

    and stat istics: resea rc h ap pe ars bo ring in com pa rison with develop ing and imp lementing new

    program s, and sta tistics seem s, well, imp ossibly challenging w ith little p ayoff in sight .

    In fac t, ana lytica l techniques involving researc h and sta tistics are increasingly in dem and . Many

    de c isions tha t mo d ern mana ge rs ma ke involve dat a a nd ana lysis, one wa y or ano ther. Consider th e

    following com mon uses of ana lysis and da ta :

  • 8/6/2019 MB0040 - AJ

    15/15

    MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 15 of 15

    First, data and objective analysis often are used to de scribe and a nalyze p rob lems, such as the

    ma gnitude of environm enta l disa sters (for examp le, oil spills), the extent o f soc ial and pub lic health

    problems (suc h a s hom elessness or the AIDS ep ide mic ), the e xte nt o f law lessness, the lev el of

    eco nomic prosp erity or sta gnation, or the impa ct of w eathe r -related problems such a s brought on

    by hurrica nes and snow storms. Fo r exam ple, it ma tters whe ther the illiterac y rat e a mong 12 yea r

    olds is 3 percent or 30 perc ent, or som ewh ere in betw een. By desc ribing the e xtent o f these

    problems and their underlying causes accurately, managers are able to better formulate effective

    strateg ies for dea ling w ith them . Policy an aly sis ofte n beg ins by de sc ribing the extent a nd

    cha rac teristics of p rob lems, and the fac tors a ssoc iat ed with them.

    Seco nd, da ta are used to d esc ribe p olicies and prog rams. What are programs and policies

    expected to achieve? How many services are programs expected to provide? What are some

    milestones of a chieve ment? How m uch will a prog ram cost? These q uestions involve qua ntifiable

    answe rs, suc h as the numb er of nationa l guardsmen that a re brought in to a ssist w ith sea rch and

    rescue efforts after a major hurricane, or the number of evacuees for whom officials expect to

    provide refuge. Policies and prog ram s ca n be described in q uite det ailed ways, involving distinct

    program a c tivities, the duration and geo graphic sco pe of a ctivities, and staffing levels and are a

    program budget d ata.

    Third, p rogram s produc e muc h routine, ad ministra tive d ata that are used to monitor progress and

    prevent fraud. For exam ple, hospita ls prod uc e a large a mount o f data abo ut pa tient visits, who

    at tend ed them , their diag nosis, billing c od es, and so on. Schools produc e va st a mounts of d at a

    ab out student a chievement, stude nt c onduc t, extrac urricular a ctivities, supp ort and ad ministrativeservices, and so on . Regulatory program s produc e d ata a bout inspec tions and com plianc e. In

    many states, gaming devices (such as slot machines) are monitored electronically to ensure that

    taxes are collected and t hat they are not tam pered w ith. Mana gers a re expec ted to b e familiar

    with the a dministrat ive data in their lines of b usiness.

    Fourth, a na lysis is used to guid e a nd improve program op erations.

    Data c an b e brought to bea r on problems that help man ag ers c hose am ong co mp eting strat egies.

    For examp le, what -if ana lysis might b e used to d ete rmine the co st -effec tiveness of a lternative

    courses of a ct ion. Suc h ana lysis often is ta ilored to unique situa tions a nd p roblems. In ad dition, c lient

    and citizen surveys might b e used to inform p rogram priorities by a ssessing p op ula tion need s and

    service sa tisfac tion. Systema tic surveys p rovide va lid and ob jec tive a ssessment s of c itizen and c lient

    nee ds, priorities, and percep tions of p rograms a nd services. Syste ma tic surveys of c itizens and c lients

    are used increa singly and are co nsidered a valuable too l of mo dern mana gem ent.

    Fifth, data are used to evaluate outc ome s. Legisla tures and c itizens wa nt to know w hat return theyare getting from their tax dollars. Did programs and policies achieve their aims? Did they produce

    any une xpec ted results?

    Most grant applications require modern managers to be accountable for program outcomes.

    Mode rn ma nag ers must d emonstrat e that t heir programs are produc ing effec tive o utco mes and

    that the y are doing so in c ost -effec tive ways. This de ma nd for outcom e eva luation and monitoring

    far exc eed s any requireme nt of p rop er funds manag ement. Ana lysis ca n also b e used to d etermine

    the impac t of d ifferent c ond itions on p rog ram effec tiveness, lead ing to sugg estions for imp roving

    programs.

    Data and ana lysis are o mnipresent in prog ra ms and po licies. They are there a t eve ry sta ge , from the

    incep tion o f program s and p olicies, to their very end. Of c ourse, de cisions are a lso b ased on

    pe rsona l ob servation, p olitic al co nsensus, anec do ta l and impressionistic de sc ript ions, and theideolog ies of lea de rs. Yet d ata a nd analysis often a re present, too, one w ay o r another. This is

    be c ause a nalysis is useful. Sp ecifica lly, quantitative ana lysis aid s in providing an ob ject ive, fac tual

    unde rpinning o f situations and respo nses. Analysis, along with d at a, helps qua ntify the extent o f

    prob lems and solutions in ways tha t othe r informa tion seldom ca n. Ana lysis ca n help quantify the

    ac tual or likely impa c t of proposed strategies, for exam ple, helping to d etermine their ade qua c y. At

    the ve ry least, a foc us on fa c ts and ob ject ive ana lysis might reduc e jud gm ent errors stemming from

    ove rly imp ressionistic or subjec tive p erce ptions tha t are fa c tua lly incorrec t. So managers are

    expec ted t o b ring d at a a nd a nalysis to the de cision-making ta ble.