MB0040 - AJ

8/6/2019 MB0040 - AJ

1/15

MBA-1 | Subjec t Cod e: Sta tistics For Mana gement Pag e 1 of 15

MBA SEMESTER 1

MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 1

Q1. Eluc ida te the func tions o f sta tistics.

A1.:

(1) Sta tistics helps in providing a be tter unde rstand ing and exac t de scription of a p henomenon ofnature.

(2) Sta tistica l helps in prop er and eff icient p lanning of a sta tistica l inquiry in any field of study .(3) Sta tistica l helps in c ollecting an app rop riate quantita tive da ta .(4) Sta tistics helps in presenting co mp lex da ta in a suitable ta bular, dia grammat ic a nd grap hic form

for an easy and c lear comp rehension of the da ta.

(5) Sta tistics helps in understa nding the na ture and pa tte rn of va riability of a ph enomenon throughqua ntitative observations. (6) Stat istics helps in draw ing valid inference, a lon g w ith a mea sure oftheir reliability about the population parameters from the sample data.

Q2. What a re the me thod s of sta tistic al survey? Explain b riefly.

A2.: Sta tistic al surveys a re used to c ollec t q ua ntita tive informat ion ab out ite ms in a

po pulation . Surveys of human po pulations a nd institutions are c om mon in polit ical p olling

and government, health, social science and marketing research. A survey may focus on

opinions or factual information depending on its purpose, and many surveys involve

ad ministe ring qu estions to individua ls. When the que st ions a re ad ministered b y a resea rche r,

the survey is c alled a struct ured interview or a resea rcher-a d ministered survey . When th e

qu estions are a d ministered b y the resp on de nt, the survey is referred to a s a qu estionna ire o r

a self-ad ministe red survey .

Structure and stand ardizatio n

The questions a re usua lly struc tured and sta ndard ized . The struct ure is intende d to reduce b ias. Fo r

examp le, questions should b e ordered in such a w ay that a qu estion do es not influenc e the

response to subsequen t qu estion s. Surveys are standardized to ensure reliability, generalizability, and

validity. Every respondent should b e p resented with the sam e questions and in the sam e o rder as

other respo ndents.

In orga nizationa l develop ment (OD), carefully constructed survey instruments a re often used a s the

ba sis for da ta ga thering, organizat ional d iagno sis, and subsequ ent a ction planning. Som e OD

prac titioners (e.g. Fred Nic kols) even c onsider survey guided dev elop ment as the sine qua non of

OD.

Serial surveys

Serial surveys ar e tho se w hich rep eat the sam e que stions at different p oints in time, p rod uc ing

rep eated measures da ta . There a re three ba sic d esigns for a study with mo re than one

mea surement oc casion: cross-sec tional d esign, longitud inal de sign, and time -series de sign.

Cross-sectional surveys use different units (respondents) at each of the measurementoc c asions, by draw ing a new sam ple ea c h time. The time intervals ma y be d ifferent

be tween mea surement oc ca sions, but t hey a re the sam e fo r all units (respo nde nts). A stud y

in which a survey is ad ministered onc e is also co nsidered to b e c ross -sec tional.

Longitudinal surveys use the same units (respondents) at each of the measurementoc c asions, by reco nta cting the sam e sam ple from th e initial survey fo r the follow ing

measureme nt o ccasion(s), and a sking the sa me que stions at every oc ca sion. The time

intervals may be d ifferent betw een mea surement oc casions, but they a re the sa me for all

units (resp ond ents).

Time -series surveys a lso use the same units (responde nts) at ea ch of the measureme ntoc c asions, but the difference with long itudinal stud y d esigns is that in time -series de signs bo th

8/6/2019 MB0040 - AJ

2/15


the numb er of measurement oc casions and the time intervals be twe en oc ca sions ma y be

different b etw een units (respon de nts).

Mode s of Data Co llectio n

There a re seve ra l wa ys of a dministe ring a survey, includ ing:

Tele phone

use o f interviewers enc ourag es samp le pe rsons to respo nd , lead ing to highe r respo nse rates. interviewers can increase comprehension of questions by answering respondents' questions. fairly cost efficient, dep end ing on loca l call cha rge structure go od fo r large nationa l (or internat ional) sampling fram es some p otential for interview er bias (e.g. som e p eople m ay b e m ore willing to discuss a

sensitive issue w ith a fema le interviewe r than with a ma le one )

canno t b e used for non -audio informa tion (grap hics, de monstrations, ta ste/ smell sam ples) unreliable for consumer surveys in rural areas where telephone penetration is low three t ypes:

o traditional telephone interviewso co mp uter assisted tel ep hone d ialingo co mp uter assisted telep hone interviewing ( CATI)

Mail

the questionnaire may be hande d to the respo ndents or mailed to the m, but in all cases theyare returned to the resea rcher via ma il.

cost is very low , since bulk po sta ge is cheap in most c ountries long time delays, often several months, before the surveys are returned and statistical

ana lysis can be gin

not suitable for issues that may require clarification respo ndents ca n a nswer at their own c onvenience (allowing them to break up long surve ys;

also useful if they nee d t o c hec k rec ords to an sw er a question)

no interview er bias introdu ce d large amount of information ca n be o btained : some ma il surveys a re a s long a s 50 p ag es respo nse rates ca n be improved by using ma il p anels

o memb ers of the pa nel have agreed to pa rticipateo pa nels ca n b e used in longitudinal designs whe re the same respo nde nts are surveyed

several

Online surveys

ca n use web o r e-mail w eb is p referred ove r e -mail beca use interac tive HTML forms can b e used ofte n inexpensive to adm inister very fast results easy to mod ify respo nse rates c an b e improved b y using Online pa nels - memb ers of the p anel have

agreed to participate

if not p asswo rd -protec ted , easy to ma nipulate b y co mp leting multiple times to skew results da ta c reation, manipu lation a nd reporting ca n be auto ma ted a nd/ or easily expo rted into a

forma t which c an be read by PSPP, DAP o r other sta tistic a l a nalysis softw are

da ta sets c rea ted in rea l time som e a re inc ent ive based (suc h as Survey Vault o r YouGov) ma y skew sam ple towa rd s a younger demo graphic c omp ared with CATI often difficult to determine/control selection probabilities, hindering quantitative analysis of

data

use in large scale industries.Persona l in- home survey

8/6/2019 MB0040 - AJ

3/15


respo nde nts are interviewed in person, in their home s (o r at the front do or) very high cost suitable when graphic representations, smells, or demonstrations are involved ofte n suitab le for long surveys (but som e respo nden ts obje ct to a llowing strangers into their

home for extende d p eriods)

suitab le for loc ation s where telep hone or mail are not d eveloped skilled interviewers can persuad e respondents to coo pe rate, imp roving respo nse rates potential for interviewer bias

Personal mall intercept survey

shop pe rs at m alls are intercep ted - they are either interview ed o n the spo t, taken to a roomand interviewed , or taken to a room and given a self -administered que stionnaire

socially acceptable - people feel that a mall is a more appropriate place to do researchthan their home

potential for interviewer bias fast ea sy to manipulate b y co mp leting multiple times to skew results

Samp ling

Sam ple select ion is c ritical to the va lidity of the information tha t represents the p op ulations tha t a re

being studied. The ap proac h of the samp ling helps to d etermine the fo cus of the study and allowsbet ter ac c epta nce of the generaliza tions that are being m ad e. Careful use of b iased sam pling ca n

be used if it is justified and as long as it is noted that the resulting sample may not be a true

rep resenta tion of the po pula tion of the stud y. Th ere are two d ifferent app roac hes to sam pling insurvey resea rch:

There is nonprob ability sam pling a pp roa ch. In this a pp roac h the researc her doe s not knowea c h eleme nt's p rob a bility of selec tion in the sam ple. The m ost c ommonly used

nonp roba b ility sa mp ling method is the c onvenienc e sa mpling ap p roa c h. With this me thod , it

only sam ples those w ho a re ava ilab le a nd w illing to pa rticipa te in the survey. The use o f this

ap p roa c h allows for convenienc e for the researcher while possibly losing da ta validity due to

the lac k o f rep resenta tion.

The p roba bility sam pling ap proa ch for researc h methods gives ea ch e lement a knownchanc e o f being inc luded in the sample. This meth od is clo ser to a true rep resentat ion of thepop ulation. It c an b e difficult to use d ue to co st of a rigorous sa mpling method , and diffic ulty

in obtaining full co vera ge of the t arget p opulation, but the g eneralizations that c ome f rom it

are more likely to be closer to a true representation of the population. Different forms of

prob ab ility sam pling are d esigned to a chieve various b enefits - e.g. theoretica l simplicity,

op erational simplicity, detailed informa tion on subp op ulations, or minimal cost. Some

co mmo n forms:

o Equa l proba bility of selec tion d esigns (EPS), in which ea ch elem ent o f the po pulationhas a n equa l cha nce o f b eing included in the sam ple. This uniformity ma kes EPS

surveys relat ively simp le to inte rpret . Forms of EPS inc lud e Simp le rando m samp ling

(SRS) and syste ma tic samp ling .

o Probability-proportional-to-size d esigns (PPS), in whic h 'la rger ' elements (ac co rding tosom e known measure of size) have a higher cha nc e of selection . This ap p roa c h is

com mo n in bu siness surveys where the ob jec t is to d ete rmine sec to r tot als (e.g . "to ta l

employme nt in ma nufact uring sec tors"); co mpa red to EPS, c onc entr ating on largerelements may produc e b etter ac curac y for the sam e c ost/ sam ple size.

o Stratified rando m sampling ap proac h , in which the population is divided intosubpopulations (called strata) and random samples are then drawn separately from

each of these strata, using any probability sampling method (sometimes including

further sub -stratificat ion). This ma y be do ne t o p rovide be tter c ontrol ove r the sam ple

size (and hence, a cc urac y) within eac h subp op ulation; when the va riab le/ s of

8/6/2019 MB0040 - AJ

4/15


interest are correlated with subpopulation, it can also improve overall accuracy.

Anothe r use for strat ific at ion is when different subpo pu lations requ ire d ifferent

samp ling method s - for insta nc e, a bu siness survey mig ht use EPS for businesses whose

'size' is not known a nd PPS el sew here.

Q3. Tab ulate t he follow ing da ta:

Age: 20-40; 40-60;60-above

Dep a rtmen ts: Eng lish, Hind i, Politic al sc ienc e, Histo ry, soc iolog y

Degree level: Graduates, Post g ra dua tes; PhD, Tota l stud ents in ag e g roup a nd in d eg ree level.

A3.:

Age Degree Level

DEPARTMENTSTot a l

Eng Hin Pol. Sci Histo ry Sociolo gy

20-40

40-60

60 & Ab ove

Tot a l

Q4. The d ata given b elow is the d istribut ion of em ployees of a business a cc ording to their efficienc y.

Find the mea n deviation and coe fficient of mea n deviation from Mean a nd Med ian:

Effic iency Ind ex 22-26 26-30 30-34 34-38 38-42

Emloyees 25 35 15 5 2

A4.: Calc ulation of Mea n Deviation from Mea n:

E I E (f) x fx| D| =

(x-28.29)f| D|

22-26 25 24 600 4.29 107.25

26-30 35 28 980 0.29 10.15

30-34 15 32 480 3.71 55.65

34-38 5 36 180 7.71 38.55

38-42 2 40 80 11.71 23.42

N=82 fx=2320 235.02

MD = ( f | D| ) /N

X = fx/N = 2320/82 = 28.29

D = X X

MD = 235.02/ 82 = 2.866

Co -efficient of MD from mean = MD/ X = 2.866/28.29 = 0.1013

8/6/2019 MB0040 - AJ

5/15


Calc ulation of MD from Median:

EI f Cf X | D| =x-Me f| D|

22-26 25 25 24 3.83 95.75

26-30 35 60 28 0.17 5.95

30-34 15 75 32 4.17 62.55

34-38 5 80 36 8.17 40.85

38-42 2 82 40 12.17 24.34N=82 f| D| =229.44

Me c lass = N/ 2 th class = 82/2th class = 41 th class

Me c la ss = 26-30

Me = l + [(N/ 2 Cf)/ f] * I = 26 + [(41-25)/ 35] * 4 = 27.83

MD = f| D| /N = 229.44/ 82 = 2.798

Coefficient of MD = MD/M e = 2.798/27.83 = 0.1005

Q5. What is c ond itiona l proba bility? Exp lain with an examp le.

A5.:Conditional probability is the p rob ability of some e vent A, given the oc currenc e of some other

event B. Conditional probability is written P(A | B), and is read "the (conditional) probability of A,

given B" o r "the p roba bility of A under the co ndition B". When in a random experiment the event B is

known to have oc curred , the p ossible outcom es of the experiment a re reduc ed to B, and hence the

proba bility of the oc currenc e of A is cha nged from the unc onditional proba bility into the cond itional

probability given B.

Joint probability is the p rob ab ility of two eve nts in conjunction. Tha t is, it is the p robability of b oth

event s tog ether. The joint p rob ab ility of A a nd B is written or

Marginal probability is then the unconditional probability P(A) of the event A; that is, the p rob ab ilityof A, regardless of whether event B did or did not occur. If B can be thought of as the event of a

random variable X having a given outcome, the marginal probability of A can be obtained bysumming (or integ rat ing, more g enerally) the joint p rob ab ilities over all outc om es for X. For examp le,

if there are two possible outcomes for X with corresponding events B and B', this mea ns tha t

. This is ca lled marginalization.

In these definitions, note that there need not be a causal or temporal relation between A a nd B. A

may precede B or vice versa or they may hap p en at the sam e time. A ma y ca use B or vic e ve rsa o r

they may have no causal relation at all. Notice, however, that causal and temporal relations are

informa l notions, not b elonging to the p rob ab ilistic fram ewo rk. They ma y ap ply in som e examples,

dep ending on the interpreta tion given to e vents.

Conditioning of prob abilities, i.e. upd ating them to ta ke ac co unt of (po ssibly new ) information, maybe ac hieved throug h Ba yes' theorem. In such co nditioning, the prob ab ility of A given only initial

information I, P(A| I), is know n as the p rior p rob ability. The up da ted c ond itional proba bility of A,

given I and t he outco me of the event B, is known a s the p oste rior p rob ab ility, P(A| B,I).

Introduction

Consider the simple sc ena rio of rolling two fair six -sided dic e, lab elled die 1 and die 2. Define th e

follow ing three event s (not assume d to oc cur simulta neou sly):

A: Die 1 lands on 3.

8/6/2019 MB0040 - AJ

6/15


B: Die 2 land s on 1.

C: The dic e sum to 8.

The p rior p rob a bility o f ea ch ev ent d esc ribes how likely the outc om e is be fore the d ice a re rolled,

without a ny knowled ge o f the roll's outc om e. For exam ple, die 1 is equa lly likely to fa ll on eac h of its

6 sides, so P(A) = 1/6. Similarly P(B) = 1/6. Likewise, o f the 6 6 = 36 po ssible w ays tha t a pa ir of d ice

can la nd, just 5 result in a sum o f 8 (namely 2 and 6, 3 and 5, 4 and 4 , 5 a nd 3, and 6 and 2), so P(C)

= 5/36.

Some of these event s ca n both o c cur at th e sa me time; for exam ple events A and C can happen atthe sam e time , in the c ase where d ie 1 lands on 3 a nd d ie 2 lands on 5. This is the o nly one o f the 36

outcomes where bo th A and C oc cur, so its p rob ab ility is 1/36. The p roba bility of both A and C

oc c urring is ca lled the joint p roba bility of A and C and is written , so . On the

other hand, if die 2 land s on 1, the dic e canno t sum to 8, so .

Now supp ose we roll the d ice and co ver up die 2, so we can o nly see d ie 1, and ob serve that d ie 1

landed on 3. Given this pa rtial informat ion, the prob a bility that the d ice sum to 8 is no long er 5/ 36;

instead it is 1/6, sinc e d ie 2 must land on 5 to a chieve this result. This is called the conditionalprobability, because it is the probability of C under the condition that A is observed, and is written

P(C | A), which is rea d "the prob ab ility of C given A." Simila rly, P(C | B) = 0, since if we observe die 2

land ed o n 1, we a lready know the dice ca n't sum to 8, rega rdless of what the other die landed on.

On the o ther hand, if we roll the d ic e a nd c over up die 2, and o bserve d ie 1, this has no impa ct onthe probability of event B, which only depends on die 2. We say events A and B are statistically

independent or just independent and in this ca se

In other words, the probability of B occurring after observing that die 1 landed on 3 is the same as

before w e ob served d ie 1.

Intersection e vents a nd cond itional events are relat ed b y the formul a:

In this examp le, we ha ve:

As noted ab ove, , so by this formula:

On m ultiplying a c ross b y P( A),

In othe r words, if two ev ents are indep end ent, their joint proba bility is the p rod uct of the p rior

prob ab ilities of eac h event o ccurring b y itself.

Definition

Given a proba bility spa ce ( , F, P) and two events A, B F with P(B) > 0, the conditional probability

of A given B is d efined b y

8/6/2019 MB0040 - AJ

7/15

If P(B) = 0 then P(A | B) is unde fined (see BorelKolmog orov pa rad ox for an e xplanation). However it

is po ssible to de fine a co nditional probability with respe ct to a -algeb ra of such events (such as

those arising from a co ntinuous rand om variab le ).

For example, if X and Y are non-d eg enerate a nd jointly cont inuous random va riab les with de nsity

X,Y(x, y) then, if B has positive measure,

The c ase w he re B has zero measure can only be dealt with directly in the case that B={y0},

rep resenting a single po int, in w hich c ase

If A has mea sure zero then the cond itional p rob ab ility is zero. An indica tion of w hy the m ore ge neral

ca se o f zero m ea sure ca nnot b e dea lt w ith in a similar wa y ca n be seen b y noting t hat tha t the limit,

as all yi a pp roac h zero, of

de pe nds on their relationship as they ap proa ch zero. See c ond itional expec ta tion for more

information.

Derivation

The fo llowing deriva tion is ta ken from G rinstea d a nd Snell's Introduc tion to Proba b ility.

Let be a samp le spa ce with the probability P. Supp ose the event has oc curred and an

altered p rob ability P({ } | E) is to b e a ssigned to the elementa ry events { } to reflect the fac t that E

ha s oc cu rred . (In the following w e will om it the c urled b rackets.)

For all w e wa nt to m a ke sure that the intuitive result P( | E) = 0 is true.

Also, without further information provided, we can be certain that the relative magnitude of

proba b ilities is conserved :

.

This requireme nt lead s us to sta te:

whe re , is a p ositive rea l consta nt or sca ling fa ctor to reflect the abo ve requireme nt.

Since w e know Ehas oc curred, w e c an state P(E) > 0 and :

8/6/2019 MB0040 - AJ

8/15

Hence

For ano ther event F this leads to :

Statistical Indepe ndenc e

Two rand om events A and B are stat istica lly inde pen de nt if and only if

Thus, if A and B are indep end ent, then their joint prob ab ility can b e expressed a s a simple p rod uc t of

their individual probabilities.

Equivalently, for two indep endent eve nts A and B with non -zero prob ab ilities,

and

In othe r words, if A and B a re indep endent, then the c onditional proba bility of A , given B is simp ly the

individual probability of A alone; likewise, the probability of B given A is simply the probability of B

alone.

Mutua l Exc lusivity

Two events A a nd B are mutually exclusive if a nd only if . Then .

Therefo re, if P(B) > 0 then is de fined and equal to 0.

The Conditional Proba b ility Fala cy

The c ond itiona l proba bility fallac y is the a ssump tion that P(A | B) is approximately equal to P(B| A).

The ma thema tic ian John Allen Paulos d iscusses this in his boo k Innumeracy , where he po ints out tha t

it is a mistake often ma d e even b y do cto rs, law yers, and o ther highly educ at ed no n -sta tisticians. It

ca n b e overco me b y de sc ribing the d ata in ac tual numbers rathe r than p rob ab ilities.

The relation betw een P(A | B) and P(B| A) is given by Ba yes' theo rem :

In other words, one can only assume that P(A| B) is approximately equal to P(B| A) if the prior

probabilities P(A) and P(B) are also a pp roximately eq ual.

An Exam ple

8/6/2019 MB0040 - AJ

9/15


In the follow ing co nstructed b ut realistic si tuat ion, the differenc e between P(A| B) and P(B| A) may

be surp rising, but is a t the same t ime ob vious.

In ord er to id entify individuals having a serious disea se in a n ea rly curable form, one ma y c onsider

screening a large group of people. While the benefits are obvious, an argument against such

sc reening s is the d isturba nc e c aused by false positive sc reening results: If a p erson not having the

disease is incorrectly found to have it by the initial test, they will most likely be quite distressed until a

mo re ca reful test show s that they d o not hav e the d isease. Even a fter being t old they are w ell, their

lives may b e a ffecte d neg atively.

The ma gnitude of this prob lem is be st und erstoo d in terms o f c ond itional proba bilities.

Supp ose 1% of the g roup suffer from t he d isea se, and the rest a re we ll. Choo sing a n individua l at

random,

P(ill) = 1% = 0.01 and P(well) = 99% = 0.99.

Supp ose tha t w hen the screening test is a pp lied to a pe rson no t ha ving the disea se, the re is a 1%

ch anc e of g etting a false positive result an d 99% c hanc e of g etting a t rue nega tive result, i.e.

P(positive | well) = 1%, and P(negat ive | w ell) = 99%.

Finally, supp ose t hat whe n the test is ap plied t o a p erson having the disease, the re is a 1% chanc e o f

a false ne ga tive result and 99% c hanc e o f ge t ting a true po sitive result, i.e.

P(nega tive | ill) = 1% and P(positive | ill) = 99%.

Now, one ma y calc ulate the following:

The frac tion of individua ls in the whole g roup who a re well and test nega tive (true neg at ive):

The frac tion of ind ividua ls in the who le group who are ill and t est p ositive (true p ositive):

The frac tion o f ind ividua ls in the who le group w ho have fa lse positive results:

The frac tion of ind ividuals in the w hole g roup w ho ha ve false ne gat ive results:

Furthermore, the frac tion of individuals in the w hole g roup who test p ositive:

Finally, the probability that an individual actually has the disease, given that the test result is positive:

In this exam ple, it should b e ea sy to relat e to the differenc e b etwe en the c onditional proba bi lities

P(p ositive | ill) (whic h is 99%) and P(ill | po sitive) (w hich is 50%): the first is the prob a bility tha t an

8/6/2019 MB0040 - AJ

10/15


individual who has the disease tests positive; the second is the probability that an individual who

tests positive ac tua lly has the disease. With the numbers chosen here, the la st result is likely to be

dee med unac ce pta ble: half the p eople testing po sitive a re ac tually false p ositives.

Sec ond Typ e of Cond itional Proba b ility Fala cy

Another type o f fallac y is interpreting c ond itional proba bilitie s of events (or a series of ev ents) as

(unco nditional) probabilities, or seeing them a s be ing in the sam e o rde r of magnitude . A conditional

prob ab ility of a n event a nd its (tota l) proba bility are linked with ea c h other throug h the formula of

tota l prob ab ility, but without a dd itional informat ion one of them says little ab out the o ther. Thefallac y to view P(A| B) as P(A) o r as be ing c lose to P(A) is ofte n related with som e fo rms of sta tistica l

bias but it c an b e subtle.

Here is an e xamp le: One of the c ond itio ns for the leg end ary wild -west hero Wyat t Earp to h ave

become a legend was having survived all the duels he survived. Indeed, it is reported that he was

neve r wounded , not even sc rat ched by a bullet. The p rob ab ility of this to ha ppen is very small,

co ntributing to his fame bec a use eve nts of very sma ll prob ab ilities at trac t at tention. How ever, the

po int is tha t the d egree o f attention dep end s very much o n the ob server. Som ebod y impressed b y aspe c ific event (here seeing a "hero") is prone t o view effec ts of rando mness differently from othe rs

whic h are less imp ressed .

In general it does not make much sense to ask after observation of a remarkable series of events

"Wha t is the prob ab ility o f this?"; this is a co nditional proba bility b ased upo n o bservation. Thedistinction between conditional and unconditional probabilities can be intricate if the observer who

asks "What is the p roba bility?" is himself/ herself an outc ome o f a random selec tion. The na me "Wya tt

Ea rp effec t" was c oined in an a rticle "Der Wyat t Ea rp Effekt" (in German) show ing through seve ralexamp les its subtlety a nd impa c t in various sc ientific do ma ins.

Q6. The proba bility that a footb all player will play Ede n g arden is 0.6 and on Am be d kar Sta dium is

0.4. The p rob ab ility that he w ill ge t knee injury when p laying in Ede n is 0.07 and tha t in Ambed kar

stad ium is 0.04. Wha t is the prob a bility tha t he wo uld g et a knee injury if he played in Eden.

A6.:

P(A) = 0.6 P(B) = 0.4 P(C) = 0.07 P(D) = 0.04

P(A C) = P(A) * P(C)= 0.6*0.07

= 0.042

MBA SEMESTER 1

MB0040 STATISTICS FOR MANAGEMENTAssignment Set - 2

Q1. A rando m sam p le of 6 sac hets of mustard oil wa s examined and two we re found to b e lea king.

A w holesaler rec eives seven hund red twenty six p a cks, ea ch c onta ining 6 sa chets. Find the

expected number of pa ckets to contain exactly one sac het leaking?A1.:

n = 6

N =726

Eac h pa cket c ontains 6 sac hets

Expec ted no. of pa cks to c onta in exa ctly 1 sac het leaking

E(A) = N*P(x)

P(x) = 36/ 726 = 0.0496

8/6/2019 MB0040 - AJ

11/15


E(A) = 726*0.0496 = 36

Q2. What do you me an b y errors in stat istics? Ment ion the me asures to do so.

A2.: In statistics and optimization, statistic al errors and residuals are two closely related and easily

co nfused mea sures o f the d eviation of a sam ple from its theo retica l value. The error of a samp le is

the deviation of the sample from the (unobservable) true function value; while the residual of a

sam ple is the d ifferenc e b etw een the sam ple and the estima ted func tion v alue.

The d istinc tion is most importa nt in reg ression ana lysis, where it lead s to t he c oncep t of stud entized

residuals.

Suppose the re is a series of observat ions from a univa riat e d istribution a nd w e wa nt to estima te the

mea n of tha t d istribution (the so -ca lled loca tion mod el). In this ca se the errors a re the deviations o f

the o bservations from the po pula tion mea n, while the residua ls a re the d eviations of the

ob servations from the sam ple m ea n.

A statistical error is the a mount by which an ob servation d iffers from its expec ted value; the latte r

being b ased on th e w hole po pulation from which t he stat istica l unit w as chosen rando mly. For

exam ple, if the mea n height in a pop ulation of 21 -yea r-old m en is 1.75 meters, and one rand omly

chosen man is 1.80 me ters ta ll, then the erro r is 0.05 met ers; if the rand om ly cho sen ma n is 1.70

met ers ta ll, then the error is 0.05 meters. The expec ted value, being the mea n of the entirepo pulat ion, is typic ally unob servable, and henc e the sta tistica l error cannot be ob served e ither.

The nomenc latu re arose from rand om m ea sureme nt erro rs in astrono my. It is as if the me asurement

of the ma ns height were an atte mpt t o me asure the p op ulation mea n, so that any differenc e

between the m an s height and the mea n would be a mea surement e rror.

A residual (or fitting error), on the other hand, is an observable estimate of the unobservable

sta tistica l error. Consider the p revious example w ith men s heights and supp ose w e ha ve a rand om

sample of n people . The sample mean could serve as a good estimator of the population mean.

Then w e have:

The differenc e be twee n the height of eac h man in the samp le and the unob serva blepopulation me an is a sta tistic a l erro r, whereas

The d ifferenc e b etw een the height of e ac h ma n in the samp le and the o bservab le sample me an is a residua l.

Note tha t th e sum of t he residua ls within a random sam ple is ne cessarily zero, a nd thus the residuals

are ne cessarily not independent . The sta tistica l errors on th e o ther hand are inde pe ndent , a nd their

sum w ithin the ran do m sa mp le is almost surely not zero.

One can standa rdize statistica l errors (espe cia lly of a nor ma l distribution) in a z-score (or standa rd

score ), and sta ndard ize residua ls in a t -sta tistic, o r mo re ge nerally stud ent ized residua ls.

Standard error of the m ean

The standard error of the mean (SEM) is the standa rd d eviation of the sa mple m ea n estimat e of a

population mean. (It can also be viewed as the standard deviation of the error in the sample mean

relative to the true mea n, since the sample m ea n is an unb iased estima to r.) SEM is usually estima te d

by the sam ple estimate of the p opulation stand ard d eviation (samp le stand a rd de viation) divide d

by the squa re root of the sam ple size (a ssuming stat istica l indep end enc e o f the values in the

sample):

8/6/2019 MB0040 - AJ

12/15

where

s is the sa mp le standa rd deviation (i.e., the samp le ba sed estimat e o f the standa rd de viation

of the p opulation), and

n is the size (numb er of ob servat ions) of the sa mp le.

This estimate ma y be comp ared w ith the formula for the t rue sta nda rd d eviation of the me an:

where

is the standa rd deviat ion of the po pulation.

Note 1: Sta nda rd error may a lso b e d efined a s the standa rd de viation of the residua l error term.

Note 2: Both the standard error and the standard deviation of small samples tend to systematically

unde restima te the p op ulation standa rd error and deviations: the sta nda rd e rror of the mea n is a

biased estima to r of the p op ulation sta nda rd erro r. With n = 2 the unde restimat e is abo ut 25%, but for

n = 6 the und erestimat e is only 5%. Gurland a nd Tripa thi (1971) provide a c orrec tion and eq uation fo rthis effec t. Sokal a nd Rohlf (1981) give an e qua tion of the c orrec tion fac tor for sma ll sa mp les of n (-1.96, 1.96) c ritica l values, 0 is rejec ted .

Conc lusion: The mea n is not 300ml, tha t is the ma chine is not funct ioning prop erly.

Q5. Out o f 2000 peo ple surveyed, 1200 b elong to urb an area s an d rest to semi urba n a reas. Among

1000 w ho visited othe r reg ions, 800 be longe d t o u rban area s. Test at 5% level o f significan ce whethe rarea a nd visiting ot her stat es a re d ep end ant.

A5.:N = 2000

P = 800/1000 = 0.8

= 0.05

P0 = 1200/2000 = 0.6

Q0 = 1- P0 = 0.4

H0 : P = 0.6 Area & visting other sta tes are dep end ent

H1 : P 0.6 Area & visting othe r stat es are indep endent

z = (P- P0) / ( P0 Q0 / n) N(0,1)

= (0.8-0.6)/ (0.6*0.4/1000)

= 12.91

zab s = 12.91 Value is not in the interval ( -1.96, 1.96)

H0 is rejected

Co nc lusion: Area & visiting othe r stat es a re indepe ndent.

Q6. How is sta tistics useful for mo de rn m an ag ers? Give exam p les and explain.

A6.: Mod ern mana gers often join ag encies be ca use t hey seek to serve a nd help their c ommunities

and c ountry. Not surprisingly, som e m ana gers are p uzzled b y the sugg estion of enga ging in resea rch

and stat istics: resea rc h ap pe ars bo ring in com pa rison with develop ing and imp lementing new

program s, and sta tistics seem s, well, imp ossibly challenging w ith little p ayoff in sight .

In fac t, ana lytica l techniques involving researc h and sta tistics are increasingly in dem and . Many

de c isions tha t mo d ern mana ge rs ma ke involve dat a a nd ana lysis, one wa y or ano ther. Consider th e

following com mon uses of ana lysis and da ta :

8/6/2019 MB0040 - AJ

15/15


First, data and objective analysis often are used to de scribe and a nalyze p rob lems, such as the

ma gnitude of environm enta l disa sters (for examp le, oil spills), the extent o f soc ial and pub lic health

problems (suc h a s hom elessness or the AIDS ep ide mic ), the e xte nt o f law lessness, the lev el of

eco nomic prosp erity or sta gnation, or the impa ct of w eathe r -related problems such a s brought on

by hurrica nes and snow storms. Fo r exam ple, it ma tters whe ther the illiterac y rat e a mong 12 yea r

olds is 3 percent or 30 perc ent, or som ewh ere in betw een. By desc ribing the e xtent o f these

problems and their underlying causes accurately, managers are able to better formulate effective

strateg ies for dea ling w ith them . Policy an aly sis ofte n beg ins by de sc ribing the extent a nd

cha rac teristics of p rob lems, and the fac tors a ssoc iat ed with them.

Seco nd, da ta are used to d esc ribe p olicies and prog rams. What are programs and policies

expected to achieve? How many services are programs expected to provide? What are some

milestones of a chieve ment? How m uch will a prog ram cost? These q uestions involve qua ntifiable

answe rs, suc h as the numb er of nationa l guardsmen that a re brought in to a ssist w ith sea rch and

rescue efforts after a major hurricane, or the number of evacuees for whom officials expect to

provide refuge. Policies and prog ram s ca n be described in q uite det ailed ways, involving distinct

program a c tivities, the duration and geo graphic sco pe of a ctivities, and staffing levels and are a

program budget d ata.

Third, p rogram s produc e muc h routine, ad ministra tive d ata that are used to monitor progress and

prevent fraud. For exam ple, hospita ls prod uc e a large a mount o f data abo ut pa tient visits, who

at tend ed them , their diag nosis, billing c od es, and so on. Schools produc e va st a mounts of d at a

ab out student a chievement, stude nt c onduc t, extrac urricular a ctivities, supp ort and ad ministrativeservices, and so on . Regulatory program s produc e d ata a bout inspec tions and com plianc e. In

many states, gaming devices (such as slot machines) are monitored electronically to ensure that

taxes are collected and t hat they are not tam pered w ith. Mana gers a re expec ted to b e familiar

with the a dministrat ive data in their lines of b usiness.

Fourth, a na lysis is used to guid e a nd improve program op erations.

Data c an b e brought to bea r on problems that help man ag ers c hose am ong co mp eting strat egies.

For examp le, what -if ana lysis might b e used to d ete rmine the co st -effec tiveness of a lternative

courses of a ct ion. Suc h ana lysis often is ta ilored to unique situa tions a nd p roblems. In ad dition, c lient

and citizen surveys might b e used to inform p rogram priorities by a ssessing p op ula tion need s and

service sa tisfac tion. Systema tic surveys p rovide va lid and ob jec tive a ssessment s of c itizen and c lient

nee ds, priorities, and percep tions of p rograms a nd services. Syste ma tic surveys of c itizens and c lients

are used increa singly and are co nsidered a valuable too l of mo dern mana gem ent.

Fifth, data are used to evaluate outc ome s. Legisla tures and c itizens wa nt to know w hat return theyare getting from their tax dollars. Did programs and policies achieve their aims? Did they produce

any une xpec ted results?

Most grant applications require modern managers to be accountable for program outcomes.

Mode rn ma nag ers must d emonstrat e that t heir programs are produc ing effec tive o utco mes and

that the y are doing so in c ost -effec tive ways. This de ma nd for outcom e eva luation and monitoring

far exc eed s any requireme nt of p rop er funds manag ement. Ana lysis ca n also b e used to d etermine

the impac t of d ifferent c ond itions on p rog ram effec tiveness, lead ing to sugg estions for imp roving

programs.

Data and ana lysis are o mnipresent in prog ra ms and po licies. They are there a t eve ry sta ge , from the

incep tion o f program s and p olicies, to their very end. Of c ourse, de cisions are a lso b ased on

pe rsona l ob servation, p olitic al co nsensus, anec do ta l and impressionistic de sc ript ions, and theideolog ies of lea de rs. Yet d ata a nd analysis often a re present, too, one w ay o r another. This is

be c ause a nalysis is useful. Sp ecifica lly, quantitative ana lysis aid s in providing an ob ject ive, fac tual

unde rpinning o f situations and respo nses. Analysis, along with d at a, helps qua ntify the extent o f

prob lems and solutions in ways tha t othe r informa tion seldom ca n. Ana lysis ca n help quantify the

ac tual or likely impa c t of proposed strategies, for exam ple, helping to d etermine their ade qua c y. At

the ve ry least, a foc us on fa c ts and ob ject ive ana lysis might reduc e jud gm ent errors stemming from

ove rly imp ressionistic or subjec tive p erce ptions tha t are fa c tua lly incorrec t. So managers are

expec ted t o b ring d at a a nd a nalysis to the de cision-making ta ble.

Documents

MB0040 - AJ