Pearson Curves Frequency

Embed Size (px)

Citation preview

  • 8/6/2019 Pearson Curves Frequency

    1/10

    GENERALISATION OF SOME TYPES OF THE FREQUENCYCURVES OF PROFESSOR PEARSON.

    By PBOFESSOR V. ROMANOVSKY (of the University of Turkestan).L I T is well known tha t th e frequency curves of Prof. Pearson are derivedfrom the differential equation

    1 dy = x + c , j .d x + adxDenoting by x, the mean of the values of x, given by some observations, and

    by /*,, / i , , f^,..., as usually, th eir successive mom ent-coefficients ab out a^, y denotingthe frequencies of the values of x, we shall ob tain from (1) the following equations,which give us th e con stan ts c, , cu,, o, in terms of /*i = 0, /JL,, y^, ^:

    Let now fa = fj^/tf, fa4 ( 2 A " 3 A 6 ) ( 4 8

    Then we know that we shall have the following seven types of the frequencycurves du e to Pro f Pearson, obtained according to the various corresponding valuesof the constants fa, fa and k:

    (-

  • 8/6/2019 Pearson Curves Frequency

    2/10

    V. ROMANOVSKY 107having the properties:

    (a) UD, UnUt,... are certain definite one-valued functions of x and A,,AltAt,...are certain constants.

    (6) Abridging the series on the right hand side of the above equation andretaining only the first member of it, we shall obtain the equation

    y = A eu0 (3a),which represents one of the curves I, II, III or VII, according to the nature ofthe series of functions ,, ui> ^s.

    (c) Retaining more members in the abridged series, we shall have equationsof the form y=A tut + A1u1+...+A,u, (36),which go always closer to the law of distribution we are considering as s increases(we suppose, ofcourse, that this law exists).

    For Types IV, V and VI it was not possible to find a series which possessedthe same properties as those for Types I, II, III and VTLIL Let us consider now the curve ofType I:

    The constants of this curve are to be calculated as follows. Let.(5).

    Th e n

    4cvb = r 2 va

    r-2(va)" (vb)'b

    (6),

    yo c (va + vb^+'O' T (va + 1) T (vb +1)(bSo being the total frequency of the distribution considered, or So = ydx. TheseJ avalues of v, a, b, y, are obtained from the equations

    (7),l)(vb + l)(vb - va)3c* (va +1) (vb + 1) [(va + 1) (i* + 1) (r - 6) - 2r]

    and S = f yd*J a

    ...(8).8a

  • 8/6/2019 Pearson Curves Frequency

    3/10

    108 Gene ralisation of some Types 0/ Frequency CurvesWe shall now consider the function

    , = ( - l . / 9 > - l (10),and a; is an independent real variable. This function is well defined, one-valuedand continuous in the interval ( a, b).

    Denoting the Arth derivative of o by -D*Uo, we introduce the functionsuk = Bt(a + xY+*Q>-xy>+* (Jfe=l, 2, 3, .. .) (11).

    It is easy to show that"*= k = ( + l)t 0 - xf - * (a + 2) t (0 + *) { b - *y (a + x)

    (13),(a+t) t =( a+) ( a + t +1) ... (a + k)\08 + A )* = 08 + A)08 + A + l)...(/8 + *)J ( *

    The polynomials >t are well knownthey are the polynomials of Jacobi inslightly altered form.

    Let us show thatJ

    Jt being a quite definite constant, depending on a, b, a, /9.Consider first the case h k. Then we can put h > k and write

    [ uo4> k kda:'=* ( fa.&ia + xY+tib-xy+t.dx.J a J a

    Integrating by parts and noting thaty+Hb - * y + * ] 4 =0,

    j u , 4 > k < f > k d x - - J D < f > h . D * ~ 1 ( a + ) + * ( 6 - New integration by parts gives us

    J a J aand so on. Finally we come to

    f Uvfafr d x = {- 1 ) * + ' f D k + 1 < f > k . i ) * " * - ' ( o + x f + k ( b - * ) + * i r = 0 ,J a J a

    tf>k being a polynomial of degree h and, therefore, Dk+1 fa being identically zero.

    we shall find J

  • 8/6/2019 Pearson Curves Frequency

    4/10

    V . ROMANOVSKY 109Secondly, let h = k. Then, integrating k times by parts, we can obtain

    where J t' =j (a + *)+* (6 - /+

  • 8/6/2019 Pearson Curves Frequency

    5/10

    110 Generalisation of some Types of Frequency Curvesand suppose that the series on the right hand side is uniformly convergent in theinterval (a, b) . Then we can integ rate this series by terms after having mu lti-plied i t by any of the polynomials , = 1, rb i.

    yfa dx = t 2 Axux. dx-a J -a fc=0eo rb= 2 A h \ u^fafadx= AtJit rbaccording to (15). Suppose tha t we can evaluate the integral I yfcdx and

    J athat we denote it by Sk. Then we getA t=^ (fc = 0 ,1 ,2 ,. . . ) (22),and our expansion (21) takes the form

    f f f (23),rbwhere S t= yfadx (24).J - aRemembering (19), we can write (23) in the form+ ^ > + x;s

  • 8/6/2019 Pearson Curves Frequency

    6/10

    V. ROMANOVSKT 111which will approximate indefinitely to the function y if we increase the numberof members retained. So, atany rate theoretically, we can fit asclosely as weplease any given distribution, taking theexpansion (25) and retaining therein asufficient number ofmembers. This procedure may be expected to be themoresuccessful, if the distribution in question is ofType L But tothis remark weshall return later, and we shall consider now (26)the first approximation to thegiven function y.

    We can write (26) as follows:

    if we remember that

    and attend also tothe fundamental property ofthe F-function. Further we shallsuppose that

    i - l -

  • 8/6/2019 Pearson Curves Frequency

    7/10

    112 Geneialisation ofsome Types of Frequency CurvesNow consider

    [*(a + xfuhdx = f*(o + xfDk(a + *)"* (b - xf^dtcJ -a J-afor h k. Integrating by parts, we easily obtain

    \" (a+xfukdx = (-\?kQc-\) ...(k-h+1) fb(a +xy-kJ-a J-a

    = (-iy>k(lc-l)...(k-h + l) I (a + xY *J -a

    = ( - 1 ) * k ( k - 1 ) . . . { k - h +1)(o + 6) o+*+*+ *+l fJoor

    Remembering further the value.of X4 in

    i t is easy to see that for 0 S h s i

    (aSumming these values from h = 0 to h =A, we get

    (34)!

    + /3+21 .2 v (35).Whence

    o + + 2 O + /9 + 2

    a+3 .8, S,

    5,(36).

  • 8/6/2019 Pearson Curves Frequency

    8/10

    V . ROMANOVSKY 11 3Note now, that the fit's are moments about x = a and the fit's are momentsabout K=BQ, and therefore are connected by the relation

    fit' = (o + an,)* + *whence, as /ij = 0,^ ' = 0 + ^ , /i , ' = (a

    ^ ' = (a + a,)4 + 6 (a + e ,Y / i, + 4 (o + e ,) fi? + ^ .Su bsti tuti ng here in the place of /*,, /*,, /*< the ir values from (7), we shallobtain ) , (a+by(a+l\** ~

    Multiplying these equations with Se and then subtracting from (36) corre-spondingly, we find:0 2( + 6) + 0

    Then, we see that really/S , = S , = S , = S 4 = 0,if the constants a, b, a, /3 in Uo are determined as they are for the frequency curveof Type L

    V. We see that the expansion (25) fulfils all the statements (a), (b), (c) madein paragraph L In particular, we see that, choosing the constants appropriately,we can bring it into the form

    The applications of this result depend on the evaluation of moments of orders5, 6 and so on, which is not always desirable seeing the high probable errors ofthese moments. Bu t in some cases, if we abridge the series on the right hand sideof our equation to two or three terms, the equations obtained may be of use. Tothis point I shall return in a later article, when I have the time for the necessarycalculations.I would make another remark about the expansion (25):

    8. Tja+fi)Biometr ika xv i

  • 8/6/2019 Pearson Curves Frequency

    9/10

    114 Gene ralisation of some Types of Frequency CurvesOne knows that by evaluating the constants a, b, a, y9 entering in the coefficientsof this series according to the relations (6), onefindsas a rule non-integer numbers,which must be taken with many decimals in order to secure the needed accuracy of

    the results to be drawn from the equation of the curve. I now suggest tha t onecan use th e expansion (25) in t his w ay: we may take instead of the usual values ofa,b, a,/3, those calculated to the ir nearest integer. The n 1

    for this curve is only a special case of the foregoing one: make a =b, o = y9 = m andwe obtain (37). On this assum ption the expansion (25) gives a generalised form ofthe curve (37). If we abridge it to its first member, we shall obtain (37). If wechoose the constants as follows:_ 5/3,-9a ~ 2 ( 3 - A ) ' y '

    then in the expansion (25), duly rewritten, Si , S^, S t and S t will be zero.VTI. Consider now the curves of Type H I :

    e~" ( 3 8 ) -They are fitted to the distribution given by aid of the relations2/V p, 2M , 8, a-1"1

    = ^"W " - *-aT(,+l)where a=va (40).

    We shall take in this caseu , = (a + * ) - , uk = D k(a + x)'+ te -ra (41),

    where a > 1, v>0 (42).We find u k = U t < p k (43)

    t (44).It is known that the polynomials, defined by the relation

    are the polynomials of Lagu erre. Thus the fa are the same polynomials ingeneralised form.

  • 8/6/2019 Pearson Curves Frequency

    10/10

    V . ROMANOVSKY 115By the same method as was described above we easily find that

    f" ,(/ . i(/ ) tir= 0 (A, k = 0,1, 2, . . . , h + k))J - a I/ :

    (46).We can show th at for a certain class of functions y of the independent variable x,one-valued and definite in the interv al ( a, oo ), an expansion of the form

    y =exists uniformly convergent in the above interval.

    Among these functions are the continuous functions, i.e., of course, continuousin the interval (a, oo). The coefficients -40> -4i -4S, ... of this series are readilyobtainable : multiply both sides with t and integrate from a to oo , thenA k = k\e(a)kr(a) (47)>

    . C Owhere Sk=\ yfadx (48).J aThus we have 1 + 1I t is easy to show that for the constants v, o, a defined by the aid ofequations (39) and (40), we shall have

    S & S O (50),and, therefore, the expansion (49) takes in this case the form

    If we abridge this infinite series and take only its first member, we shallobtain the frequency cnrve of Prof. Pearson's Type III.W e can repeat now all the remarks that were made in paragraph V. We mayadd to them, that in the case of distributions of Type III we work under morefavourable conditions than in the case of distributions of Types I or II. Indeed, forthe evaluation of constants in the present case we need first only three m oments ofdistribution. The fourth one can be evaluated in most cases with sufficient accuracy

    and thus St can be obtained with accuracy of the same order; therefore we canexpect a better fit to the data by aid of the curve( 5 2 )

    9S