1667274

7/27/2019 1667274

1/13

Efficient Derivative Codes through AutomaticDifferentiation and Interface Contraction: -4n

Application in Biostatistics*by

Paul Hovland, t Christian Bischof.+ Donna Spiegelman.! and Mario Caseila.5

Xb str cDeveloping code fo r computing the first- and higher-order derivatives of a function by hand ca n

be very time-consuming and is prone to errors. Automatic differentiation has proven capable ofproducing derivative codes with very little effort on the part of the user. Automati c differentia-tion avoids the truncation errors characteristic of divided-difference approximations. However, t hederivative code produced by auto mat ic differentiation can be significantiy less efficient than one pro-duced by hand. Th is shortcoming may be overcome by utilizing insight i nto the high-level s tru ctu reof a computation . This paper focuses on how to take advantage of the fact that the number ofvariables passed between subroutines frequently is small compared with th e number of the variableswith respect to which we wish to differentiate. Such an interface contraction, coupled with theassociativity of the chain rule for differentiation, allows us to apply a uto mat ic differentiation in amore judicious fashion, r esulting in much more efficient code for the computat ion of derivatives.A case study involving a program for maximizing a logistic-normal likelihood function developedfrom a problem in nutrit ional epidemiology is examined , and performance figures are presented. Weconclude with som e directions for future study.1 Introduction

Many problems in computat iona l science require the evaluation of a mathematical function. aswell as the derivatives of th at function with respect to certain independent variables. Autom aticdifferentiation provides a mechanism for the aut omat ic generation of code for the computation ofderivatives, using the program for the evaluation of the function as input [9,19]. However, whenaut oma tic differentiation is applied wi thout ins ight into the program being processed, th e derivativecomputation can be almost as expensive as divided differences, especially if the -called forwardmode is being used. Nonetheless, in Section 4 , w e show tha t if the user has high-level knowledge abou tthe structure of a pro gram, automati c differentiation can be employed more judiciously, resulting incodes whose performance rivals those produced by many person-hours of hand-coding. In particular,we show how one can exploit interface contract ion, that is: instances where the number of variablespassed between subroutines is smal l compared with the number of variables with respect to which

TIUS work wa5 supported by the Office of Scientific Computing, US. Department of Energy, under ContractIV-31-109-Eng-38,he National Aerospace Agency under Purchase Order L25935D, the U.S. Department of Defensethrough an NDSEC fellowship, the Centers for Disease Control through Cooperative Agreement K O 1 OH00106, theNational Institutes of Health through Cooperative Agreement RO1 CA50597, and the National Science Foundationthrough NS F Coopera tive Agreement No. CCR-9120008.

Mathematics and Computer Science Division, .4rgonne National Laboratory, 9700S. Cass Avenue, ..\rgome, IL60439, {hovland, bischof}Qmcs.anl gov.Departments of Epidemiology and Biostatistics , Harvard School of Public Health, 677 Huntington Avenue, Boston,

%1A 02115,stdlsOgauss.rned.harvard.edu.Department of Epidemiology, Harvard School of Pubtic Health, 67 7 Huntington Avenue, Boston, M A 02115.

[email protected].
http://02115%2Cstdlsogauss.rned.harvard.edu/mailto:[email protected]:[email protected]://02115%2Cstdlsogauss.rned.harvard.edu/

7/27/2019 1667274

2/13

1

7/27/2019 1667274

3/13

derivatives are desired. Combining our knowledge of this interface contraction with the chain rule fordifferentiation enables US to compute derivatives considerably more emciently than is possible using b l a ~ k - b o x ~ ~utomat ic differentiation. We discuss this technique and describe how it was applied toa program segment that maximizes a likelihood func tion used for statistical analyses investigatingthe link between dietary intake and breast cancer.

The organization of this paper is as follows. Th e nes t section provides a brief introduction toautomati c differentiation and explains some of its advantages in comparison with other techniquesused for computi ng derivatives. Section 3 presents interface contraction and explains how it canbe used to reduce the computational cost of a derivative computat ion. Sections 4 and 5 describethe results of our experiments using the application from biostatistics. Section 6 provides a briefsummary and discusses possible directions for future study.2 Automatic Different ia t ion

Traditionally, scientists who wish to compute the derivatives of a function have chosen one oftwo approaches-derive an analyt ic expression for the derivatives and implement th is expression asa computer program, or approximate the derivatives using divided differences, for example,

for small h . T he former approach suffers from being tedious and prone to errors, while the lattercan produce large errors if the size of the perturbation is not carefully chosen; even in the bestcase, half of th e significant digits will be lost. For problems of limited size, symbolic manipulato rs,such as Maple 171, are available. These programs can simplify the task of deriving an expressionfor derivatives and converting this expression into code, but they are typically unable to handlefunctions tha t are large or contain branches, loops, or subroutines.An alternative to these techniques is automatic differentiation 19,191. Automati c differentiationtechniques rely on the fact that every function, no matter how complicated, is executed on a com-puter as a sequence of elementary operations, such as addition and multiplication, and elementaryfunctions, such as square root and log. By applying the chain rule, for example,

repeatedly to the composition of those elementary operations, one can compute derivatives of fexactly and in a completely mechanical fashion.

For esample, the short code segmenty = s i n ( x )z = y*x + 5

could be augmented to compute derivatives asy = s i n ( x )vy = cos(x)*Vxz = y* x + 5vz = y * v x + x * v y

where Vvar is a vector representing the derivatives of v a r with respect to the independent vari-able(s). Thus, if x is the scalar independent variable, then Vx is equal to 1 and Vz represents e.This example uses the so-called forward mode of au tom ati c differentiation, wherein derivatives ofintermed iate variables with respect to the independent variables are propagated. There is also areverse mode of automatic differentiation, which propagates derivatives of the dependent variableswith respect to the intermediate variables.

2

7/27/2019 1667274

4/13

Several tools have been developed that use automatic differentiation for the computation ofderivatives [16]. In par ticular, we mention GRESS [13], ADRE-:! [17], Odyssee [20]. and .ADIFOR[2] for For tran programs and ADOL-C [ l l]and ADIC [4] for C programs. We employed the ADIFORtool in o u r experiments.

ADIFOR (Automatic Differentiation of Fortran) [2] provides automatic differentiation for pro-grams written in Fortran i 7 . Given a Fortran subroutine (or collection of subroutines) describinga function, and an indication of which variables i n parameter lists or common blocks correspondto independent and dependent variables with respect t o differentiation, ADIFOR producesportable Fortran i7 code tha t allows the computation of the derivatives of the dependent variableswith respect to the independent ones.3 Interface Contraction

Automatic differentiation tools such as ADIFOR produce derivative code tha t typically outper-forms divided difference approxima tions (see, for example, [ l , , 5 , 6 , IS]), but, not surprisingly, isusuall? much less efficient than a hand-derived code probably could be. We introduce a technique,called interface contraction, t ha t can dramatically reduce the runtim e and storage requirementsfor computing derivatives via aut omat ic differentiation. This technique takes advantage of a pro-grammers understanding of which subroutines encapsulate the majority of the computation andknowledge of the number of variables passed to these subroutines. In counting the number of vari-ables, we count the number of scalar variables; that is, a 10 x 10 array would be counted as 10 0variables. We now introduce interface contraction in detail.Rp,and denote its cost by C ( f ) .The cost of a computationis usually measured in terms of memory an d floating-point opera tions, a nd th e following argumentapplies to either. If the derivatives of f(z) with respect to z, 2 , re computed using the so-calledforward mode of autom atic differentiation, the additional cost of computing these derivatives isapproximately n x C(f), because the derivative code includes operations on vectors of length R .Now, suppose tha t f(z) an be interpreted as the composition of two functions h : R -+ Rp andg : R - R uch that j ( z ) = h(g(z ) ) .To compute 2 and $$ independently, the computationalcost is approximately nC(g) and mC(h), respectively. Thus, the tot al cost of comput ing 2 isnC(g)+mC(h) lus the cost of performing the ma trix multiplication,2 2. he cost of the matr ixmultiplication will usually be small compared with the cost of computing the partial derivatives.If m < n, then the cost of computing derivatives is reduced by using thi s method. Furthermore ,i f C(h )>> C(g ) , this method h a s a cost of approximately mC( f ) ,which may be substantially lessthan nC( f ) .The value of m is said to be the width of the interface between g and h. Hence, ifm is less than n , we have what we call interface contraction. An even more pronounced effect canbe seen in the case of Hessians, since the cost of computing derivatives using the forward mode isquadratic in the number of independent variables. Hence, the potenti al speedup due to interfacecontraction is (n /m)* ather than jus t n /m .

We note that a similar approach to interface contraction wa s mentioned by Iri 1151 as a vertexcut when considering aut omat ic differentiation a s applied to a computational g raph representationof a program.

Consider a function f : R

3.1 Microscopic In te rface ContractionA simple case of interface contraction occurs for every complex assignment statement in a pro-

gram. Consider a simple example, the following stat emen t, which computes the product of fiveterms:f = y ( l ) * y(2) * y(3) * y(4) * y ( 5 )

In most cases, no hand-derived derivative cod e was available for comparison.

7/27/2019 1667274

5/13

Using temporaries ri, 2 , and r3 , we could rewrite this asr1 = y ( 1 ) * y(2)r 2 = y(3) * r l1-3 = y(4) * r 2f = y(5) * r 3

and apply th e forward mode of automatic differentiation to yield, for n independent variables,r l = y(1) * y(2)V r l ( l : n ) = y ( 1 ) * Vy ( l :n ,2 ) + y(2) * V y ( l : n , l )r 2 = y(3) * r lVra(1:n) = y(3) * Vri(1 :n) + rl * Vy(l :n ,3 )r 3 = y(4) * r 2'7r3(1:n) = y(4) * Vr2(1:n) + r 2 * Vy ( l :n ,4 )f = y(5) * r 3Vf(1:n) = y(5) * Vr3(1 :n ) + r 3 Vy ( l : n ,S )

But, if we notice that this single statement is a scalar function of a vector y E R5, hich is itselfa function of n ndependent variables, we have the situation described above, where p = 1, m = 5 ,and n = n. Thus, if we compute = $$ first, we can compute Vf = jj x Vy more efficiently. Forcomputing the derivatives of scalar functions, the reverse mode of automa tic differentiation is moreefficient than the forward mode [9,19], so we can use it to compute the values of F. Th e code forcomputing rj using the reverse mode is

r l = y ( l ) * y(2)r2 = r l * y(3)r3 = r 2 * y(4)f = 1-3 * y(5)y5bar = r 3r 2 b a r - y(4)*y(5)y4bar - r2 * y(5)rlbar = y(3)*r2bary3bar = r l * r2bary2bar = y ( l ) * r l b a ry l b a r = y(2) * r l b a r

where each yibar represents Q ( i ) . We can then perform the matrix multiplicationVf(1 :n ) = y lb a r * Vy(1:n. l) + y2bar * Vy(l :n , 2 )+ y3bar * v y ( l : n , 3) + y4bar * v y ( l : n , 4)

+ y5bar * Vy(1:n. 5 )Thi s hybrid m ode of autom ati c differentiation is employed by ADIFOR [2]. We see that interfacecontraction is mainly responsible for the lower complexity of ADIFOR-generated code comparedwith divided-difference approximations. We also note that, for a moderate number of variables onthe right-hand side, we would still come out ahead if we used the forward mode to compute 8,instead of the reverse mode.

3.2 Macroscopic Interface ContractionAn analogous situation exists for larger program units, in particular, subroutines. Suppose we

have a subroutine subf that computes a function z = f ( z ) and that simply calls two subroutines,subg and subh, as follows:

4

7/27/2019 1667274

6/13

subroutine subf (x.z.n)integer nreal x(n),z(n>real y ( 2 )

returnendSubroutine subg computes J from x ,and subh computes z from J . Applying XDIFOR to thissubroutine would result in the creation of subroutines g$subf, g$subg, and g$subh, each of whichwould work with derivative vectors of length n, epresenting derivatives with respect to L . If weprocess subh separately to get a subroutine g$subh that computes e, se g$subg to compute 2 ,and then multiply e x 2 , e get as before. However, the addi tional computational cost ofg$subh is no longer n times the cost of subh but merely two times the cost of subh ( y is a vector oflength 2, so rn = 2). If subh is computationally demanding, this may be a great savings.

The exploitation of interface contraction in this example is illustrated in Figure 1. The widthof an arrow corresponds to the amount of information passing between and computed within thevarious subroutines. When au tom ati c differentiation is applied to th e whole program (Before), th egradient objec ts have length n. Thus, large amounts of dat a must be computed and stored, resultingin large runtimes and memory requirements. If interface contraction is exploited by processing subgan d subh separately (After), the amount of data computed within g$subh is greatly reduced,resulting in reduced computational demands within this subroutine. If subh is expensive, thisapproach results in greatly improved performance and reduced memory requirements.

a z

r

Before:

After:

Figure 1: Schematic of subroutine calls before and after interface contraction

5

7/27/2019 1667274

7/13

Note th at this process requires a little more w o r k o n the users par t t han simply applying a black-box automat ic different iation tool. While previously we just applied the automati c differentiationtool to subf and the subroutines it called. w e now must apply it separately to subg and subh,and we must provide the matrix-matrix multiply .glue code as well. Nonetheless, this approachproduces in a shor t amount of time, and with a fairly low likelihood of hum an error, a derivative codewith significantly lower complexity than t ha t derived from black-box application of an aut oma ticdifferentiation tool.

4 An Application of Interface ContractionThe macroscopic version of interface contraction can be used advantageously in the development

of derivative codes for two log-likelihood functions used for biostatist ical analysis. These func tionswere motivated by a problem in nutritional epidemiology investigating the relationships of age anddietary intakes of saturated fat, total energy, and alcohol with the four-year risk of developingbreast cancer in a prospective cohort of 89,538 nurses [21]. The likelihood functions take intoaccount measurement error in total energy and binary misclassification in satu rated fat and alcoholintake, an d include an integral tha t is evaluated numerically by using the algor ithm of Crouch andSpiegelman (81. Th e Hessian and gradient are needed for optimiza tion of these nonlinear likelihoodfunctions, and the Hessian is again needed to evaluate the variance-covariance matr ix of the resultingmax imum likelihood estimates. Two likelihood functions were fit to these data, a 33-parameterfunction and a 17-parameter function.Although the current version of ADIFOR does not support second derivatives, we were able toapply ADIFOR twice to produce code for computing the Hessian (see [14] for more details). Thismethod for computing second derivatives is somewhat tedious, and not optimal from a complexitypoint of view, but does produce correct results. Direct support for second derivatives will be providedin a futu re release of ADIFOR.

The initial results using ADIFOR exhibited linear increase in computational complexity forgradients and quadratic increase for Hessians as expected. As a result, computing the Hessianfor the 33-parameter function required approximately six hours on a SPARCstation 20 . However,this problem turns out to be very suitable for exploiting interface contraction. Th e subrout inedefining the function to be differentiated, l g l i k 3 , calls subroutine i n t e g r , whose computationalexpense domi nates the computation (approximately 85% of the overall computation time is spentin i n t e g r ) . Th e input variable fo r subroutine l g l i k 3 , x i n, is a vector of length 17 (33) , while thelength of the input variable for subroutine i n t e g r , x , is 2. Th e outpu t variables for the routines arex l o g l (a scalar) and f (a scalar), respectively. So, using the notation of ou r previous example, wehave n = 17 (33), m = 2, and p = 1. The difference between this situation an d the one examined inSection 3 .2 is that rather than being preceded by another subroutine call, i n t e g r is surrounded bycode and is within a loop. In addition, ADIFOR is used t o generate code for computing a Hessian,rather than a gradient, as was the case in our previous examples.

Modifying the ADIFOR-generated routines for computing Hessians to accommodate interfacecontraction is straightforward. The subroutine i n t e g r and the subroutines it calls are processedseparately from lglik3, making it possible to compute derivatives with respect to x instead ofxin. Since we are comput ing Hessians, this reduces computational complexity for the derivatives fori n t eg r by a factor of approximately -m-75 for the 17-parameter problem and x 273for the 33-parameter problem.

(17.1 7) 33.33

5 Experimental ResultsFigures 2 an d 3 summar ize performance results for the 17- and 33-parameter problems, respec-

tively, on a SPARCstation 20 at the Channing Laboratory, Harvard Medical School, and Brighamand Womens Hospital, Boston, Massachusetts. This workstation is equipped with a 50 M H z Super-

6

7/27/2019 1667274

8/13

Task Method Time #feval FactorFunc Analytic 10.94 1.0Grad Analyt ic 36.65 3.35 1 oo

Black-Box AD 186.70 17.07 5.09ADIFOR & Interface Contr. 54.32 4.97 1.48Analytic & Interface Contr. 49.31 4.51 1.34

Hess Analyt ic 185.80 16.98 1 ooBlack-Box AD 5636.00 515.17 30.30ADIFOR & Interface Cont r. 609.70 55.73 3.28Analytic & Interface Contr. 608.90 55.66 3.28

Max Error0

6.5e-146.le-146.le-14

09.9e-119.9e-119.9e-11

Figure 2: Experimental results for the 17-parameter problem

Task Method I Time #feval Factor Max ErrorFunc Analytic 11.67 1.0 - -Grad Black-Box AD 394.25 33.78 4.15 0

Interface Contract ion 104.56 8.96 1.10 8.3e-14Analyti c Interface 95.00 8.14 1.00 6.le-14Hess Black-Box AD 21130.00 1810.62 7.09 0ADIFOR & Interface Contr. 3049.10 261.28 1.02 7.4e-14Analytic & Interface Cont r. 2979.30 255.30 1 oo 2.7e-13

Figure 3: Experimental results for the 33-parameter problem

SPARC processor and 48MB of memory. All code was-compiled by using Suns f77 compiler withthe -0 option. The times provided are the total cpu time in seconds, averaged over several runs.Th e column labelled #feval shows the number of function evaluations that could be completed inthat time. The cpu time required for a given method divided by the execution time of the fastestmethod is reported in the Factor column. The maximum (over all components of the gradient orHessian) relative error is reported in the Max Error column, compared with values obtained bythe method for which the error is listed as 0 in the figures. The column labeled Task indicateswhether the function (func), its gradient (grad), or its Hessian (Hess) is being computed. We usedfour methods to compute derivatives:Analytic: For th e 17-parameter problem, code was developed over the course of two years of person-time to comp ute th e derivatives analytically. No such code exists for the 33-parameter problem.Black-Box Automatic Differentiation: This method corresponds to the code generated byADIFOR. Thus, even for the smaller (17-parameter) problem, the code generated via automaticdifferentiation is much slower than the code genera ted by hand. However, it also took almost noperson-time to develop,

TOimplement the interface contraction (IC) approach, we used two different versions of thederivative code for i n t eg r :Analytic and Interface Contraction: Code already existed for analytically computing the deriva-tives of the output variables of integr with respect to its inpllt variables, and we employed thiscode to comput e the derivatives of in t e g r .

7

7/27/2019 1667274

9/13

ADIFOR and In te r face Contraction: b e employed the derivative code generated by ADIFORfor integr.

A s a rough e stim ate, it can be expected th at , for n independent variables, computing derivativesby using th e forward mode of auto mati c differentiation requires n an d n3 the cost of computingthe function for gradient and Hessian, respectively. For our problems, this would amoun t to theequivalent of 17 (33) and 289 (1089) function evaluations to compute the gradient an d Hessian ofthe 17- (33-) parameter function. As can be seen from the #feval column, the gradient computedby ADIFOR conforms well to this rough est ima te, while the Hessian comput ation performs worseby a factor of less than two. This is due to the fact that the Hessian code (which w a s derived byapplying ADIFOR t o th e ADIFOR-generated first-order derivative code) cannot internally exploitthe symmetry inherent in Hessian computations.

There is littledifference in the performance of the two implementations of interface contraction with analytic and-4DIFOR-generated derivative code for in t e g r , respectively. Th e fact tha t th e ADIFOR-generatedHessian code for i n t e g r does not exploit symmetry is of little importance here since we computeonly a 2 x 2 Hessian. For the 17-parameter problem, interface contraction allows us to obtain thegradient by a factor less than 1.5 slower th an t he hand-derived version, corresponding to the t imetaken by a bout five function evaluations. T he gradient for the 33-parameter problem is obtained atthe cost of about eight function evaluations. The 17 x 17 (33 x 33) Hessian is obtained at the costof about 55 (260) function evaluations.The fact that interface contraction does worse, in comparison with the analytic approach, forHessians is due to an effect akin to that described by Amdahls law [12]. For the 17-parameterproblem, the cost of the black-box version of th e first- (second-) derivative code of in t e g r can b eexpected to be approximately = 8 .5 ( i s ) 75) times that of the interface contraction version.The speedup of about 2.8 (10) that we measured is due to the fact that some small amount oftime is spent in parts of the code outside of integr. In the interface contraction version of th ederivative code, the r atio of time outside of th e derivative code for i n t e g r to time within increases,resulting in a smalle r speedup. For example, if the black-box version spends 85% of execution timefor computi ng the second derivatives of subroutine i n t e g r and 15%of execution in the rest of theprogram, the interface contraction version will require .15 + .85 * & = .161 the execution t ime ofthe black-box version, for a speedup factor of ab ou t 6.2. By the sa me token, interface contractionleads to a smaller decrease in computation t ime for the 33-parameter problem. as the portion of th eexecution time outsi de of i n t e g r increases.

The exploitation of interface contraction significantly improved performance.

6 ConclusionsAlthough automatic differentiation is an excellent tool for developing accurate derivative codes

quickly, the resul tant code can sometimes be significantly more expensive tha n analytic code writtenby a programmer. We introduced an approach for reducing derivative complexity that capitalizeson chain rule associativity and what we call interface contraction. By combining th e techniqueof auto mati c differentiation with techniques capitalizing on high-level program-specific information,such as interface contraction, a user can create codes in the span of a few days that rival theperformance of codes that require months to develop by hand. Application of this approach to abiostatistics problem validated the promise of this approach.

Com put ationa l differentiation (CD ) is an emerging discipline that, in addition to black-boxaut oma tic differentiation, exploits insight into program struc ture and underlying algorithms. Itsgoal is the generation of efficient derivative codes with minimal human effort, Future research inC D will further explore how a computational scientist can exploit knowledge about a program (suchas intcrface contraction) o r a n underlying algorithm (such as the solut.ion of a nonlinear system ofequations [lo]) to reduce the cost of comput ing derivatives. Research i n C D will also lead to the

8

7/27/2019 1667274

10/13

.development of a tool infrastructure to make i t e'wier to employ this knowledge in building derivativecodes.Acknowledgmen ts

We thank Alan Carle for his instrumental role i n the ADIFOR project, Andreas Griewank orst imu lat ing discussions, and Eugene Demidenko for his valuable assistance in program development.We than k Dr. Frank Speizer for making available the facilities of the Channing Laboratory forperforming the analyses described in this paper.References[l] Brett Averick, Jorge Mo rd , Christian Bischof, Alan Carle, and Andreas Grieu-ank. Computing

large sparse Jacobian matrices using automatic differentiation. SLIM J o v r n a l on SczentzficComput ing , 15(2):285-294, 1994.[2] Chris tian Bischof, Alan Carle , George Corliss, Andreas Griewank, and Paul Hovland. ADIFOR:

Generating derivative codes from Fortran programs. Scientific Programmtng, 1(1):ll-29, 1992.[3] Christian Bischof, George Corliss, Larry Green, Andreas Griewank, Kara Haigler, and Perry

Newman. Aut oma tic differentiation of advanced CF D codes for multidisciplinary design. Jour-na l on Computing Systems in Engineering, 3(6):625-638, 1992.

[4]Christian Bischof and Andrew Mauer. Private communication. Argonne National Laboratory,1995.

[5] Chri stia n Bischof, Greg Whiffen, Christine Shoemaker, Alan Carle, and Aaron Ross.Applica-tion of aut oma tic differentiation t o groundwater transport models. In Alexander Peters et al.,editor, Computational Methods in Wa ter Resources X, ages 173-182, Dordrecht, 1994. KluwerAcademic Publishers.

[6] Alan Car le, Lawrence Green, Christ ian Bischof, and Perry Newman. Applica tions of automat icdifferentiation in CFD. In Proceedings of the 25th A I A A Fluid Dynamics Conference, AIA-4Paper 94-2197. American Inst itut e of Aeronautics and Astronautics, 1994.

[i]ruce IV. Char, Keith 0. Geddes, Gaston H . Gonnet, Michael B. Monagan, and Stephen M.Watt. MAPLE Reference Manual. Watcom Publications, Waterloo, Ontar io Canada, 1988.[8] E. A . C. Crouch and D. Spiegelman. Th e evaluation of integrals of the form Jf(t)e-"dt.

Application to logistic-normal models. Journa l of the American Statistical Association, 85:464-469, 1990.

[9] Andreas Griewank. On aut oma tic differentiation. In Mathematical Programming: Recent De-velopments and Applications, pages 83-108, Amsterdam, 1989. Kluwer Academic Publishers.[lo] Andreas Griewank, Christian Bischof, George Corliss, Alan Carle, and Karen Williamson.

Derivative convergence of iterative equation solvers. 0ptzmx:ation Methods a n d Software, 2:321-355, 1993.

[ l l ] Andreas Griewank, David Juedes, and Jay Srinivasan. ADOL-C, a package for the automaticdifferentiation of algorithms written in C/C++. Preprint MCS-Pl8O-I 190, Vathemat ics andComputer Science Division, Argonne National Laboratory, 1990.

[12] J . J . Hack. Peak vs. sustained performance in highly concurrent vector machines. Computzng,19(9):9-19. 1986.

9

7/27/2019 1667274

11/13

[13] im E. Horwedel. GRESS: .A preprocessor for sensitivity st dies on Fort ran programs. InAndreas Griewank and George F. Corliss, editors, A u io m a i i c D i f l e re n t ia t i o n of .A l j o r i t h rns :T h e o ry , Im p le m e n ta i io n . a n d A p p l i ca t io n , pages 243-250. SIAM. Philadelphia, 1991.

[14] Paul Hovland, Christ ian Bischof, and Alan Carle. Using - IDIFOR to compute Hessians. Tech-nical Repo rt AXL/MCS-TM-192, Mathematics and Compute r Science Division, Argonne Sa-tional Laboratory, 1995 (in press).

[15] klasao Iri. History of Automat ic Differentiation and Rounding Estimation. In Xndreas Griervankand George F. Corliss, editors, Automat ic Dif feren t ia t ion of A l g o d h m s : .Theory , I m p l e m e n t a -t i o n , a n d . 4 p p f i c a f io n ,pages 1-16. SEAM, Philadelphia, 1991.

[16] David Juedes . A taxonomy of automatic differentiation tools. In Andreas Griewank and Geo r g eCorliss, editors , Proceedings of the Workshop on Autom at ic Dif feren t ia t ion of .-Ilgorifhms: T,$e-o r y , Im p le m e n ta t io n , a n d A p p l i c a t i o n , pages 315-330, Philadelphia. 1991. SLUI.

[17] Koichi Kub ota . PX D R E 2 , a FORTRAN precompiler yielding error estima tes and second deriva-tives. In Xndreas Griewank and George F. Corliss, editors, = i d o m a f i c D i ff e r en t i o fi o n of Argo-r i th m s: T h e o r y , Im p le m e n ta t io n , a n d A p p l i c a t i o n , pages 251-262. SIAM, Philadelphia, 1991.

[18]Seon Ki Park, Kelvin Droegemeier, Christian Bischof, and Ti m Knauff. Sensitivity analysisof numerically-simulated convective storms using direct and adjoint methods. In P r e p r i n t s ,10th C o n f e re n c e o n N u m e r i c a l W e a th e r P re d i c t i o n , P o r t l a n d , Oregon,pages 457-459. AmericanMeteorological Society, 1994.

[19] Louis B. Rall. Automat ic Dif ie ren t ia t ion: Techniques and Appl ica t ions , volume 120 of LectureN o t e s i n C o m p u t e r S c ie n ce . Springer Verlag, Berlin, 1981.

[20] .Nicole Rostaing , Stephane Dalmas, and Andre Galligo. Automatic differentiation in Odyssee.Tel lus , 45a(3):358-568, October 1993.[21] W. C.Will ett, et al. Dietary fat and fiber in relation to risk of breast cancer. Journal o f theA m e r i c a n M e d ic a l A s so c ia t i o n , 268:2037-2044, 1992.

DISCLAIMERThis report was prepared as an account of work sponsored by an agency of the United StatesGovernment. Neither the United Stat es Government nor any agency thereof, nor any of theiremployees, makes any warranty, express or implied, or assumes any legal liability or responsi-bility for the accuracy, completeness, or usefulness of any information, apparatus, product, o rprocess disclosed, or represents that its use would not infringe privately owned rights. Refer-ence herein to any specific commercial product, process, or service by trade name, trademark,manufacturer, or otherwise docs not necessarily constitute or imply its endorsement, recom-mendation, or favoring by the United States Government or any agency thereof. The viewsand opinions of authors expressed herein do not necessarily state or reflect those of theUnited States Government or any agency thereof.

10

7/27/2019 1667274

12/13

From: Chris Bischof Date: Mon, 6 Feb 1995 13:34:080600To: beume @ mcs.an . govSubject: updated version of MCS-P491-0195 is in /home/bischof/JUDY/DONNA.

This is the version w/ Gails comments incorporated. 1 made sure itsreadable. -- Chris

7/27/2019 1667274

13/13

To: beumercc: bischofSubject: need TR no.Date: Fri, 20 Jan 1995 13:28:57 -0600From: Chris Bischof [email protected]>author = "Paul Hovland and Christian Bischof and Donna Spiegelman and Mariotitle = "Efficient Derivative Codes through Automatic Differentiation andCass a",Interface Contraction and an Application in Biostatistics",

intend to submit to SlAM J. Scientific Computing. Will pass copy to Gailand point you to source.Thanks,-- Chris

Documents

1667274