University of Hertfordshire

By the same authors

Sharing storage using dirty vectors

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Documents

View graph of relations
Original languageEnglish
Title of host publicationIn: Computational Differentiation: Techniques, Applications and Tools, Procs of the 2nd SIAM Workshop on Computational Differentiation, Santa Fe, New Mexico, 1996.
PublisherSociety for Industrial and Applied Mathematics (SIAM Press)
Pages107-115
ISBN (Print)0898713854
Publication statusPublished - 1996

Abstract

Consider a computation F with n inputs (independent variables) and m outputs (dependent variables) and suppose that we wish to evaluate the Jacobian of F. Automatic differentiation commonly performs this evaluation by associating vector storage either with the program variables (in the case of forward-mode automatic differentiation) or with the adjoint variables (in the case of reverse). Each vector component contains a partial derivative with respect to an independent variable, or a partial derivative of a dependent variable, respectively. The vectors may be full vectors, or they may be dynamically managed sparse data structures. In either case, many of these vectors will be scalar multiples of one another. For example, any intermediate variable produced by a unary operation in the forward mode will have a derivative vector that is a multiple of the derivative for the argument. Any computational graph node that is read just once during its lifetime will have an adjoint vector that is a multiple of the adjoint of the node that reads it. It is frequently wasteful to perform component multiplications explicitly. A scalar multiple of another vector can be replaced by a single multiplicative "scale factor" together with a pointer to the other vector. Automated use of this "dirty vector" technique can save considerable memory management overhead and dramatically reduce the number of floating-point operations required. In particular, dirty vectors often allow shared threads of computation to be reverse-accumulated cheaply. The mechanism permits a number of generalizations, some of which give efficient techniques for preaccumulation.

ID: 102752