Skip to content

Commit 589bc49

Browse files
authored
New page added for Research Area "Automatic Differentiation" (#163)
1 parent 659b7ba commit 589bc49

File tree

4 files changed

+174
-1
lines changed

4 files changed

+174
-1
lines changed

_pages/automatic_differentiation.md

Lines changed: 173 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,173 @@
1+
---
2+
title: "Automatic Differentiation"
3+
layout: gridlay
4+
excerpt: "Automatic Differentiation is a general and powerful technique
5+
of computing partial derivatives (or the complete gradient) of a function inputted as a
6+
computer program."
7+
sitemap: true
8+
permalink: /automatic_differentiation
9+
---
10+
11+
## Automatic differentiation
12+
13+
Automatic Differentiation (AD) is a general and powerful technique for
14+
computing partial derivatives (or the complete gradient) of a function
15+
inputted as a computer program.
16+
17+
It takes advantage of the fact that any computation can be represented as a
18+
composition of simple operations / functions - this is generally represented
19+
in a graphical format and referred to as the [computation
20+
graph](https://colah.github.io/posts/2015-08-Backprop/). AD works by
21+
repeatedly applying the chain rule over this graph.
22+
23+
### Understanding Differentiation in Computing
24+
25+
Efficient computation of gradients is a crucial requirement in the fields of
26+
scientific computing and machine learning, where approaches like [Gradient
27+
Descent](https://en.wikipedia.org/wiki/Gradient_descent) are used to
28+
iteratively converge over the optimum parameters of a mathematical model.
29+
30+
Within the context of computing, there are various methods for
31+
differentiation:
32+
33+
- **Manual Differentiation**: This consists of manually applying the rules of
34+
differentiation to a given function. While straightforward, it can be
35+
tedious and error-prone, especially for complex functions.
36+
37+
- **Numerical Differentiation**: This method approximates the derivatives
38+
using finite differences. It is relatively simple to implement but can
39+
suffer from numerical instability and inaccuracy in its results. It doesn't
40+
scale well with the number of inputs in the function.
41+
42+
- **Symbolic Differentiation**: This approach uses symbolic manipulation to
43+
compute derivatives analytically. It provides accurate results but can lead
44+
to lengthy expressions for large computations. It requires the computer
45+
program to be representable in a closed-form mathematical expression, and
46+
thus doesn't work well with control flow scenarios (if conditions and loops)
47+
in the program.
48+
49+
- **Automatic Differentiation (AD)**: Automatic Differentiation is a general
50+
and an efficient technique that works by repeated application of the chain
51+
rule over the computation graph of the program. Given its composable nature,
52+
it can easily scale for computing gradients over a very large number of
53+
inputs.
54+
55+
### Forward and Reverse mode AD
56+
Automatic Differentiation works by applying the chain rule and merging the
57+
derivatives at each node of the computation graph. The direction of this graph
58+
traversal and derivative accumulation results in two approaches:
59+
60+
- Forward Mode, Tangent Mode: starts the accumulation from the input
61+
parameters towards the output parameters in the graph. This means that we
62+
apply the chain rule to the inner functions first. That approach
63+
calculates derivatives of output(s) with respect to a single input
64+
variable.
65+
66+
![Forward Mode](/images/ForwardAccumulationAutomaticDifferentiation.png)
67+
68+
- Reverse Mode, Adjoint Mode: starts at the output node of the graph and moves backward
69+
towards all the input nodes. For every node, it merges all paths that
70+
originated at that node. It tracks how every node affects one output. Hence,
71+
it calculates the derivative of a single output with respect to all inputs
72+
simultaneously - the gradient.
73+
74+
![Reverse Mode](/images/ReverseAccumulationAutomaticDifferentiation.png)
75+
76+
### Automatic Differentiation in C++
77+
78+
Automated Differentiation implementations are based on [two major techniques]:
79+
Operator Overloading and Source Code Transformation. Compiler Research Group's
80+
focus has been on exploring the [Source Code Transformation] technique, which
81+
involves constructing the computation graph and producing a derivative at
82+
compile time.
83+
84+
[The source code transformation approach] enables optimization by retaining
85+
all the complex knowledge of the original source code. The compute graph is
86+
constructed during compilation and then transformed to generate the derivative
87+
code. The drawback of that approach in many implementations is that, it
88+
typically uses a custom parser to build code representation and produce the
89+
transformed code. It is difficult to implement (especially in C++), but it is
90+
very efficient, since many computations and optimizations can be done ahead of
91+
time.
92+
93+
### Advantages of using Automatic Differentiation
94+
95+
- Automatic Differentiation can calculate derivatives without any [additional
96+
precision loss].
97+
98+
- It is not confined to closed-form expressions.
99+
100+
- It can take derivatives of algorithms involving conditionals, loops, and
101+
recursion.
102+
103+
- It can be easily scaled for functions with a very large number of inputs.
104+
105+
### Automatic Differentiation Implementation with Clad - a Clang Plugin
106+
107+
Implementing Automatic Differentiation from the ground up can be challenging.
108+
However, several C++ libraries and tools are available to simplify the
109+
process. The Compiler Research Group has been working on [Clad], a C++ library
110+
that enables Automatic Differentiation using the LLVM compiler infrastructure.
111+
It is implemented as a plugin for the Clang compiler.
112+
113+
[Clad] operates on Clang AST (Abstract Syntax Tree) and is capable of
114+
performing C++ Source Code Transformation. When Clad is given the C++ source
115+
code of a mathematical function, it can algorithmically generate C++ code for
116+
the computing derivatives of that function. Clad has comprehensive coverage of
117+
the latest C++ features and a well-rounded fallback and recovery system in
118+
place.
119+
120+
**Clad's Key Features**:
121+
122+
- Support for both, Forward Mode and Reverse Mode Automatic Differentiation.
123+
124+
- Support for differentiation of the built-in C input arrays, built-in C/C++
125+
scalar types, functions with an arbitrary number of inputs, and functions
126+
that only return a single value.
127+
128+
- Support for loops and conditionals.
129+
130+
- Support for generation of single derivatives, gradients, Hessians, and
131+
Jacobians.
132+
133+
- Integration with CUDA for GPU programming.
134+
135+
- Integration with Cling and ROOT for high-energy physics data analysis.
136+
137+
### Clad Benchmarks (while using Automatic Differentiation)
138+
139+
[Benchmarks] show that Clad is numerically faster than the conventional
140+
Numerical Differentiation methods, providing Hessians that are 450x (~dim/25
141+
times faster). [General benchmarks] demonstrate a 3378x improvement in speed
142+
with Clad (compared to Numerical Differentiation) based on central
143+
differences.
144+
145+
For more information on Clad, please view:
146+
147+
- [Clad - Github Repository](https://github.com/vgvassilev/clad)
148+
149+
- [Clad - ReadTheDocs](https://clad.readthedocs.io/en/latest/)
150+
151+
- [Clad - Video Demo](https://www.youtube.com/watch?v=SDKLsMs5i8s)
152+
153+
- [Clad - PDF Demo](https://indico.cern.ch/event/808843/contributions/3368929/attachments/1817666/2971512/clad_demo.pdf)
154+
155+
- [Clad - Automatic Differentiation for C++ Using Clang - Slides](https://indico.cern.ch/event/1005849/contributions/4227031/attachments/2221814/3762784/Clad%20--%20Automatic%20Differentiation%20in%20C%2B%2B%20and%20Clang%20.pdf)
156+
157+
- [Automatic Differentiation in C++ - Slides](https://compiler-research.org/assets/presentations/CladInROOT_15_02_2020.pdf)
158+
159+
160+
161+
[Clad]: https://compiler-research.org/clad/
162+
163+
[Benchmarks]: https://compiler-research.org/assets/presentations/CladInROOT_15_02_2020.pdf
164+
165+
[General benchmarks]: https://indico.cern.ch/event/1005849/contributions/4227031/attachments/2221814/3762784/Clad%20--%20Automatic%20Differentiation%20in%20C%2B%2B%20and%20Clang%20.pdf
166+
167+
[additional precision loss]: https://compiler-research.org/assets/presentations/CladInROOT_15_02_2020.pdf
168+
169+
[Source Code Transformation]: https://compiler-research.org/assets/presentations/V_Vassilev-SNL_Accelerating_Large_Workflows_Clad.pdf
170+
171+
[two major techniques]: https://compiler-research.org/assets/presentations/G_Singh-MODE3_Fast_Likelyhood_Calculations_RooFit.pdf
172+
173+
[The source code transformation approach]: https://compiler-research.org/assets/presentations/I_Ifrim-EuroAD21_GPU_AD.pdf

_pages/research.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -90,7 +90,7 @@ only improves performance but also simplifies code development and debugging
9090
processes, offering a more efficient alternative to static binding methods.
9191

9292

93-
[Automatic Differentiation ↗]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p2072r0.pdf
93+
[Automatic Differentiation ↗]: https://compiler-research.org/automatic_differentiation
9494

9595
[Interactive C++]: https://blog.llvm.org/posts/2020-12-21-interactive-cpp-for-data-science/
9696

Loading
Loading

0 commit comments

Comments
 (0)