Skip to content

Conversation

oxinabox
Copy link
Member

@oxinabox oxinabox commented Nov 25, 2019

This PR includes my attempt to explain the math behind chainrules.
In particular behind how we define a Differential

Preview Docs Build

TODO:

@oxinabox
Copy link
Member Author

The definition of valid differential would be much clearer if written in terms of type unions instead of sets of types.

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some typos :)

docs/src/math.md Outdated

-----

### REWRITE: Differential Type
Copy link
Member

@willtebbutt willtebbutt Nov 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we simplify even further by saying something like the following:

A type \mathcal D is a valid differential type for primal type \mathcal P if

  • addition between a primal in \mathcal P and differential in \mathcal D yields another primal in \mathcal P
  • addition of two differentials in \mathcal D produces another differential in \mathcal D. We say that \mathcal D is closed under addition.

This avoids a lot of mathematical complexity that I'm not sure we really need (unless I'm missing something...)

Copy link
Member Author

@oxinabox oxinabox Nov 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the ability to add 2 differentials of different types and come back with another differential, potentially of a 3rd type.
Common example would be [1.0,2.0,3.0] + Zero() = [1.0,2.0,3.0]

another isi

Composite{Foo, NamedTuple{(:x,),Tuple{Float64}})(x=2.0) 
+ 
Composite{Foo, NamedTuple{(:y,),Tuple{Float64}})(y=3.0)
=
 Composite{Foo, NamedTuple{(:x,:y),Tuple{Float64,Float64}})(x=2.0, y=3.0)


It is written around the Julia idea of types.
Which do not nessicarily correspond too closely with the idea of Programming Language Theory (static) types,
nor with type-theory types.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be brought up a line? I think the standard is one sentence per line.

There's also a couple other cases that break this standard throughout.

Comment on lines +17 to +22
We thus can have defined different ``+``
for different input types, and so will not distinguish between them.
The functions thus stand alone from the types,
except that a set of input types may special case them.
Where as most similar definitions might decribe a object as a type and some operations on the type,
we can consider them seperately.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this reads a bit strange. I understand what you mean, but with multiple dispatch we would be distinguishing between different types. Just the higher level abstraction of Base.+ would remain the same to the end user.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a fun thing in a lot of math write ups that say:

"A vector space over a field F is a set V together with two operations * and + that satisfy the eight axioms"

and treat the * and + as part of the vector space.
And then for a different vector space, it will be defined with *' and +' etc,
completely distrinct.

Where as we are saying that we will treat + as always a thing that exists.

But yeah this bit does read weirdly

@oxinabox oxinabox changed the title [WIP] ChainRules Manifesto [WIP] ChainRules Mathematical Manifesto Nov 29, 2019
Co-Authored-By: willtebbutt <[email protected]>
@simeonschaub
Copy link
Member

Looks very promising! Just a couple of additional thoughts:

  1. Should we introduce Zero here formally as well? It's mentioned above, but it isn't really explained, how this fits into the framework.
  2. Do we want to formalize multiplication of a differential with a scalar as well? Seems that we need this in a lot of cases, too. (Oh, I see this is already on your TODO)
  3. I'm wondering, whether we should make use of more conventional mathematical terminology alongside the ChainRules-specific one. What you're describing as a Differential Type here seems to be quite closely related to the concept of vector spaces. (Although floating point arithmetic does not actually describe a field, so things get a bit muddier) I guess, your concept of a Primal Type is also analogous to the concept of a manifold in differential geometry, where your Differential Types would correspond to differential one-forms on a tangent space. I think that would help people coming from a more mathematical background to better understand why we need this extra terminology and where exactly it differs from the more abstract mathematical concepts.



- If there exists a type-union ``\mathbb U``, with ``\mathcal D <: \mathbb U``,
- if for all ``u :: \mathbb U`` and for all ``p :: \mathcal P``, there exists a ``q :: \mathcal P`` such that `u + p = p + u = q`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, Rationals wouldn't form a Primal Type according to this definition, since typemax(Int)//1 + typemax(Int)//1 doesn't correspond to any value q, since it throws an error. We might want to loosen this and allow for some edge cases throwing exceptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I think a disclaimer at the top about for purposes of these definitions one can ignore overflow underflow etc

Comment on lines +25 to +26
We use some notation here based closely off of how JuliaLang indicates type relationships. </br>
- ``d::\mathcal D``: a value ``d`` of type ``\mathcal D``, or the assertion that the value ``d`` has type ``\mathcal D`` </br>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the </br> are necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it just fixes how Atom renders it

@oxinabox
Copy link
Member Author

Looks very promising! Just a couple of additional thoughts:

1. Should we introduce `Zero` here formally as well? It's mentioned above, but it isn't really explained, how this fits into the framework.

Indeed we should explain it here. added to the TODO

3. I'm wondering, whether we should make use of more conventional mathematical terminology alongside the ChainRules-specific one.

Yes, we should I just don't know them.

(Although floating point arithmetic does not actually describe a field, so things get a bit muddier)

Yeah, but we will ignore that in this write up.

I guess, your concept of a Primal Type is also analogous to the concept of a manifold in differential geometry, where your Differential Types would correspond to differential one-forms on a tangent space.

Yeah, basically.
The concept of a Primal Type is a vector, I guess one might say 1-vector (people don't say that, do they?).
The concept of a Differential Type is indeed a 1-form.
I thought 1-fotm and differential were synonomous?
(Maybe one has to ignore orintation/coordinate-ish stuff, which we do do)

@oxinabox
Copy link
Member Author

oxinabox commented Dec 4, 2019

Here is all the math that I want in the PR.
I wrote it on a bus.
I will rewrite it into this PR

https://photos.google.com/share/AF1QipO1fqEhTGO7BwFKGczb8ebCHf0McWpbPrWgO4UVLjjuhYCt4mdwE722MrldJzIyMA?key=MHBFZVBsTWgwRl9hVUlJSUpQQlEzVmhQMzZaQ01B

@oxinabox
Copy link
Member Author

oxinabox commented Dec 4, 2019

Pages 1 and 2, of 9 pages are written

@nickrobinson251
Copy link
Contributor

xref JuliaDiff/ChainRulesCore.jl#138

@willtebbutt
Copy link
Member

@oxinabox what's the status of this PR. Has it been superceded?

@oxinabox
Copy link
Member Author

oxinabox commented Jul 6, 2020

@oxinabox
Copy link
Member Author

oxinabox commented Jul 6, 2020

Closing this because it is on the wrong package now

@oxinabox oxinabox closed this Jul 6, 2020
@oxinabox oxinabox deleted the ox/explainmath branch August 17, 2020 07:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants