add `ArnoldiInfluenceFunction` #1187

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Closed

99warriors wants to merge 2 commits into pytorch:master from 99warriors:export-D42006733

Contributor

99warriors commented Sep 19, 2023

Summary: This diff implements ArnoldiInfluenceFunction, which was described, along with NaiveInfluenceFunction in D40541294. Please see that diff for detailed description. Previously implementations of both methods had been 1 diff. Now, ArnoldiInfluenceFunction is separated out for easier review.

Differential Revision: D42006733

facebook-github-bot added the cla signed label

Contributor

facebook-github-bot commented Sep 19, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

facebook-github-bot added the fb-exported label

Contributor

facebook-github-bot commented Nov 14, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from 8d51ca1 to c816129 Compare

November 14, 2023 15:54

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

c816129

Summary:

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

Contributor

facebook-github-bot commented Nov 14, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from c816129 to ad0577c Compare

November 14, 2023 16:07

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

ad0577c

Summary:

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

Contributor

facebook-github-bot commented Nov 14, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

1 similar comment

Contributor

facebook-github-bot commented Nov 27, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from ad0577c to fd67835 Compare

November 27, 2023 23:08

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

fd67835

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 04435111f370dc8ca27fac6dae4a686bf483d1de

Contributor

facebook-github-bot commented Nov 30, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

e04efe1

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: b55830a40507e2db41cb466bad00c7ed8a914ad3

99warriors force-pushed the export-D42006733 branch from fd67835 to e04efe1 Compare

November 30, 2023 20:14

Contributor

facebook-github-bot commented Nov 30, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

461aedd

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 3d8bd87aaf23411025fecc3e7b5d965879358be9

99warriors force-pushed the export-D42006733 branch from e04efe1 to 461aedd Compare

November 30, 2023 20:21

Contributor

facebook-github-bot commented Nov 30, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from 461aedd to 25d42d7 Compare

November 30, 2023 20:26

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

25d42d7

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: efd2d1077bc47466743a2d0a5992dce2dddeafbf

Contributor

facebook-github-bot commented Nov 30, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

c2d960f

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 5859afc7a76b6be0e506eb00e8708a65c8e2834d

99warriors force-pushed the export-D42006733 branch from 25d42d7 to c2d960f Compare

November 30, 2023 22:46

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

d73bfe6

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: c952e234455dd530af6af27441efcf83a6a2055e

99warriors force-pushed the export-D42006733 branch from c2d960f to d73bfe6 Compare

December 1, 2023 01:26

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

b513784

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: 51a49895c400e0a3c06410e5297a043326c00add

99warriors force-pushed the export-D42006733 branch from 95cf231 to 9768dd9 Compare

December 1, 2023 04:31

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

d28d683

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: 88a5b67ff90a7e24187f87dac1f2b24667c2271c

99warriors force-pushed the export-D42006733 branch from 9768dd9 to d28d683 Compare

December 1, 2023 11:46

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from d28d683 to 6a0597d Compare

December 1, 2023 19:00

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

6a0597d

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 4006e3f786e224f004520dcf17108f0e20eeed90

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

9a78dad

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 044aa0d1ce0d54c1eb30adc6e9b6af172e9a723d

99warriors force-pushed the export-D42006733 branch from 6a0597d to 9a78dad Compare

December 1, 2023 21:33

Contributor

facebook-github-bot commented Dec 1, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

3d34ad9

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 35a16eb42978222674e30d5b59d1268754ff2a7c

99warriors force-pushed the export-D42006733 branch from 9a78dad to 3d34ad9 Compare

December 1, 2023 21:38

Contributor

facebook-github-bot commented Dec 2, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

4fd4950

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Reviewed By: vivekmig

Differential Revision: D42006733

fbshipit-source-id: 6aa5e6444e65e3e7c517fe87a3ac827916252bc4

99warriors force-pushed the export-D42006733 branch from 3d34ad9 to 4fd4950 Compare

December 2, 2023 02:35

Contributor

facebook-github-bot commented Dec 3, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

d89e918

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: 456796009950a42e0a130b3e66d547cb52bfc6b2

99warriors force-pushed the export-D42006733 branch from 4fd4950 to d89e918 Compare

December 3, 2023 00:28

Contributor

facebook-github-bot commented Dec 4, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors pushed a commit to 99warriors/captum that referenced this pull request


          add ArnoldiInfluenceFunction (pytorch#1187)

031426e

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: 9bbcd7ed391595b34748892d7ab2c44b564772c9

99warriors force-pushed the export-D42006733 branch from d89e918 to 031426e Compare

December 4, 2023 04:16

Fulton Wang added 2 commits

December 4, 2023 06:42


          add NaiveInfluenceFunction (pytorch#1214)

16a9bb5

Summary:
Pull Request resolved: pytorch#1214

Pull Request resolved: pytorch#1186

# Overview
This diff, along with D42006733, implement 2 different implementations that both calculate the "infinitesimal" influence score as defined in the paper ["Understanding Black-box Predictions via Influence Functions"](https://arxiv.org/pdf/1703.04730.pdf).
- `NaiveInfluenceFunction`: a computationally slow but exact implementation that is useful for obtaining "ground-truth" (though, note that influence scores themselves are an approximation of the effect of removing then retraining). Several papers actually use this approach, i.e. ["Learning Augmentation Network via Influence Functions"](https://openaccess.thecvf.com/content_CVPR_2020/papers/Lee_Learning_Augmentation_Network_via_Influence_Functions_CVPR_2020_paper.pdf), ["Quantifying and Mitigating the Impact of Label Errors on Model Disparity Metrics"](https://openreview.net/forum?id=RUzSobdYy0V), ["Achieving Fairness at No Utility Cost via Data Reweighting with Influence"](https://proceedings.mlr.press/v162/li22p/li22p.pdf)
- `ArnoldiInfluenceFunction`: This is a computationally efficient implementation described in the paper ["Scaling Up Influence Functions"](https://arxiv.org/pdf/2112.03052.pdf) by Schioppa et al.  These [slides](https://docs.google.com/presentation/d/1yJ86FkJO1IZn7YzFYpkJUJUBqaLynDJCbCWlKKglv-w/edit#slide=id.p) give a brief summary of it.

This diff is rebased on top of D41324297, which implements the new API.

Again, note that the 2 above implementations are implemented across 2 diffs, for easier review, though they are jointly described here.

# What is the "infinitesimal" influence score
More details on the "infinitesimal" influence score: This "infinitesimal" influence score approximately answers the question if a given training example were infinitesimally down-weighted and the model re-trained to optimality, how much would the loss on a given test example change. Mathematically, the aforementioned influence score is given by `\nabla_\theta L(x)' H^{-1} \nabla_\theta L(z)`, where `\nabla_\theta L(x)` is the gradient of the loss, considering only training example `x` with respect to (a subset of) model parameters `\theta`, `\nabla_\theta L(z)` is the analogous quantity for a test example `z`, and `H` is the Hessian of the (subset of) model parameters at a given model checkpoint.

# What the two implementations have in common
Both implementations compute a low-rank approximation of the inverse Hessian, i.e. a tall and skinny (with width k) matrix `R` such that `H^{-1} \approx RR'`, where k is small. In particular, let `L` be the matrix of width k whose columns contain the top-k eigenvectors of `H`, and let `V` be the k by k matrix whose diagonals contain the corresponding eigenvalues. Both implementations let `R=LV^{-1}L'`. Thus, the core computational step is computing the top-k eigenvalues / eigenvectors.
This approximation is useful for several reasons:
- It avoids numerical issues associated with inverting small eigenvalues
- Since the influence score is given by `\nabla_\theta L(x)' H^{-1} \nabla_\theta L(z)`, which is approximated by `(\nabla_\theta L(x)' R) (\nabla_\theta L(z)' R)`, we can compute an "influence embedding" for a given example `x`, `\nabla_\theta L(x)' R`, such that the influence score of one example on another is approximately the dot-product of their respective embeddings.  Because k is small, i.e. 50, these influence embeddings are low-dimensional.
- Even for large models, we can store `R` in memory, provided k is small. This means influence embeddings (and thus influence scores) can be efficiently computed by doing a backwards pass to compute `\nabla_\theta L(x)` and then multiplying by `R'`. This is orders of magnitude faster than the previous LISSA approach of Koh et al, which to compute the influence score involving a given example, need to compute Hessian-vector products involving on the order of 10^4 examples.

The implementations differ in how they compute the top-k eigenvalues / eigenvectors.

# How `NaiveInfluenceFunction` computes the top-k eigenvalues / eigenvectors
It is "naive" in that it computes the top-k eigenvalues / eigenvectors by explicitly forming the Hessian, converting it to a 2D tensor, computing its eigenvectors / eigenvalues, and then sorting. See documentation of the `_set_projections_naive_influence_function` method for more details.

# How `ArnoldiInfluenceFunction` computes the top-k eigenvalues / eigenvectors
The key novelty of the approach by Schioppa et al is that it uses the Arnoldi iteration to find the top-k eigenvalues / eigenvectors of the Hessian without explicitly forming the Hessian. In more detail, the approach first runs the Arnoldi iteration, which only requires the ability to compute Hessian-vector products, to find a Krylov subspace of moderate dimension, i.e. 200. It then finds the top-k eigenvalues / eigenvectors of the restriction of the Hessian to the subspace, where k is small, i.e. 50. Finally, it expresses the eigenvectors in the original basis. This approach for finding the top-k eigenvalues / eigenvectors is justified by the property of the Arnoldi iteration, that the Krylov subspace it returns tends to contain the top eigenvectors.

This implementation does incur some one-time overhead in `__init__`, where it runs the Arnoldi iteration to calculate `R`. After that overhead, calculation of influence scores is quick, only requiring a backwards pass and multiplication, per example.

Unlike `NaiveInfluenceFunction`, this implementation does not flatten any parameters, as the 2D Hessian is never formed, and Pytorch's Hessian-vector implementation (`torch.autograd.functional.hvp`) allows the input and output vector to be a tuple of tensors. Avoiding flattening / unflattening parameters brings scalability gains.

# High-level organization of the two implementations
Because of the common logic of the two implementations, they share the same high-level organization.
- Both implementations accept a `hessian_dataset` initialization argument.  This is because "infinitesimal" influence scores depend on the Hessian, which is in practice, computed not over the entire training data, but over a subset of it, which is specified by `hessian_dataset`.
- in `__init__`, `NaiveInfluenceFunction` and `ArnoldiInfluenceFunction` both compute `R` using private helper methods `_set_projections_naive_influence_function` and `_set_projections_arnoldi_influence_function`, respectively.
- `R` is used by their respective `compute_intermediate_quantities` methods to compute influence embeddings.
- Because influence scores (and self-influence scores) are computed by first computing influence embeddings, the `_influence` and `self_influence` methods for both implementations call the `_influence_helper_intermediate_quantities_influence_function` and `_self_influence_helper_intermediate_quantities_influence_function` helper functions, which both assume the implementation implements the `compute_intermediate_quantities` method.

# Reason for inheritance structure
`InfluenceFunctionBase` refers to any implementation that computes the "infinitesimal" influence score (as opposed to `TracInCPBase`, which computes the checkpoint-based definition of influence score).  Thus the different "base" implementations implement differently-defined influence scores, and children of a base implementation compute the same influence score in different ways.  `IntermediateQuantitiesInfluenceFunction` refers to implementations of `InfluenceFunctionBase` that implement the `compute_intermediate_quantities` method. The reason we don't let `NaiveInfluenceFunction` and `ArnoldiInfluenceFunction` directly inherit from `InfluenceFunctionBase` is that their implementations of `influence` and `self_influence` are actually identical (though for logging reasons, we cannot just move those methods into `IntermediateQuantitiesInfluenceFunction`).  In the future, there may be implementations of `InfluenceFunctionBase` that do *not* inherit from `IntermediateQuantitiesInfluenceFunction`, i.e. the LISSA approach of Koh et al.

# Key helper methods
- `captum._utils._stateless.functional_call` is copy pasted from [Pytorch 13.0 implementation](https://github.com/pytorch/pytorch/blob/17202b363780a06ae07e5cecceffaae6418ad6f8/torch/nn/utils/stateless.py) so that the user does not need to use the latest Pytorch version, and turns a Pytorch `module` into a function whose inputs are the parameters of the `module` (represented as a dictionary).  This function is used to compute the Hessian in `NaiveInfluenceFunction`, and Hessian-vector products in `ArnoldiInfluenceFunction`.
- `_compute_dataset_func` is used by `NaiveInfluenceFunction` to compute the Hessian over `hessian_dataset`.  This is done by calculating the Hessian over individual batches, and then summing them up.  One complication is that `torch.autograd.functional.hessian`, which we use to compute Hessians, does not return the Hessian as a 2D tensor unless the function we seek the Hessian of accepts a 1D tensor.  Therefore, we need to define a function of the model's parameters whose input is the parameters, *flattened* into a 1D tensor (and a batch).  This function is given by the factory returned by `naive_influnce_function._flatten_forward_factory`.
- `_parameter_arnoldi` performs the Arnoldi iteration and is used by `ArnoldiInfluenceFunction`.  It differs from a "traditional" implementation in that the Hessian-vector function it accepts does not map from 1D tensor to 1D tensor.  Instead, it maps from tuple of tensor to tuple of tensor, because the "vector" in this case represents a parameter setting, which Pytorch represents as a tuple of tensor.  Therefore, all the operations work with tuple of tensors, which required defining various operations for tuple of tensors in `captum.influence._utils.common`.  This method returns a basis for the Krylov subspace, and the restriction of the Hessian to it.
- `_parameter_distill` takes the output of `_parameter_distill`, and returns the (approximate) top-k eigenvalues / eigenvectors of the Hessian.  This is what is needed to compute `R`.  It is used by `ArnoldiInfluenceFunction`.

# Tests
We create a new test file `tests.influence._core.test_arnoldi_influence.py`, which defines the class `TestArnoldiInfluence` implementing the following tests:
#### Tests used only by `NaiveInfluenceFunction`, i.e. appear in this diff:
- `test_matches_linear_regression` compares the influence scores and self-influence scores produced by a given implementation with analytically-calculated counterparts for a model where the exact influence scores are known - linear regression.  Different reductions for loss function - 'mean', 'sum', 'none' are tested.  Here, we test the following implementation:
-- `NaiveInfluenceFunction` with `projection_dim=None`, i.e. we use the inverse Hessian, not a low-rank approximation of it.  In this case, the influence scores should equal the analytically calculated ones, modulo numerical issues.
- `test_flatten_unflattener`: a common operation is flattening a tuple of tensors and unflattening it (the inverse operation).  This tests checks that flattening and unflattening a tuple of tensors gives the original tensor.
- `test_top_eigen`: a common operation is finding the the top eigenvectors / eigenvalues of a possibly non-symmetric matrix.  Since `torch.linalg.eig` doesn't sort the eigenvalues, we make a wrapper that does do it.  This checks that the wrapper is working properly.
#### Tests used only by `ArnoldiInfluenceFunction`, i.e. appear in next diff:
- `test_parameter_arnoldi` checks that `_parameter_arnoldi` is correct.  In particular, it checks that the top-`k` eigenvalues of the restriction of `A` to a Krylov subspace (the `H` returned by `_parameter_arnoldi`) agree with those of the original matrix. This is a property we expect of the Arnoldi iteration that `_parameter_arnoldi` implements.
- `test_parameter_distill` checks that `_parameter_distill` is correct. In particular, it checks that the eigenvectors corresponding to the top eigenvalues it returns agree with the top eigenvectors of `A`. This is the property we require of `distill`, because we use the top eigenvectors (and eigenvalues) of (implicitly-defined) `A` to calculate a low-rank approximation of its inverse.
- `test_matches_linear_regression` where the implementation tested is the following:
-- `ArnoldiInfluenceFunction` with `arnoldi_dim` and `projection_dim` set to a large value.  The Krylov subspace should contain the largest eigenvectors because `arnoldi_dim` is large, and `projection_dim` is not too large relative to `arnoldi_dim`, but still large on an absolute level.
- When `projection_dim` is small, `ArnoldiInfluenceFunction` and `NaiveInfluenceFunction` should produce the same influence scores, provided `arnoldi_dim` for `ArnoldiInfluenceFunction` is large, since in this case, the top-k eigenvalues / eigenvectors for the two implementations should agree.  This agreement is tested in `test_compare_implementations_trained_NN_model_and_data` and `test_compare_implementations_random_model_and_data` for a trained and untrained 2-layer NN, respectively.

# Minor changes / functionalities / tests
- `test_tracin_intermediate_quantities_aggregate`, `test_tracin_self_influence`, `test_tracin_identity_regression` are applied to both implementations
- `_set_active_params` now extracts the layers to consider when computing gradients and sets their `requires_grad`.  This refactoring is done since the same logic is used by `TracInCPBase` and `InfluenceFunctionBase`.
- some helpers are moved from `tracincp` to `captum.influence._utils.common`
- a separate `test_loss_fn` initialization argument is supported, and both implementations are now tested in `TestTracinRegression.test_tracin_constant_test_loss_fn`
- `compute_intermediate_quantities` for both implementations support the `aggregate` option.  This means that both implementations can be used with D40386079, the validation influence FAIM workflow.
- given the aforementioned tests, testing now generates multiple kinds of models / data.  The ability to do so is added to `get_random_model_and_data`.  The specific model (and its parameters) are specified by the `model_type` argument.  Before, the method only supports the random 2-layer NN.  Now, it also supports an optimally-trained linear regression, and a 2-layer NN trained with SGD.
- `TracInCP` and implementations of `InfluenceFunctionBase` all accept a `sample_wise_grads_per_batch` option, and have the same requirements on the loss function.  Thus, `_check_loss_fn_tracincp`, which previously performed those checks, is renamed `_check_loss_fn_sample_wise_grads_per_batch` and moved to `captum.influence._utils.common`.  Similarly, those implementations all need to compute the jacobian, with the method depending on `sample_wise_grads_per_batch`.  The jacobian computation is moved to helper function `_compute_jacobian_sample_wise_grads_per_batch`.

Differential Revision: https://www.internalfb.com/diff/D40541294?entry_point=27

fbshipit-source-id: cd94a98782d0aa2f012c9cf36e31ed13d58dc1d4


          add ArnoldiInfluenceFunction (pytorch#1187)

b2c3b96

Summary:
Pull Request resolved: pytorch#1187

This diff implements `ArnoldiInfluenceFunction`, which was described, along with `NaiveInfluenceFunction` in D40541294.  Please see that diff for detailed description.  Previously implementations of both methods had been 1 diff.  Now, `ArnoldiInfluenceFunction` is separated out for easier review.

Differential Revision: D42006733

fbshipit-source-id: 0f7e5295afbd2c208c6a0073b1ac7d825e0ae2bc

Contributor

facebook-github-bot commented Dec 4, 2023

This pull request was exported from Phabricator. Differential Revision: D42006733

99warriors force-pushed the export-D42006733 branch from 031426e to b2c3b96 Compare

December 4, 2023 14:44

facebook-github-bot closed this in

68d88cf

facebook-github-bot added the Merged label

Contributor

facebook-github-bot commented Dec 5, 2023

This pull request has been merged in 68d88cf.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed fb-exported Merged