This repository contains the code for reproduce the results from "On the Expressivity Role of LayerNorm in Transformers' Attention" (Findings of ACL'2023) [PDF].
Make sure you have wandb.ai user and that you are logged into your machine.
Install the required python packages:
pip install -r requirements.txt
Gurobi is needed to find unselectable keys, and requires a license. See in here.
In general, all experiments can run on either GPU or CPU.
- The
majoritysubdirectory contains the files needed to reproduce the results of the Majority task (Figure 1a, 1b, 2, 3). - The
unselectablesubdirectory contains the files needed to reproduce the results of the unselectable experiments (Figure 1c, 1d, 4, Table 1, 2).
On the Expressivity Role of LayerNorm in Transformers' Attention
@article{brody2023expressivity,
title={On the Expressivity Role of LayerNorm in Transformers' Attention},
author={Brody, Shaked and Alon, Uri and Yahav, Eran},
journal={arXiv preprint arXiv:2305.02582},
year={2023}
}
