Open
Description
In Interpreting Bert Layers Section (cell 26):
lc = LayerConductance(squad_pos_forward_func, model.bert.encoder.layer[i])
layer_attributions_start = lc.attribute(inputs=input_embeddings, baselines=ref_input_embeddings, additional_forward_args=(token_type_ids, position_ids,attention_mask, 0))[0]
layer_attributions_end = lc.attribute(inputs=input_embeddings, baselines=ref_input_embeddings, additional_forward_args=(token_type_ids, position_ids,attention_mask, 1))[0]
where input_embeddings and ref_input_embeddings are computed in construct_whole_bert_embeddings (cell 7):
input_embeddings = interpretable_embedding.indices_to_embeddings(input_ids, token_type_ids=token_type_ids, position_ids=position_ids)
ref_input_embeddings = interpretable_embedding.indices_to_embeddings(ref_input_ids, token_type_ids=token_type_ids, position_ids=position_ids)
interpretable_embedding = configure_interpretable_embedding_layer(model, 'bert.embeddings') (cell 25)
To my understanding, input_embeddings and ref_input_embeddings thus obtained are outputs of BertEmbeddings layer, i.e. summation of input_embedding, position_embedding and token_type_embedding. They are not the "inputs" arg of squad_pos_forward_func, which should be input_ids. If this is true, why are token_type_ids and position_ids still needed in additional_forward_args?
- Da Xiao