turkish-qa-transformers

Exploring the Effectiveness of Pre-Trained Transformer Models for Turkish Question Answering

This repository contains the implementation of our research article titled "Exploring the Effectiveness of Pre-Trained Transformer Models for Turkish Question Answering." The study evaluates several pre-trained transformer architectures for their performance on a Turkish question answering (QA) benchmark dataset.

📄 Paper

If you use this code in your research, please cite the following paper:

Abdullah Talha Kabakus. Exploring the Effectiveness of Pre-Trained Transformer Models for Turkish Question Answering. [Under Review].

📊 Dataset

We use the publicly available gold-standard Turkish QA dataset released by Soygazi et al. (2021). It contains:

14,221 QA pairs for training
3,114 QA pairs for evaluation
Structured in SQuAD-style JSON format

Note: We do not host the dataset directly due to licensing. You can download it from the original source.

🔍 Models Evaluated

electra-base-turkish-cased-discriminator
XLM-RoBERTa
bert-base-turkish-cased
DistilBERT
T5-Small

All models were evaluated using the Hugging Face transformers and datasets libraries.

⚙️ Installation

git clone https://github.com/your-username/turkish-qa-transformers.git
cd turkish-qa-transformers
pip install -r requirements.txt

🚀 How to Run
	1.	Download the dataset and place it in the data/ folder.
	2.	Launch the notebook:

jupyter notebook notebooks/qa_pipeline.ipynb

	3.	Follow the steps to:
	•	Load data
	•	Preprocess using tokenizers
	•	Fine-tune transformer models
	•	Evaluate using F1 Score, Exact Match, and BLEU

📈 Evaluation Metrics

We use:
	•	F1 Score: Token-level overlap
	•	Exact Match (EM): Exact string match
	•	BLEU: N-gram-based similarity

🧪 Reproducibility

The entire workflow can be reproduced via the Jupyter notebook. Evaluation is consistent with the original dataset’s predefined train/test split.

⸻

⭐ Acknowledgments

Thanks to Soygazi et al. (2021) for providing the Turkish QA dataset.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
LICENSE		LICENSE
README.md		README.md
train-models.ipynb		train-models.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

turkish-qa-transformers

Exploring the Effectiveness of Pre-Trained Transformer Models for Turkish Question Answering

📄 Paper

📊 Dataset

🔍 Models Evaluated

⚙️ Installation

About

Uh oh!

Releases

Packages

Languages

License

talhakabakus/turkish-qa-transformers

Folders and files

Latest commit

History

Repository files navigation

turkish-qa-transformers

Exploring the Effectiveness of Pre-Trained Transformer Models for Turkish Question Answering

📄 Paper

📊 Dataset

🔍 Models Evaluated

⚙️ Installation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages