--- datasets: - deepset/germanquad language: - de metrics: - squad pipeline_tag: question-answering --- # mdeberta-base-v3 GermanSQuAD Model Welcome to the repository for the German mdeberta-base-v3 model fine-tuned on the GermanSQuAD dataset for the task of extractive question answering (QA). This model aims to provide an effective solution for extracting answers from German text, leveraging the robust capabilities of the mdeberta-base-v3 language model. ## Overview This model is fine-tuned to understand and process German language questions and contexts, making it a powerful tool for applications requiring extractive QA capabilities in German. - **Language model**: mdeberta-base-v3 - **Language**: German - **Downstream-task**: Extractive Question Answering - **Training data**: GermanSQuAD - **Evaluation data**: SQuAD (translated to German for consistent evaluation) ## Model Training The model was trained with the following hyperparameters: - **Batch size**: 12 - **Number of epochs**: 4 - **Base language model**: mdeberta-base-v3 - **Learning rate**: 2e-5 - **Learning rate schedule**: Linear with Warmup - **Warmup proportion**: 0.1 These hyperparameters were selected to optimize the model's performance on the extractive QA task, balancing training efficiency with the quality of the resulting model. ## Results The model achieved the following results on the evaluation data: - **Exact Match (EM)**: 64.56% - **F1 Score**: 82.51% These metrics indicate the model's effectiveness at identifying the exact answers within the provided context as well as its ability to match answers that are semantically correct but not an exact text match. ## Usage To use this model for extractive question answering in German, you can load it using the Hugging Face Transformers library. Below is a quick example of how to do so: ```python from transformers import AutoModelForQuestionAnswering, AutoTokenizer model_name = "adresolo/mdeberta-v3-base-germansquad" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForQuestionAnswering.from_pretrained(model_name) # Example question and context question = "Was ist das Hauptziel der QA-Aufgabe?" context = "Die Hauptaufgabe der Fragebeantwortung (QA) ist es, aus einem gegebenen Kontext die genaue Antwort auf eine gestellte Frage zu extrahieren." inputs = tokenizer(question, context, return_tensors='pt') answer_start_scores, answer_end_scores = model(**inputs) # Decoding the predicted answer answer_start = torch.argmax(answer_start_scores) # The start position of your answer answer_end = torch.argmax(answer_end_scores) + 1 # The end position of your answer answer = tokenizer.convert_tokens_to_string(tokenizer.convert_ids_to_tokens(inputs['input_ids'][0][answer_start:answer_end])) print("Predicted answer:", answer) ```