MamaBot-Llama / README.md

HelpMum-Personal

Update README.md

aca319c verified 18 days ago

preview code

raw

history blame contribute delete

No virus

5.9 kB


	---
	library_name: transformers
	tags:
	- maternal-healthcare
	- causal-language-model
	- Llama
	- transformers
	- healthcare-chatbot
	- open-source
	- fine-tuning
	---

	# Model Card for MamaBot-Llama

	MamaBot-Llama is an opensource fine-tuned large language model developed by HelpMum to assist with maternal healthcare by providing accurate and reliable answers to questions about pregnancy, maternal and child health. The model has been fine-tuned on Llama 3.1 8b using a dataset of maternal healthcare questions and answers.

	## Model Details

	- Developed by: HelpMum
	- Shared by : HelpMum
	- Model type: Causal Language Model (Llama 3.1 8b)
	- Language(s) (NLP): English
	- License: Apache-2.0
	- Finetuned from model: Llama 3.1 8b

	### Model Sources

	- Repository: [MamaBot-Llama on Hugging Face](https://huggingface.co/HelpMumHQ/MamaBot-Llama)

	## Uses

	### Direct Use

	MamaBot-Llama can be directly used to provide answers to maternal healthcare questions, offering guidance and support to mothers during pregnancy, maternal and child health.

	### Downstream Use

	The model can be integrated into healthcare applications, chatbots, or other systems that aim to provide maternal healthcare support.

	### Out-of-Scope Use

	The model is not intended for use in medical diagnosis or treatment without the supervision of a qualified healthcare professional. It should not be used for malicious purposes or misinformation.

	## Bias, Risks, and Limitations

	The model was trained on a specific dataset related to maternal healthcare. While it aims to provide accurate and supportive information, users should be aware of the following:

	- Bias: The model may reflect biases present in the training data, which could affect the quality and impartiality of the responses.
	- Risks: Users should not rely solely on the model for critical medical decisions. Always consult with a healthcare professional for medical advice.
	- Limitations: The model's responses are based on the data it was trained on and may not cover all possible scenarios or latest medical guidelines.

	### Recommendations

	Users (both direct and downstream) should be made aware of the risks, biases, and limitations of the model. It is recommended to use the model as a supplementary tool and not as a primary source of medical advice.

	## How to Get Started with the Model

	Use the code below to get started with the model (make sure you have access to the model).

	```python
	!pip install -q -U transformers bitsandbytes

	from huggingface_hub import HfFolder
	HfFolder.save_token('hf_...')

	from transformers import AutoModelForCausalLM, AutoTokenizer
	tokenizer = AutoTokenizer.from_pretrained('HelpMumHQ/MamaBot-Llama')
	model = AutoModelForCausalLM.from_pretrained('HelpMumHQ/MamaBot-Llama')

	def generate_response(user_message):
	tokenizer.chat_template = "{%- for message in messages %}{{ bos_token + '[INST] ' + message['content'] + ' [/INST]' if message['role'] == 'user' else ' ' + message['content'] + ' ' + eos_token }}{%- endfor %}"
	messages = [{"role": "user", "content": user_message}]
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	inputs = tokenizer(prompt, return_tensors='pt', truncation=True).to("cuda")
	outputs = model.generate(**inputs, max_length=150, num_return_sequences=1)
	text = tokenizer.decode(outputs[0], skip_special_tokens=True)
	response = (text[text.find('[/INST]') + len('[/INST]'):text.find('[INST]', text.find('[/INST]') + len('[/INST]'))] if text.find('[INST]', text.find('[/INST]') + len('[/INST]')) != -1 else text[text.find('[/INST]') + len('[/INST]'):]).strip().split('[/INST]')[0].strip()
	return response

	# Sample usage
	user_message = "Why might mothers not realize they are already pregnant in the first two weeks?"
	response = generate_response(user_message)
	print(response)
	```

	## Training Details

	### Training Data

	The training data consists of a HelpMum-created dataset of maternal healthcare questions and answers covering all stages of pregnancy up to birth.

	### Training Procedure

	#### Preprocessing

	The dataset was cleaned and formatted to align with the required input format for the model.

	#### Training Hyperparameters

	- Training regime: torch.bfloat16
	- Optimizer: paged_adamw_32bit
	- Learning rate: 2e-4

	## Evaluation

	### Testing Data, Factors & Metrics

	#### Testing Data

	The testing data is a subset of the training dataset, split into training and testing sets.

	#### Factors

	The evaluation considered the training and validation losses.

	#### Metrics

	The model was evaluated based on training loss and validation loss metrics.

	### Results

	- Training Loss: 0.4654
	- Validation Loss: 0.5168

	#### Summary

	The model showed consistent performance with a training loss of 0.4654 and a validation loss of 0.5168, indicating its effectiveness in answering maternal healthcare questions.

	## Environmental Impact

	- Hardware Type: GPU

	## Technical Specifications

	### Model Architecture and Objective

	The model is based on the Llama 3.1 8b architecture and aims to provide accurate and supportive responses to maternal healthcare questions.

	### Compute Infrastructure

	#### Hardware

	The model was trained using GPUs to handle the computational load of fine-tuning a large language model.

	#### Software

	The training and inference were conducted using the Hugging Face Transformers library and other associated tools.

	## Citation

	BibTeX:

	```bibtex
	@misc{mamabot-llama,
	author = {HelpMum},
	title = {MamaBot-Llama},
	year = {2024},
	howpublished = {\url{https://huggingface.co/HelpMumHQ/MamaBot-Llama}},
	}
	```

	APA:

	HelpMum. (2024). MamaBot-Llama. Retrieved from https://huggingface.co/HelpMumHQ/MamaBot-Llama

	## Model Card Contact

	For more information, please contact [tech@helpmum.org](mailto:tech@helpmum.org).