Initial Upload

3801fda 4 months ago

4.01 kB

	---
	license: mit
	language:
	- it
	---

	# Model Card for Modello Italia 9B INT4 group-size 128 GPU-optimized
	This an UNOFFICIAL conversion/quantization of the OFFICIAL model checkpoint of "Modello Italia 9B", Large Language Model (LLM) developed by [iGenius](https://it.igenius.ai/) in collaboration with [CINECA](https://www.cineca.it/).
	* More information about Modello Italia: [click here](https://it.igenius.ai/language-models).

	This model has been quantized in INT4, group-size 128, and optimized for inferencing on GPU.

	## 🚨 Disclaimers
	* This is an UNOFFICIAL quantization of the OFFICIAL model checkpoint released by iGenius.
	* This model is based also on the conversion made for HF Transformers by [Sapienza NLP, Sapienza University of Rome](https://huggingface.co/sapienzanlp).
	* The original model was developed using LitGPT, therefore, the weights need to be converted before they can be used with Hugging Face transformers.

	## 🚨 Terms and Conditions
	* Note: By using this model, you accept the iGenius' [terms and conditions](https://secure.igenius.ai/legal/italia_terms_and_conditions.pdf).

	## 🚨 Reproducibility
	This model has been quantized using Intel [auto-round](https://github.com/intel/auto-round), based on [SignRound technique](https://arxiv.org/pdf/2309.05516v4).

	```
	git clone https://github.com/fbaldassarri/model-conversion.git

	cd model-conversion

	mkdir models

	cd models

	huggingface-cli download --resume-download --local-dir sapienzanlp_modello-italia-9b --local-dir-use-symlinks False sapienzanlp/modello-italia-9b
	```

	Then,

	```
	python3 ./examples/language-modeling/main.py \
	--model_name ./models/sapienzanlp_modello-italia-9b \
	--device 0 \
	--group_size 128 \
	--bits 4 \
	--iters 1000 \
	--deployment_device 'gpu' \
	--output_dir "./models/sapienzanlp_modello-italia-9b-int4" \
	--train_bs 2 \
	--gradient_accumulate_steps 8
	```

	## 🚨 Biases and Risks
	From the terms and conditions of iGenius for Modello Italia:
	> Modello Italia è concepito per essere utilizzato da tutti e per adattarsi a una vasta gamma di casi
	d'uso. È stato progettato con l'obiettivo di essere accessibile a persone provenienti da
	background, esperienze e prospettive diverse. Modello Italia si rivolge agli utenti e alle loro
	esigenze senza inserire giudizi superflui o normative, riconoscendo al contempo che anche
	contenuti potenzialmente problematici in determinati contesti possono avere scopi validi in altri.
	Il rispetto per la dignità e l'autonomia di tutti gli utenti, specialmente in termini di libertà di
	pensiero ed espressione, è un pilastro fondamentale del suo design. Tuttavia, essendo una nuova
	tecnologia, Modello Italia comporta rischi legati al suo utilizzo. I test condotti finora sono stati
	eseguiti in italiano e non hanno potuto coprire tutte le possibili situazioni. Pertanto, come per
	tutti gli LLM, non è possibile prevedere in anticipo gli output di Modello Italia e il modello
	potrebbe in alcuni casi generare risposte imprecise, tendenziose o altre risposte discutibili. Prima
	di utilizzare Modello Italia in qualsiasi contesto, gli sviluppatori sono fortemente incoraggiati a
	eseguire test di sicurezza e adattamento specifici per le loro applicazioni.

	We are aware of the biases and potential problematic/toxic content that current pretrained large language models exhibit: more specifically, as probabilistic models of (Italian and English) languages, they reflect and amplify the biases of their training data.

	For more information about this issue, please refer to our survey paper:
	* [Biases in Large Language Models: Origins, Inventory, and Discussion](https://dl.acm.org/doi/full/10.1145/3597307)

	## Model architecture
	* The model architecture is based on GPT-NeoX.

	## Results
	Modello Italia 9B INT4 group-size 128 GPU-optimized has not been evaluated on standard benchmarks yet.
	If you would like to contribute with your evaluation, please feel free to submit a pull request.