ssmits
/

Falcon2-5.5B-German

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Falcon2-5.5B-German / README.md

ssmits's picture

Update README.md

bde78de verified 6 months ago

|

3 kB

	---
	base_model:
	- tiiuae/falcon-11B
	library_name: transformers
	tags:
	- mergekit
	- merge
	- lazymergekit
	license: apache-2.0
	language:
	- de
	---
	# sliced

	This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).

	## Merge Details
	### Merge Method

	This model was merged using the passthrough merge method.

	### Models Merged

	The following models were included in the merge:
	* [tiiuae/falcon-11B](https://huggingface.co/tiiuae/falcon-11B)

	### Configuration

	The following YAML configuration was used to produce this model:

	```yaml
	slices:
	- sources:
	- model: tiiuae/falcon-11B
	layer_range: [0, 24]
	- sources:
	- model: tiiuae/falcon-11B
	layer_range: [55, 59]
	merge_method: passthrough
	dtype: bfloat16
	```

	[PruneMe](https://github.com/arcee-ai/PruneMe) has been utilized using the wikimedia/wikipedia German (de) subset by investigating layer similarity with 2000 samples. The layer ranges for pruning were determined based on this analysis to maintain performance while reducing model size.

	![Layer Similarity Plot](https://cdn-uploads.huggingface.co/production/uploads/660c0a02cf274b3ab77dd6b7/k9VKXgqUuUr0EjGZf7Ick.png)

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM
	import transformers
	import torch

	model = "ssmits/Falcon2-5.5B-German"

	tokenizer = AutoTokenizer.from_pretrained(model)
	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	tokenizer=tokenizer,
	torch_dtype=torch.bfloat16,
	)
	sequences = pipeline(
	"Can you explain the concepts of Quantum Computing?",
	max_length=200,
	do_sample=True,
	top_k=10,
	num_return_sequences=1,
	eos_token_id=tokenizer.eos_token_id,
	)
	for seq in sequences:
	print(f"Result: {seq['generated_text']}")

	```

	💥 Falcon LLMs require PyTorch 2.0 for use with `transformers`!

	For fast inference with Falcon, check-out [Text Generation Inference](https://github.com/huggingface/text-generation-inference)! Read more in this [blogpost]((https://huggingface.co/blog/falcon).

	## Direct Use
	Research on large language models; as a foundation for further specialization and finetuning for specific usecases (e.g., summarization, text generation, chatbot, etc.)

	## Out-of-Scope Use
	Production use without adequate assessment of risks and mitigation; any use cases which may be considered irresponsible or harmful.

	## Bias, Risks, and Limitations
	Falcon2-5.5B is trained mostly on English, but also German, Spanish, French, Italian, Portuguese, Polish, Dutch, Romanian, Czech, Swedish. It will not generalize appropriately to other languages. Furthermore, as it is trained on a large-scale corpora representative of the web, it will carry the stereotypes and biases commonly encountered online.

	## Recommendations
	We recommend users of Falcon2-5.5B to consider finetuning it for the specific set of tasks of interest, and for guardrails and appropriate precautions to be taken for any production use.