Falcon3-Continued-0.3-10B-Base / README.md

Update README.md

d5628b8 verified 3 months ago

5.42 kB

	---
	license: other
	license_name: falcon-llm-license
	license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
	datasets:
	- PleIAs/common_corpus
	base_model:
	- tiiuae/Falcon3-10B-Base
	---

	Falcon3-Continued-0.3-10B-Base is built using artificial intelligence technology from the Technology Innovation Institute.

	This model uses qLoRA with UnSloth to continuously pretrain Falcon3-10B-Base on an additional 30,720 rows from PleIAs/common_corpus, cyclically.

	Rows trained at a time varied between 2048, 4096, and 8192, using cosine decay. A merged model was saved and tested every 10240 rows.

	Adapters ranged from rank 32 to rank 128, with ranks 64 and 128 being the most common. Weight decay was 0.01.

	Trained context length ranged from 4096 to the full 32678, with 32678 being the most common. Sample packing was not used.
	Long documents, if present, were truncated.

	Training continued until no improvement in eq_bench was demonstrated from this method. Most other benchmarks stayed similar.

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base,dtype=auto,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: auto
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \| Value \| \|Stderr\|
	\|--------\|------:\|------\|-----:\|-----------------\|---\|-------:\|---\|-----:\|
	\|eq_bench\| 2.1\|none \| 0\|eqbench \|↑ \| 64.2105\|± \|2.1413\|
	\| \| \|none \| 0\|percent_parseable\|↑ \|100.0000\|± \|0.0000\|

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4
	\|Tasks\|Version\| Filter \|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-----\|------:\|----------------\|-----:\|-----------\|---\|-----:\|---\|-----:\|
	\|gsm8k\| 3\|flexible-extract\| 5\|exact_match\|↑ \|0.8105\|± \|0.0108\|
	\| \| \|strict-match \| 5\|exact_match\|↑ \|0.8036\|± \|0.0109\|

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4 (4,64,64,64)
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-------------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|-----:\|
	\|arc_challenge\| 1\|none \| 0\|acc \|↑ \|0.5401\|± \|0.0146\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.5648\|± \|0.0145\|
	\|piqa \| 1\|none \| 0\|acc \|↑ \|0.7873\|± \|0.0095\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.7954\|± \|0.0094\|
	\|sciq \| 1\|none \| 0\|acc \|↑ \|0.9620\|± \|0.0060\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.9500\|± \|0.0069\|
	\|winogrande \| 1\|none \| 0\|acc \|↑ \|0.7332\|± \|0.0124\|

	MuSR:

	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| murder mysteries \| regular \| 134 / 250 \| 53.6
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| object placements \| regular \| 130 / 256 \| 50.8
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| team allocation \| regular \| 100 / 250 \| 40.0

	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| murder mysteries \| cot+ \| 145 / 250 \| 58.0
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| object placements \| cot+ \| 83 / 256 \| 32.4
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| team allocation \| cot+ \| 112 / 250 \| 44.8

	Original under same conditions:

	hf (pretrained=tiiuae/Falcon3-10B-Base,dtype=auto,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: auto
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \| Value \| \|Stderr\|
	\|--------\|------:\|------\|-----:\|-----------------\|---\|-------:\|---\|-----:\|
	\|eq_bench\| 2.1\|none \| 0\|eqbench \|↑ \| 60.9913\|± \|2.2402\|
	\| \| \|none \| 0\|percent_parseable\|↑ \|100.0000\|± \|0.0000\|

	hf (pretrained=tiiuae/Falcon3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4
	\|Tasks\|Version\| Filter \|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-----\|------:\|----------------\|-----:\|-----------\|---\|-----:\|---\|-----:\|
	\|gsm8k\| 3\|flexible-extract\| 5\|exact_match\|↑ \|0.8188\|± \|0.0106\|
	\| \| \|strict-match \| 5\|exact_match\|↑ \|0.8105\|± \|0.0108\|

	hf (pretrained=tiiuae/Falcon3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4 (4,64,64,64)
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-------------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|-----:\|
	\|arc_challenge\| 1\|none \| 0\|acc \|↑ \|0.5520\|± \|0.0145\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.5887\|± \|0.0144\|
	\|piqa \| 1\|none \| 0\|acc \|↑ \|0.7873\|± \|0.0095\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.7949\|± \|0.0094\|
	\|sciq \| 1\|none \| 0\|acc \|↑ \|0.9610\|± \|0.0061\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.9360\|± \|0.0077\|
	\|winogrande \| 1\|none \| 0\|acc \|↑ \|0.7364\|± \|0.0124\|

	MuSR:

	RUNNING \| tiiuae/Falcon3-10B-Base \| murder mysteries \| regular \| 144 / 250 \| 57.6
	RUNNING \| tiiuae/Falcon3-10B-Base \| object placements \| regular \| 124 / 256 \| 48.4
	RUNNING \| tiiuae/Falcon3-10B-Base \| team allocation \| regular \| 126 / 250 \| 50.4

	RUNNING \| tiiuae/Falcon3-10B-Base \| murder mysteries \| cot+ \| 140 / 250 \| 56.0
	RUNNING \| tiiuae/Falcon3-10B-Base \| object placements \| cot+ \| 139 / 256 \| 54.3
	RUNNING \| tiiuae/Falcon3-10B-Base \| team allocation \| cot+ \| 118 / 250 \| 47.2

	---
	license: other
	license_name: falcon-llm-license
	license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
	datasets:
	- PleIAs/common_corpus
	base_model:
	- tiiuae/Falcon3-10B-Base
	---

	Falcon3-Continued-0.3-10B-Base is built using artificial intelligence technology from the Technology Innovation Institute.

	This model uses qLoRA with UnSloth to continuously pretrain Falcon3-10B-Base on an additional 30,720 rows from PleIAs/common_corpus, cyclically.

	Rows trained at a time varied between 2048, 4096, and 8192, using cosine decay. A merged model was saved and tested every 10240 rows.

	Adapters ranged from rank 32 to rank 128, with ranks 64 and 128 being the most common. Weight decay was 0.01.

	Trained context length ranged from 4096 to the full 32678, with 32678 being the most common. Sample packing was not used.
	Long documents, if present, were truncated.

	Training continued until no improvement in eq_bench was demonstrated from this method. Most other benchmarks stayed similar.

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base,dtype=auto,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: auto
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \| Value \| \|Stderr\|
	\|--------\|------:\|------\|-----:\|-----------------\|---\|-------:\|---\|-----:\|
	\|eq_bench\| 2.1\|none \| 0\|eqbench \|↑ \| 64.2105\|± \|2.1413\|
	\| \| \|none \| 0\|percent_parseable\|↑ \|100.0000\|± \|0.0000\|

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4
	\|Tasks\|Version\| Filter \|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-----\|------:\|----------------\|-----:\|-----------\|---\|-----:\|---\|-----:\|
	\|gsm8k\| 3\|flexible-extract\| 5\|exact_match\|↑ \|0.8105\|± \|0.0108\|
	\| \| \|strict-match \| 5\|exact_match\|↑ \|0.8036\|± \|0.0109\|

	hf (pretrained=Lambent/Falcon3-Continued-0.3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4 (4,64,64,64)
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-------------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|-----:\|
	\|arc_challenge\| 1\|none \| 0\|acc \|↑ \|0.5401\|± \|0.0146\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.5648\|± \|0.0145\|
	\|piqa \| 1\|none \| 0\|acc \|↑ \|0.7873\|± \|0.0095\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.7954\|± \|0.0094\|
	\|sciq \| 1\|none \| 0\|acc \|↑ \|0.9620\|± \|0.0060\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.9500\|± \|0.0069\|
	\|winogrande \| 1\|none \| 0\|acc \|↑ \|0.7332\|± \|0.0124\|

	MuSR:

	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| murder mysteries \| regular \| 134 / 250 \| 53.6
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| object placements \| regular \| 130 / 256 \| 50.8
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| team allocation \| regular \| 100 / 250 \| 40.0

	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| murder mysteries \| cot+ \| 145 / 250 \| 58.0
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| object placements \| cot+ \| 83 / 256 \| 32.4
	RUNNING \| Lambent/Falcon3-Continued-0.3-10B-Base \| team allocation \| cot+ \| 112 / 250 \| 44.8

	Original under same conditions:

	hf (pretrained=tiiuae/Falcon3-10B-Base,dtype=auto,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: 0, batch_size: auto
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \| Value \| \|Stderr\|
	\|--------\|------:\|------\|-----:\|-----------------\|---\|-------:\|---\|-----:\|
	\|eq_bench\| 2.1\|none \| 0\|eqbench \|↑ \| 60.9913\|± \|2.2402\|
	\| \| \|none \| 0\|percent_parseable\|↑ \|100.0000\|± \|0.0000\|

	hf (pretrained=tiiuae/Falcon3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4
	\|Tasks\|Version\| Filter \|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-----\|------:\|----------------\|-----:\|-----------\|---\|-----:\|---\|-----:\|
	\|gsm8k\| 3\|flexible-extract\| 5\|exact_match\|↑ \|0.8188\|± \|0.0106\|
	\| \| \|strict-match \| 5\|exact_match\|↑ \|0.8105\|± \|0.0108\|

	hf (pretrained=tiiuae/Falcon3-10B-Base), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto:4 (4,64,64,64)
	\| Tasks \|Version\|Filter\|n-shot\| Metric \| \|Value \| \|Stderr\|
	\|-------------\|------:\|------\|-----:\|--------\|---\|-----:\|---\|-----:\|
	\|arc_challenge\| 1\|none \| 0\|acc \|↑ \|0.5520\|± \|0.0145\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.5887\|± \|0.0144\|
	\|piqa \| 1\|none \| 0\|acc \|↑ \|0.7873\|± \|0.0095\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.7949\|± \|0.0094\|
	\|sciq \| 1\|none \| 0\|acc \|↑ \|0.9610\|± \|0.0061\|
	\| \| \|none \| 0\|acc_norm\|↑ \|0.9360\|± \|0.0077\|
	\|winogrande \| 1\|none \| 0\|acc \|↑ \|0.7364\|± \|0.0124\|

	MuSR:

	RUNNING \| tiiuae/Falcon3-10B-Base \| murder mysteries \| regular \| 144 / 250 \| 57.6
	RUNNING \| tiiuae/Falcon3-10B-Base \| object placements \| regular \| 124 / 256 \| 48.4
	RUNNING \| tiiuae/Falcon3-10B-Base \| team allocation \| regular \| 126 / 250 \| 50.4

	RUNNING \| tiiuae/Falcon3-10B-Base \| murder mysteries \| cot+ \| 140 / 250 \| 56.0
	RUNNING \| tiiuae/Falcon3-10B-Base \| object placements \| cot+ \| 139 / 256 \| 54.3
	RUNNING \| tiiuae/Falcon3-10B-Base \| team allocation \| cot+ \| 118 / 250 \| 47.2