Mpt-Instruct-DotNet-XS / README.md

Kabumbus

Usage example

59929e0 about 1 year ago

preview code

raw

history blame contribute delete

No virus

3.78 kB

	---
	license: cc-by-sa-3.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- csharp
	- mpt
	- instruct
	- 1b
	- llm
	- .net
	---
	Upsides:
	- similar in quality (slightly worse) for C# code generation and explanation as 7b [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S),
	- 1b params size (2.6gb, bfloat16 finetuned),
	- 6x smaller,
	- 4x+ faster


	Downsides:
	- Sometimes, sufferers from response repetition-reiteration-not-ending when outputting for general discussion questions
	- Slightly worse in code generation than 7b model
	- No GGML/LLAMA.cpp running on CPU support yet

	Based on [mosaicml/mpt-1b-redpajama-200b-dolly](https://huggingface.co/mosaicml/mpt-1b-redpajama-200b-dolly)

	Same data sources as in [Nethermind/Mpt-Instruct-DotNet-S](https://huggingface.co/Nethermind/Mpt-Instruct-DotNet-S)

	Usage example:
	```python
	import os
	from glob import glob
	import torch
	import transformers
	from transformers import PreTrainedTokenizerFast
	from transformers import AutoTokenizer

	out_name = "Nethermind/Mpt-Instruct-DotNet-XS"
	model = transformers.AutoModelForCausalLM.from_pretrained(
	out_name,
	torch_dtype=torch.bfloat16,
	trust_remote_code=True,
	)
	model.to('cuda:0')
	model.eval()

	from markdownify import markdownify as md
	from bs4 import BeautifulSoup
	from IPython.display import display, Markdown

	tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-neox-20b")
	tokenizer.pad_token = tokenizer.eos_token

	INSTRUCTION_KEY = "### Instruction:"
	RESPONSE_KEY = "### Response:"
	PROMPT_FOR_GENERATION_FORMAT = """{system}
	{instruction_key}
	{instruction}
	{response_key}
	""".format(
	system="{system}",
	instruction_key=INSTRUCTION_KEY,
	instruction="{instruction}",
	response_key=RESPONSE_KEY,
	response="{response}"
	)


	def output_loop(input_tokens, steps=2000):
	print(input_tokens.shape[1], 2000 - input_tokens.shape[1] )
	input_tokens = model.generate(input_tokens.to('cuda:0'), max_new_tokens=min(512, 1024 - input_tokens.shape[1]), do_sample=False, top_k=1, top_p=0.95)
	return input_tokens


	def give_answer(instruction="Create a loop over [0, 6, 7 , 77] that prints its contentrs", system="Below is an instruction that describes a task. Write a response that appropriately completes the request.", ):
	question = PROMPT_FOR_GENERATION_FORMAT.format(system=system, instruction=instruction)
	tokenized_question = tokenizer.encode(question ,return_tensors='pt')
	outputs = output_loop(tokenized_question)
	answer = tokenizer.batch_decode(outputs, skip_special_tokens=True)
	print(answer)
	return answer

	give_answer("What is the main difference between a struct and a class in C#?")
	```

	outputs:
	```
	A struct is a value type, which means it can only hold a few values. It is often used as a placeholder for other data types. A class, on the other hand, is a reference type, which means it can hold references to other data types.
	```

	On RTX 4090 new token sizes:
	- 2sec for 128 tokens
	- 5sec for 256 tokens
	- 11sec for 512 tokens

	Code generation:
	prompt:
	> Generate code to answer the question.
	>
	> How would you retrieve and analyse the fee history for the last 100 blocks and determine the average gas price?

	Example of code output:
	```csharp
	public async Task<decimal> GetFeeHistoryGasPriceAverage()
	{
	// Get the fee history
	ResultWrapper<FeeHistoryResults> result = await _ethRpc.eth_feeHistory(100, BlockParameter.Latest,
	new double[] { 50, 75, 90 });
	// Check if the API call succeeded
	if (result.Result!= Result.Success)
	{
	throw new Exception("Failed to retrieve fee history");
	}
	// Get the gas price average
	decimal averageGasPrice = result.Data.BaseFeePerGas.Average();

	return averageGasPrice;
	}
	```