metadata

license: llama2

Trinity

Trinity is a coding specific model series that can be used to create autonomous agents. In the future, we will be releasing agent software that uses this model.

Our Offensive Cybersecurity Model WhiteRabbitNeo-33B model is now in beta!

Access at: https://www.whiterabbitneo.com/

Join Our Discord Server

Join us at: https://discord.gg/8Ynkrcbk92 (Updated on Dec 29th. Now permanent link to join)

Sample Inference Code

import torch, json
from transformers import AutoModelForCausalLM, AutoTokenizer

model_path = "/home/migel/models/WhiteRabbitNeo"

model = AutoModelForCausalLM.from_pretrained(
    model_path,
    torch_dtype=torch.float16,
    device_map="auto",
    load_in_4bit=False,
    load_in_8bit=True,
    trust_remote_code=True,
)

tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)


def generate_text(instruction):
    tokens = tokenizer.encode(instruction)
    tokens = torch.LongTensor(tokens).unsqueeze(0)
    tokens = tokens.to("cuda")

    instance = {
        "input_ids": tokens,
        "top_p": 1.0,
        "temperature": 0.5,
        "generate_len": 1024,
        "top_k": 50,
    }

    length = len(tokens[0])
    with torch.no_grad():
        rest = model.generate(
            input_ids=tokens,
            max_length=length + instance["generate_len"],
            use_cache=True,
            do_sample=True,
            top_p=instance["top_p"],
            temperature=instance["temperature"],
            top_k=instance["top_k"],
            num_return_sequences=1,
        )
    output = rest[0][length:]
    string = tokenizer.decode(output, skip_special_tokens=True)
    answer = string.split("USER:")[0].strip()
    return f"{answer}"


tot_system_prompt = """
Answer the Question by exploring multiple reasoning paths as follows:
- First, carefully analyze the question to extract the key information components and break it down into logical sub-questions. This helps set up the framework for reasoning. The goal is to construct an internal search tree.
- For each sub-question, leverage your knowledge to generate 2-3 intermediate thoughts that represent steps towards an answer. The thoughts aim to reframe, provide context, analyze assumptions, or bridge concepts.
- Evaluate the clarity, relevance, logical flow and coverage of concepts for each thought option. Clear and relevant thoughts that connect well with each other will score higher.
- Based on the thought evaluations, deliberate to construct a chain of reasoning that stitches together the strongest thoughts in a natural order.
- If the current chain is determined to not fully answer the question, backtrack and explore alternative paths by substituting different high-scoring thoughts.
- Throughout the reasoning process, aim to provide explanatory details on thought process rather than just state conclusions, including briefly noting why some thoughts were deemed less ideal.
- Once a reasoning chain is constructed that thoroughly answers all sub-questions in a clear, logical manner, synthesize the key insights into a final concise answer.
- Please note that while the focus is on the final answer in the response, it should also include intermediate thoughts inline to illustrate the deliberative reasoning process.
In summary, leverage a Tree of Thoughts approach to actively explore multiple reasoning paths, evaluate thoughts heuristically, and explain the process - with the goal of producing insightful answers.
"""

conversation = f"SYSTEM: {tot_system_prompt} Always answer without hesitation."


while True:
    user_input = input("You: ")
    llm_prompt = f"{conversation} \nUSER: {user_input} \nASSISTANT: "
    answer = generate_text(llm_prompt)
    print(answer)
    conversation = f"{llm_prompt}{answer}"
    # print(conversation)
    json_data = {"prompt": user_input, "answer": answer}

    # print(json_data)
    # with open(output_file_path, "a") as output_file:
    #     output_file.write(json.dumps(json_data) + "\n")

Sample Conversation:

You: Code me a Fast API server, with one endpoint. This endpoint returns files from a S3 bucket.

Here is how to code a Fast API server with one endpoint that returns files from an S3 bucket:

from fastapi import FastAPI                                                                                                                                                                            
from starlette.responses import StreamingResponse                                                                                                                                                      
import boto3                                                                                                                                                                                           
import botocore.exceptions                                                                                                                                                                             
                                                                                                                                                                                                       
app = FastAPI()                                                                                                                                                                                        
s3_client = boto3.client('s3')                                                                                                                                                                         
                                                                                                                                                                                                       
@app.get("/s3files")                                                                                                                                                                                   
async def get_s3_files(bucket_name: str, key: str):                                                                                                                                                    
    try:                                                                                                                                                                                               
        obj = s3_client.get_object(Bucket=bucket_name, Key=key)                                                                                                                                        
        return StreamingResponse(obj['Body'], media_type=obj['ContentType'])                                                                                                                           
    except botocore.exceptions.ClientError as e:                                                                                                                                                       
        return {"error": str(e)}, 404

To explain:

We initialize a FastAPI app and an S3 client
The /s3files endpoint takes the bucket name and object key as parameters
We try to get the object from S3 using the S3 client
If successful, we return the object's Body wrapped in a StreamingResponse, setting the ContentType
If the client raises an error, we return a 404 with the error message

This allows clients to request files from an S3 bucket by specifying the bucket name and object key. The endpoint streams the file content back to the client.

To elaborate further:

The endpoint can be extended to validate inputs, handle authentication, logging, etc.
We could make this endpoint more generic by accepting a list of object keys instead of requiring a single key.
There are several ways to optimize performance, like caching responses.

Let me know if you have any other questions!

You: How do I authenticate s3?

Here are a few ways to authenticate with S3:

Credentials - You can provide AWS access key ID and secret access key to boto3. This will sign requests automatically.

import boto3 

s3 = boto3.client('s3', 
                 aws_access_key_id='YOUR_ACCESS_KEY',
                 aws_secret_access_key='YOUR_SECRET_KEY')

IAM roles - If you are running on EC2 with an IAM role, you don't need to provide credentials. Boto3 will automatically use the role.
Environment variables - Store credentials in environment variables and read them in boto3.
AWS credentials file - Store credentials in ~/.aws/credentials. Boto3 will automatically use them.
STS Assume roles - Get temporary credentials from STS and assume a role to access S3.
AWS SDK credential providers - Use environment variables, configuration files, etc to load credentials.

So in summary, the best way depends on your deployment environment. The best practice is to use IAM roles when possible. Other options like environment variables and credential files can be useful for testing.

Let me know if you have any other questions!