Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Malaysian Llama2 Sentiment Analysis Model (GGUF Version)

Overview

This repository contains a GGUF (GPT-Generated Unified Format) version of the kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2 model, specifically adapted for sentiment analysis of Malay text from social media. This GGUF version allows for efficient inference on various platforms and devices.

Model Details

Usage

Prompt

Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
β€”β€”β€”
### Teks: tulis teks, tweet, atau ayat yang anda ingin analisa di ruangan ini.
β€”β€”β€”
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".

Sentimen:

Example:

Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
β€”β€”β€”
### Teks: alhamdulillah terima kasih sis support saya πŸ₯Ή
β€”β€”β€”
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".

Sentimen:

1. Using with llama.cpp

  1. Clone the llama.cpp repository and build it:

    git clone https://github.com/ggerganov/llama.cpp.git
    cd llama.cpp
    make
    
  2. Download the GGUF model file from this repository.

  3. Run inference using the following command:

    ./main -m path/to/your/model.gguf -n 256 --repeat_penalty 1.0 --color -i -r "User:" -f prompts/chat-with-bob.txt
    

    Replace path/to/your/model.gguf with the actual path to this downloaded GGUF file.

2. Using with UI-based Systems

This GGUF model can be used with various UI-based systems for an easier, more user-friendly experience:

  1. GPT4All:

    • Download GPT4All from https://gpt4all.io/
    • In the application, go to "Model Explorer"
    • Click on "Add your own GGUF model"
    • Select the downloaded GGUF file
    • Start chatting with the model
  2. Jan.AI:

    • Download Jan.AI from https://jan.ai/
    • In the application, go to the Models section
    • Click on "Add Model" and select "Import local model"
    • Choose the downloaded GGUF file
    • Once imported, you can start using the model in conversations
  3. Ollama:

    • Install Ollama from https://ollama.ai/
    • Create a custom model file (e.g., malaysian-sentiment.Ollama) with the following content:
      FROM /path/to/your/model.gguf
      
    • Replace /path/to/your/model.gguf with the actual path to this downloaded GGUF file.
    • Run the command: ollama create malaysian-sentiment -f malaysian-sentiment.Ollama
    • Start chatting with: ollama run malaysian-sentiment

3. Using directly with Python (unsloth library)

For those who prefer using Python, you can use the following code to load and run inference with the model:

from unsloth import FastLanguageModel

# Model configuration
max_seq_length = 4096  # Extended from TinyLlama's 2048 using RoPE Scaling
dtype = None  # Auto-detection (Float16 for Tesla T4, V100; Bfloat16 for Ampere+)
load_in_4bit = True  # Use 4-bit quantization to reduce memory usage

# Load the model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="kaiimran/malaysian-llama2-7b-32k-instructions-lora-sentiment-analysis-v2",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

# Enable faster inference
FastLanguageModel.for_inference(model)

# Prepare the prompt template
alpaca_prompt = """Lakukan analisis sentimen bagi teks di dalam tanda sempang berikut.
β€”β€”β€”
### Teks: {}
β€”β€”β€”
Kenal pasti sama ada teks ini secara keseluruhannya mengandungi sentimen positif atau negatif.
Jawab dengan hanya satu perkataan: "positif" atau "negatif".
Sentimen:
{}"""

# Example tweet for analysis
tweet = """
alhamdulillah terima kasih sis support saya ☺️ semoga sis dimurahkan rezeki dipanjangkan usia dan dipermudahkan segala urusan https://t.co/nSfNPGpiW8
"""

# Tokenize input
inputs = tokenizer(
    [alpaca_prompt.format(tweet, "")],
    return_tensors="pt"
).to("cuda")

# Generate output
outputs = model.generate(**inputs, max_new_tokens=10, use_cache=True)

# Print result
print(tokenizer.batch_decode(outputs)[0])

Notes

  • This model is specifically trained for sentiment analysis of Malay text from social media.
  • The model uses RoPE Scaling to extend the context length from 2048 to 4096 tokens.
  • 4-bit quantization is used by default to reduce memory usage, but this can be adjusted.
  • The GGUF format allows for efficient inference on various platforms and devices.

Contributing

Feel free to open issues or submit pull requests if you have suggestions for improvements or encounter any problems.

Acknowledgements

  • Thanks to the creators of the base model and the Malaysian tweets sentiment dataset.
  • This project was inspired by and follows the methodology outlined in this tutorial.
  • Also thanks to the developers of llama.cpp, GPT4All, Jan.AI, and Ollama for providing user-friendly interfaces to non-coders for running GGUF models.
Downloads last month
47
GGUF
Model size
6.74B params
Architecture
llama

16-bit

Inference API
Unable to determine this model’s pipeline type. Check the docs .