Edit model card

Model Card for Khalsa

Fine-tuned Gemma Model which was worked on using the intel developer cloud, and trained on using Intel Max 1550 GPU

Model Details

Model Description

Fine-tuned Gemma Model which was worked on using the intel developer cloud

  • Developed by: Manik Sethi, Britney Nguyen, Mario Miranda
  • Model type: Language model
  • Language(s) (NLP): eng
  • License: apache-2.0
  • Parent Model: gemma-2b
  • Resources for more information: Intel Develpor Cloud

Uses

Model is intended to be used by individuals who are struggling to understand the information in important documentations. More specifically, the demographic includes immigrants and visa holders who struggle with english. When they receive documentaiton from jobs, government agencies, or healthcare, our model should be able to answer any questions they have.

Direct Use

User uploads a pdf to the application, which is then parsed by our model. The user is then able to ask questions about content in the given documentation.

Out-of-Scope Use

Misuse of the model would entail relying on it to provide legal advice, which it is not intended to give.

Bias, Risks, and Limitations

Significant research has explored bias and fairness issues with language models (see, e.g., Sheng et al. (2021) and Bender et al. (2021)). Predictions generated by the model may include disturbing and harmful stereotypes across protected classes; identity characteristics; and sensitive, social, and occupational groups.

Current limitations are the quantity of languages available for the model to serve in.

Recommendations

To translate the advice into a target language, we suggest first taking the output from the LLM, and then translating it. Trying to get the model to do both simultaneously may result in flawed responses.

Training Details

Training Data

Model was trained using the databricks-dolly-15k datbase. This dataset contains a diverse range of question-answer pairs spanning multiple categories, facilitating comprehensive training. By focusing specifically on the question-answer pairs, the model adapts to provide accurate and relevant responses to various inquiries.

Training Procedure

Preprocessing

The dataset underwent preprocessing steps to extract question-answer pairs relevant to the "Question answering" category. This involved filtering the dataset to ensure that the model is fine-tuned on pertinent data, enhancing its ability to provide accurate responses.

Speeds, Sizes, Times

Ran through 25 epocs.

Evaluation

Testing Data, Factors & Metrics

Testing Data

We fed the following prompts into the model

"What are the main differences between a vegetarian and a vegan diet?", "What are some effective strategies for managing stress and anxiety?", "Can you explain the concept of blockchain technology in simple terms?", "What are the key factors that influence the price of crude oil in global markets?", "When did Virgin Australia start operating?"

Results

More information needed

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: Intel XEON hardware
  • Hours used: More information needed
  • Cloud Provider: Intel Developer cloud
  • Compute Region: More information needed
  • Carbon Emitted: More information needed

Technical Specifications [optional]

Model Architecture and Objective

More information needed

Compute Infrastructure

More information needed

Hardware

Trained model on Intel Max 1550 GPU

Software

Developed model using Intel Developer Cloud

Model Card Authors

Manik Sethi, Britney Nguyen, Mario Miranda

Model Card Contact

More information needed

How to Get Started with the Model

Use the code below to get started with the model.

Click to expand

More information needed

Downloads last month
2
Safetensors
Model size
2.51B params
Tensor type
BF16
·
Unable to determine this model’s pipeline type. Check the docs .

Adapter for