dmatekenya commited on
Commit
82e1599
·
verified ·
1 Parent(s): abd7472

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -34,14 +34,18 @@ Also, its worth noting that the model repo doesnt have a ```tokenizer.json```, a
34
  instead of AutoModel or other modules in Transformer.
35
 
36
  ## Source of Funding for this Work
37
- The dataset that was used to fine-tune this model as well as resources for compute were provided by
38
- [Opportunity Internation](https://www.globalcitizen.org/en/partners/opportunity-international/?gad_source=1&gbraid=0AAAAACnN8MzEIzvf0oKqHW5bw14A4IvGY&gclid=CjwKCAjw9p24BhB_EiwA8ID5Bptp-7RgECcozDIe_6Owjb2g0wClWOKv4-NsEdtXpKx4FGPvOlBPQBoC9SMQAvD_BwE). Mor
 
 
39
 
40
  ## Training and evaluation data
41
 
42
  More information needed
43
 
44
  ## Training procedure
 
 
45
 
46
  ### Training hyperparameters
47
 
 
34
  instead of AutoModel or other modules in Transformer.
35
 
36
  ## Source of Funding for this Work
37
+ The dataset used to fine-tune this model, as well as the compute resources, were provided by [Opportunity International](https://www.globalcitizen.org/en/partners/opportunity-international/?gad_source=1&gbraid=0AAAAACnN8MzEIzvf0oKqHW5bw14A4IvGY&gclid=CjwKCAjw9p24BhB_EiwA8ID5Bptp-7RgECcozDIe_6Owjb2g0wClWOKv4-NsEdtXpKx4FGPvOlBPQBoC9SMQAvD_BwE).
38
+ This was part of a project in Malawi aimed at supporting the deployment of an LLM-based chatbot for agriculture, with the capability to handle voice interactions in the local language, Chichewa.
39
+ A total of 30 hours was collected for this dataset but due to data quality issues, only 25 hours was used.
40
+ About 30 minutes was also removed to be used as hold-out for further model evaluation.
41
 
42
  ## Training and evaluation data
43
 
44
  More information needed
45
 
46
  ## Training procedure
47
+ Most of the training for this model involved trying to varying speech dataset sizes (5 hours, 10 hours up to 24 hours).
48
+ As such, the different model commits represent different data sizes. More details will be added to each model commit.
49
 
50
  ### Training hyperparameters
51