anezatra commited on
Commit
62ccd52
1 Parent(s): f8a3f59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -1
README.md CHANGED
@@ -18,4 +18,46 @@ Architecturally akin to its antecedent GPT-1 and progeny GPT-3 and GPT-4, GPT-2
18
 
19
  ## Training
20
 
21
- The transformer architecture provides a capability that allows GPT models to be trained on larger datasets compared to previous NLP (natural language processing) models. The GPT-1 model demonstrated the validity of this approach; however, GPT-2 aimed to further investigate the emergent properties of networks trained on extremely large datasets. CommonCrawl, a large corpus previously used to train NLP systems, was considered due to its extensive size. However, further examination revealed that much of the content was unintelligible. Consequently, OpenAI developed a new dataset called WebText. Instead of indiscriminately scraping content from the World Wide Web, WebText collected content only from pages linked to by Reddit posts that had received at least three upvotes prior to December 2017. The dataset was then cleaned; HTML documents were parsed into plain text, duplicate pages were removed, and Wikipedia pages were excluded due to the risk of overfitting, as they were prevalent in many other datasets. Additionally, this model was retrained using the OpenWebText corpus by Anezatra. Utilizing DistilGPT, the model was aimed at reducing its size to create a lighter and more efficient version. The DistilGPT technique maintains the model's learning capabilities while reducing the number of parameters, thus speeding up training and inference processes and utilizing resources more efficiently.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
  ## Training
20
 
21
+ The transformer architecture provides a capability that allows GPT models to be trained on larger datasets compared to previous NLP (natural language processing) models. The GPT-1 model demonstrated the validity of this approach; however, GPT-2 aimed to further investigate the emergent properties of networks trained on extremely large datasets. CommonCrawl, a large corpus previously used to train NLP systems, was considered due to its extensive size. However, further examination revealed that much of the content was unintelligible. Consequently, OpenAI developed a new dataset called WebText. Instead of indiscriminately scraping content from the World Wide Web, WebText collected content only from pages linked to by Reddit posts that had received at least three upvotes prior to December 2017. The dataset was then cleaned; HTML documents were parsed into plain text, duplicate pages were removed, and Wikipedia pages were excluded due to the risk of overfitting, as they were prevalent in many other datasets. Additionally, this model was retrained using the OpenWebText corpus by Anezatra. Utilizing DistilGPT, the model was aimed at reducing its size to create a lighter and more efficient version. The DistilGPT technique maintains the model's learning capabilities while reducing the number of parameters, thus speeding up training and inference processes and utilizing resources more efficiently.
22
+
23
+ ## How to use
24
+
25
+ ```python
26
+
27
+ # pip install git+https://github.com/huggingface/transformers.git
28
+ # pip install accelerate
29
+ # pip install torch
30
+
31
+ from transformers import pipeline
32
+
33
+ text_generator = pipeline("text-generation", model="anezatra/chat-gpt2", tokenizer="anezatra/chat-gpt2")
34
+
35
+ prompt = "question: About psychologists?\nanswer:"
36
+
37
+ generated_text = text_generator(prompt, max_length=1000, num_return_sequences=1)
38
+
39
+ print(generated_text[0]["generated_text"])
40
+ ```
41
+
42
+ ## Example Output
43
+
44
+ ```question: About psychologists answer:
45
+ We can list what I have to say about psychologists as follows:
46
+
47
+ 1) There is no direct correlation between age and behavior that goes beyond a single issue or point. This can make the difference that if you have a good therapist in there to help you develop a functioning and functioning mental health system, chances of going through these issues are very low.
48
+ 2) No one can make this question unanswerable.
49
+ 3) This is not the case.
50
+ 4) People are asked "Which psychiatrist was best for ADHD?" and "Which way did your patient get it?" What advice for them? What advice they give you about psychotherapy therapy? How do they give you therapy? Which therapy you are going to get? And what advice do they give you?
51
+ 5) The answer is "Yes." In fact, people will ask more than just "who was best for ADHD," the answer is "who did the best for ADHD." People respond almost as likely as other professionals who are more likely. The question to be asked "Is that a good way to help you better?" "Is it a good way to help you improve mental health in a non-psychiatric setting?" And what advice do clinicians give you about psychotherapy therapy?
52
+ 6) Some therapists are skeptical. And as many as one third of people will tell you, "I have to tell you whether there's a medical professional you can help with when you look in the mirror" about all of these questions. And it's important to note that all of these individuals answer "yes" as many times as possible. There is really no way to test the reliability of these questions with accurate information or even have a clear objective answer that will answer all of these questions.
53
+ 7) Some therapists are in denial about their own mental health problems. One of the reasons I am so critical of professional psychotherapy is to identify them as people who are going through a variety of mental health issues with different mental health problems. These people are often struggling with addiction and are sometimes in denial about what they have done and the way they have done and what they do. The same cannot be said about mental illness.
54
+ 8) There is something wrong with talking about the individual for years.
55
+ 9) If you say, "It is my responsibility to tell you. Do I want it as much as I can?" You may sound off on some of them, but do you know what can be done? Here are some helpful things:
56
+ 1. The answer is "Don't talk to other people.
57
+ ```
58
+
59
+ **Authors**
60
+
61
+ - **Developed by:** Anezatra
62
+ - **Model type:** GPT2
63
+ - **Contacts:** https://github.com/anezatra