rpand002 commited on
Commit
d929d31
1 Parent(s): 601d23d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -240,7 +240,7 @@ for i in output:
240
  ```
241
 
242
  ## Training Data
243
- Starting from the base Granite model, this model was further pretrained on repository-level code data with per-language oversampling, allowing it to effectively utilize up to 128K tokens of context. This continued training stage focused on a curated selection of programming languages, such as Python, C, C++, Go, Java, JavaScript, and TypeScript.
244
 
245
  ## Infrastructure
246
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.
 
240
  ```
241
 
242
  ## Training Data
243
+ Starting from the base Granite model, this model was further pretrained on repository-level code data with per-language context-length oversampling, allowing it to effectively utilize up to 128K tokens of context. This continued training stage focused on a curated selection of programming languages, such as Python, C, C++, Go, Java, JavaScript, and TypeScript.
244
 
245
  ## Infrastructure
246
  We train the Granite Code models using two of IBM's super computing clusters, namely Vela and Blue Vela, both outfitted with NVIDIA A100 and H100 GPUs respectively. These clusters provide a scalable and efficient infrastructure for training our models over thousands of GPUs.