its5Q PRO

its5Q

AI & ML interests

None yet

Recent Activity

Organizations

Vikhr models's profile picture Social Post Explorers's profile picture AI Starter Pack's profile picture

its5Q's activity

posted an update 9 days ago
view post
Post
2831
Am I missing something, or there is still no way to filter by model size while searching for models? It has been a requested feature since 2022, but I haven't seen any updates since! With the amount of different models coming out, I think the size filter would be a great extension of the search functionality, especially when looking for smaller models, which are a lot less prevalent.
  • 1 reply
·
posted an update 4 months ago
view post
Post
1330
Continuing my streak by releasing the Wikireading dataset: a large collection of scraped non-fiction books predominantly in Russian language.
its5Q/wikireading

Here's the highlights:
- ~7B tokens, or ~28B characters, making it a great candidate for use in pretraining
- Contains non-fiction works from many knowledge domains
- Includes both the original HTML and extracted text of book chapters
New activity in its5Q/wikireading 4 months ago

Update README.md

#1 opened 4 months ago by
its5Q
reacted to clem's post with 🔥 5 months ago
view post
Post
4134
Just crossed 200,000 free public AI datasets shared by the community on Hugging Face! Text, image, video, audio, time-series & many more... Thanks everyone!

http://hf.co/datasets
posted an update 5 months ago
view post
Post
1112
Made public a dataset of scraped teletype articles.

Here's the overview:
- 3.3 million articles, predominantly in Russian and English
- Includes original HTML, extracted text and metadata
- All articles were run through language identification
- Includes all public articles up until April 2024

its5Q/teletype
New activity in Vikhrmodels/Vikhr-7B-instruct_0.3 9 months ago

max_position_embeddings

4
#2 opened 9 months ago by
radm