Pietro Lesci

pietrolesci

https://pietrolesci.github.io/

AI & ML interests

I like developing and applying causal methods to study the effect of training choices on models’ behaviour, including memorisation, shortcut learning, and tokenisation.

Recent Activity

updated a dataset about 16 hours ago

pietrolesci/pile-deduped-pythia-preshuffled

updated a dataset about 16 hours ago

pietrolesci/pile-deduped-pythia-preshuffled

updated a dataset about 17 hours ago

pietrolesci/pile-deduped-pythia-preshuffled

View all activity

Organizations

pietrolesci's activity

New activity in LLM360/AmberDatasets 3 days ago

🌟 Appreciation for providing seamless access to pre-processed pre-shuffled data

#2 opened 3 days ago by

pietrolesci

New activity in bigscience-data/README 3 days ago

Reconstructing pre-training data

#1 opened 3 days ago by

pietrolesci

New activity in JeanKaddour/minipile 5 months ago

Domain and provenance annotation

#1 opened over 1 year ago by

haukur

New activity in HuggingFaceTB/SmolLM-135M 7 months ago

Trapezoidal scheduler with cooldown phase

#4 opened 8 months ago by

maveriq

New activity in bias-amplified-splits/mnli 10 months ago

Bias annotation

#2 opened 10 months ago by

pietrolesci

New activity in EleutherAI/pythia-160m 10 months ago

Tokenizer `merges.txt` files

#5 opened 10 months ago by

pietrolesci

New activity in EleutherAI/pile-deduped-pythia-preshuffled about 1 year ago

Sequence "packing" logic

#2 opened about 1 year ago by

pietrolesci

Pad-only sequences from mmap'ed dataset after a certain index

#1 opened about 1 year ago by

pietrolesci

New activity in EleutherAI/pile-duped-pythia-random-sampled about 1 year ago

Add full sequences (beyond the first 64 tokens)

#1 opened about 1 year ago by

pietrolesci

New activity in pfb30/multi_woz_v22 over 2 years ago

Fix swapped start and exclusive_end fields

#3 opened over 2 years ago by

pietrolesci

New activity in mrm8488/PromptSource over 2 years ago

App down

#1 opened over 2 years ago by

pietrolesci

New activity in pfb30/multi_woz_v22 over 2 years ago

`start` and `exclusive_end` seems swapped

#1 opened over 2 years ago by

pietrolesci