Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

Inference Endpoints

AutoTrain Compatible

text-generation-inference

8-bit precision

Misc with no match

4-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

2,138

Full-text search

Active filters: ppo

lucas-palmiro/ppo-early-stopping-LunarLander-v3

Reinforcement Learning • Updated 2 days ago

sighmon/ppo-cleanrl-LunarLander-v2

Reinforcement Learning • Updated 2 days ago

mrinaldi86/ppo-CartPole-v1

Reinforcement Learning • Updated 1 day ago

mrinaldi86/ppo-LunarLander-v3

Reinforcement Learning • Updated 1 day ago

takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_offline_nav_2nd

Reinforcement Learning • Updated 1 day ago • 3

takedakoji00/Llama-3.1-8B-Instruct-custom-qg-full_20250219-7th_random_pad_is_eos_ppo_3rd

Reinforcement Learning • Updated about 15 hours ago • 16

nasnoussi/ppo-Pixelcopter-v1

Reinforcement Learning • Updated 1 day ago

dragovoid/ppo-LunarLander-v2-u8

Reinforcement Learning • Updated about 4 hours ago