Privacy Preserving AI Hackathon (Zama, Hugging Face, Entrepreneur First)

Enterprise
community

AI & ML interests

None defined yet.

Recent Activity

ppaihack's activity

regisss 
posted an update 1 day ago
view post
Post
844
Nice paper comparing the fp8 inference efficiency of Nvidia H100 and Intel Gaudi2: An Investigation of FP8 Across Accelerators for LLM Inference (2502.01070)

The conclusion is interesting: "Our findings highlight that the Gaudi 2, by leveraging FP8, achieves higher throughput-to-power efficiency during LLM inference"

One aspect of AI hardware accelerators that is often overlooked is how they consume less energy than GPUs. It's nice to see researchers starting carrying out experiments to measure this!

Gaudi3 results soon...
regisss 
posted an update about 2 months ago
regisss 
posted an update 4 months ago
view post
Post
1416
Interested in performing inference with an ONNX model?⚡️

The Optimum docs about model inference with ONNX Runtime is now much clearer and simpler!

You want to deploy your favorite model on the hub but you don't know how to export it to the ONNX format? You can do it in one line of code as follows:
from optimum.onnxruntime import ORTModelForSequenceClassification

# Load the model from the hub and export it to the ONNX format
model_id = "distilbert-base-uncased-finetuned-sst-2-english"
model = ORTModelForSequenceClassification.from_pretrained(model_id, export=True)

Check out the whole guide 👉 https://huggingface.co/docs/optimum/onnxruntime/usage_guides/models
jeremyzacch 
updated a Space 5 months ago
WenqingZhang 
updated a Space 5 months ago
Sckathach 
updated a Space 5 months ago
Nos7 
updated a Space 5 months ago
gregoiregllt 
updated a Space 5 months ago