AI & ML interests

Arabic NLP, computer vision, etc.

Recent Activity

ZaidΒ  updated a dataset 13 days ago
arbml/masader
abuelnasrΒ  updated a model 4 months ago
arbml/whisper-tiny-ar
abuelnasrΒ  updated a model 4 months ago
arbml/whisper-small-cv-ar
View all activity

arbml's activity

lunarfluΒ 
posted an update 29 days ago
not-lainΒ 
posted an update about 2 months ago
view post
Post
1986
ever wondered how you can make an API call to a visual-question-answering model without sending an image url πŸ‘€

you can do that by converting your local image to base64 and sending it to the API.

recently I made some changes to my library "loadimg" that allows you to make converting images to base64 a breeze.
πŸ”— https://github.com/not-lain/loadimg

API request example πŸ› οΈ:
from loadimg import load_img
from huggingface_hub import InferenceClient

# or load a local image
my_b64_img = load_img(imgPath_url_pillow_or_numpy ,output_type="base64" ) 

client = InferenceClient(api_key="hf_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx")

messages = [
	{
		"role": "user",
		"content": [
			{
				"type": "text",
				"text": "Describe this image in one sentence."
			},
			{
				"type": "image_url",
				"image_url": {
					"url": my_b64_img # base64 allows using images without uploading them to the web
				}
			}
		]
	}
]

stream = client.chat.completions.create(
    model="meta-llama/Llama-3.2-11B-Vision-Instruct", 
	messages=messages, 
	max_tokens=500,
	stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")
lunarfluΒ 
posted an update 5 months ago
not-lainΒ 
posted an update 5 months ago