AI-generated EAD/XML records with Speech-to-text command

Documentation

This application allows users to generate EAD/XML records using voice commands. Users can record their instructions, highlight existing XML content to add context, and send the combined prompt to the Ollama model for processing.

To use the application:

  • Click "Start Recording" to begin capturing your voice instructions (can be in french).
  • Highlight any existing XML content and click "Add to Context" to include it in your prompt.
  • Use the "Prettify XML" button to format your XML content.
  • Click "Send Prompt" to generate the EAD/XML based on your instructions.

This application utilizes the Xenova/whisper-small model for audio transcription, which is implemented using transformers.js and leverages WebGPU in the browser for efficient processing.

For EAD generation, the application uses the Q5_K_M quantized variant of the fine-tuned model Geraldine/FineLlama-3.2-3B-Instruct-ead. This model is designed to understand and generate EAD/XML records based on the user’s instructions and context.

See this blog post for explanations on fine-tuning

Browser Requirements for WebGPU

To use the WebGPU features of this application, ensure that you are using a compatible browser. The following requirements must be met:


User Prompt

Context