Attaching Images

Click the paperclip icon or drag-and-drop images into the chat. Local Assistant supports PNG, JPG, GIF, and WebP formats.

Vision Models

To analyze images, you need a vision-capable model like LLaVA or Llama 3.2 Vision. These models can:

  • Describe image contents
  • Answer questions about images
  • Extract text from images (OCR)
  • Analyze charts and diagrams

Auto Model Selection

With Auto-Select enabled, Local Assistant automatically switches to a vision model when you attach an image or video.

MP4 Video Support

Local Assistant supports MP4 video files. When you attach a video:

  • A video player preview appears in the chat
  • You can scrub through the video to find specific moments
  • Select individual frames to extract and analyze
  • The AI can describe scenes, read text, or answer questions about the selected frame

This is useful for analyzing video content, extracting information from recordings, or getting descriptions of specific moments.