Edit model card

AI Image Processing Toolkit


A collection of specialized scripts for AI image processing, dataset preparation, and model training workflows.

πŸ› οΈ Scripts Overview


WDV3 (Waifu Diffusion V3 Tagger)

An image tagging script using the WD V3 tagger models. Supports multiple model architectures (ViT, SwinV2, ConvNext) and can process both single images and directories recursively.

Features

  • Multiple model architecture support
  • Batch processing capabilities
  • Adjustable confidence thresholds
  • CUDA acceleration with FP16 support
  • JXL image format support

Training Functions (train_functions.zsh)

A set of ZSH functions for managing AI model training workflows:

  • Script execution management
  • Training variable setup
  • Git repository state tracking
  • Output directory management
  • Automatic cleanup of empty outputs

Git Wrapper (git-wrapper.zsh)

Enhanced Git functionality for dataset management:

  • Automatic submodule handling
  • LFS integration for JXL files
  • Dataset-specific Git attributes management

Check4sig (check4sig.zsh)

Dataset caption file watermark detection utility:

  • Scans .caption files for watermark-related text
  • Batch processing support
  • Interactive editing with nvim
  • Recursive directory scanning

Gallery-dl Wrapper (gallery-dl.zsh)

Directory-aware wrapper for gallery-dl:

  • Automatically changes to ~/datasets directory
  • Maintains consistent download locations
  • Preserves original command functionality

JoyCaption (joy)

Advanced image captioning system using CLIP and LLM:

  • Multiple caption styles (descriptive, training prompts, art critic, etc.)
  • Custom image adapters
  • Tag-based caption generation
  • Batch processing support

PNG to MP4 Converter (png2mp4)

Training progress visualization tool:

  • Converts PNG sequences to MP4
  • Customizable frame rates and durations
  • Step counter overlay support
  • Multiple sample handling

XY Plot Generator (xyplot)

Image comparison grid generator:

  • Supports multiple image formats
  • Customizable grid layouts
  • Optional row/column labels
  • Automatic image padding and alignment

Caption Concatenator (concat_captions)

Utility for combining multiple caption files:

  • Merges .caption and .tags files
  • Maintains original image associations
  • Batch processing support
  • Error handling for missing files

πŸš€ Installation


  1. Clone the repository: (optional)
git clone https://huggingface.co/k4d3/toolkit
  1. Add the repository to your PATH: (optional)
export PATH="$PATH:~/path/to/toolkit"
  1. Add the .zshrc to your shell: (optional and you will need to make changes to it)
source ~/path/to/toolkit/.zshrc
nano ~/.zshrc

πŸ“ Requirements


  • miniconda with the environment set up for training with sd-scripts, timm, etc
  • ZSH shell (optional)
  • CUDA-capable GPU (recommended)
  • Required Python packages:
    • torch
    • transformers
    • pillow
    • pillow-jxl
    • opencv-python
    • numpy
    • and a lot more

πŸ”§ Usage


Each script can be used independently or as part of a workflow. Here are some common usage examples:

JoyCaption

joy --feed-from-tags=10 --custom_prompt="Write a very long descriptive caption for this image in a formal tone. Do not mention feelings and emotions evoked by the image." .

png2mp4

png2mp4 --repeat 16

inject_to_txt

inject_to_txt 1_honovy "honovy"

replace_comma_with_keep_tags_txt

replace_comma_with_keep_tags_txt 1 1_honovy

πŸ“¦ Directory Structure


~/
β”œβ”€β”€ datasets/
β”œβ”€β”€ output_dir/
β”œβ”€β”€ models/
β”œβ”€β”€ toolkit/

πŸ“„ License


WTFPL - Do what the fuck you want with it.

The included data and models are copyrighted by their respective owners with their own licenses.

🀝 Contributing


Contributions are welcome! For major changes, please open an issue first to discuss what you would like to change.

πŸ“š Documentation


If the documentation of a script is missing, ask a language model about it.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model's library. Check the docs .