Flux1-Dev FP8&NF4&GGUF 6 Steps, Hybrid 4 Steps : SVDQuant-Int4-Flux.1-Dev : LORA Models - Svdq-Int4

flux1-dev-fp8-nf4-gguf-6-steps-hybrid-4-steps-svdquant-int4-flux-1-dev-lora-mode

Text to Image Community ModelFree for Premium Users LLMs.txt

This upload serves as a demonstration of advanced image generation techniques using the publicly available svdq-int4_r32-flux.1-dev. Please note I am not the original creator

Installing Nunchaku for ComfyUI Portable (A "Survivor's" Guide)

(to use the single file svdq-int4_r32-flux.1-dev.safetensors)

This guide is based on a real-world troubleshooting process to get ComfyUI-Nunchaku working seamlessly with a ComfyUI portable installation. Many users face dependency issues, and this aims to help those "affected by the process."

Disclaimer: This guide is not official. It's a community-driven effort based on extensive troubleshooting. Always back up your files before making changes.

Why this guide? The official Nunchaku PyPI release can be outdated, and its direct installation can cause dependency conflicts, especially with filterpy and PyTorch versions. This guide focuses on using a specific development release that resolves these issues.

Target Environment:

ComfyUI Portable (with embedded Python)
Python 3.12
PyTorch 2.7.1+cu128 (or similar +cu12x version)

NVIDIA GPU Compatibility Notes: NVIDIA categorizes GPU compatibility by architecture, not strictly by series numbers.

INT4/FP4 (e.g., Nunchaku's quantization): Generally more suited for newer architectures like Ada Lovelace (RTX 40 series) or Hopper, as they have dedicated INT4/FP4 hardware.
Ampere (RTX 30 series): Fully compatible with FP16 and generally works well with many Nunchaku features. While it can run INT4/FP4, the performance gains might not be as significant as on Ada Lovelace.
Older series (e.g., RTX 20 series or GTX 16 series): Compatibility for advanced features like INT4/FP4 might be limited or nonexistent, often requiring FP32 or FP16.

Step-by-Step Installation Guide:

1. Close ComfyUI: Ensure your ComfyUI application is completely shut down before starting.

2. Open your embedded Python's terminal: Navigate to your ComfyUI_windows_portable\python_embeded directory in your command prompt or PowerShell. Example: cd E:\ComfyUI_windows_portable\python_embeded

3. Uninstall problematic previous dependencies: This cleans up any prior failed attempts or conflicting versions. bash python.exe -m pip uninstall nunchaku insightface facexlib filterpy diffusers accelerate onnxruntime -y (Ignore "Skipping" messages for packages not installed.)

4. Install the specific Nunchaku development wheel: This is crucial as it's a pre-built package that bypasses common compilation issues and is compatible with PyTorch 2.7 and Python 3.12. bash python.exe -m pip install https://github.com/mit-han-lab/nunchaku/releases/download/v0.3.1dev20250609/nunchaku-0.3.1.dev20250609+torch2.7-cp312-cp312-win_amd64.whl (Note: win_amd64 refers to 64-bit Windows, not AMD CPUs. It's correct for Intel CPUs on 64-bit Windows systems).

5. Install facexlib: After installing the Nunchaku wheel, the facexlib dependency for some optional nodes (like PuLID) might still be missing. Install it directly. bash python.exe -m pip install facexlib

6. Install insightface: insightface is another crucial dependency for Nunchaku's facial features. It might not be fully pulled in by the previous steps. bash python.exe -m pip install insightface

7. Install onnxruntime: insightface relies on onnxruntime to run ONNX models. Ensure it's installed. bash python.exe -m pip install onnxruntime

8. Verify your installation: * Close the terminal. * Start ComfyUI via run_nvidia_gpu.bat or run_nvidia_gpu_fast_fp16_accumulation.bat (or your usual start script) from E:\ComfyUI_windows_portable\. * Check the console output: There should be no ModuleNotFoundError or ImportError messages related to Nunchaku or its dependencies at startup. * Check ComfyUI GUI: In the ComfyUI interface, click "Add Nodes" and verify that all Nunchaku nodes, including NunchakuPulidApply and NunchakuPulidLoader, are visible and can be added to your workflow. You should see 9 Nunchaku nodes.

Important Notes:

The Nunchaku wheel installer node now included in ComfyUI-Nunchaku can update Nunchaku in the future, simplifying maintenance.
You can find example workflows in the workflows_examples folder located at E:\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI-nunchaku\. These JSON files can be loaded directly into ComfyUI to demonstrate how to use Nunchaku's nodes.
While performance optimizations like xformers exist, they can sometimes complicate installations due to strict version dependencies and potential need for "rollback" procedures. For most users, the steps above are sufficient to get Nunchaku fully functional.

The svdq-int4_r32-flux.1-dev version is likely considered "much better," especially for "characters," due to its use of a single file combining int4 and BF16 (Bfloat16) layers.

Here's a breakdown of why this is significant:

INT4 (4-bit integer): This is a highly compressed data type. Its primary benefit is a significant reduction in memory usage and faster computation, especially during inference (when you're just generating images, not training the model). This means the model can run on GPUs with less VRAM or allow for larger image sizes/batch sizes. While it introduces some "quality loss" due to the extreme compression, it's often optimized to be minimal for many applications.
BF16 (Bfloat16): This is a 16-bit floating-point format that offers a good balance between precision and range. It's often used in deep learning training because it helps maintain numerical stability, preventing issues like overflow or underflow that can occur with other lower-precision formats like FP16. BF16 typically provides better precision than INT4.

Why a single file combining both is better, especially for characters:

When int4 and BF16 are combined in a single file, it usually implies a "mixed-precision" approach. This means the model leverages the strengths of both data types:

Efficiency: The int4 layers likely handle parts of the model where extreme precision isn't as critical, drastically reducing memory footprint and speeding up operations.
Quality for Critical Parts: The BF16 layers are probably used for parts of the model that are more sensitive to precision, such as those crucial for generating detailed and consistent characters. This ensures that the essential elements of the characters (like facial features, hands, etc.) maintain higher quality despite the overall compression.
Seamless Integration: Having them in a single file suggests that the integration and transition between these different precision layers are optimized, leading to better overall performance and output quality compared to separate files or less refined mixed-precision implementations.
Overall Improvement for Characters: By intelligently applying int4 for efficiency and BF16 for critical elements, the model can render characters with more detail and accuracy while still being very efficient. This can result in sharper features, more consistent anatomy, and better overall artistic quality in character generation.

In essence, the new version likely achieves a better balance between model size, speed, and output quality, particularly benefiting character generation due to this optimized mixed-precision strategy

API Playground API Documentation