Skip to content
#

llm-quantization

Here are 20 public repositories matching this topic...

Production-grade LLM quantization, benchmarking, and edge deployment toolkit. Supports bitsandbytes INT8/INT4, GPTQ (Hessian calibration), AWQ (activation-aware), and GGUF (Q2_K–Q8_0). Four-dimensional benchmarking: perplexity, TPS/TTFT, VRAM profiling, and LLM-as-Judge quality scoring. RTX 5090 Blackwell sm_120 ready.

  • Updated Jun 14, 2026
  • Python

Improve this page

Add a description, image, and links to the llm-quantization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-quantization topic, visit your repo's landing page and select "manage topics."

Learn more