Skip to content

Pull requests: NVIDIA/TransformerEngine

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add Graph-safe NVFP4 CUTLASS Group GEMM to unified entry points community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3175 opened Jul 4, 2026 by cael-ling Contributor Loading…
8 tasks done
Disable cuDNN 9.23.0/9.23.1 for MXFP8 attention 2.17
#3173 opened Jul 2, 2026 by cyanguwa Collaborator Loading…
8 of 13 tasks
Reverse MXFP8 quantization row raster community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3170 opened Jul 2, 2026 by sraman-rgb Contributor Loading…
13 tasks
Remove cuDNN frontend submodule 2.18
#3169 opened Jul 2, 2026 by vcherepanov-nv Collaborator Loading…
3 of 13 tasks
Add fused multi-tensor kernel for 1D blockwise FP8 quantization community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3168 opened Jul 2, 2026 by shangxiaokang Draft
13 tasks
Bump transformers from 4.57.0 to 5.3.0 in /docs/examples/te_llama community-contribution PRs from external contributor outside the core maintainers, representing community-driven work. dependencies Pull requests that update a dependency file python Pull requests that update python code
#3167 opened Jul 2, 2026 by dependabot Bot Loading…
Bump transformers from 4.55.0 to 5.3.0 in /docs/examples/te_gemma community-contribution PRs from external contributor outside the core maintainers, representing community-driven work. dependencies Pull requests that update a dependency file python Pull requests that update python code
#3166 opened Jul 2, 2026 by dependabot Bot Loading…
Improve readability of dgrad overlap variable community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3165 opened Jul 2, 2026 by Prachi-kushwaha Loading…
4 of 13 tasks
[JAX] Add attention tutorials 2.18 documentation Improvements or additions to documentation
#3162 opened Jul 1, 2026 by KshitijLakhani Collaborator Loading…
5 of 13 tasks
[Common][PyTorch] Add strided batched GEMM in BF16/MXFP8 org-contribution
#3160 opened Jul 1, 2026 by yaox12 Member Loading…
8 of 13 tasks
Migrate norms and softmax kernels to NVRTC community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3156 opened Jun 30, 2026 by CarlosGomes98 Contributor Loading…
2 of 13 tasks
[PyTorch][torch.compile] Add TensorProto mechanism
#3153 opened Jun 29, 2026 by pggPL Collaborator Loading…
4 of 13 tasks
[PyTorch][torch.compile] Make quantizers opaque value objects
#3152 opened Jun 29, 2026 by pggPL Collaborator Loading…
8 of 13 tasks
Enable FA4 for context-parallel attention
#3149 opened Jun 26, 2026 by sudhakarsingh27 Member Draft
7 of 13 tasks
[Draft] Use vendored cuDNN frontend for Python
#3148 opened Jun 26, 2026 by vcherepanov-nv Collaborator Loading…
1 of 13 tasks
Add MXFP8 support with cuBLASMp community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3145 opened Jun 25, 2026 by almogsegal Contributor Loading…
13 tasks
Add multi_tensor_raw_moments kernel community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3144 opened Jun 25, 2026 by philipcmonk Draft
6 of 13 tasks
[Common] Fix Build: NCCL EP build to respect MAX_JOBS
#3138 opened Jun 22, 2026 by phu0ngng Collaborator Draft
7 of 13 tasks
[Common] Experimental CuTeDSL MXFP8 backends in C++ via TVM-FFI
#3137 opened Jun 21, 2026 by kainzhong Collaborator Draft
13 tasks
Single-launch CUTLASS grouped GEMM for per-tensor NVFP4 community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3134 opened Jun 17, 2026 by cael-ling Contributor Loading…
9 of 13 tasks
Enable NVFP4 RHT amax for grouped SReLU MLP community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3133 opened Jun 16, 2026 by sraman-rgb Contributor Loading…
13 tasks
[Common] Support scaled & clamped swiglu, srelu for BF16 community-contribution PRs from external contributor outside the core maintainers, representing community-driven work.
#3132 opened Jun 16, 2026 by zhongbozhu Collaborator Loading…
13 tasks
ProTip! Mix and match filters to narrow down what you’re looking for.