NVIDIA / cutlass Public

Notifications You must be signed in to change notification settings
Fork 820
Star 4.7k

Code
Issues 84
Pull requests 33
Discussions
Actions
Projects
Wiki
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Wiki
Security
Insights

Issues: NVIDIA/cutlass

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

84 Open 816 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[QST] GemmUniversal is slower than GemmSplitKParallel when M and N are small and K is large ? - Needs Triage question

Question

#1586 opened Jun 12, 2024 by ken012git

[QST] CUTLASS support for sparse matrix multiplication for X*W=Y with GPU sparse tensor core ? - Needs Triage question

Question

#1585 opened Jun 12, 2024 by YukeWang96

[QST] How to improve skinny matrix perf over Ampere like 3090?

#1582 opened Jun 11, 2024 by leiwen83

[QST] Unknown CMake command "cutlass_example_add_executable" ? - Needs Triage question

Question

#1581 opened Jun 10, 2024 by soohyung-jang

[BUG] Circular Dependency in Header Files ? - Needs Triage bug

Something isn't working

#1576 opened Jun 6, 2024 by gavinchen430

[DOC] Incorrect link in main README file ? - Needs Triage documentation

Documentation

#1575 opened Jun 5, 2024 by gcunhase

[QST]What is the difference between WmmaTensorOp and TensorOp? ? - Needs Triage question

Question

#1574 opened Jun 5, 2024 by sleepwalker2017

Int8 multiplication with pytorch extension: namespace "torch" has no member "I8 ? - Needs Triage bug

Something isn't working

#1573 opened Jun 5, 2024 by MuhammedHasan

[QST] Is there grouped_gemv ? - Needs Triage question

Question

#1572 opened Jun 4, 2024 by hanzz2007

[BUG] Failing to build on MSVC due to call to _div128 ? - Needs Triage bug

Something isn't working

#1571 opened Jun 3, 2024 by drisspg

How to perform operations like crop, concat on tensors in CuTe? [QST] ? - Needs Triage question

Question

#1570 opened Jun 3, 2024 by Ricky-KLA

[FEA] Add cuTensorMapEncodeTiled to CudaHostAdapter ? - Needs Triage feature request

New feature or request

#1566 opened May 31, 2024 by drisspg

[QST] GEMM Epilogue Fusion: Row-wise and Column-wise Multiplication ? - Needs Triage question

Question

#1565 opened May 31, 2024 by Hongbosherlock

[QST]Why fp8 convert only has float2fp8 function without ptx ? ? - Needs Triage question

Question

#1564 opened May 31, 2024 by WtDMaO

[QST] GEMM Epilogue Fusion: Element-wise Ops and Two-Tensor Element-wise Multiplication ? - Needs Triage question

Question

#1563 opened May 30, 2024 by HanGuo97

Tiled copy misaligned, how to solve it? ? - Needs Triage question

Question

#1561 opened May 30, 2024 by 4grass

Warp Group MMA vs Warp MMA ? - Needs Triage question

Question

#1560 opened May 30, 2024 by OrenLeung

[QST/BUG] why cute kernel transfers so much data between L2 and gmen than cublas kernel ? - Needs Triage question

Question

#1556 opened May 29, 2024 by irasin

[QST]How to implement different type between D0(D1) and D2 based on 45_dual_gemm example ? - Needs Triage question

Question

#1555 opened May 29, 2024 by Sunny-bot1

[QST] The best way to do D = func(A x B) x C. ? - Needs Triage question

Question

#1551 opened May 27, 2024 by amazingyyc

[QST] epilogue in HGEMM ? - Needs Triage question

Question

#1550 opened May 27, 2024 by irasin

[QST] Hopper mixed precision gemm always worse than FP8 ? - Needs Triage question

Question

#1549 opened May 24, 2024 by divchenko

[BUG] Cutlass Python API silently fails in (suspected) unsupported case ? - Needs Triage bug

Something isn't working

#1547 opened May 23, 2024 by LucasWilkinson

[QST] Row major for int8 matrix multiplications? ? - Needs Triage question

Question

#1533 opened May 10, 2024 by ken012git

[QST] cutlass::Array and cute::Tensor --- using CUTLASS utility structs/classes with CUTE (such as NumericArrayConverter) ? - Needs Triage inactive-30d question

Question

#1532 opened May 10, 2024 by HanGuo97

Previous 1 2 3 4 Next

Previous Next

ProTip! Follow long discussions with comments:>50.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly