#

CUDA

CUDA® is a parallel computing platform and programming model developed by NVIDIA for general computing on graphical processing units (GPUs). With CUDA, developers are able to dramatically speed up computing applications by harnessing the power of GPUs.

Here are 4,937 public repositories matching this topic...

iree-org / iree

A retargetable MLIR-based machine learning compiler and runtime toolkit.

machine-learning compiler runtime tensorflow vulkan cuda pytorch spirv jax mlir

Updated Jun 12, 2024
C++

NikhilMukraj / spiking-neural-networks

Implementations of various simulations for integrate and fire models, as well as conductance based models with synaptic neurotransmission

python rust cuda computational-biology izhikevich-neurons biological-neural-networks biological-neurons hodgkin-huxley-neuron

Updated Jun 12, 2024
Rust

OpenVoiceOS / status

Open Voice OS Status Page

status text-to-speech translator monitoring alerting cuda sam nvidia tts uptime stats speech-to-text stt piper ovos upptime openvoiceos fasterwhisper mimic3

Updated Jun 12, 2024
Markdown

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu mlops llm inferentia llmops llm-serving trainium

Updated Jun 12, 2024
Python

CEED / libCEED

CEED Library: Code for Efficient Extensible Discretizations

api hpc gpu julia linear-algebra cuda high-performance-computing high-order ecp exascale-computing ceed

Updated Jun 12, 2024
C

BobMcDear / attorch

A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.

machine-learning deep-learning cuda pytorch openai triton openai-triton

Updated Jun 12, 2024
Python

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

python machine-learning deep-learning gpu cuda pytorch jax fp8

Updated Jun 12, 2024
Python

theochem / cuGBasis

High performance CUDA/Python library for computing quantum chemistry density-based descriptors for larger systems using GPUs.

python gpu quantum cuda quantum-chemistry electron-density conceptual-dft atoms-in-molecules

Updated Jun 12, 2024
Cuda

rapidsai / rmm

RAPIDS Memory Manager

cuda memory-management memory-allocation rapids

Updated Jun 12, 2024
C++

NVIDIA / gpu-operator

NVIDIA GPU Operator creates/configures/manages GPUs atop Kubernetes

kubernetes gpu cuda nvidia

Updated Jun 12, 2024
Go

jaredhoberock / ubu

cuda gpu-computing gpu-programming cuda-programming circlelang

Updated Jun 12, 2024
C++

PygmalionAI / aphrodite-engine

PygmalionAI's large-scale inference engine

machine-learning cuda api-rest avx512 rocm inference-engine inferentia

Updated Jun 12, 2024
Python

catboost / catboost

A fast, scalable, high performance Gradient Boosting on Decision Trees library, used for ranking, classification, regression and other machine learning tasks for Python, R, Java, C++. Supports computation on CPU and GPU.

python data-science machine-learning data-mining tutorial r big-data gpu cuda kaggle gbdt gbm gpu-computing decision-trees gradient-boosting coreml catboost categorical-features

Updated Jun 12, 2024
Python

vectorch-ai / ScaleLLM

A high-performance inference system for large language models, designed for production environments.

performance gpu model production cuda efficiency inference transformer llama speculative serving llm llm-inference llama3

Updated Jun 12, 2024
C++

shader-slang / slang

Making it easier to work with shaders

shaders vulkan glsl cuda hlsl d3d12

Updated Jun 12, 2024
C++

NVIDIA / cccl

CUDA C++ Core Libraries

cpp hpc gpu modern-cpp parallel-computing cuda nvidia gpu-acceleration cuda-kernels gpu-computing parallel-algorithm parallel-programming nvidia-gpu gpu-programming cuda-library cpp-programming cuda-programming accelerated-computing cuda-cpp

Updated Jun 12, 2024
C++

rapidsai / cudf

cuDF - GPU DataFrame Library

python data-science cpp gpu arrow pydata cuda pandas data-analysis dask dataframe rapids cudf

Updated Jun 12, 2024
C++

tomilov / sah_kd_tree

(in progress) SAH kd-tree parallel construction algorithm implementation

kd-tree cuda raytracing acceleration-data-structures acceleration-structure sah-kd-tree

Updated Jun 12, 2024
C++

celeritas-project / celeritas

Celeritas is a new Monte Carlo transport code designed to accelerate scientific discovery in high energy physics by improving detector simulation throughput and energy efficiency using GPUs.

monte-carlo cuda computational-physics hep detector-simulation particle-transport

Updated Jun 12, 2024
C++

LLNL / hiop

HPC solver for nonlinear optimization problems

Updated Jun 12, 2024
C++

Created by Nvidia

Released June 23, 2007

Followers: 205 followers
Website: developer.nvidia.com/cuda-zone
Wikipedia: Wikipedia

Related Topics

nvcc