vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 2.9k
Star 20.9k

Code
Issues 881
Pull requests 278
Discussions
Actions
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2024

#3861 opened Apr 4, 2024 by simon-mo

Open 30

Virtual Office Hours: Jun 5 and Jun 20

#4919 opened May 20, 2024 by robertgshaw2-neuralmagic

Open 3

v0.5.0 Release Tracker

#5224 by simon-mo was closed Jun 12, 2024

Closed 9

Labels 42 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

881 Open 2,122 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: Prefix Caching with Multi-Lora Support bug

Something isn't working

#5475 opened Jun 12, 2024 by curiositywan

[Bug][v0.5.0]: Benign error reported by Python multiprocessing resource_tracker bug

Something isn't working

#5468 opened Jun 12, 2024 by mgoin

[Feature]: Allow user defined extra request args to be logged in OpenAI compatible server feature request

#5467 opened Jun 12, 2024 by davidgxue

[Bug]: Runtime Error: GET was unable to find an engine to execute this computation for LLaVa-NEXT bug

Something isn't working

#5465 opened Jun 12, 2024 by XkunW

[Feature]: PagedAttention multiple of 8 feature request

#5459 opened Jun 12, 2024 by barschiiii

[Bug]: Error when --tensor-parallel-size > 1 bug

Something isn't working

#5458 opened Jun 12, 2024 by javi111717

[Bug]: vllm v0.5.0 internal assert failed bug

Something isn't working

#5450 opened Jun 12, 2024 by changshivek

[Usage]: How to serve embedding model and LLM at the same time usage

How to use vllm

#5449 opened Jun 12, 2024 by weiyunfei

multilora_inference调用qwen2-1.5b报错 documentation

Improvements or additions to documentation

#5445 opened Jun 12, 2024 by zigangzhao-ai

[Bug]: v0.4.3 AsyncEngineDeadError bug

Something isn't working

#5443 opened Jun 12, 2024 by changshivek

[Bug]: TypeError: a bytes-like object is required, not 'str' bug

Something isn't working

#5440 opened Jun 12, 2024 by yaoyasong

[Bug]: get the degree of the outlines FSM compilation progress from vlllm0.5.0 engine (via a route) bug

Something isn't working

#5436 opened Jun 12, 2024 by syGOAT

[Feature]: PagedAttention for CPU-memory constraned environments? feature request

#5434 opened Jun 12, 2024 by peeteeman

[Feature]: Support [RecurrentGemmaForCausalLM] new model

Requests to new models

#5431 opened Jun 12, 2024 by sung-ho-moon

[Feature]: ci test with vGPU feature request

#5426 opened Jun 11, 2024 by youkaichao

[Bug]: CUDA out of memory when setting prompt_logprobs with larger batch_size bug

Something isn't working

#5424 opened Jun 11, 2024 by qaz-wsx-1

[RFC]: Improve guided decoding (logit_processor) APIs and performance. RFC

#5423 opened Jun 11, 2024 by rkooo567

[Bug]: vllm deployment of GLM-4V reports KeyError: 'transformer.vision.transformer.layers.45.mlp.fc2.weight' bug

Something isn't working

#5417 opened Jun 11, 2024 by zhaobu

[Usage]: How do you specify a specific branch on huggingface to use when downloading a model? good first issue

Good for newcomers

usage

How to use vllm

#5415 opened Jun 11, 2024 by fake-name

[Performance]: Qwen2-72B-Instruction-GPTQ-Int4 Openai Server Request Problem performance

Performance-related issues

#5407 opened Jun 11, 2024 by syngokhan

hidden-states from final (or middle layers) feature request

#5406 opened Jun 11, 2024 by janphilippfranken

[Bug]:The vllm service takes two hours to start Because of NCCL bug

Something isn't working

#5405 opened Jun 11, 2024 by zhaotyer

[Bug]: topk=1 and temperature=0 cause different output in vllm bug

Something isn't working

#5404 opened Jun 11, 2024 by rangehow

[RFC]: OpenVINO vLLM backend RFC

#5377 opened Jun 10, 2024 by ilya-lavrenov

0.4.3 error CUDA error: an illegal memory access was encountered bug

Something isn't working

#5376 opened Jun 10, 2024 by maxin9966

Previous 1 2 3 4 5 … 35 36 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly