Skip to content

Navigation Menu

Explore
For
- Enterprise
- Teams
- Startups
- Education
By Solution
Resources
- GitHub Sponsors
  Fund open source developers
- The ReadME Project
  GitHub community articles
Repositories
- Enterprise platform
  AI-powered developer platform
Available add-ons
Pricing

Search code, repositories, users, issues, pull requests...

Search

Clear

Search syntax tips

Provide feedback

We read every piece of feedback, and take your input very seriously.

Include my email address so I can be contacted

Saved searches

Use saved searches to filter your results more quickly

Name

Query

To see all available qualifiers, see our documentation.

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Dismiss alert

microsoft / DeepSpeed Public

Notifications You must be signed in to change notification settings
Fork 3.9k
Star 33.3k

Code
Issues 971
Pull requests 141
Discussions
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Pull requests: microsoft/DeepSpeed

Labels 32 Milestones 0

Labels 32 Milestones 0

New pull request New

141 Open 2,633 Closed

141 Open 2,633 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Reviews

Filter by reviews

No reviews Review required Approved review Changes requested

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Pull requests list

Unpin transformers version

#5650 opened Jun 12, 2024 by loadams

Loading…

Fix memory leak from _hp_mapping

#5643 opened Jun 11, 2024 by chiragjn

Loading…

reduce all-to-all communication volume when both expert and non-expert are tensor-parallel

#5626 opened Jun 7, 2024 by taozhiwei

Loading…

3

Hybrid Offloading for ZeRO3

#5625 opened Jun 7, 2024 by tohtana • Draft

fix: quantization with DeepSpeed HE

#5624 opened Jun 6, 2024 by Atry

Loading…

1

Add support for Phi-3 small to FastGen

#5614 opened Jun 4, 2024 by adk9 • Draft

[INF] Enable torch compile for inference

#5612 opened Jun 4, 2024 by oelayan7

Loading…

3

Upgrade HPU image to v1.16.0.

#5610 opened Jun 4, 2024 by vshekhawat-hlab

Loading…

4

Fixed Windows inference build.

#5609 opened Jun 3, 2024 by costin-eseanu

Loading…

Add an argument to enable the injection of missing state during the conversion of universal checkpoints

#5608 opened Jun 3, 2024 by xylian86

Loading…

[CPU] Allow deepspeed.comm.inference_all_reduce in torch.compile graph

#5604 opened Jun 3, 2024 by delock

Loading…

state_dict_factory: llama checkpoint - support SWIGLU

#5601 opened Jun 2, 2024 by nelyahu

Loading…

FastGen H100 MoE support: Add PyTorch multi-gemm MOE implementation

#5586 opened May 29, 2024 by HeyangQin

Loading…

7

Update profiler.py

#5584 opened May 29, 2024 by gameofdimension

Loading…

Remove compile wrapper to simplify access to model attributes

#5581 opened May 29, 2024 by tohtana

Loading…

reduce cpu host overhead when using moe

#5578 opened May 29, 2024 by ranzhejiang

Loading…

7

_exec_forward_pass: place zeros(1) on the same device as the param

#5576 opened May 28, 2024 by nelyahu

Loading…

Reuse KV cache of prefixes

#5572 opened May 27, 2024 by tohtana • Draft

3

[CPU] SHM based allreduce improvement for small message size

#5571 opened May 27, 2024 by delock

Loading…

Add support for Microsoft Phi-3 model to DeepSpeed-FastGen

#5559 opened May 21, 2024 by adk9

Loading…

Add chatglm2 & chatglm3 autotp

#5540 opened May 16, 2024 by Yejing-Lai

Loading…

2

Fix deadlock in PipeEngine._exec_recv_grads

#5518 opened May 10, 2024 by i4never

Loading…

3

inference: remove unused _validate_args function

#5505 opened May 8, 2024 by nelyahu

Loading…

5

Z3: optimizations for grad norm calculation and gradient clipping

#5504 opened May 7, 2024 by nelyahu

Loading…

4

Update to ROCm6

#5491 opened May 1, 2024 by loadams

Loading…

1

Previous 1 2 3 4 5 6 Next

Previous Next

ProTip! Updated in the last three days: updated:>2024-06-09.

Footer

© 2024 GitHub, Inc.

Footer navigation

Terms
Privacy
Security
Status
Docs
Contact

You can’t perform that action at this time.