-
Notifications
You must be signed in to change notification settings - Fork 666
Pull requests: InternLM/lmdeploy
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Split/tool call args json for qwen3coder tool calls (Qwen3.5)
#4433
opened Mar 19, 2026 by
lapy
Loading…
feat: fully implement compressed-tensors gs32 support in TurboMind
enhancement
New feature or request
#4429
opened Mar 19, 2026 by
lapy
Loading…
[Feature] Support n parameter in /v1/chat/completions and /v1/completions
#4419
opened Mar 17, 2026 by
ziyangliu-666
Loading…
support cache_seqlen on recurrent-gdr and causal-conv1d-update
#4417
opened Mar 17, 2026 by
grimoire
Loading…
Assign sequential api_server ports when proxy_url is unset
#4416
opened Mar 16, 2026 by
lvhan028
Loading…
[Fix][Feat] Fix worker sorting with external pg bundles & Support persistent buffer for update_params
#4397
opened Mar 6, 2026 by
CyCle1024
Loading…
Use pyupgrade and ruff to modernize LMDeploy Python Code
#4392
opened Mar 3, 2026 by
windreamer
Loading…
Support MiniMax-M2 in TurboMind engine
enhancement
New feature or request
#4343
opened Feb 10, 2026 by
zh-nj
Loading…
add preliminary support for EP(single-node) of turbomind backend
enhancement
New feature or request
#4332
opened Feb 6, 2026 by
irexyc
Loading…
change ascend paged attention from BSH format to TND format for better performace
#4295
opened Jan 27, 2026 by
jinminxi104
•
Draft
Previous Next
ProTip!
Add no:assignee to see everything that’s not assigned.