Hacker News
new
top
best
ask
show
job
Show HN: Serve 100 Large AI models on a single GPU with low impact to TTFT
(
github.com
)
7 points
by
leonheuler
3 months ago
1 comment
billconan
3 months ago
can you hot swap a portion of an ai model, if my gpu is not large enough to hold the entire model? so that I can run half model first and load the other half.