Hacker News
new
top
best
ask
show
job
Train a LLM from Scratch
(
github.com
)
3 points
by
linhns
6 hours ago
1 comment
subtick
6 hours ago
Curious — how did you handle training stability early on? Was convergence an issue without heavy tuning?