2 pointsby ser_man2 hours ago1 comment
  • ser_man2 hours ago
    Hi HN — I'm Sergei, founder of Avaturn, our team built AVTR-1. It's an open-weights real-time avatar model built for real time interactions: feed it one reference photo and any audio, it generates a talking-head video frame by frame in real time, every pixel of the face every frame — no pre-recorded clip, no mouth swap. Sub-200ms end-to-end on one A100, runs well as well on 4060 or higher.

    Code, : https://github.com/avaturn-live/avtr-1 and Weights : https://huggingface.co/avaturn-live/avtr-1 Project page : https://avtr-1.avaturn.live Live demo no sign ups: https://avaturn.live

    Along side the model we release our streamer so you can theoretically drop in any other open-weight real-time video model, not just ours plug in any conversation backend, ship a low-latency experience without rebuilding orchestration.

    If we gain enought traction our next step is to covince other avatar providers that running a public leaderboard is useful. As of now, there isn't such for real-time avatars. We think transperancy is useful and can accelrate research for all. In the meantime we ran our own benchmark vs other real time models , with SOTA results on 5/6 subtests , I would be happy for you to take a look and suggest any improvments to our benchmarks. (Full results are on our project page)

    The model is open weights under a community license — free for personal, research, and any commercial use under $10M ARR. I would most appreciate your feedback on it.