6 pointsby kfallah8 hours ago3 comments

kfallah8 hours ago
CLaaS is an open-source system that uses self-distillation to move feedback from context into model weights. Current approaches rely on system prompts and memory to personalize your model, but every token spent reminding is a token your model can't use for the actual task. Instead, with every piece of feedback, CLaaS triggers a weight update while avoiding the catastrophic forgetting you get with standard fine-tuning. The updated LoRA adapter hot-reloads into vLLM, so your next response comes from a better model.
Right now it runs on a single consumer GPU (tested on RTX 5090) with Qwen3-8B. Easy to set up with Docker Compose alongside a locally hosted OpenClaw, but the API works with any local model.
zerocks25033 hours ago
That's sick
matiszz4 hours ago
Cool project!