Desktop app for generating LLM fine-tuning datasets(github.com)

2 pointsby AronDaron7 hours ago1 comment

AronDaron7 hours ago
Hey,
I've been building side projects with Claude Code for a few months, but I'm completely new to fine-tuning — started experimenting maybe a week ago. From day one I wanted a GUI for the dataset side of the workflow, so this desktop app grew alongside my very first FT attempts.
I know there are similar apps out there, but I wanted something simple that non-technical users could run with open-source models end-to-end.
To sanity-check whether the datasets were actually useful I fine-tuned Qwen2.5-Coder-7B-Instruct on them and ran HumanEval / HumanEval+ (pass@1, 5 runs). Picked these benchmarks because they match the dataset's focus and run fast on my machine:
- Base: 55.5% / 49.0% - FT V2 (1135 samples from the app): 60.0% / 54.0%
Error bars don't overlap so it's at least not noise. Obviously HumanEval is only one slice — YMMV with other categories / criteria.
Stack: Next.js 16 + FastAPI + SQLite, packaged as standalone binary (Win/Linux).
Code: https://github.com/AronDaron/dataset-generator Fine-tuned model: https://huggingface.co/AronDaron/Qwen2.5-Coder-7B-Instruct-D... Datasets: https://huggingface.co/datasets/AronDaron/dataset-gen-v1 / https://huggingface.co/datasets/AronDaron/dataset-gen-v2
Happy to hear feedback, especially if something doesn't work on your setup or if the approach misses something obvious — this is my first public tool release.