The world needs products like this that are local first and open source. Enable me train an open source LLM on my M2 Macbook with a desktop app then I'll consider giving you my money. App developers integrating LLM's need to be able to experiment and see the potential before storing everything on the cloud.
We've built the platform primarily for companies that serve LLMs in production, so even if we allowed you to fine-tune on device, sooner or later you will find yourself in a position where you want to deploy the model.
We want to streamline this whole process, end-to-end.
With that being said, I do agree that we shouldn't store everything on the cloud, this is what we're doing about it:
1. Any data in FinetuneDB like evals, logs, datasets etc. can be exported or deleted.
2. Fine-tuned model weights for OS models can be downloaded.
3. Using our inference stack is not a requirement. Many users are happy with only the dataset manager (which is 100% free).
4. We are exploring options to integrate external databases and storage providers with FinetuneDB, allowing datasets to be stored off our servers
feedback :
1. Since you guys have support for multiple models, it would be cleaner and more correct to give the API some name which doesn't start with openAI.
2. sdk using other languages like Python in `show code` would be nice.
3. It was a bit confusing to figure out how to fine tune the model, would be nice if it was explicitly available as a side pane.
Questions:
1. Can you speak a bit about your tech stack if that's alright
2. How do you currently scale inference if there is more incoming requests coming in?
1. Where exactly did you see this? There are internal FinetuneDB API keys, and external API keys like OpenAI. Though it's confusing, I agree!
2. Work in progress.
3. I agree, thanks for the feedback.
There are multiple components working together, so it's hard to define a single tech stack. When it comes to the web app, Remix is my framework of choice and can highly recommend it.
Does the platform also help speed up the labelling of semi-structured data? I have a use case where I need to take data in word, ppt, pdf; label paragraphs / sections which could then be used to fine tune a model
We utilize LoRA for smaller models, and qLoRA (quantized) for 70b+ models to improve training speeds, so when downloading model weights, what you get is the weights & adapter_config.json. Should work with llama.cpp!
Happy to discuss this in detail, how do you measure performance?
About Gemini Flash, we add new model providers entirely based on feedback. Gemini is next on the roadmap!
You said that you do only LoRA finetuning and your pricing for Llama 3.1 8B is $2/1M tokens. To me this does seem high. I can do full finetuning (so not just a LoRA!) of Llama 3.1 8B for something like ~$0.2/M if I rent a 4090 on RunPod, and ~$0.1/M if I just get the cheapest 4090 I can find on the net.
Once a model is fine-tuned, you can run inference on Llama 3.2 3B for as low as $0.12 per million tokens. This includes access to logging, evaluation, and continuous dataset improvement through collaboration, all without needing to set up GPUs or manage the surrounding infrastructure yourself.
Our primary goal is to provide the best dataset for your specific use case. If you decide to deploy elsewhere to reduce costs, you always have the option to download the model weights.
So your $2/M tokens for LoRA finetuning tells me that you either have a very (per dollar) inefficient finetuning pipeline (e.g. renting expensive GPUs from AWS) and need such a high price to make any money, or that you're charging ~20x~30x more than it costs you. If it's the latter - fair enough, some people will pay a premium for all of the extra features! If it's the former - you might want to consider optimizing your pipeline to bring those costs down. (:
This is a very common topic, so I wrote a blog post that explains the difference between fine-tuning and RAG if you're interested: https://finetunedb.com/blog/fine-tuning-vs-rag
For a defined specific use-case it's certainly possible to beat their performance, but things get harder when you try to create a general solution.
To answer your question, the format of the data depends entirely on the use-case and how many examples you have. The more examples you have, the more flexible you can be.
Question: when do you expect to release your Python SDK?
With that being said, feel free to email us with your use-case, I could build the SDK within a few days!
More details here: https://docs.finetunedb.com/getting-started/pricing
Any specific features or use cases you're interested in?
Some minor feedback - I went to the website to look for pricing (scanned the header bar), and couldn't find it.
Didn't think to look in the docs, as it's almost always available from the homepage.
Appreciate you linking it here,but if I hadn't come from HN, I'd assume this is a "contact us for pricing" situation, which is a bit of a turnoff.