Leanstral 1.5(docs.mistral.ai)

165 pointsby vetronauta10 hours ago9 comments

Grimblewald2 hours ago
Got curious, sign up, add money to account, try to use. Can't, it's a labs model. Fine, let's enable labs. Can't, unspecified error. Fine, lets contact customer support as instructed, can't no customer support, just a half-assed FAQ, that seems vibe-coded and searched poorly, totally irrelevant answers coming up for all queries tried. Then it hit me:
If AI makes good customer support, then why does no AI company use theirs to provide customer support?
- gregman16 minutes ago
  “Don’t get high on your own supply” - I think it’s Microsoft’s motto.
- criticalfault40 minutes ago
  These guys don't answer emails. Same for qwant.
  Sample of two, but I'm assuming french companies don't like to being contacted n English.
- saghman hour ago
  No one ever thought it made good customer support. It makes cheap customer support, and quite a lot of companies already have shitty customer support because they don't care about it being good, so they're thrilled to get to cut costs further.
  It's "good" from the perspective of a company that's annoyed to have to spend money on actually fixing things.
- Mashimoan hour ago
  Further down someone said the support is great and they respond within the day.
- tmikaeldan hour ago
  This isn’t the first time. I’m amazed at how they manage to fumble releases over and over …
- adithyassekhar2 hours ago
  Because that AI will either expose their business or it will be so nerfed it’s useless.
henryrobbins003 hours ago
What a coincidence! I just released OpenATP earlier today. OpenATP is an open-source Python package and CLI for agentic automated theorem provers. It includes support for Leanstral with Mistral’s Vibe harness. The previous production Leanstral model was deprecated on May 22nd. I will update the package to point to Leanstral 1.5 ASAP!
GitHub: https://github.com/henryrobbins/open-atp
Docs: https://open-atp.henryrobbins.com
c7ban hour ago
I'm not sure I understand the Weights policy. This site says the weights are Apache-licensed, suggesting it's open weights. But I can't find a download link. Their Huggingface profile seems to only provide an earlier snapshot [0]. Any pointers on whether/where we can or will be able to download the weights?
[0] https://huggingface.co/mistralai/Leanstral-2603
__natty__8 hours ago
Discussion about Leanstral 1: https://news.ycombinator.com/item?id=47404796
pmarreck3 hours ago
Lean 4 and Idris 2 are underrated, and likely great for LLM's to code in (since they provide additional guarantees)
impodimium3 hours ago
Interesting that this only specialized for Lean4 and not for similar like Coq
- DoctorOetker3 hours ago
  I would have preferred actual proof objects, as in Metamath's: separate the actual proof from the heuristics used to find it (also valuable, but a different thing).
doctorpangloss7 hours ago
Real talk, does anyone use anything from Mistral because it performs the best, by whatever secular metric of your choosing? Or is it only used "because EU"? Just focus on answering the question. I wonder if anyone has observed it perform better on any objective metric in any rigorous setting.
- Confiks6 hours ago
  For writing and languange learning it's very decent, especially Mistral Large. The pricing is very good too. I really like the consistently low time to first token and good token per second. Claude, especially in the past, would be very inconsistent, often with outages. Mistral mostly just always works and is very fast.
  Technical questions are unfortunately hit or miss. I'm lately pretty much always using a system prompt that emphasizes short answers [1], and Opus regularly one-shots it while Mistral needs a follow up. I use big-AGI as a model router [2] (dumb name, great software), which makes switching midway very easy though. For coding I'm still using Claude Code mostly out of inertia (although I really want to move to an OSS harness) and the one time I tried their `vibe` tool months ago it was a bit rough.
  Mistral TTS with diarization is also great and cheap. That's the only thing for which I use their web UI.
  [1] Give a short but helpful answer to the question the user asks. When helping with a computer-related task, unless the user asks, don't give any installation or setup instructions, but just get straight to the point. When the user asks a follow up question, give a more complete and longer answer while still not overexplaining. When the user prefaces the question with "short mode off" in any question, give a full and well considered reply.
  [2] https://github.com/enricoros/big-AGI
  - BartjeD34 minutes ago
    Mistral doesn't have caching on batches. For me that meant they are 10x more expensive than Google.
    I think its dumb.
    Their support is hidden away in a chat bubble at the bottom. But they do respond promptly.
    Its decent, but after switching to Google i wouldn't go back
  - Skinney2 hours ago
    vibe has improved _a lot_ during the past few months, fyi.
    The new Mistral Medium 3.5 is also a big improvement over devstral-2
- ashenke6 hours ago
  I use their Voxtral Mini STT audio model to automatically transcribe my podcasts into markdown. Out of all the STT models I've tried, it's both the best performing and one of the cheapest! It's really accurate, feeding the episode notes and the podcast description ensures all names are properly spelled, and speaker diarization works really great. (I just do a Gemini flash pass at the end to identify the speakers, so it shows the host name instead of "Speaker 1")
- troyvit7 hours ago
  We are not Mistral's target audience. For instance I don't know if Leanstral performs the best as a "formal proof engineering model optimised for automated theorem proving and autoformalization" because I don't even know wth that is or who else does it.
  Mistral themselves focus more on b2b; financial services, manufacturing, stuff like that, and they get some big clients that way.
  Despite not being their target, I started using them because they have many open models. I continue using them because, yeah EU, but also because the community is great and the tool makes me think more than Claude does. Last, I stick with them because they are one of the few AI companies that are up-front about their environmental impact and are actually trying to minimize it while still providing a decent product.
  - computerex7 hours ago
    It's for mathematics. There is this programming language: https://lean-lang.org/
    If you can express a solution in Lean you can formally prove or disprove it. Formal verification is making a debut in traditional engineering toolkits.
- psalaun2 hours ago
  I use it as my workhorse for coding and general chat questions, because it's good enough 80% of the time, and indeed it's french/european (with heavy US capital tho...).
  We complain too much about not having enough major competitors in the IT space, to not support a burgeoning one even if it's less powerful than SOTA labs
  - barrenkoan hour ago
    Well, if you're a taxpayer in EU you're already supporting it implicitly.
- data-ottawa7 hours ago
  Mistral medium is considerably better at writing than Opus.
  I’ve also found it very good at pulling info from pdfs. Even a complicated festival with multiple venues and timetables.
  - tjwebbnorfolk4 hours ago
    Writing what? I found it worse than gemma4 at coding even though it's 4x the parameter size
- trentor7 hours ago
  I like the models for creative writing. They have a distinct voice that is different from the other llms.
  - SwellJoe7 hours ago
    I made a game (https://prose-or-con.com) where you pick whether writing is AI or human. Mistral is a bonkers weird writer. So weird I fell for it a couple of times because I thought, "No way a model writes this weird." Not, like, incorrect grammar or spelling or anything, just...off-kilter. Kinda sassy.
    vlian20883 hours ago
    needs a leaderboard of models most often mistaken for humans.
    SwellJoe3 hours ago
    Yes, it's on the todo list, but I need more data. Only a half dozen people have played it and submitted a score. I'm storing the hashes of passages people got right and wrong so I can make exactly that chart at some point. I think both "the most human-like AI" and "the most AI-like human" are both interesting pieces of data, but I don't know either yet.
    vlian2088an hour ago
    try posting it on r/localllama and r/sillytavernai
- Adrig7 hours ago
  A few months ago, I had some data cleaning to do; their small model was surprisingly efficient and got the job done for 0.2x what I expected to run (Anthropic Sonnet / Haiku). Their TTS / STT is also roughly at the frontier, at least for French.
  But I admit I only consider them because they're from France. Haven't seen a dimension where they're competitive for general users
- suprjami3 hours ago
  I still prefer Mistral Nemo 12B for text summarisation tasks. It has a nice style. The Mistral Small 24B is also decent. I have a YouTube transcript summariser which I like these for.
  However these days I usually have Qwen 3.6 27B already loaded so I mostly just use that instead.
- adev_7 hours ago
  > Mistral because it performs the best, by whatever secular metric of your choosing?
  I am. I use them primarily through their vibe CLI.
  Reason is simple: They are cheaper (by almost one order of magnitude compared to Claude) and still do the job pretty well.
  For small programming tasks, quick prototyping, refactoring or anything verbose and not requiring a context too large: I first go to Mistral and then eventually to Claude if I'm unsatisfied.
  I also found out some of their models to be more responsive than OpenAI ones (which is not so surprising considering the size).
  My tasks are mainly C++ and Python programming. People in other languages might not share my enthusiasm.
  - jatora6 hours ago
    Your reason can't be cost because there are superior models that are cheaper than Mistral models, for coding. So i re-ask the question
    adev_6 hours ago
    > Your reason can't be cost because there are superior models that are cheaper than Mistral models
    Nope. This is not my experience.
    Public pricing in token/$ is only part of the equation.
    Mistral tooling to consume significantly less tokens-per-given-task than the Anthropic ones.
    My bills currently reflects that.
    tjwebbnorfolk4 hours ago
    I think other commenter is talking about smaller/cheaper models like Qwen that outperform mistral on just about every metric
    greenavocado4 hours ago
    Compare to Xiaomi MiMo-V2.5 you will be shocked
- evilmonkey197 hours ago
  I use it because EU and API pricing is decent to me. And support is awesome also. They reply the same day or at most the next day, and they follow the ticket great. It isn't that bad, but neither the best.
  - jatora6 hours ago
    Why do you need support so often?
- hakunin6 hours ago
  I use it because it’s a simple, convenient and cheap OCR api. Specifically via my ringbinder[1] tool.
  [1]: https://github.com/maxim/ringbinder
- bee_rider3 hours ago
  I liked that their website didn’t ask for my phone number, IIRC.
- Hamuko2 hours ago
  >Just focus on answering the question.
  Are you trying to instruct me like an LLM?
- refulgentis6 hours ago
  OCR is off the charts good on every metric you can think of.
  LLMs are a near-afterthought at this point if you don’t have data residency requirements. I love them and they’re slightly underrated, their models are consistently well-trained, open, but as you note, behind. There is no metric that will say they’re ahead in anything.
esafak7 hours ago
Is this useful for specifying programs too or only theorems?
- zeckalpha6 hours ago
  Curry-Howard correspondence.
  - esafak2 hours ago
    It may be theoretically possible, but is it ergonomic and useful? Do you use Lean for your programs?
    nymaltan hour ago
    I used Lean for AoC last time and it’s really good.
    tsterinan hour ago
    use https://rocq-prover.org/ for that purpose
mertleee6 hours ago
[dead]