And for a bit more advanced use cases I do recommend looking at llm_routing [2] demo and currency_exchange demo [3].
We currently support providing seamless interface to major providers like openai, mistral, deepseek and also support hooking up to local providers like ollma [4]
[1] - https://github.com/katanemo/archgw?tab=readme-ov-file#quicks...
[2] - https://github.com/katanemo/archgw/tree/main/demos/use_cases...
[3] - https://github.com/katanemo/archgw/tree/main/demos/samples_p...
[4] - https://github.com/katanemo/archgw/tree/main/demos/use_cases...
We happen to take those tasks that are non-business or domain specific related and trained our models to offer SOTA performance for 1/10th the cost and 10x the speed. For e.g. Arch-Function can process 5k/tokens per sec