llm | Nick Herrig

Local Llama3 with Ollama

It’s been 2 days since the launch of Llama 3 and I’ve been itching to try it out. Here is a short blog post experimenting with Ollama to locally run this sota (state of the art) LLM (large language model). But first, what is Llama 3? What is Llama 3 Meta’s Llama 3 is the most capable openly available LLM as of writing this blog. On April 18th, 2024, Meta released two models of this generation with 8B and 70B parameters....

Streaming LLM Requests with Python

Today I was playing around with an LLM called Mistral 7B by running it locally with Ollama. Once installed, Ollama provides a chat interface and an API that you can use and run wherever. curl http://localhost:11434/api/generate -d '{ "model": "mistral", "prompt":"tell me a joke?" }' When running this API call, I noticed that responses were streamed back to the client in a way that appears to be token by token. Take a look at running the command....