Local Llama3 with Ollama

It’s been 2 days since the launch of Llama 3 and I’ve been itching to try it out. Here is a short blog post experimenting with Ollama to locally run this sota (state of the art) LLM (large language model). But first, what is Llama 3? What is Llama 3 Meta’s Llama 3 is the most capable openly available LLM as of writing this blog. On April 18th, 2024, Meta released two models of this generation with 8B and 70B parameters....

April 21, 2024 · Me

Streaming LLM Requests with Python

Today I was playing around with an LLM called Mistral 7B by running it locally with Ollama. Once installed, Ollama provides a chat interface and an API that you can use and run wherever. curl http://localhost:11434/api/generate -d '{ "model": "mistral", "prompt":"tell me a joke?" }' When running this API call, I noticed that responses were streamed back to the client in a way that appears to be token by token. Take a look at running the command....

April 14, 2024 · Me