Twitter

Getting started with OLLAMA

LLM is the new fancy word, everyone's talking about LLMs and it is pretty cool actually, it is always nice to get a breeze of fresh air with this new AI/LLM technology.

So I wanted to test some LLM on a server and/or on my computer using WSL, not just using ChatGPT in my browser. I wanted to see how you can install and interact with a LLM (also how to RAG -- Retrieval-Augmented Generation -- it, more on that later). I tried a few tools and in my opinion, the simplest to use is ollama. It is also free and open source under a permissive MIT licence which basically means "do whatever you want with it, just don't sue me".


1/ Install ollama:
To install ollama, just go to the ollama download page and you'll find the below command like for Linux which you just copy and paste in a terminal:
$ curl -fsSL https://ollama.com/install.sh | sh
>>> Downloading ollama...
######################################################################## 100.0%##O#-#                                                                  ######################################################################## 100.0%
>>> Installing ollama to /usr/local/bin...
[sudo] password for fred:
>>> Creating ollama user...
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
Created symlink /etc/systemd/system/default.target.wants/ollama.service → /etc/systemd/system/ollama.service.
>>> NVIDIA GPU installed.
$
And that's it. Super easy so far! you can check that ollama is running; also check the version:
$ service ollama status
  ● ollama.service - Ollama Service
     Loaded: loaded (/etc/systemd/system/ollama.service; enabled; vendor preset: enabled)
     Active: active (running) since Thu 2024-05-23 15:22:07 CEST; 27min ago
   Main PID: 826 (ollama)
      Tasks: 18 (limit: 9471)
     Memory: 518.4M
     CGroup: /system.slice/ollama.service
             └─826 /usr/local/bin/ollama serve
			 . . .
$ ollama -v
  ollama version is 0.1.38
$


2/ Get a model:
Now that the tool to interact with a LLM is installed, we need to get one. The list of models available for ollama is available here. At the time this blog is written, there is no command line to easily get that list of available models. I could find many issues opened to request such feature like this one for example but it is not implemented yet. In the meantime, I wrote that simple script to get that list in a nice table from your command line.
$ ./ls_ollama_avail_model.sh
  Model                    : Description
------------------------------------------------------------------------------------------------------------------------
alfred                   : A robust conversational model designed to be used for both chat and instruct use cases.
all-minilm               : Embedding models on very large sentence level datasets.
aya                      : Aya 23, released by Cohere, is a new family of state-of-the-art, multilingual models that support 23 languages.
bakllava                 : BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA  architecture.
. . .
llama3                   : Meta Llama 3: The most capable openly available LLM to date
. . .
zephyr                   : Zephyr is a series of fine-tuned versions of the Mistral and Mixtral models that are trained to act as helpful assistants.
------------------------------------------------------------------------------------------------------------------------
$
You'll see that ollama supports a tons of LLMs which means a tons of fun testing these guys!

Let's start by installing llama3 which is the latest LLM from Meta.
$ ollama pull llama3
pulling manifest
pulling 6a0746a1ec1a... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏ 4.7 GB
pulling 4fa551d4f938... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  12 KB
pulling 8ab4849b038c... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  254 B
pulling 577073ffcc6c... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  110 B
pulling 3f8eb4da87fa... 100% ▕███████████████████████████████████████████████████████████████████████████████████████▏  485 B
verifying sha256 digest
writing manifest
removing any unused layers
success
$
Again: super easy. You can use the ollama list command to see the models installed on your system:
$ ollama list
NAME            ID              SIZE    MODIFIED
llama3:latest   365c0bd3c000    4.7 GB  About a minute ago
$
Note that you can install and use as many model as it pleases you, the only limit is your storage size, as you can see this llama3 is 4.7GB. Note that the models are stored into ~/.ollama/models (/usr/share/ollama/.ollama/models under WSL). To store the models into an alternative destination (a FS with a lot of storage for example), you are supposed to use the OLLAMA_MODELS variable; I tried with no luck and found many bugs related to this so you would need to make a symbolic link to an alternate path for now. This should be fixed eventually though.

3/ Interact with a model:
Now the fun part: interacting with your model! It is a simple as the previous steps:
$ ollama run llama3
>>> in a word, who are you ?
I am LLaMA.
>>> who is luke skywalker father ?
Luke Skywalker's father is Anakin Skywalker, also known as Darth Vader.
>>> exit
It was nice chatting with you! If you want to start a new conversation or ask me something else, feel free to come back and say hi anytime. Until next time, may the Force be with you!
>>>
See how Llama3 concluded the conversation with a may the Force be with you! as I was asking about Star Wars; this guy is kind of smart :) -- and actually CTRL+D closes the conversation, not exit. Interesting to see how the same conversation goes with for example llama2 which is llama3 predecessor's:
$ ollama run llama2
>>> in a word, who are you ?
I am LLaMA, an AI assistant developed by Meta AI that can understand and respond to human input in a conversational manner. I am trained on a massive dataset of text from the internet and can generate human-like responses to a wide range of topics and questions. I can be used to create chatbots, virtual assistants, and other applications that require natural language understanding and generation capabilities.
>>> who is luke skywalker father ?
Luke Skywalker's father is Anakin Skywalker.
>>> exit
You have entered the command to exit. I am sorry, but I cannot comply with this request as I am a text-based AI assistant and do not have the capability to physically exit or leave a virtual environment. Is there anything else I can help you with?
>>>
This is not looking as good as llama3; let's try with mistral which is a another model with has a great reputation:
$ ollama run mistral
>>> in a word, who are you ?
 Assistant. I'm here to help! How can I assist you today?
 >>> who is luke skywalker father ?
 Luke Skywalker's father is Darth Vader, who is revealed to be Anakin Skywalker in the original Star Wars trilogy. However, this is a twist that is known from the beginning of the sequel trilogy, as Rey reveals Darth Vader's identity to Luke in "The Last Jedi."
>>> exit
 Goodbye for now! Don't hesitate to come back if you have any more questions or need assistance with something else. I'm always here to help!
>>>
Again some differences which makes sense: it is different models. Anyway, I'll leave you to it, I could play with these things all day long :)

4/ Update ollama:
Last but not least, ollama being kind of new, you may need to update it on a regular basis; hopefully, this is also very simple; remember that command line to install ollama? well it is the same which is going to perform the upgrade !:
$ ollama -v
ollama version is 0.1.38
$ curl -fsSL https://ollama.com/install.sh | sh
>>> Downloading ollama...
######################################################################## 100.0%##O#-#
>>> Installing ollama to /usr/local/bin...
[sudo] password for fred:
>>> Adding ollama user to render group...
>>> Adding ollama user to video group...
>>> Adding current user to ollama group...
>>> Creating ollama systemd service...
>>> Enabling and starting ollama service...
>>> Nvidia GPU detected.
>>> The Ollama API is now available at 127.0.0.1:11434.
>>> Install complete. Run "ollama" from the command line.
$ ollama -v
ollama version is 0.1.47
$

That wraps it up for this one; ollama is a very easy tool to use, there is no excuse not using it and playing with these famous LLMs!

No comments:

Post a Comment

Some bash tips -- 17 -- SECONDS

Reinventing the wheel is rooted deep inside our DNA. I really try to never reinvent the wheel but I recently got caught (!). Indeed, wh...