WikiJS/home/homelab/grok-AI-server-guide.md
2025-06-08 22:19:54 +00:00

487 lines
13 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
title: Full Guide - Install Ubuntu Server and Configure Ollama with CodeGemma and Phi-3 Mini
description: AI Project
published: true
date: 2025-06-06T12:27:30.919Z
tags: ollama, ai, guide, walk-though, ubuntu server, server
editor: markdown
dateCreated: 2025-06-06T12:27:28.985Z
---
# Install Ubuntu Server and Configure Ollama with CodeGemma and Phi-3 Mini
This guide provides step-by-step instructions to set up a headless **Ubuntu Server 24.04 LTS** on a PC with the following specs, install **Ollama** with **CodeGemma 7B** for user `arti` (Python coding assistance) and **Phi-3 Mini (3.8B)** for user `phixr` (system administration tasks), and restrict each users SSH access to their respective interactive AI session:
- **GPU**: Radeon RX 6600 (8 GB VRAM)
- **CPU**: AMD Ryzen 7 2700 (8 cores, ~3.2 GHz)
- **RAM**: 64 GB (2133 MT/s)
- **Storage**: 465.8 GB NVMe SSD (`nvme0n1`), 2x 931.5 GB SSDs (`sda`, `sdb`)
The setup is command-line only, with no desktop environment or window manager, and assumes youre replacing any existing OS (e.g., Proxmox). Both models use Q4_K_M quantization to fit within 8 GB VRAM and <20 GB disk space, leveraging ROCm for GPU acceleration.
---
## Step 1: Prepare for Ubuntu Server Installation
Lets prepare to install Ubuntu Server on your NVMe SSD, replacing any existing OS.
1. **Download Ubuntu Server 24.04 LTS**:
- On another computer, download the ISO:
```bash
wget https://releases.ubuntu.com/24.04/ubuntu-24.04-live-server-amd64.iso
```
- Or download manually from [ubuntu.com](https://ubuntu.com/download/server).
- Verify the ISO:
```bash
sha256sum ubuntu-24.04-live-server-amd64.iso
```
- Check the hash against [Ubuntus checksums](https://releases.ubuntu.com/24.04/).
2. **Create a Bootable USB Drive**:
- Use a USB drive (≥4 GB). Identify it with:
```bash
lsblk
```
- Write the ISO (replace `/dev/sdX` with your USB device):
```bash
sudo dd if=ubuntu-24.04-live-server-amd64.iso of=/dev/sdX bs=4M status=progress && sync
```
- **Warning**: Double-check `/dev/sdX` to avoid overwriting other drives.
- Alternatively, use Rufus (Windows) or Etcher (cross-platform).
3. **Backup Existing Data**:
- If replacing Proxmox or another OS, back up data to an external drive or another system:
```bash
scp -r /path/to/data user@other-machine:/destination
```
4. **Boot from USB**:
- Insert the USB, reboot, and enter the BIOS (usually `Del` or `F2`).
- Set the USB as the first boot device.
- Save and reboot to start the Ubuntu installer.
---
## Step 2: Install Ubuntu Server 24.04 LTS
Lets install Ubuntu Server on the NVMe SSD (`nvme0n1`, 465.8 GB).
1. **Start the Installer**:
- Select “Install Ubuntu Server”.
- Set language (English), keyboard layout, and network (DHCP or static IP).
2. **Configure Storage**:
- Choose “Custom storage layout”.
- Partition `nvme0n1`:
- **EFI Partition**: 1 GB, `fat32`, mount at `/boot/efi`.
- **Root Partition**: 464.8 GB, `ext4`, mount at `/`.
- Example (in installer):
- Select `nvme0n1`, create partitions as above.
- Write changes and confirm.
- Optional: Use `sda` or `sdb` (931.5 GB SSDs) for additional storage (e.g., mount as `/data`).
3. **Set Up Users and SSH**:
- Set hostname (e.g., `ai-server`).
- Create an admin user (e.g., `admin`):
- Username: `admin`
- Password: Set a secure password.
- Enable “Install OpenSSH server”.
- Skip importing SSH keys unless needed.
4. **Complete Installation**:
- Select no additional packages (Ollama and ROCm will be installed later).
- Finish and reboot.
5. **Verify Boot**:
- Remove the USB, boot into Ubuntu, and log in as `admin` via a local terminal or SSH:
```bash
ssh admin@<server-ip>
```
---
## Step 3: Install AMD ROCm for Radeon RX 6600
Lets set up ROCm to enable GPU acceleration for Ollama.
1. **Update System**:
```bash
sudo apt update && sudo apt upgrade -y
```
2. **Add ROCm Repository**:
- Install dependencies and add ROCm 5.7:
```bash
sudo apt install -y wget gnupg
wget -qO - https://repo.radeon.com/rocm/rocm.gpg.key | sudo apt-key add -
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.7 ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
```
3. **Install ROCm**:
```bash
sudo apt update
sudo apt install -y rocm-libs rocminfo
```
4. **Verify ROCm**:
- Reboot:
```bash
sudo reboot
```
- Check GPU:
```bash
rocminfo
```
- Look for “Navi 23 [Radeon RX 6600]”.
- Check VRAM:
```bash
rocm-smi --showmeminfo vram
```
- Expect ~8192 MB.
5. **Troubleshooting**:
- If no GPU is detected, verify:
```bash
lspci | grep -i vga
```
- Try ROCm 5.6:
```bash
echo 'deb [arch=amd64] https://repo.radeon.com/rocm/apt/5.6 ubuntu main' | sudo tee /etc/apt/sources.list.d/rocm.list
sudo apt update && sudo apt install -y rocm-libs
```
---
## Step 4: Install Ollama and Models
Lets install Ollama and download both **CodeGemma 7B** and **Phi-3 Mini**.
1. **Install Ollama**:
```bash
curl -fsSL https://ollama.com/install.sh | sh
```
- Verify:
```bash
ollama --version
```
2. **Pull CodeGemma 7B**:
- Download Q4_K_M (~4.2 GB):
```bash
ollama pull codegemma:7b
```
- Verify:
```bash
ollama list
```
- Expect `codegemma:7b` (q4_k_m).
3. **Test CodeGemma**:
- Run:
```bash
ollama run codegemma:7b
```
- Prompt: “Debug: `x = [1, 2]; print(x[2])`.”
- Expected: “Check the lists length with `len(x)`.”
- Exit: `Ctrl+D`.
4. **Pull Phi-3 Mini**:
- Download Q4_K_M (~2.3 GB):
```bash
ollama pull phi3:mini
```
- Verify:
```bash
ollama list
```
- Expect `phi3:mini` (q4_k_m).
5. **Test Phi-3 Mini**:
- Run:
```bash
ollama run phi3:mini
```
- Prompt: “Walk me through configuring a firewall.”
- Expected: “Install `ufw` with `sudo apt install ufw`. Enable with `sudo ufw enable`.”
- Exit: `Ctrl+D`.
6. **Verify GPU Usage**:
- During a session, check:
```bash
rocm-smi
```
- CodeGemma: ~56 GB VRAM.
- Phi-3 Mini: ~3.54.5 GB VRAM.
7. **Enable Ollama Service**:
```bash
sudo systemctl enable ollama
sudo systemctl start ollama
```
- Verify:
```bash
systemctl status ollama
```
---
## Step 5: Configure User `arti` for CodeGemma 7B
Lets restrict `arti`s SSH access to an interactive CodeGemma 7B session for Python coding.
1. **Create User `arti`**:
```bash
sudo adduser arti
```
- Set a secure password, optional details (e.g., full name: “Artificial Intelligence”).
2. **Restrict Home Directory**:
```bash
sudo chown arti:arti /home/arti
sudo chmod 700 /home/arti
```
- Verify:
```bash
ls -ld /home/arti
```
- Expect: `drwx------ arti arti`
3. **Create Shell Script**:
```bash
sudo nano /usr/local/bin/ollama-shell
```
- Add:
```bash
#!/bin/bash
echo "Starting CodeGemma 7B interactive session..."
/usr/bin/ollama run codegemma:7b
```
- Save and exit.
- Make executable:
```bash
sudo chmod +x /usr/local/bin/ollama-shell
sudo chown root:root /usr/local/bin/ollama-shell
sudo chmod 755 /usr/local/bin/ollama-shell
```
4. **Set Shell**:
```bash
sudo usermod -s /usr/local/bin/ollama-shell arti
```
- Verify:
```bash
getent passwd arti
```
- Expect: `arti:x:1000:1000:,,,:/home/arti:/usr/local/bin/ollama-shell`
5. **Add GPU Access**:
```bash
sudo usermod -a -G render arti
```
6. **Restrict SSH**:
```bash
sudo nano /etc/ssh/sshd_config
```
- Add:
```bash
Match User arti
ForceCommand /usr/local/bin/ollama-shell
```
- Restart SSH:
```bash
sudo systemctl restart sshd
```
7. **Limit Permissions**:
```bash
sudo usermod -G nogroup arti
```
8. **Test SSH**:
```bash
ssh arti@<server-ip>
```
- Expect: `Starting CodeGemma 7B interactive session...`
- Prompt: “Debug: `x = '5'; y = 3; print(x + y)`.”
- Expected: “Check types with `type(x)`.”
- Exit: `Ctrl+D` (terminates SSH).
- Try: `ssh arti@<server-ip> bash` (should fail).
---
## Step 6: Configure User `phixr` for Phi-3 Mini
Lets restrict `phixr`s SSH access to a Phi-3 Mini session for system administration.
1. **Create User `phixr`**:
```bash
sudo adduser phixr
```
- Set password, optional details (e.g., full name: “Phi-3 System Admin”).
2. **Restrict Home Directory**:
```bash
sudo chown phixr:phixr /home/phixr
sudo chmod 700 /home/phixr
```
- Verify:
```bash
ls -ld /home/phixr
```
- Expect: `drwx------ phixr phixr`
3. **Create Shell Script**:
```bash
sudo nano /usr/local/bin/ollama-phi3-shell
```
- Add:
```bash
#!/bin/bash
echo "Starting Phi-3 Mini interactive session..."
/usr/bin/ollama run phi3:mini
```
- Save and exit.
- Make executable:
```bash
sudo chmod +x /usr/local/bin/ollama-phi3-shell
sudo chown root:root /usr/local/bin/ollama-phi3-shell
sudo chmod 755 /usr/local/bin/ollama-phi3-shell
```
4. **Set Shell**:
```bash
sudo usermod -s /usr/local/bin/ollama-phi3-shell phixr
```
- Verify:
```bash
getent passwd phixr
```
- Expect: `phixr:x:1001:1001:,,,:/home/phixr:/usr/local/bin/ollama-phi3-shell`
5. **Add GPU Access**:
```bash
sudo usermod -a -G render phixr
```
6. **Restrict SSH**:
```bash
sudo nano /etc/ssh/sshd_config
```
- Add (below `Match User arti`):
```bash
Match User phixr
ForceCommand /usr/local/bin/ollama-phi3-shell
```
- Restart SSH:
```bash
sudo systemctl restart sshd
```
7. **Limit Permissions**:
```bash
sudo usermod -G nogroup phixr
```
8. **Test SSH**:
```bash
ssh phixr@<server-ip>
```
- Expect: `Starting Phi-3 Mini interactive session...`
- Prompt: “Walk me through installing pfSense.”
- Expected: “Download the ISO from pfsense.org. Create a USB with `dd if=pfSense.iso of=/dev/sdX bs=4M`.”
- Exit: `Ctrl+D` (terminates SSH).
- Try: `ssh phixr@<server-ip> bash` (should fail).
---
## Step 7: Optimize and Troubleshoot
Lets ensure optimal performance and address potential issues.
1. **Performance Optimization**:
- **CodeGemma 7B**: ~56 GB VRAM, ~812 tokens/second. Good for Python debugging.
- **Phi-3 Mini**: ~3.54.5 GB VRAM, ~1015 tokens/second. Ideal for system administration guidance.
- **Prompting**:
- `arti`: “Debug this Python code: [snippet].”
- `phixr`: “Walk me through [task] step-by-step.”
- **Temperature**: For precise responses, set temperature to 0.2:
- For CodeGemma:
```bash
nano ~/.ollama/models/codegemma-modelfile
```
Add:
```
FROM codegemma:7b
PARAMETER temperature 0.2
```
Create:
```bash
ollama create codegemma-lowtemp -f ~/.ollama/models/codegemma-modelfile
```
Update `/usr/local/bin/ollama-shell` to use `ollama run codegemma-lowtemp`.
- For Phi-3 Mini:
```bash
nano ~/.ollama/models/phi3-modelfile
```
Add:
```
FROM phi3:mini
PARAMETER temperature 0.2
```
Create:
```bash
ollama create phi3-lowtemp -f ~/.ollama/models/phi3-modelfile
```
Update `/usr/local/bin/ollama-phi3-shell` to use `ollama run phi3-lowtemp`.
2. **Troubleshooting**:
- **No Session**:
- Check scripts:
```bash
ls -l /usr/local/bin/ollama-shell /usr/local/bin/ollama-phi3-shell
cat /usr/local/bin/ollama-shell
cat /usr/local/bin/ollama-phi3-shell
```
- **GPU Issues**: If slow (~15 tokens/second), verify ROCm:
```bash
rocminfo
rocm-smi --showmeminfo vram
```
- Reinstall ROCm 5.6/5.7 if needed.
- **Shell Access**: If `arti` or `phixr` access Bash:
```bash
getent passwd arti
getent passwd phixr
```
- Confirm shells. Re-run `usermod -s`.
- **SSH Errors**:
```bash
sudo systemctl status sshd
```
- Restart: `sudo systemctl restart sshd`.
---
## Expected Performance
- **Hardware Fit**: CodeGemma (~56 GB VRAM, ~4.2 GB disk) and Phi-3 Mini (~3.54.5 GB VRAM, ~2.3 GB disk) fit your Radeon RX 6600, Ryzen 7 2700, 64 GB RAM, and 465.8 GB NVMe SSD.
- **Use Case**:
- `arti`: Guides Python coding/debugging (e.g., “Check your list index with `len()`”).
- `phixr`: Provides detailed system administration instructions (e.g., “Download pfSense ISO, then use `dd`”).
- **Speed**: CodeGemma (~812 tokens/second), Phi-3 Mini (~1015 tokens/second). Responses in ~12 seconds.
- **Restriction**: `arti` locked to CodeGemma; `phixr` to Phi-3 Mini. No Bash access.
## Example Usage
- **For `arti`**:
```bash
ssh arti@<server-ip>
>>> Debug: x = [1, 2]; print(x[2]).
The error suggests an invalid index. Check the lists length with `len(x)`.
```
- **For `phixr`**:
```bash
ssh phixr@<server-ip>
>>> Walk me through installing pfSense.
Download the ISO from pfsense.org. Create a USB with `dd if=pfSense.iso of=/dev/sdX bs=4M`. Check with `lsblk`.
```