AI Tips

Practical ways to train, run, or shrink AI models — explained for people new to AI. 7 new in last 30d.

New here? Each card answers one question: what is this and why should I care? Click a card to read the full explanation, including any new words. The command at the bottom is what you would type to try it on your own machine.

The AI flow — where each tip fits

Read left to right

An AI model goes through these five phases. Click a phase to see the tips that apply there.

  1. 1Pre-training

    A model first learns language by reading huge amounts of text. This costs millions of dollars and runs on thousands of GPUs.

  2. 2Fine-tuning

    You take that pre-trained model and teach it your own data, your own task, or your own writing style. Hours to days, on a few GPUs.

    e.g. QLoRA · Unsloth · DeepSpeed ZeRO-3 / FSDP

    See Training tips →
  3. 3Preference tuning

    After fine-tuning, you teach the model which answers humans prefer. This makes it polite, helpful, and on-topic.

    e.g. DPO / GRPO / KTO

    See Training tips →
  4. 4Quantization

    The trained model is huge. Quantization shrinks it about 4× by storing its numbers with less precision, so it fits on cheap hardware.

    e.g. GGUF + llama.cpp · AWQ / GPTQ · EXL2

    See Quantization tips →
  5. 5Inference / serving

    Running the model so users can ask it questions. This is what your app actually does in production.

    e.g. vLLM (PagedAttention) · Ollama · Speculative decoding

    See Inference tips →
MLX — run big models on a Mac

Apple's framework that uses the Mac's shared memory. A 64 GB MacBook Pro can run a Llama-3 70B at usable speed.

Cheap GPUM2 / M3 / M4 Max 64GB+

On a normal PC the GPU has its own memory (VRAM) and the CPU has its own memory (RAM), and they have to copy data between them — that copy is slow. Apple Silicon Macs share one memory pool between CPU and GPU, so there is no copy. MLX is Apple's framework that takes advantage of this. A 64 GB M3 Max runs a 4-bit Llama-3 70B at about 10 tokens per second, which is usable for chat. The mlx-community on Hugging Face mirrors popular models pre-quantized for you.

Try it

pip install mlx-lm && mlx_lm.generate --model mlx-community/Llama-3.1-70B-Instruct-4bit --prompt 'hello'
Source
Simplify AI infrastructure at the edge with Cisco and CanonicalNewAuto

Test-time inference is shifting to the edge to reduce latency and bandwidth consumption.

Cheap GPUCPU only

Legacy infrastructure was not designed for AI era requirements. Large-scale model training remains centralized in data centers, while test-time inference is moving to the edge to reduce latency and bandwidth consumption.

Try it

# Example command to deploy AI model on edge device
sudo docker run -d --name ai-edge-model my-ai-model:latest
Source
Optimize AI infrastructure for edge deploymentNewAuto

Shift test-time inference to the edge to reduce latency and bandwidth consumption

Cheap GPUCPU only

Legacy infrastructure was not designed for AI requirements. Large-scale model training remains centralized in data centers, but test-time inference is rapidly shifting to the edge. This can help reduce latency and bandwidth consumption, which is crucial for real-time AI applications.

Try it

sudo apt-get install -y edge-ai-optimization-tool
Source
Optimize AI energy consumption with Ubuntu 26.04 LTSNewAuto

Ubuntu 26.04 LTS focuses on reducing energy consumption for AI workloads, which is crucial for cost and sustainability.

Cheap GPUCPU only

Ubuntu 26.04 LTS is designed to maximize the value extracted from GPU clusters by focusing on energy efficiency, measured in tokens per watt (TpW). This metric helps CEOs and infrastructure teams manage the cost of AI workloads more effectively.

Try it

sudo apt-get install ubuntu-26.04-lts
Source
Linux 7.2 can boot on Apple M3 devicesNewAuto

Linux 7.2 mainline kernel will support booting on Apple M3 devices, including iMac and MacBook.

Cheap GPUCPU only

This means that users with Apple M3 devices will be able to run Linux, although it may not be immediately useful for end-users due to ongoing development and compatibility issues.

Try it

sudo apt update && sudo apt upgrade -y && sudo apt install linux-image-7.2
Source
Linux 7.2 boots on Apple M3 devicesNewAuto

Linux 7.2 mainline kernel will support booting on Apple M3 devices, including iMac and MacBook products.

Cheap GPUApple M3

This means that users with Apple M3 devices will be able to run Linux on their hardware, potentially improving the utility of these devices for users who prefer or require Linux.

Try it

sudo apt update && sudo apt upgrade -y && sudo apt install linux-image-7.2
Source
Use NVIDIA Vera CPU for agentic workloads in AI factoriesNewAuto

Leverage NVIDIA Vera CPU to handle agentic workloads in AI factories.

Cheap GPUNVIDIA Vera CPU

NVIDIA Vera CPU sets a new standard for agentic workloads by enabling AI factories to preprocess and analyze large datasets more efficiently, leading to improved AI model training and scaling.

Try it

# Example command for running agentic workloads on NVIDIA Vera CPU
# This is a placeholder command and may vary based on actual usage
nvidia_vera_run --task <task_name> --data <data_path>
Source
Cache Aware Scheduling to improve Linux kernel performanceNewAuto

Improves performance by reducing cache misses

Cheap GPUCPU only

CONFIG_SCHED_CACHE has been merged into the mainline kernel, which should improve performance by reducing cache misses. This can be particularly beneficial for AI workloads running on CPU-only systems.

Try it

# CONFIG_SCHED_CACHE is enabled by default in Linux 7.2
# No specific command needed, just ensure your kernel is updated
Source
Improve Linux GPU Drivers for Better Gaming ExperienceAuto

Valve is expanding their open-source Linux graphics driver team to enhance GPU drivers.

Cheap GPUCPU only

Valve has hired a leading Mesa developer from AMD to join their team, aiming to improve the Linux GPU drivers for a better gaming experience. This move signifies the importance of optimizing GPU drivers for better performance and compatibility on Linux systems.

Try it

sudo apt-get install mesa-utils
Source
Enhancing Linux GPU Drivers for Better Gaming ExperienceAuto

Valve hires top Mesa developer from AMD to improve Linux GPU drivers.

Cheap GPURTX 3090 24GB

Valve continues to expand their open-source Linux graphics driver team, securing top talent to enhance the Linux GPU drivers for a better gaming experience, which can also benefit AI developers running GPU-intensive tasks.

Try it

sudo apt-get install mesa-utils
Source
Solving Agentic AI's Scale-Up Problem with NVIDIA Vera Rubin PlatformAuto

NVIDIA Vera Rubin platform addresses the scale-up problem in agentic AI inference workloads.

Cheap GPUNVIDIA Vera Rubin

Agentic inference has fundamentally changed the runtime dynamics of inference workloads by introducing non-deterministic trajectories. NVIDIA Vera Rubin platform is designed to solve the scale-up problem in agentic AI, enabling efficient inference on large models.

Try it

# Example command to run agentic AI inference on NVIDIA Vera Rubin
nvidia-smi -i 0 --gpu=0 --compute-mode=exclusive_process --threads=1 --mig=1g.1g.1g.1g.1g.1g.1g.1g
Source
Arm Mali G1 Pro support in PanVK and Panfrost driversAuto

PanVK Vulkan driver and Panfrost Gallium3D driver now support Arm Mali G1-Pro GPU hardware.

Cheap GPUArm Mali G1-Pro

This support enables AI developers to utilize Arm Mali G1-Pro GPUs with open-source drivers, expanding the range of affordable hardware options for AI development.

Try it

git clone https://github.com/panfrost-driver/panfrost && cd panfrost && ./configure && make && sudo make install
Source
Improved support for older AMD GPUs on LinuxAuto

Valve's Linux open-source graphics driver team enhances aging AMD GCN 1.0/1.1 era graphics cards.

Cheap GPUOlder AMD GCN 1.0/1.1 GPUs

This improvement allows for better utilization of older AMD GPUs on Linux, potentially enabling AI developers to run models on more affordable hardware.

Try it

sudo apt-get install mesa-driver
Source
Arm Mali G1 Pro support in open-source PanVK & Panfrost driversAuto

PanVK Vulkan driver and Panfrost Gallium3D driver now support Arm Mali G1-Pro GPU hardware.

Cheap GPUArm Mali G1-Pro

This support enables AI development on devices with Arm Mali G1-Pro GPUs, which are typically found in lower-cost or embedded systems.

Try it

git clone https://github.com/panfrost-driver/panfrost && cd panfrost && ./configure && make && sudo make install
Source
Accelerate Page Migration for Better PerformanceAuto

AMD engineers are working on patches for accelerating page migration in the Linux kernel.

Cheap GPUCPU only

This patch series, originally started by a NVIDIA engineer in early 2025, aims to improve system performance by accelerating page migration. AMD's involvement suggests that this optimization could benefit a wide range of systems, not just those with AMD hardware.

Try it

git apply amd_page_migration.patch
Source
Accelerate page migration for better performanceAuto

AMD engineers are working on patches to accelerate page migration for improved performance.

Cheap GPUCPU only

This patch series, originally started by a NVIDIA engineer, is now being worked on by AMD to accelerate page migration, which can lead to better performance in Linux systems.

Try it

git apply accelerated-page-migration.patch
Source