AI Tips
Practical ways to train, run, or shrink AI models — explained for people new to AI. 7 new in last 30d.
New here? Each card answers one question: what is this and why should I care? Click a card to read the full explanation, including any new words. The command at the bottom is what you would type to try it on your own machine.
The AI flow — where each tip fits
Read left to rightAn AI model goes through these five phases. Click a phase to see the tips that apply there.
- 1Pre-training
A model first learns language by reading huge amounts of text. This costs millions of dollars and runs on thousands of GPUs.
- 2Fine-tuning
You take that pre-trained model and teach it your own data, your own task, or your own writing style. Hours to days, on a few GPUs.
e.g. QLoRA · Unsloth · DeepSpeed ZeRO-3 / FSDP
See Training tips → - 3Preference tuning
After fine-tuning, you teach the model which answers humans prefer. This makes it polite, helpful, and on-topic.
e.g. DPO / GRPO / KTO
See Training tips → - 4Quantization
The trained model is huge. Quantization shrinks it about 4× by storing its numbers with less precision, so it fits on cheap hardware.
e.g. GGUF + llama.cpp · AWQ / GPTQ · EXL2
See Quantization tips → - 5Inference / serving
Running the model so users can ask it questions. This is what your app actually does in production.
e.g. vLLM (PagedAttention) · Ollama · Speculative decoding
See Inference tips →
MLX — run big models on a MacApple's framework that uses the Mac's shared memory. A 64 GB MacBook Pro can run a Llama-3 70B at usable speed.
Cheap GPUM2 / M3 / M4 Max 64GB+
On a normal PC the GPU has its own memory (VRAM) and the CPU has its own memory (RAM), and they have to copy data between them — that copy is slow. Apple Silicon Macs share one memory pool between CPU and GPU, so there is no copy. MLX is Apple's framework that takes advantage of this. A 64 GB M3 Max runs a 4-bit Llama-3 70B at about 10 tokens per second, which is usable for chat. The mlx-community on Hugging Face mirrors popular models pre-quantized for you.
Try it
pip install mlx-lm && mlx_lm.generate --model mlx-community/Llama-3.1-70B-Instruct-4bit --prompt 'hello'Simplify AI infrastructure at the edge with Cisco and CanonicalNewAutoTest-time inference is shifting to the edge to reduce latency and bandwidth consumption.
Cheap GPUCPU only
Legacy infrastructure was not designed for AI era requirements. Large-scale model training remains centralized in data centers, while test-time inference is moving to the edge to reduce latency and bandwidth consumption.
Try it
# Example command to deploy AI model on edge device
sudo docker run -d --name ai-edge-model my-ai-model:latestOptimize AI infrastructure for edge deploymentNewAutoShift test-time inference to the edge to reduce latency and bandwidth consumption
Cheap GPUCPU only
Legacy infrastructure was not designed for AI requirements. Large-scale model training remains centralized in data centers, but test-time inference is rapidly shifting to the edge. This can help reduce latency and bandwidth consumption, which is crucial for real-time AI applications.
Try it
sudo apt-get install -y edge-ai-optimization-toolOptimize AI energy consumption with Ubuntu 26.04 LTSNewAutoUbuntu 26.04 LTS focuses on reducing energy consumption for AI workloads, which is crucial for cost and sustainability.
Cheap GPUCPU only
Ubuntu 26.04 LTS is designed to maximize the value extracted from GPU clusters by focusing on energy efficiency, measured in tokens per watt (TpW). This metric helps CEOs and infrastructure teams manage the cost of AI workloads more effectively.
Try it
sudo apt-get install ubuntu-26.04-ltsLinux 7.2 can boot on Apple M3 devicesNewAutoLinux 7.2 mainline kernel will support booting on Apple M3 devices, including iMac and MacBook.
Cheap GPUCPU only
This means that users with Apple M3 devices will be able to run Linux, although it may not be immediately useful for end-users due to ongoing development and compatibility issues.
Try it
sudo apt update && sudo apt upgrade -y && sudo apt install linux-image-7.2Linux 7.2 boots on Apple M3 devicesNewAutoLinux 7.2 mainline kernel will support booting on Apple M3 devices, including iMac and MacBook products.
Cheap GPUApple M3
This means that users with Apple M3 devices will be able to run Linux on their hardware, potentially improving the utility of these devices for users who prefer or require Linux.
Try it
sudo apt update && sudo apt upgrade -y && sudo apt install linux-image-7.2Use NVIDIA Vera CPU for agentic workloads in AI factoriesNewAutoLeverage NVIDIA Vera CPU to handle agentic workloads in AI factories.
Cheap GPUNVIDIA Vera CPU
NVIDIA Vera CPU sets a new standard for agentic workloads by enabling AI factories to preprocess and analyze large datasets more efficiently, leading to improved AI model training and scaling.
Try it
# Example command for running agentic workloads on NVIDIA Vera CPU
# This is a placeholder command and may vary based on actual usage
nvidia_vera_run --task <task_name> --data <data_path>Cache Aware Scheduling to improve Linux kernel performanceNewAutoImproves performance by reducing cache misses
Cheap GPUCPU only
CONFIG_SCHED_CACHE has been merged into the mainline kernel, which should improve performance by reducing cache misses. This can be particularly beneficial for AI workloads running on CPU-only systems.
Try it
# CONFIG_SCHED_CACHE is enabled by default in Linux 7.2
# No specific command needed, just ensure your kernel is updatedImprove Linux GPU Drivers for Better Gaming ExperienceAutoValve is expanding their open-source Linux graphics driver team to enhance GPU drivers.
Cheap GPUCPU only
Valve has hired a leading Mesa developer from AMD to join their team, aiming to improve the Linux GPU drivers for a better gaming experience. This move signifies the importance of optimizing GPU drivers for better performance and compatibility on Linux systems.
Try it
sudo apt-get install mesa-utilsEnhancing Linux GPU Drivers for Better Gaming ExperienceAutoValve hires top Mesa developer from AMD to improve Linux GPU drivers.
Cheap GPURTX 3090 24GB
Valve continues to expand their open-source Linux graphics driver team, securing top talent to enhance the Linux GPU drivers for a better gaming experience, which can also benefit AI developers running GPU-intensive tasks.
Try it
sudo apt-get install mesa-utilsSolving Agentic AI's Scale-Up Problem with NVIDIA Vera Rubin PlatformAutoNVIDIA Vera Rubin platform addresses the scale-up problem in agentic AI inference workloads.
Cheap GPUNVIDIA Vera Rubin
Agentic inference has fundamentally changed the runtime dynamics of inference workloads by introducing non-deterministic trajectories. NVIDIA Vera Rubin platform is designed to solve the scale-up problem in agentic AI, enabling efficient inference on large models.
Try it
# Example command to run agentic AI inference on NVIDIA Vera Rubin
nvidia-smi -i 0 --gpu=0 --compute-mode=exclusive_process --threads=1 --mig=1g.1g.1g.1g.1g.1g.1g.1gArm Mali G1 Pro support in PanVK and Panfrost driversAutoPanVK Vulkan driver and Panfrost Gallium3D driver now support Arm Mali G1-Pro GPU hardware.
Cheap GPUArm Mali G1-Pro
This support enables AI developers to utilize Arm Mali G1-Pro GPUs with open-source drivers, expanding the range of affordable hardware options for AI development.
Try it
git clone https://github.com/panfrost-driver/panfrost && cd panfrost && ./configure && make && sudo make installImproved support for older AMD GPUs on LinuxAutoValve's Linux open-source graphics driver team enhances aging AMD GCN 1.0/1.1 era graphics cards.
Cheap GPUOlder AMD GCN 1.0/1.1 GPUs
This improvement allows for better utilization of older AMD GPUs on Linux, potentially enabling AI developers to run models on more affordable hardware.
Try it
sudo apt-get install mesa-driverArm Mali G1 Pro support in open-source PanVK & Panfrost driversAutoPanVK Vulkan driver and Panfrost Gallium3D driver now support Arm Mali G1-Pro GPU hardware.
Cheap GPUArm Mali G1-Pro
This support enables AI development on devices with Arm Mali G1-Pro GPUs, which are typically found in lower-cost or embedded systems.
Try it
git clone https://github.com/panfrost-driver/panfrost && cd panfrost && ./configure && make && sudo make installAccelerate Page Migration for Better PerformanceAutoAMD engineers are working on patches for accelerating page migration in the Linux kernel.
Cheap GPUCPU only
This patch series, originally started by a NVIDIA engineer in early 2025, aims to improve system performance by accelerating page migration. AMD's involvement suggests that this optimization could benefit a wide range of systems, not just those with AMD hardware.
Try it
git apply amd_page_migration.patchAccelerate page migration for better performanceAutoAMD engineers are working on patches to accelerate page migration for improved performance.
Cheap GPUCPU only
This patch series, originally started by a NVIDIA engineer, is now being worked on by AMD to accelerate page migration, which can lead to better performance in Linux systems.
Try it
git apply accelerated-page-migration.patch