Templates
Overview
We strive to make accessing and utilizing GPUs seamless and effective. One of the ways we achieve this is through our template system, which allows users to quickly deploy GPU workloads with minimal setup. This doc covers the two types of templates, a currently supported list of managed templates, and how to apply templates to your instance.
What is a Template
GPU Trader templates are structured as Docker Compose files, ensuring flexibility and ease of deployment. Templates help with:
- Faster Deployment: reduce setup time and focus on execution.
- Consistency: Ensure that environments remain consistent across deployments, reducing errors and compatibility issues.
- Scalability: Deploy workloads at scale with a few clicks, making it easy to expand your computing power as needed.
We offer two types of templates:
- Managed Templates: Pre-configured by GPU Trader, continuously updated, and optimized for performance on our platform.
- Custom Templates: User-defined templates that provide flexibility for specific workload needs.
Let’s dive into how these templates work and how they can enhance your experience on GPU Trader.
Managed Templates
Managed Templates are designed for users who want a quick and reliable way to deploy GPU workloads without worrying about configuration issues. These templates are:
- Maintained by GPU Trader: We ensure they are always up to date and compatible with our infrastructure.
- Optimized for Performance: Managed templates leverage best practices to ensure workloads run efficiently.
- Easy to Deploy: No need for users to configure settings, just select a template and start using it immediately.
With Managed Templates, users can focus on running their applications rather than spending time on defining and managing a complex setup process. Here is a list of current managed templates:
Last Updated: April 10, 2025
Template | Description |
---|---|
vLLM | vLLM is a fast and easy-to-use library for LLM inference and serving. |
Ubuntu Noble Numbat | Ubuntu 24.04 LTS “Noble Numbat” is the latest Long Term Support (LTS) release from Canonical, launched on April 25, 2024. |
Ubuntu Jammy Jellyfish | Ubuntu 22.04 LTS “Jammy Jellyfish” is a Long Term Support (LTS) release of the Ubuntu operating system, launched on April 21, 2022. |
Ubuntu Focal Fossa | Ubuntu 20.04 LTS “Focal Fossa” is a Long Term Support (LTS) release of the Ubuntu operating system, launched on April 23, 2020. |
NVIDIA Triton | NVIDIA Triton Inference Server is an open-source software platform designed to streamline and standardize the deployment of AI models in production environments. |
Tensorflow | TensorFlow is an open-source platform developed by Google for building and deploying machine learning and deep learning models. |
Red Hat ubi9 | Red Hat Universal Base Image 9 (UBI 9) is a freely redistributable, Open Container Initiative (OCI)-compliant base operating system image provided by Red Hat. |
Ray | Ray is an open source unified framework for scaling AI and Python applications. It provides a simple, universal API for building distributed applications that can scale from a laptop to a cluster. |
Pytorch | PyTorch is an optimized tensor library for deep learning using GPUs and CPUs. |
Oobabooga’s Web UI | Oobabooga is a simple web UI for interacting with the Open Source models. |
Open WebUI | Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI platform designed to operate entirely offline. It supports various LLM runners like Ollama and OpenAI-compatible APIs, with built-in inference engine for RAG, making it a powerful AI deployment solution. |
Apache MXNet | Apache MXNet is an open-source deep learning framework designed for both efficiency and flexibility, with a focus on scalability across multiple CPUs and GPUs. |
MLFlow | MLflow is an open-source platform for managing the end-to-end machine learning lifecycle, encompassing tracking experiments, packaging code, managing models, and deploying them, all in a reproducible and collaborative manner. |
Blender Kasm | Blender Kasm refers to the integration of Blender, a powerful open-source 3D creation suite, within Kasm Workspaces, a container streaming platform. This setup allows users to access Blender through a web browser, eliminating the need for local installations and enabling remote 3D modeling and animation work. |
Juice Labs Agent | Juice is GPU-over-IP: a software application that routes GPU workloads over standard networking, creating a client-server model where virtual remote GPU capacity is provided from Server machines that have physical GPUs (GPU Hosts) to Client machines that are running GPU-hungry applications (Application Hosts). This template allows users to add external GPUs to their existing pools. |
Hugging Face Transformers | Hugging Face Transformers is a widely-used open-source library that provides easy access to state-of-the-art natural language processing (NLP) and generative AI models, including models for text, vision, audio, and multimodal tasks. |
Fedora | Fedora is a community-driven Linux distribution sponsored by Red Hat. |
Cuda Devel Ubuntu | ”cuda-devel” in Ubuntu refers to the package containing the development files for the NVIDIA CUDA Toolkit. This package provides the necessary headers and libraries for compiling CUDA applications. |
ComfyUI | ComfyUI is an open-source, node-based graphical user interface (GUI) designed for creating and managing complex workflows in generative AI models, particularly Stable Diffusion. |
Custom Templates
For users who require greater flexibility, Custom Templates allow for personalized configurations that fit specific workload needs. These templates can be based off of existing GPU Trader manged templates by duplicating them or you can start from scratch. However, users should be aware, due to security constraints, certain elements are restricted to maintain system integrity and prevent abuse. Here’s how they work:
- User-Defined Configurations: Choose the software stack, dependencies, and settings that suit your application.
- Reusable for Future Deployments: Once created, a custom template can be reused, saving time on repeated configurations.
- Fine-Tuned for Your Workloads: Customize resource allocation, libraries, and optimizations to match your unique requirements.
Learn how to create custom templates by following this tutorial.
Applying Templates to your Instance
Selecting the Instance
Navigate to My Instances or your instance’s detail page to add a template. You will be redirected to the detail page automatically after renting an instance. If you are somewhere else in the platform, click ‘My Instances’ and click ‘Setup’ on the instance you want to choose a template for.
If you have used the instance before and don’t know if stacks are running, click the ‘Card’ to see the instance details.
Finding a Template
From the instance detail page click ‘Setup’ or ‘Add Stack’ to select a template for your instance. You will be redirected to a page showing your account’s templates.
Choosing a Template
Highlight a template by clicking on the ‘inventory card’ or select the template by clicking the ‘Select’ button. In this example we will choose the Tensorflow managed template. This will allow you to review the template before applying it to your instance.
Configuring the Template
The template details allows users edit the template before instantiating it. Users can Name their templates and give a unique description, enable Basic Authentication, configure Environment Variables, edit the template YAML, and Save the template as a custom template.
Name: This field can be changed from the default to identify the template more easily.
Description: This field can be changed from the default to provide context for a user or or team.
Basic Authentication: There is an optional feature to enable or disable basic authentication using a toggle in the template details. Users should enable this feature if they want to protect access to a public-facing site in front of a web application service defined in the template. To learn more, read the basic authentication documentation.
Environment Variables: Use environment variables to configure dynamic settings in the stack’s containers.
${VARIABLE_NAME}
.In the Tensorflow template there are two variables that need to be defined, ${TOKEN}
and ${PW}
. Fill out the Key and Value, and click ‘Add’ to associate the variables with your template.
Configuration: You may edit the YAML in the template directly. Examples might include changing the image from latest
to another version. Keep in mind, changing the YAML makes the template custom and GPU Trader doesn’t gurantee it will work.
Save Template: If you make changes to the template that you wish to save for future use, click ‘Save Template’ to save it as a custom template.
Use Template: If you are satisfied with the template click ‘Use Template’ to instantiate the stack. If the template is definied correctly you will be redirected back to the instance detail page.
Confirming the Template Deployed
The stack will appear under the instance details and will show a status of ‘Applying’ while the template is deploying. The size of the image and network speeds of your instance will affect the deployment time of your stack. Some templates will take seconds to deploy, while others, like Open WebUI, take several minutes due to their size. When the deployment is complete, the status will change to ‘Applied’ and you will see details to access the stack.
Keep reading to understand template management or continue to ‘Using your Instance’ to work with the stack you just deployed.
Template Management
To manage your templates, click the ‘Templates’ link in the navigation menu under ‘My Instances’. This page will generate a list of templates associated with your account. This list includes managed templates provided by GPU Trader and the templates you have created. You can filter the list of templates based on template type (managed or custom) to quickly find what you are looking for.
Use this interface to manage your templates when not working directly with an instance. You can duplicate existing templates and save a new version or start from scratch and create a template from nothing.
Use these template tutorials to get started.
Template Tutorials
Create Custom Templates
Learn to how to find the best infrastructure for your project.
Install vLLM Template
We listen to you and support the GPU models you work with.
Run Llama 4 with Ollama
Choose your instance type and configure with your ideal number of cards.
Run DeepSeek with Ollama
Intuitive interfaces for monitoring usage, scaling resources, and accessing support.
Run Qwen 3.2 with Ollama
Focus on running applications rather than defining and managing complex setups.
Add Capacity to Juice Lab Pools
No BS. No prefunding. Upfront pricing, metered at 1-minute intervals. Pay for what you use.