Skip to main content
πŸŽ“ Claude Code Masterclass Learn AI-assisted development on Udemy β€” plus the companion book on Leanpub & Amazon. Start Learning
IBM Granite model context length and specifications
AI

IBM Granite Context Length and Model Specs (2026)

IBM Granite model context lengths, parameter counts, and deployment specs. Covers granite-3b-instruct, granite-8b-base, granite-8b-instruct and all.

LB
Luca Berton
Β· 2 min read

IBM Granite model family

IBM’s Granite models are open-source, enterprise-focused LLMs trained on curated, legally compliant datasets. They are a core part of RHEL AI and IBM watsonx.

Here is the complete context length and specification reference.

Granite 3.x model specifications

ModelParametersContext LengthLicenseNotes
granite-3.2-8b-instruct8B128,000 tokensApache 2.0Latest instruct model, thinking mode
granite-3.2-8b-instruct-preview8B128,000 tokensApache 2.0Preview with enhanced reasoning
granite-3.2-3b-instruct3B128,000 tokensApache 2.0Compact instruct model
granite-3.2-2b-instruct2B128,000 tokensApache 2.0Edge deployment target
granite-3.2-1b-instruct1B128,000 tokensApache 2.0Smallest instruct variant
granite-3.1-8b-instruct8B128,000 tokensApache 2.0Stable release
granite-3.1-8b-base8B128,000 tokensApache 2.0Base model for fine-tuning
granite-3.1-3b-instruct3B128,000 tokensApache 2.0Compact, function calling
granite-3.1-2b-instruct2B128,000 tokensApache 2.0Edge and mobile
granite-3.1-1b-instruct1B128,000 tokensApache 2.0Tiny but capable
granite-3.0-8b-instruct8B4,096 tokensApache 2.0Original release
granite-3.0-8b-base8B4,096 tokensApache 2.0Original base
granite-3.0-3b-instruct3B4,096 tokensApache 2.0Original compact

Granite code models

ModelParametersContext LengthLicense
granite-3.2-8b-instruct (code mode)8B128,000 tokensApache 2.0
granite-code-34b34B8,192 tokensApache 2.0
granite-code-20b20B8,192 tokensApache 2.0
granite-code-8b8B4,096 tokensApache 2.0
granite-code-3b3B2,048 tokensApache 2.0

Key details by model

granite-8b-base context length

The ibm-granite/granite-8b-base (Granite 3.0) has a 4,096 token context length. If you need longer context, upgrade to granite-3.1-8b-base or granite-3.2-8b-instruct which support 128,000 tokens.

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "ibm-granite/granite-3.2-8b-instruct"
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Check max context length
print(tokenizer.model_max_length)  # 128000

granite-3b-instruct context length

The ibm-granite/granite-3b-instruct (Granite 3.0) has a 4,096 token context length. The Granite 3.1 and 3.2 versions of the 3B model support 128,000 tokens.

granite-8b-instruct context length

The ibm-granite/granite-8b-instruct (Granite 3.0) has a 4,096 token context length. Upgrade to granite-3.2-8b-instruct for 128,000 token context.

Deploying Granite on NVIDIA NIM

Granite models are available through NVIDIA NIM:

docker run -d --gpus all \
  -e NGC_API_KEY=$NGC_API_KEY \
  -p 8000:8000 \
  nvcr.io/nim/ibm/granite-3.1-8b-instruct:latest

NIM profile selection for Granite:

ModelGPUProfilePrecision
granite-3.2-8b-instructA100 80GBdefaultBF16
granite-3.2-8b-instructL40SdefaultFP8
granite-3.2-8b-instructA10GdefaultFP8
granite-3.2-3b-instructT4defaultFP16

Deploying Granite on RHEL AI

Granite is the default model family for RHEL AI with InstructLab:

# Download Granite teacher model
ilab model download --repository ibm-granite/granite-3.2-8b-instruct

# Serve the model
ilab model serve --model-path models/granite-3.2-8b-instruct

# Fine-tune with InstructLab
ilab data generate --model models/granite-3.2-8b-instruct
ilab model train --model-path models/granite-3.2-8b-instruct

Granite vs other models

MetricGranite 3.2 8BLlama 3.1 8BMistral 7B
Context length128K128K32K
LicenseApache 2.0Llama 3.1Apache 2.0
Function callingYesYesYes
Code generationStrongStrongGood
RAG optimizedYesNoNo
Training data transparencyFullPartialPartial
Enterprise indemnity (via IBM)YesNoNo

Granite’s differentiator is legal compliance β€” trained on curated datasets with full provenance, making it the safest choice for regulated industries.

Free 30-min AI & Cloud consultation

Book Now