Skip to main content
🎓 Claude Code Masterclass Learn AI-assisted development on Udemy — plus the companion book on Leanpub & Amazon. Start Learning
Fix Copilot response hit the length limit error
AI

Fix: "Sorry, the Response Hit the Length

Getting the response hit the length limit error in Microsoft Copilot or Bing Chat? Here are 7 proven fixes to get complete responses every time.

LB
Luca Berton
· 2 min read

The error message “Sorry, the response hit the length limit. Please rephrase your prompt.” is one of the most common frustrations with Microsoft Copilot (formerly Bing Chat). Here is why it happens and how to fix it.

Why This Happens

Microsoft Copilot has a token output limit per response. When your question requires a long answer — detailed code, comprehensive lists, long explanations — the model hits this ceiling and truncates the output.

This is not a bug. It is a deliberate limit to manage compute costs and response times.

Fix 1: Break Your Prompt into Smaller Parts

Instead of asking for everything at once:

Before (hits limit):

“Write a complete Python web application with authentication, database, API endpoints, tests, and deployment instructions”

After (works):

“Write the database models for a Python web app with user authentication”

Then follow up with:

“Now add the API endpoints for the models above”

Fix 2: Ask Copilot to Continue

When the response cuts off, simply type:

“Continue from where you left off”

Or:

“Continue”

Copilot will pick up where it stopped. You may need to do this 2-3 times for very long responses.

Fix 3: Start a New Conversation

Copilot conversations accumulate context. Long conversation histories eat into the token budget, leaving less room for responses:

  1. Click the New Topic button (broom icon)
  2. Re-ask your question in a fresh conversation
  3. You will get a longer response with the full token budget available

Fix 4: Request a Specific Format

Structured formats use fewer tokens:

“Give me a bullet-point list of the top 10 Kubernetes security practices” (shorter than paragraph form)

“Summarize in a table with columns: Tool, Purpose, License” (compact output)

“Give me just the code, no explanations” (eliminates commentary)

Fix 5: Set Explicit Length Constraints

Tell Copilot how long you want the response:

“In under 500 words, explain…”

“Give me a brief overview (3 paragraphs max) of…”

“List the top 5 most important…”

Fix 6: Switch Conversation Style

Microsoft Copilot offers different conversation styles:

  • Creative — longer, more detailed responses
  • Balanced — default, moderate length
  • Precise — shorter, focused responses

Try Creative mode if you need longer outputs, or Precise if you want Copilot to stay concise and avoid hitting the limit.

Fix 7: Use Copilot in Different Contexts

The token limit varies by product:

ProductTypical LimitNotes
Copilot (free web)~4,000 tokens outputMost restrictive
Copilot Pro~8,000 tokens outputLonger responses
M365 CopilotVaries by appWord/Excel have different limits
Copilot in VS Code~8,000 tokensCode-optimized
GitHub Copilot Chat~4,096 tokensContext window matters

If you are on the free tier, upgrading to Copilot Pro doubles your response length.

For Developers: GitHub Copilot Chat Limits

If you are hitting this in GitHub Copilot Chat in VS Code:

# Use /fix or /explain for targeted responses
/explain this function
/fix the error in this file

# Use @workspace for scoped questions
@workspace how is authentication implemented?

# Break large refactoring into steps
"Refactor the database layer only"
"Now refactor the API routes to use the new database layer"

About the Author

I am Luca Berton, AI and Cloud Advisor. I help teams adopt AI tools productively. Book a consultation.

Free 30-min AI & Cloud consultation

Book Now