Why AI Slows Down When You Give It Too Much to Read

Patrick Law
Jun 23
2 min read

Engineers often assume more context means better answers from ChatGPT. But here’s a frustrating truth: the more you give it, the worse it performs. If your AI assistant is slowing down, losing accuracy, or just acting weird — overloaded input might be why.

When More = Less Large Language Models like ChatGPT or Claude can technically process huge documents — up to 100,000 words in some cases. But just because they can doesn’t mean they should. Research shows performance peaks at around 3,000–8,000 tokens (roughly a few pages).

Give it more than that, and you’re not adding context — you’re adding noise. The model takes longer, the answers get vaguer, and it often misses key points.

The ‘Lost-in-the-Middle’ Problem AI has a short attention span. It prioritizes the beginning and end of the input — and tends to forget what’s in the middle. This is called the "lost-in-the-middle" effect. So if your critical data is buried halfway through a long spec sheet, there’s a good chance it gets ignored.

What We’re Seeing at Singularity Truthfully, we’re still refining this at Singularity. We’ve been uploading long engineering documents into ChatGPT — and yes, we’re hitting performance issues.

The fix? We’re shifting to smaller, cleaner inputs. Breaking large documents into chunks. Prioritizing what matters. Not just for speed, but clarity.

Best Practices for Engineers

Don’t upload a whole document. Extract only what’s relevant.
Move key information to the top or bottom of your input.
Watch model response time — slow replies are a red flag.

Conclusion + CTA: AI doesn’t need more data. It needs better input. For more ways to sharpen your engineering workflows using AI, subscribe to our newsletter: https://www.singularityengineering.ca/general-4

Why AI Slows Down When You Give It Too Much to Read

Recent Posts

Comments