What AI does to the minds of novice coders (www.youtube.com)
from HaraldvonBlauzahn@feddit.org to programming@programming.dev on 26 Jun 19:47
https://feddit.org/post/31816844

#programming

threaded - newest

HaraldvonBlauzahn@feddit.org on 26 Jun 20:36 collapse

This is about AI and learning.

I have an interesting observation: I had the last few years the opportunity to work with older legacy code bases. And I came to the conclusion, that understanding them for me requires systematic learning. Pretty much like learning a language.

Say you have 70,000 lines of code with about 8000 to 10000 identifiers. Also a number of implicit concepts. Invariants and so on.

To work with the code, you need some basic understanding. To understand, you need an idea what these identifiers mean. Or their important subset.

Now, remember that the human brain can only keep six or seven things in short-term memory. To store more, the meaning of all these identifiers, concepts, invariants etc… … - needs to go into long-term memory.

Which means repeating and memorizing around 10,000 things.

That is similar to learning a foreign language up to a pretty good level.

So. you need that learning process to work with legacy code.

If AI use inhibits this - this means trouble.

Thorry@feddit.org on 26 Jun 22:01 collapse

Yeah that’s spot on. AI is always shown to do so well in coding using small tutorial like exercises or by generating demos. But in reality that’s almost never what devs actually work on. Usually it’s huge codebases with many different interconnected parts. Stuff that takes quite a while to figure out. And AI simply isn’t capable of dealing with anything like that.

It has this thing called the context window, which is pretty much all the knowledge the AI can have outside of its model based on the training data. These context windows are very limited in size, with beefy ones around 1M tokens. This is obviously not enough to hold the system prompt, the user prompt and all of the directly related code. And it also has to store all of the responses and follow up, the entire conversation. Another issue is when the context window is filled up a lot, the output quality suffers, there’s often too much information to give a concise answer as the bot doesn’t actually understand anything that was said.

So the bots use tricks where they will only read parts of the files. Using tools to quickly find related files and only read the relevant lines on those. It will also fall back onto common patterns, which may or may not be how the code base is actually made. Another trick they do is summarizing the conversation. There the AI summarizes the key parts of the prompt, responses and follow ups and writes it out in short form. Obviously this works kinda meh where key details are often lost. And then you get heavily into diminishing returns territory.

In my experience, when working on larger codebases, AI coding tools are mostly useless. They have to be babied through simple stuff and corrected often. This is very frustrating and takes more time than just doing the work yourself.

And if you are working with some older coding guidelines that are different from what’s normally used today. Oh boy that’s a bad time.

People get easily impressed about the thing getting a demo sorta right. Which is admittedly impressive for any computer system to do so. But it’s usually not actually right and very limited in scope. Most folk tend to extrapolate intelligence like they do with humans. If a human can do a demo well or ace an exam, they know their stuff. But with AI we have this thing termed jagged intelligence, where it can get an insanely hard complex question absolutely spot on. And then fail on very basic questions.

If it weren’t for the huge amount of marketing and shady business going on with AI, I doubt anyone would take it seriously. It’s a neat curiosity, a nice play thing, but not something we would actually call “AI”.

cloudy1999@sh.itjust.works on 27 Jun 05:09 collapse

Just wanted to say this is a great write up, thanks.