Data > Hype

There is a ton of hype now about using GenAI for various tasks, especially for technical workers. There are lots of executives who would like to use AI to reduce their cost of labor, whether that’s getting more out of their existing staff or perhaps even reducing staff. Salesforce famously noted they weren’t hiring software engineers in 2025. I’m not sure they let engineers go, but it seems they did let support people go.

For many technical people, we know the hype of a GenAI agent writing code is just that: hype. The agents can’t do the same job that humans do, at least not for some humans. We still need humans to prompt the AIs, make decisions, and maybe most importantly, stop the agents when they’re off track. I’m not sure anyone other than a trained software engineer can do that well.

I was listening to a podcast recently on software developers using AI, and there was an interesting comment. “Data beats hype every time, ” which is something I hope most data professionals understand. We should experiment with our hypothesis, measure outcomes, and then decide if we continue on with our direction, or if we need to rethink our hypothesis.

Isn’t that how you query tune? You have an idea of what might reduce query time, you make a change, and check the results. Hopefully you don’t just rewrite certain queries using a pattern because this has helped improve performance in the past without testing your choice. Maybe you default to adding a new index (or a new key column/include column) to make a query perform better? I hope you don’t do those last two.

AI technology can be helpful, but there needs to be some thought put into how to roll it out, how to set up and measure experiments, and get feedback on whether it actually produces better code and helps engineers. Or if it’s just hype that isn’t helping.

Ultimately, I think that this is especially true for data professionals, as the training of models on SQL code isn’t as simple or easy as it might be for Python, Java, C#, etc. For example, I find some models are biased more towards one platform (MySQL) than another (SQL Server). Your experiments should include using a few different models and finding out which ones work well and (more importantly) which ones don’t. We also need to learn where models actually produce better-performing code for our platforms.

If you’re skeptical of AI, then conduct some experiments. Try to learn to use the tool to help you, rather than replace you. Look for ways to speed up your development, or have an assistant handle tedious tasks. I have found that when I do that, I get benefits from AI that save a bit of typing.

From the Pragmatic Engineer podcast, the best way to deal with some of the hype on AI is with data, take a structured approach to rolling it out, throw in a lot of AB testing measures with different groups or cohorts, evaluate, and see what works well. One of the things the guest noted was that the most highly regulated and structured groups are having the most success with AI. Because they’re careful about rollout, and they are measuring everything. They’ve been measuring time spent, accuracy of tasks and more. Then they decide where and when to use AI, which might be the best advice you get.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Unknown's avatar

About way0utwest

Editor, SQLServerCentral
This entry was posted in Editorial and tagged . Bookmark the permalink.

6 Responses to Data > Hype

  1. If only this were true in the short term:

    . “Data beats hype every time, ” 

    In aggregate, nothing beats hype, because it (like all marketing), plays to our emotions, our deepest fears, our joy. It takes work for data to even have a chance.

    The problem is really evident in all walks of life. Weight loss cures, cleanses, products we “need”, new software patterns “SQL is DEAD”, etc. And during the early stages of hype cycle, EVERY win adds to the hype, and every failure to the data. Anyone touting any of the values of something being hyped is on the bandwagon. All naysayers are defeatest.

    Just like with LLMs… the data and the hype will eventually come to a crossroads. The data will show it has a lot of values, but nothing like the hype said. And at this point it becomes boring. You and I are old enough to have worked pre-search engine for a bit. It feels like that, only this is a much better tool. (Uh oh, I just added to the hype!)

    Like

  2. Steve – so glad you did another AI piece so I had a chance to share this. The other day I asked the Google AI a question and within the same answer it provided 2 entirely conflicting pieces of information about the stock performance of Chipotle. I thought I was just not reading the response correctly so I had another read it and they agreed it was 2 entirely conflicting statements, no doubt about it. I then asked the AI about this and it apologized for the mistake.

    These things are NOT ready for what companies are planning to use them for. They never will be ready short of some massive change in how they work because their model is flawed. They are merely providing a summary/analysis of information in approved only sources which means there will always be biases involved and b/c humans are humans, lies, deception as well as just honest mistakes. They are going to propagate in accurate info and because people are being conditioned to trust them it will be damaging. Look at how long it took for people to realize Google could provide in accurate info and even now many still don’t realize this.This is why it’s a fools errand to just trust these to replace humans as opposed to using them as tools to let humans do more in less time. These corporate executives who can’t help themselves are going to bring us a very hard and difficult future because after they’ve broken the system with this foolishness, it will take time for rest of us to fix what they broke and rebuild it all.

    Like

    • way0utwest's avatar way0utwest says:

      They’re like other humans. They sometimes give good info, sometimes have responses not really backed by data. I think part of the challenge is there is so much data, that you’d hope an LLM regressed to a mean, but because they’re not deterministic, they sometimes lean one way or the other, depending on how the question is asked.

      More and more I find them amazing and horrific all at once

      Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.