Data > Hype

There is a ton of hype now about using GenAI for various tasks, especially for technical workers. There are lots of executives who would like to use AI to reduce their cost of labor, whether that’s getting more out of their existing staff or perhaps even reducing staff. Salesforce famously noted they weren’t hiring software engineers in 2025. I’m not sure they let engineers go, but it seems they did let support people go.

For many technical people, we know the hype of a GenAI agent writing code is just that: hype. The agents can’t do the same job that humans do, at least not for some humans. We still need humans to prompt the AIs, make decisions, and maybe most importantly, stop the agents when they’re off track. I’m not sure anyone other than a trained software engineer can do that well.

I was listening to a podcast recently on software developers using AI, and there was an interesting comment. “Data beats hype every time, ” which is something I hope most data professionals understand. We should experiment with our hypothesis, measure outcomes, and then decide if we continue on with our direction, or if we need to rethink our hypothesis.

Isn’t that how you query tune? You have an idea of what might reduce query time, you make a change, and check the results. Hopefully you don’t just rewrite certain queries using a pattern because this has helped improve performance in the past without testing your choice. Maybe you default to adding a new index (or a new key column/include column) to make a query perform better? I hope you don’t do those last two.

AI technology can be helpful, but there needs to be some thought put into how to roll it out, how to set up and measure experiments, and get feedback on whether it actually produces better code and helps engineers. Or if it’s just hype that isn’t helping.

Ultimately, I think that this is especially true for data professionals, as the training of models on SQL code isn’t as simple or easy as it might be for Python, Java, C#, etc. For example, I find some models are biased more towards one platform (MySQL) than another (SQL Server). Your experiments should include using a few different models and finding out which ones work well and (more importantly) which ones don’t. We also need to learn where models actually produce better-performing code for our platforms.

If you’re skeptical of AI, then conduct some experiments. Try to learn to use the tool to help you, rather than replace you. Look for ways to speed up your development, or have an assistant handle tedious tasks. I have found that when I do that, I get benefits from AI that save a bit of typing.

From the Pragmatic Engineer podcast, the best way to deal with some of the hype on AI is with data, take a structured approach to rolling it out, throw in a lot of AB testing measures with different groups or cohorts, evaluate, and see what works well. One of the things the guest noted was that the most highly regulated and structured groups are having the most success with AI. Because they’re careful about rollout, and they are measuring everything. They’ve been measuring time spent, accuracy of tasks and more. Then they decide where and when to use AI, which might be the best advice you get.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | 6 Comments

Back at Small Data SF in 2025

Today I’m in San Francisco at Small Data SF 2025. I went to the conference last year and thought it was a great event. Watching people talk about data and how we might look at managing smaller systems, at dealing with the challenges of exploding volumes by querying, storing, and handling less data was fascinating. The event had me me really thinking about ways in which we can build better performing (and cheaper) systems.

To be clear, small data isn’t very little data. Often this is still 100s of GB, perhaps low TBs, but it’s getting away from the idea of thinking we’ll be working on PB-sized big data systems, or that we even need to.

Last year there were lots of talks on data analysis, querying, and even AI, but using smaller sets of data in practical ways that provide value to organizations and individuals by judiciously choosing data sets. Either recent or representative data.

This helped me think of new ways for subsetting, which is something I’ve been pushing at Redgate for our TDM product.

I’m looking forward to the talks. This is a quick trip. I skipped the workshops yesterday since they weren’t that great last year (too many product/company pitches from Silicon Valley) and flew out last night after coaching kids in their first practice. At the conference today listening to talks and doing networking get together after before flying back early tomorrow.

A quick trip, but I’m sure I’ll lots to write about (and think about) in the future.

Posted in Blog | Tagged , , | Comments Off on Back at Small Data SF in 2025

Republish: Practice Until You Don’t Get It Wrong

I was in the UK last week and since I was gone for a week, I’ve decided to take a day off with my wife.

You get to read Practice Until You Don’t Get It Wrong, which is something I believe in. Or listen to the podcast: https://traffic.libsyn.com/voiceofthedba/PracticeUntilNotWrong_92_v2915.mp3

 

Posted in Editorial | Tagged | Comments Off on Republish: Practice Until You Don’t Get It Wrong

The Selfish Case for Learning AI

I ran across this article on a survey about AI usage recently. The headline is this: 55% of businesses admit wrong decisions in making employees redundant when bringing AI into the workforce.

That sounds a little ominous for those making these decisions, and a lot of you might be saying, “I could have told you that. Using AI to replace people is a bad decision.” On the surface, I agree. I dislike the idea that companies will opt for a semi-competent AI bot or agent to replace people, thereby further exacerbating the challenges faced by many workers in the modern world.

However. 55% means 45% didn’t feel that way. That’s almost a coin flip for executives who want to make a decision about whether to terminate some employees and replace them with GenAI tools, especially workers who might do things like customer service or tasks that are narrower in scope and “seem” like good fits for an LLM.

A lot of the stats in the piece are presented in a way to paint AI as risky, but the numbers are often less than 50%. To me, I worry that this is a bet executives might make. Especially when this might result in a bigger bonus or dividend for them personally. I never discount the selfish nature of executive decisions.

So, what is my selfish case for learning AI if I might get replaced? The big number is this one: “…80% of business leaders plan to reskill employees to use AI effectively…” For me, in my job, I want to be one of the people execs see as effectively using AI.

I need to spend time learning how to, and how not to, use AI tools. They can be helpful, but they can also cause problems. I need to examine where they help, how much time they save, and when to abandon them and just handle a task myself. That last skill might be the most important. I also need to ensure I learn to work efficiently with the tools to save time and become more effective. That takes some effort and focus to learn to use the tools well.

There are always going to be executives who will make the decision to let someone go. Your (selfish) job is to ensure that you aren’t the one chosen.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | 2 Comments