Half of All Engineers

The AI LLM boom seems to show no sign of slowing down. Each time I think we’ve reached some level of crazy use or predictions, things take another turn. I still find myself pinging back and forth between this will be amazingly good and horrifyingly bad.

Sometimes on the same day.

Today, I’m a little more down on AI. I was listening to Steve Yegge on the Pragmatic Engineer podcast, and they were discussing the curve of AI usage at companies. He points out that he’s mad that Amazon let 16,000 engineers go and might let more go. He worries that companies might let 50% of their engineers go. Not necessarily because the top 50% will be more productive with AI than 100% of engineers without it. Rather the concern is that companies will get rid of half their salaries to pay for the AI tokens for the other half.

Steve Yegge is an accomplished software engineer that has worked at Amazon and Google. Steve wrote Gastown and has been someone who not only is successful at producing code but also thinks a lot about how we produce more software.

How coordinated and powerful are the new models? Can they really do a lot of software work that we do today? Steve thinks so, but to be fair, he’s got a lot of experience and can architect and design software well, which means he can also guide AI LLMs and agents to write more code. He also thinks the latest models, like Opus 4.6, are way more capable that most people believe.

I also caught this post on X about a paper predicting that AI might cause economic collapse as less knowledge workers are used in various tasks. This might happen faster than we can absorb those workers who are laid off in the name of AI back into the economy in other positions. It’s a scary thought.

The positive side of this, at least from me, is that so many organizations move slower, and so many people aren’t extremely competent software engineers, so we’ll get a lot of bad software written by non-technical people that doesn’t scale. We’ll have more database (and application) performance issues, and that will slow the use of YOLO, vibe coding.

Plus plenty of companies just aren’t implementing or looking to pay for lots of tokens. They’ll just move slower and the world will change, but not anywhere near the pace that Steve or others think it will. We’ll see, but let me know what you think.

For a more positive spin, I’ve been reading Reshuffle, which is a little less depressing about the future.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged , | Leave a comment

Upgrading SQL Server Containers on the Laptop

I don’t have SQL Server installed on my laptop. In an effort to keep things clean and smooth in case I need to rebuild things, I’ve gone with containers. I can easily copy a folder with all my docker compose files and data to another machine and be up and running.

One other benefit is upgrades. This post looks at the process of upgrading/patching SQL Server on my laptop.

Getting the Latest Version

A normal process for me in the past (and on my desktop) is to download a patch, run the installer, and then have SQL Server upgraded. Sometimes there’s a reboot involved as well. With a container, things are a little different. Here is the process:

  • Edit the docker-compose.yml
  • Restart the container

That’s it.

Once the image downloads and the container restarts, I have a new version. If I want to go back and test the previous CU/patch, I change my docker-compose file back. Most of the time, the database version hasn’t changed with a CU or GDR, so I can upgrade and downgrade easily.

Here’s how this works in practice.

There were a number of SQL Server patches released recently, including a few for SQL Server 2022. When I looked at my docker-compose file, I saw this:

image

The latest version of SQL 2022 is CU24+the Apr 2026 GDR (I’ve been remiss here in patching this machine). If I want to patch this, I update line 5. I know that there is a page on the MS Artifact Repository that lists the tags for each CU. It’s an easy Google search away.

I change from this:

image: mcr.microsoft.com/mssql/server:2022-CU18-ubuntu-22.04

to this:

image: mcr.microsoft.com/mssql/server:2022-CU24-GDR1-ubuntu-22.04

Now, when I restart my container, I’ll see the new image downloading. Since images are built in layers, this isn’t the complete SQL Server image, but rather the changes from other images, so the download size and time are shorter.

2026-04_line0001

As soon as this finishes, the container starts and SQL Server is patched.

2026-04_line0002

Now I need to check the other SQL compose files I have for 2019 and 2025 and update those version.

Video Walkthrough

Below is a video showing me doing this process.

 

Posted in Blog | Tagged , , | Leave a comment

Spring Connections and Learning at PASS On Tour

It’s just a few weeks until the spring PASS On Tour Chicago 2026. This is a smaller version of the PASS Summit that will be coming to Seattle in November and has been running for 26 years each fall. Myself and many other data professionals have found immeasurable value from the time we’ve spent with others at these events.

Connect. Share. Learn.

That’s the motto that PASS adopted many years ago, and I love it. We get to network with each other, talk about our experiences, challenges, and struggles, and hopefully find a way past them. We are stronger as a team, and the synergies that come from people working together allow us to accomplish more than we ever would alone.

The PASS Summit has been an event at which this has happened, and I’ve been very lucky to attend most of them. I’ve learned lots of tips and tricks in sessions, but much more by talking with others before or after the presentations. I’ve been inspired by others and warned about walking down paths where people have struggled. All of the knowledge shared with me has helped grow my career across the last 26 years.

In 2025 there were 3 tour events, and I was able to attend all of them. They were small, but I still had ample opportunities to connect with others, learn from them, share with them, and get out to a few dinners with friends who didn’t plan on coming to Seattle, but they made one of the regional events.

This year there is just one tour stop in the US. May 7-8 in Chicago, and I get to go again. There’s a fantastic lineup of speakers, some great pre-cons, all at a relatively cheap price compared to many other data events. Chicago is easy to get to and it’s not too expensive to spend a few days in one of America’s classic cities.

I hope you’ll join me in Chicago. It’s short notice, but there are still chances to join, and I’ve got a discount code if you ping me 😉

Steve Jones

Posted in Editorial | Tagged | Leave a comment

Questions and Answers from Running a Local LLM

I had a few random questions from my Running a Local LLM on Your Laptop session at the Houston AI-lytics 2026 event last week, so this post looks at a few of those questions and my answers.

Note: This stuff is changing rapidly, and there aren’t a lot of factual answers. A lot of what you should look for is guidance and rational reasons for leaning in some direction.

Questions below:

  • Do we need an NPU? (Or what do I think of NPUs)
  • How do we audit or Test an AI LLM and know what is happening?
  • In which situations would you run a local model?
  • Which Model is Best?

Do we need an NPU? (Or what do I think of NPUs)

You don’t need an NPU to run a local LLM model, but they help with efficiency. An NPU is a Neural Processing Unit, which is a type of CPU that is designed to work with AI-type applications and process instructions more efficiently. This could be training a model or running LLM workloads.

I think an NPU is a great idea for efficiency. We already know AI applications use a lot of compute and power. Just look at all the concerns over power/water and investments being made in new data centers for AI. Being more efficient helps.

Just like a GPU helps with graphics and makes your laptop more efficient, an NPU will help, but it’s not required.

How do we audit or Test an AI LLM and know what is happening?

First, LLMs aren’t deterministic, so they might not return the same thing all the time. It’s hard to test a non-deterministic thing because we look to assert if a is passed in, b is returned. If I pass in a and sometimes get b, sometimes c, and rarely f, this is hard to test.

I have no idea how to test a model for behavior in this case.You get useful results from experiments, and more often they are useful than un-useful to continue using it. If that happens, faster, then you have a better model. If it’s slower/more expensive/less useful, it’s a worse model.

Auditing is looking at what happened, which means reaching into the processing of these GPT-type tools. There are some tools to help (AuditLLM), but I can’t speak to whether these are a) worth the effort, b) effective, or c) junk. I’m still learning here, too.

In which situations would you run a local model?

This is a hard one because there are a few situations in which I’d seriously consider a local model (including Amazon Bedrock/Azure AI/Google Vertex).

First, when I’m worried about costs and I want to control them. While the vendors give you some limits and throttles, it can be expensive. In many cases, if I want to set controllable spend, a known spend, I might consider a local model in some service because I can allocate out capacity and know what is available, what it will cost, and who will be using it. Perhaps the cloud vendors will give us more controls and ensure we aren’t on “shared” systems, but any efficient use of hardware to do this will be for their benefit, not mine.

Second, when I’m really concerned about data security. While most companies might promise they won’t use your data and will delete sessions, they might not, and they might make mistakes. And if they do, would they accidentally use my data, or send it in response to some sort of legal subpoena accidentally? If I’m outside the US or really worried, I’d run local models.

Third, if I want to ensure that I have complete control over the training of the model or the prompts, I might use a local model where I know there aren’t any system prompts being injected into my context.

Which Model is Best?

Yes.

There’s no good answer here. If you look at the list of models on Hugging Face, for example, there are lots and lots of models. None of us has time to test many of them, or even a small fraction. I think you have to depend on the community here to help you decide that any of these models are better for your situation.

Think about what you want a model to do, what things are important to your problem space, and then look for a model that people think works well and does the type of things you want to do. Similar to how you interview a person for certain types of work, think about that for a model.

The nice thing outside of the large LLMs is that you can use smaller models to fill in certain situations if you find you want to provide that capability a lot to your organization. I would see interpretation and linting of best practices in code, for example, using a smaller model that uses less compute, but it trained, or you fine-tune it for your particular situation (and save money).

Posted in Blog | Tagged , , | Leave a comment