Password Guidance

I remember working at a large organization with a team of other IT Operations staffers. We rotated this one job every month amongst a few people, each taking turns, where we’d lose a day to update all the privileged passwords for our servers. This was before Managed Service Accounts and the cloud, when we were required to change these every 30 days and then store the new ones in an encrypted store.

What struck me when I got stuck with this wasn’t the requirement to change every 30 days; that seemed normal. The thing that bothered me was how manual this was. As a former developer, I wrote some scripts to automated this, pre-PowerShell, and make the task easier on my fellow sys admins. I had scripts to generate a password, change it in AD, then print the pwd to be copied into our secure storage (no API there). This ran in a loop so I didn’t lose a whole day to changing password.

These days, we have lots of alternatives to managing passwords, and in fact, much of modern guidance isn’t to require password changes so often. For systems, use an automated process such as an MSA or GMSA. For users, we’ve mostly given up on changes and are trying to get people to use decently long passwords and disparate ones across services.

Modern guidance from Microsoft says to avoid using common passwords (asdfasdf, password1, ec.) and don’t use the same password in multiple places. MFA is also recommended, but the anti-patterns for success are requiring long, complex passwords or frequent changes. Studies show these lead to less security because users do stupid things.

These days, I would guess many of you managing database systems use some sort of integrated security with AD, Entra, OAuth, etc. However, I know there are still places where passwords are in use. Do you require changes often? Do you change any of your passwords regularly?

Security is always hard, and it’s even harder when the recommendations and rules aren’t consistent or even enforced. I don’t know what to do, but I try to use disparate, long passwords and MFA wherever I can. So far that’s worked well.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | 3 Comments

Sparse Columns Can Use More Space: #SQLNewBlogger

I saw this as a question submitted at SQL Server Central, and wasn’t sure it was correct, but when I checked, I was surprised. If you choose to designate columns as sparse, but you have a lot of data, you can use more space.

This post looks at how things are stored and the impact if much of your data isn’t null.

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

Setting Up

Let’s create a couple of tables that are the same, but with sparse columns for one of them.

CREATE TABLE [dbo].[NoSparseColumnTest](
     [ID] [int] NOT NULL,
     [CustomerID] [int] NULL,
     [TrackingDate] [datetime] NULL,
     [SomeFlag] [tinyint] NULL,
     [aNumber] [numeric](38, 4) NULL,
  CONSTRAINT [NoSparseColumnsPK] PRIMARY KEY CLUSTERED 
(
     [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
CREATE TABLE [dbo].[SparseColumnTest](
     [ID] [int] NOT NULL,
     [CustomerID] [int] NULL,
     [TrackingDate] [datetime] SPARSE  NULL,
     [SomeFlag] [tinyint] SPARSE  NULL,
     [aNumber] [numeric](38, 4) SPARSE  NULL,
  CONSTRAINT [SparseColumnPK] PRIMARY KEY CLUSTERED 
(
     [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO

Once we have these, I used claude to help me fill this with data. That’s coming in another post, but I uploaded the script here. This is for the SparseTable Test, where I replaced the select on line 59 with NULL values. In the NoSparse table, this selected random data.

If I select data from the tables and count rows, I see 1,000,000 rows in each. However, the Sparse table is all NULL values in these columns.

2025-09_0228

Checking the Sizes

I can use sp_spaceused to check sizes. The results of running this is below, but here is the summary

  • NoSparse Columns – 42MB and 168KB for the index
  • Sparse Columns – 16MB and 72KB for the index

A good set of savings. Here is the raw data:

2025-09_0229

Adding Sparse Data

I’m going to update 10% of the rows to be not null in different columns. Not 10% total, but a random 10% amongst all the columns. Again, Claude gave me a script to do this and I have run it. This is the SparseTest_UpdateData.sql in the zip file above.

After running this, I have 900,000 nulls i the TRackingDate, as well as the other columns. You can see the counts below, and a sample of data.

2025-09_0230

If we re-run the size comparison, it’s changed. Now I have:

  • NoSparse Columns – 42MB and 168KB for the
    index
  • Sparse Columns – 33.7MB and 88KB for the index

Not bad, and still savings.

Let’s re-run the update script and aim not for 10% updates, but 65% updates. This gets me to only 315k NULL values in the tables, or a little over 70% of my sparse columns are full of data. My sizes now are:

  • NoSparse Columns – 42MB and 168KB for the
    index
  • Sparse Columns – 67MB and 192KB for the index

My sparse columns now use more space than my regular columns.

Beware of using the sparse option unless you truly have sparse data. I didn’t test to find out where the tipping point it, but I’d hope it was less than 50% of data being populated.

SQL New Blogger

This is another post in my series that tries to inspire you to blog. It’s a simple post looking at a concept that not a lot of people might get, but which might trigger a question in an interview. That’s why you blog. You can share knowledge, but you build your brand and get interviewers to ask you questions about your blog.

This post took a little longer, about 30 minutes to write, though the AI made it go quicker to actually generate the data for my tables. There were a few errors, which I’ll document, but pasting in the error got the GenAI to fix things.

This post showed me testing something I was wondering about. In a quick set of tests, I learned that I need to be careful if I use a sparse option. You could showcase this and update in 10% increments (or less) and keep testing sizes until you find when there is a tipping point. Bonus if you use a column from an actual table in your system.

https://learn.microsoft.com/en-us/sql/relational-databases/tables/use-sparse-columns?view=sql-server-ver17

Posted in Blog | Tagged , | 3 Comments

T-SQL Tuesday #190–Mastering a New Technical Skill

It’s time for T-SQL Tuesday again and this time Todd Kleinhans has a great invitation that is near and dear to my heart: mastering a new or existing technical skill. That’s been a lot of what I try to inspire people to do at SQL Server Central.

Make a plan and start learning. And respond to Todd’s invitation and write down your plan and share it. Start a blog, use Linked In, whatever. Spread the word on socials as well.

If you want to host, I’m always looking for hosts for T-SQL Tuesday. Ping me on Twitter/X, BlueSky, or LinkedIn.

Mastering a New Tech Skill

Like Todd, I’m interested in AI and I think it will dramatically change the world in the coming future. I also think it’s a bit of a technical skill that is important to learn. I wrote about this a bit in last month’s post.

How do I work with a GenAI model and improve my technical skills? The easy answer is more and more. I’ve been having more conversations with Claude, usually looking for ways to help me solve a problem or write code, and then ask the GenAI to explain things.

However, I wrote awhile ago about an experiment in helping someone else learn something: Can an AI Help Me Find a Job?. For me, I’ve been looking a bit more at DataBricks, as I hear this from clients all the time. I wanted to gain some skill here, so I decided to ask Claude to help me.

2025-09_0225

I got a good outline of things to do across a few months. I then asked for references and got some:

2025-09_0226

The next stage for me is to start embarking on this journey a few nights a week and learn some things that might help me both in my job, and potentially in a future position if I need one.

My Complete Outline from Claude

Here’s the end result, with links.

Databricks Learning Outline with Resources

Phase 1: Foundations (1-2 weeks)

Understanding the Basics

  • What is Databricks and why it’s used
  • Core concepts: clusters, notebooks, workspaces, and Apache Spark
  • Databricks architecture and modern data stack integration

Key Resources:

Phase 2: Getting Started (2-3 weeks)

Hands-on Basics

  • Navigating workspace interface
  • Creating and managing clusters
  • Working with notebooks
  • Basic data import methods

Key Resources:

Phase 3: Data Analysis Fundamentals (3-4 weeks)

Core Analytics Skills

  • Data exploration and cleaning
  • Working with different data formats
  • Basic SQL and PySpark operations

Key Resources:

Phase 4: Intermediate Techniques (4-5 weeks)

Advanced Analytics

  • Complex transformations and ETL
  • Streaming data and machine learning
  • Performance optimization

Key Resources:

Phase 5: Production and Best Practices (2-3 weeks)

Professional Development

  • Job scheduling and workflows
  • Security and monitoring
  • Integration patterns

Key Resources:

Additional Learning Platforms:
Certification Path:
Community and Support:
Getting Started Steps:
  1. Sign up for Databricks Free Edition: Visit the Databricks Free Edition signup page and pick your preferred signup method Databricks Free Edition | Databricks on AWS
  2. Access free training: Free unlimited access to training content on Databricks Academy provides on-demand learning experiences for all skill levels Databricks Launches Free Edition and Announces $100 Million Investment to Develop the Next Generation of Data and AI Talent – Databricks
  3. Join the community: Connect with other learners and experts
  4. Start with sample datasets: Practice with built-in datasets before using your own data

The great news is that all self-paced training across AI, data engineering, and more is now free for learners Databricks Training & Certification Programs | Databricks, making it easier than ever to get started with Databricks!

Posted in Blog | Tagged , , | 1 Comment

Getting Started with the MSSQL AI Agent in VS Code

Recently I was working in VS Code and I saw a walkthrough for the new Copilot chat features. I decided to give those a try in trying to get some information from my SQL Server instance.

This post walks through a few things I did with this GenAI agent. There is a video walkthrough at the end.

Note: I have copilot access set on VS Code as a part of my employer’s benefit.

This is part of a series of experiments with AI systems.

First Steps

When I start VS Code, I see something like this.

2025-08_0105

One of the Walkthroughts recently mentioned copilot. If I click the “More” at the bottom right, I’ll get this image. You might see something different, but Id’ expect you have a Copilot walkthrough if you can use Copilot. I choose the 4th one down (where the mouse pointer is).

2025-08_0106

This opened a Copilot pane. There were a few items, and you can see on the left in the image below, some have checkmarks. I’d explored these before.

2025-08_0107

If I scroll up, I see the one I wanted to get, which was “chat about your code”. I picked this one. This opened a blade to the left when I clicked the blue “Chat with Copilot” button.

2025-08_0109

I had read there are these @ agents (look up at the right side) and decided to type “@”. I saw a list of things.

2025-08_0110

Lots of places to work, but I choose the @mssql agent, since this is the place I tend to work. In the lower pane, I typed a question.

2025-08_0111

Above this (still in the left blade), I got a response.

2025-08_0112

Below this, I got some text and code explaining how to access a list of databases on various platforms. Not sure why MySQL is first, but I’m assuming this is alphabetical. For SQL Server, I saw this. This is a reasonable answer, with some help on how to execute it.

2025-08_0113

I then decided to connect to my local instance. I have the MSSQL extension, so I clicked that and got a connection.

2025-08_0114

Rerunning that query produced the same response. However, when I opened a query window, I got different results. Note the little database icon on the left, below my prompt, with “Untitled-1” next to it. This is the context, which I also saw added to the lower prompt box, just above where I would enter a prompt.

2025-08_0115

However, this didn’t work.  After a few minutes,  I got this.

2025-08_0116

and this. The LLM is trying, but can’t seem to get a query to run. It did try.

2025-08_0119

I then decide to move on.

Getting Results Back from Questions

This isn’t really the type of thing I’d do, but I decided to try and get some info from a database. The one above isn’t that interesting, so I switched to asking the model some questions. Here’s the first one, where I don’t remember the exact table name, but I ask.

2025-08_0120

It’s queried the database, and there isn’t a player table. However, it continues to look and finds dbo.players.

2025-08_0121

Even better, once it has the answer, it also provides a little more info. Maybe good, maybe bad. This reminds me of talking with a person that gives me more information than I asked for.

2025-08_0122

I try something else. Let’s get some metadata, since I clearly don’t remember what’s in this database.

2025-08_0125

I get a nice response, with some guesses about what information is contained inside these tables.

2025-08_0126

OK, can I query for information. I’ve always been a bit more of a hitter than a pitcher, so I’ll ask a question. This isn’t asking to join specific tables, but get me an answer.

2025-08_0127

It worked, though to be fair, I tabbed over to SSMS and wrote this query in the same time (with SQL Prompt) as the Copilot agent. Cool to see as I’d forgotten Thome and Vlad were up there.

2025-08_0134

While I got the answer, I didn’t get the query. I asked for it and got it, with an apology.

2025-08_0128

I’ll do something else. Who played the longest. Might be a somewhat funny query to write for a quick answer. I’d have to join a few tables and look for a sum.

2025-08_0130

It likely remembered I wanted the query, so that was included, with an explanation. However, it only looked at the batting table.

I asked other questions about fielding and pitching and got those answers (Nolan Ryan, 27 years with fielding stats and pitching stats. So I asked that:

2025-08_0151

Below this, I see the two players who tied, which Copilot noted.

2025-08_0152

The code provided only returned one player. I checked, which is something that you should always do. I asked if I could get better code. I got a few options, and I liked the RANK one, so I tried it and it worked.

2025-08_0180

Slightly annoying, but when I think about this type of conversation with someone else, especially a junior dev, I might have the same results and iterate this way.

At this point I also asked about databases, and I got a result. Maybe I needed a query to run first? Not sure why this works now.

2025-08_0133

Summary

This was an interesting set of things I could get done with this agent in VS Code. It’s not amazing, but it was helpful. I could tackle some light query tasks or db query ones while also handling some other work. In this case I wasn’t in the zone, trying to code or decode a database schema. Instead, I had a few things to try, and I let the agent work while I tabbed over to close some emails and chats.

In a job where I might need to find info from an unfamiliar database, this could be helpful in getting things done, though it’s hard to know if it’s slower for me if I were focused on ths all the time. The agent can find some info without me, but it also failed in a few cases. I started to try and get other things done when I noticed delays in responding and some hung queries.

Learning to use an AI agent to help you is a skill, and it’s one that takes time to develop.

I’ll look at some more practical tasks in the next post.

Video Walkthrough

Here’s a video walkthrough of most of the stuff in this post. It differs slightly as working with LLMs is not deterministic.

Posted in Blog | Tagged , , | 2 Comments