Editorial Republish: Evergreen SQL Server

I’ve been traveling a bit lately, but this week is a little crazy, two trips to two places, and so I’m more out of touch. You get a re-run of Evergreen SQL Server, which is a fun one to re-read. The idea of upgraded database software is something that is more common these days than in 2019.

Posted in Editorial | Tagged | Comments Off on Editorial Republish: Evergreen SQL Server

SQL Saturday NYC 2025 Slides Building an API with DAB

I forgot to do this Saturday, and was traveling all day yesterday, but I’m finally getting this done. Thanks to everyone that attended my talk on building an API with the Data API Builder.

Slides: BuildingAPI.pptx

Demo code repo: https://github.com/way0utwest/DAB-Experiments

Note, the repo is a little unorganized, but I’ve tried to add some stuff in the Getting Started as READMEs. If you have questions, let me know.

Posted in Blog | Tagged , | Comments Off on SQL Saturday NYC 2025 Slides Building an API with DAB

T-SQL Tuesday #186–Agent Jobs

It’s that time of the month again, when the T-SQL Tuesday blog party takes place. I manage this site, and am looking for hosts all the time. This month I managed to convince Andy Levy to host, and I’m grateful for his participation.

His invitation is asking about SQL Agent jobs and how they are managed. It’s focused, but he gives a lot of choices for how to examine this subsystem in SQL Server.

Note: if you work in Oracle or PostgreSQL or anything else, how do you schedule work in an automated fashion? Cron? Something else? You can still write.

If you want to host, ping me and I’ll get you a month.

Designing Jobs for an Enterprise

I used to work in a fairly large enterprise (5,000+ people, 500+ production SQL instances) with a small staff. It was 2-3 of us to manage all these systems, as well as respond to questions/queries/issues with dev/test systems. As a result, we depended heavily on SQL Agent.

We decided on a few principles which helped us manage jobs, with a (slow) refactoring of the existing jobs people randomly created with no standards. A few of the things we did are listed below. This isn’t exhaustive, but these are the main things I remember.

Name schedules clearly

Scheduling gets crazy. As a result, we would try to name with the days and times something ran. For days, we’d use SMTWRFSa. If something ran every day, that was in the name. If it were week days, then it had MTWRF in the name. Thursdays only were R.

We’d include a time, such as 0200 or 1830 in there. If there were just one or two times, we’d list those. If it were more often, we had “every hour” or “every 15 minutes”.

This wasn’t perfect, but it made most schedules clear.

Job Names and Descriptions

We tried to make job names clear with a starting noun (Backup, Maintenance, Sales), which was a little overloaded. It was a DBA thing for most work that DBAs might run and a department for those business level things.

Job Steps

I tried desperately to get away from code in the job step and use stored procedures instead. This helps us tune and watch things that run, and it keeps code in code places.

For DBA stuff, we had a DBA database on each instance for our procs. We’d put our code in there (Ola’s procs, our own custom maintenance things, checks, ETL, etc.). This way we could more easily run server level stuff.

For business level jobs or things related to a db, we want a proc in there. Then call that. This also let us often have a logging table alongside the proc where we could track progress.

Alerts/Operators

Luckily we had a monitoring solution that notified us when jobs failed. We didn’t use these systems. However, we did have an auditing report that queried DMVs and noted job failures and stored this data in a table (rolling 30 days) and used it to produce a daily report we archived in a folder.

This was for our ISO compliance and auditors loved it. We would store a daily report and then add a daily note of any actions we took. That way we knew what we did and had a record.

Summary

For most of the other options (categories, etc.) we ignored them. The goal was to keep things very simple and streamlined. We had a standard job we deployed to most servers as part of a build process.

We also drove a lot of activity in code off queries as much as possible and only used a table to log exceptions. We might have a table that stored the “FinanceDW” db name as an exception. The backup process would get a list of all dbs, and then delete those in the exception table. Then run as normal.

K.I.S.S. worked very well for us.

Posted in Blog | Tagged , , | 1 Comment

Deleting a Database

Who among us has deleted a production database?

I’d hope it’s very few of you that have done this in your career. I’m sure a few of you have deleted (or truncated or updated all rows for) a table in production. I’ve done that a few times, but fortunately, I’ve been able to recover the data quickly. I had this happen in SQL 6.5 and was grateful I could start a single-table restore before my phone rang.

Here’s another question: which of you has had a storage admin delete or remove some remote storage and cause you database problems? Has anyone had that happen in their environment? I haven’t had this in production, but I have had this happen to test systems, and I was very irate with the storage people when it did. After that, I’m sure they were very cautious about changing any configuration for database servers. I’m also sure that also contributed to my struggles in getting more space promptly as well, so I’m not sure I came out ahead in that situation.

A hospital system had issues after this happened to them: engineers deleted critical storage that connected to a database system. Fortunately, no services have stopped, and no patients are missing services, at least as far as we know. Everything has to be moving slower, and that might mean that staff is spending more time on “downtime procedures”, i.e. paper, than focusing on patients. Knowing a few medical professionals, this means they’re more stressed and working harder to be sure patients aren’t affected. Sucks for them, and I doubt the hospital compensates them for an engineer’s mistake.

I haven’t heard about this happening in a long time, and I’m surprised by that. Almost all storage these days for server systems is remote, especially in the cloud, and it would be easy to click the wrong button or select the wrong disk when re-configuring a system and remove critical storage. Maybe we’ve gotten better at popping up warnings that slow people down and prevent mistakes.

Or maybe we don’t delete disks and only add them to database systems 😉

In any case, I hope they can recover things quickly and easily. If you’ve seen this, let me know. If you haven’t, here’s a reminder that this could happen. You should be sure your backups are running AND you can perform a test restore.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | 1 Comment