T-SQL Tuesday #177–Keeping track of database code


This month’s invitation is from Mala Mahadevan, who has hosted 5 times. This latest one is one that is near and dear to my heart as I use source control most days and I think it’s important for database code.

I’ll explain why below, but I’m glad to see this as a topic. We have a wide variety of technical topics being proposed, but not a lot about software engineering as a discipline, which is part of what version control is. Not building features, but better managing your process.

If you want to host a T-SQL Tuesday, ping me.

Capturing Database Code

There are many ways to capture code, but I work for Redgate Software, so I use Flyway. Since I work there, I get a paid version, but I work with customers all the time and see a variety of things. If I didn’t have Redgate tools, I’d likely use something like SQL Compare to capture off database code, since that’s easy. Or SMO through SSMS.

As for how I work with code, I use Git to store the code. Git is ubiquitous and I rarely find customers without Git. Sometimes the data teams aren’t using it or don’t know how, but that’s why I’ve written some Git articles on getting started.

I also try to work in branches, with a protected main branch. This means no one can commit code to main, but rather need to commit it elsewhere and use a PR (pull request) to move the code into main. I do this with a lot of customers, helping them understand how to use version control to manage their code.

For my git work, I primarily work in GitHub in public repos. I’m at https://github.com/way0utwest, where I keep a lot of sample projects for things I work on with customers to demonstrate how to use Redgate tools, or just manage code better.

There are lots of ways to capture code, format it, and deploy it. However, you should use git and learn to manage your code within a team. I’d also suggest you use Flyway to deploy the code. There is an OSS version, and because it supports many platforms, if your company adopts PostgreSQL or DataBricks, you can still use a similar process to deploy code. Learn it and use it.

But first, get code into a Git repo.

Posted in Blog | Tagged , , | 7 Comments

A Checklist for Learning

One of the challenges many people have is focusing their learning efforts along some path. The best way to move forward is with steady effort that guides you through steps to build knowledge or skill. However, with so much information out there on the Internet, how do you decide where you focus your efforts?

Lots of people choose a random method, but the world is full of those people, many of whom never develop strong skills. That might be fine if you are an hobby guitarist or piano player, but it’s not the best way to approach your career.

Choosing what you want to learn is hard, but if you were to try and become a better Database Administrator, what do you think of this list: the ultimate checklist for Jr DBAs. This is a long list of topics in various areas, such as basic concepts, indexing, backups, security, objects, etc.

I think it’s not bad, but it leaves you a lot of work to do on your own. You have to find places that explain these items, which can be a chore. If you find a link, how do you know it’s good? You can’t judge because, well, you don’t know. You’re learning here.

I wish there were more guidance in posts like this. Not necessarily more information in this post, but with links that might help someone know where to look for good information. If you know these things, then what do you think is a good source of information? Maybe on your own blog or maybe an article that taught you something. I’m sure I don’t always provide background links, but I try to ensure I include links to help someone learn more.

Inside a company, I might include internal links to our policies or documentation. As an example, maybe we write down how often we think stats should be updated as a standard and why or what types of backup schedules are appropriate. This would be an easy way to help someone learn why we approach our jobs in certain ways, which is a very specific type of learning we need. Other more general learning is important and would drive these decisions.

Would you make a list like this one for yourself? If you’re a junior DBA, you might find this helpful, though I’d suggest you ask others for input on where you learn about these concepts. For people looking to learn something else, such as Fabric or Snowflake, is this the type of list that would be helpful, assuming there are some links on where to learn more? Let us know today.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | Comments Off on A Checklist for Learning

Grouping by Object Type in SQL Compare

A customer recently was asking about grouping objects by type to see all the differences in two databases for one set of objects, like all stored procedures. This post shows how this works and what this changes for your system.

This is part of a series of posts on SQL Compare.

I have two databases, Compare_1_Source and Compare_2_Destination. I made a number of changes to the Source db and when I run SQL Compare between these, my default view shows me the different objects, as well as those in one database and not in the other. You can see this below.

Screenshot 2024-07-31 170641

This is the way most of us want to make changes, by deciding from this short list what changes to deploy. If you look closely, you can see there are 115 objects that are identical, so it’s nice to be able to see what has changed.

However, I have had a few people ask to see all the stored procs that have changed, so they can decide what needs to move. They aren’t ready for table changes.

In the upper right corner, above the destination database, I have a drop down for grouping options. I can use the default (type of difference), switch to type of object, or have no grouping.

Screenshot 2024-07-31 171350

If I choose object, I see this grouping. In here, the stored procedures are expanded, showing my differences. Second, in the middle, near where the checkboxes are for object selection, I see a count of how many objects are selected, and how many are changes. However, I don’t know the type of change.

Screenshot 2024-07-31 171501

I’ll modify a procedure and delete one and I see something slightly different. Now I see the object name to the right side of the checkbox, so I can infer a change if the name is on both sides, or a delete if there is an “x” next to the name.

Screenshot 2024-07-31 172047

My default view of tables is shown below. Note that my table changes are mixed within non-changed tables.

Screenshot 2024-07-31 171517

However, I can click on the Last modified column and resort the data. If I sort descending, then I see my table changes at the top.

Screenshot 2024-07-31 172416

Toggling these settings allows me to see different views. If I just want changes without groups or differences, then I can set no groups and see this (I’ve sorted by modified date).

Screenshot 2024-07-31 172450

I have a request to hide the unchanged objects, but that’s not something we do now, nor am I sure we will change things. You can submit your own ideas on Uservoice and get some votes from friends.

SQL Compare is an amazing tool that millions of users have enjoyed for 25 years. If you’ve never tried it, give it an eval today and see what you think.

Posted in Blog | Tagged , , | Comments Off on Grouping by Object Type in SQL Compare

Send Metrics Not Logs

This is part of a series on observability, a concept taking hold in modern software engineering.

One of the interesting things I saw in an engineering presentation on Observability from Chik-Fil-A was that they are sometimes bandwidth-constrained at remote sites. In an early version of their platform, they sent logs back to HQ, and their logs used all the available bandwidth, so they were unable to process credit card transactions.

While most of us don’t deal with lots of remote offices sending data back to a central data warehouse, we do often work in distributed environments, and we may send data to/from a cloud or even employees’ remote offices. Or maybe we send a lot of data between components. Bandwidth is very good in many parts of the world, but it isn’t infinite.

In the presentation, they talked about a tool, called Vector, that can work with lots of data, slice/dice/aggregate/sample/etc. the data, and then send the results to a sink location. This works like many other ETL tools that have a source and sink, along with various transforms that operate on the data.

It’s an interesting philosophy to try and send back metrics that might be useful to developers or Operations staff in understanding the performance of their system. By only sending metrics, the load on downstream systems is reduced. This also allows us to store less data and read metrics sooner rather than storing all the data and processing it each time someone needs a metric.

The flip side of this is that taking this approach means that the consumers of the metrics need to ensure they are getting useful and actionable information. Determining what is needed will be like any development project, something built, iterated, re-tested, and repeated. This might even be an ongoing part of building software as new features and logging are added to your software or system.

In general, I prefer to have more data over less, but the volumes of logging and instrumentation data have grown dramatically. Some systems are producing more log data than actual data on a daily basis. Like audit data, we likely need to reduce and limit the amount of data stored long-term. However, we want to keep the important data that we find useful.

I am looking forward to trying out Vector and seeing what’s possible. Having good CLI-based tools that can work with data is becoming more important all the time, especially as more of us move to DevOps flows, coding our systems operation in text, storing it in version control, and deploying on demand.

If you’ve used Vector, let us know what you think, and if you prefer another tool, share why today.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged , | Comments Off on Send Metrics Not Logs