Voice of the DBA

Adding Manual Relationships Between Tables in the TDM Subsetter

Posted on February 12, 2025 by way0utwest

I wrote about getting the Redgate Test Data Manager set up in 10 minutes before, and a follow up post on using your own backup. One of the things I didn’t show from my own database was that it had no FKs, so the subsetting didn’t quite work as I wanted.

A previous post showed how add starting tables for the subsetter to look at, however that didn’t get me a good data set for testing. This post continues looking at the subsetter by adding manual relationships to our configuration.

This is part of a series of posts on TDM. Check out the tag for other posts.

Declaring a Relationship in the Options File

In my previous post, I’d picked a starting table and had reduced the dbo.players table from 16564 to 1800. However, I only had player information. If I query my subset database, I see there is a player, but I have no batting statistics for this player.

This is because my table has no declared FKs in it. If I check the dbo.batting table, I can see only a PK.

Let’s fix this.

Declaring Manual FK Relationships

In the options file documentation, there is a section that notes manual relationships can be declared with a key called “manualRelationships”. If I copy/paste the example section into my options file, I’ll see this:

I don’t have a SourceTest table, so let me edit things. I’ll set a relationship between dbo.players.playerID and dbo.batting.playerID. This gives me the following in my options file.

Before I run my subsetter, here are the row counts by table.

I’ll re-run this subset command, which includes my option file at the end.

rgsubset run --database-engine=sqlserver --source-connection-string="server=localhost;database=BB_FullRestore;Trusted_Connection=yes;TrustServerCertificate=yes" --target-connection-string="server=localhost;database=BB_Subset;Trusted_Connection=yes;TrustServerCertificate=yes" --target-database-write-mode Overwrite --options-file E:\Documents\git\TDM-Demos\rgsubset-options-bb.json

When I do that, I know see these row counts. Note I now have batting rows.

My player query won’t work, so I still need to declare another relationship with the dbo.teams table. That is shown below:

I can re-run the same command above, and then I see this set of rowcounts (original on left, subset on right).

There is teams data, and if I re-run my queries from the top, I can see stats now.

Now I have a dataset that I can perform development work with in terms of players, teams, and batting.

I can also add more relationships as needed, for example, I’ll add this section to include pitching, batting post, and fielding. Here’s my complete options file:

{
  "jsonSchemaVersion": 1,
  "startingTables": [
    { 
      "table":
      {
        "schema": "dbo",
        "name": "players"
      },
      "filterClause": "birthState = 'CA'"
    }
  ],
  "manualRelationships": [
    {
      "sourceTable": 
        { 
          "schema": "dbo", 
          "name": "players"
        },
      "sourceColumns": [ "playerID" ],
      "targetTable": 
        { 
          "schema": "dbo", 
          "name": "batting" 
        },
      "targetColumns": [ "playerID" ]
     },
     {
        "sourceTable": 
          { 
            "schema": "dbo",
 
            "name": "batting"
          },
        "sourceColumns": [ "teamID", "yearID", "lgID" ],
        "targetTable": 
          { 
            "schema": "dbo", 
            "name": "teams" 
          },
        "targetColumns": [ "teamID", "yearID", "lgID" ]
       },
       {
          "sourceTable": 
            { 
              "schema": "dbo", 
              "name": "players"
            },
          "sourceColumns": [ "playerID" ],
          "targetTable": 
            { 
              "schema": "dbo", 
              "name": "battingpost" 
            },
          "targetColumns": [ "playerID"]
         },
         {
            "sourceTable": 
              { 
                "schema": "dbo", 
                "name": "players"
              },
            "sourceColumns": [ "playerID" ],
            "targetTable": 
              { 
                "schema": "dbo", 
                "name": "pitching" 
              },
            "targetColumns": [ "playerID"]
           },
           {
              "sourceTable": 
                { 
                  "schema": "dbo", 
                  "name": "players"
                },
              "sourceColumns": [ "playerID" ],
              "targetTable": 
                { 
                  "schema": "dbo",
 
                  "name": "fielding"
                },
              "targetColumns": [ "playerID"]
             }
  ]
}

After re-running the subsetter, I have these row counts. Note there are rows in all the tables defined in the options file.

I can keep adding in more tables as needed to ensure the subsetter can walk down the data relationships I need in my database to produce a useable dev/test dataset that’s smaller than production.

TDM can help your devs build better software and with the subsetter, this can create lots of agility to ensure the data you need to accurately build this software is available.

Give TDM a try today from the repo and a trial, or contact one of our reps and get moving with help from our sales engineers.

Video Walkthrough

Check out a video of my demoing this below:

Posted in Blog | Tagged Redgate, subset, syndicated, TDM | Comments Off

Creating a GitHub Actions Self-Hosted Runner

Posted on February 10, 2025 by way0utwest

I had to demo the Flyway Autopilot system recently and created a GitHub Actions runner as a part of that. This post documents how this went.

First, if you go to the settings in a repo and click the Actions area, you see a Runners item. Click that. Notice I have no runners.

In the upper right corner, I can click a button to create one.

This gives me the instructions to get a new one. Note, these are PowerShell commands, and the first command doesn’t quite work right. Still, this is what I need.

I opened a CMD window and stared running these. Note, I need to repeat the change directory.

Now start PowerShell as the next commands are PoSh ones. When I copy the next command, it starts downloading a zip file. As of this writing, this is a 600-ish MB file.

Once this is done, you can run the next commands, which unzip and configure this. For the config, I just hit enter as I don’t have multiple groups or tags, and I leave it named as my machine. I also don’t bother to run this as a service.

The last command runs the runner agent.

If I go back to the Runner screen in Settings, I see I have an idle agent set up.

And that’s it. If I pick one of my workflows in the Actions tab, I can run it and I’ll see the job started in my runner folder. Here are my actions with the Run workflow button on the right.

If I click this, I see the job start in the CLI.

If I get back to the Actions, I’ll see things in progress. As you can see, I was slow here.

That’s about it. Now I can run local automations in my repo that connect to things like local databases, which can be handy.

Video Walkthrough

I’ve got a video of this process if you want to watch it.

Posted in Blog | Tagged DevOps, GitHub, syndicated | 2 Comments

Extended Event Comfort

Posted on February 10, 2025 by way0utwest

I saw an interesting thread recently in the SQL Server Community Slack where someone posted about extended events (XE). They were asking about whether XE would have a problem with a situation. The problem wasn’t so interesting, but a quote from one of the responders was. The quote was:

The best time to have learned Extended Events was ten years ago. The second best time is today.

I love that, and I tend to agree. If you need to trace what is happening inside your SQL Server, you need to learn how to capture information with Extended Events. That’s the best way to dig into the details of how queries affect your system.

It’s also hard. I know that whenever I need to use it, which is rare, I have to dig through some articles and docs to understand what thing I need to do. Even having some scripts hasn’t helped because it’s a sufficiently complex system that unless I use it regularly, I forget how all the filters, targets, events, etc. work.

On one hand, I think it’s amazing, and on the other, it’s too hard to use. Even when I try the Extended Events profiler, it’s so different from Profiler that I find myself getting frustrated at times trying to dig through the information.

I am curious how many of you think XE is easy to configure and if you use it often. What are the places it works well? For those of you that don’t use XE or haven’t learned, why not? Do you not have to trace what’s happening with queries in some detail? Or do you have another way that you dive deep into your system? Or do you not have the need?

If you do want to learn more, we have a short Stairway Series on Extended Events to help you get started, as well as a few other articles. If you’re an expert, we’d love a few more on using XE in specific situations.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged administration, extended events | 4 Comments

No End to Programmers

Posted on February 8, 2025 by way0utwest

There’s an interesting piece at O’Reilly this week titled The End of Programming as We Know It. That’s actually a good title, but it doesn’t mean the end to programmers, developers, software engineers, of whatever people call themselves.

The piece looks back at history, starting with physically connecting circuits to program and moving to switch flippers, cards, compiled languages, and on to the web and mobile systems. Each of this brought more and more programmers into the industry because there’s no end to the software needs of the world and it’s hard to write good code.

I like that the web was seen as the end of programming as anyone could easily build a minimal application to share things. Even frameworks like WordPress let novices create applications that do all sorts of useful things. However, we still need programmers. Look around at how many WordPress consultants are willing to help you.

That’s because there’s a difference between configuring something off a shelf (or from a digital store) and actually having a working application for your situation. There’s no end to what people want to build, and really, they need someone to build it for them. Or at least modify, improve, or re-implement the proof of concept they created.

The world of AI LLMs, and chat-oriented programming, is no different. Anyone can ask for an app and get a reasonable prototype. However, I don’t think this means less programmers. While more people will build more PoCs or prototypes, that also means more people will need a professional to clean up their system and make it work, scale, and perform well. And probably help secure it better.

Programmers (developers, software engineers, etc.) will scale themselves with AI tech, as they’ll get the AI to do some scaffolding and initial work that they clean up. I’m sure the very best will build RAG AI systems that generate the type of code they want with a little training and input. I could see millions of AI assistants that help write basis code outlines, write tests, check for standards and code smells, and help the human shift left, catching small, simple, silly mistakes.

I think we’ll have more people producing software in the future, but I also think that those who know their industry well will be in more demand. They’ll recognize issues in AI-generated code, they’ll guide AIs better, and they’ll communicate more clearly with AIs. They’ll get their pick of the best jobs, and they might be better compensated.

Things they likely do today with their fellow humans.

Steve Jones

Posted in Editorial | Tagged career, Database Weekly | Comments Off

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30

Voice of the DBA

Adding Manual Relationships Between Tables in the TDM Subsetter

Declaring a Relationship in the Options File

Declaring Manual FK Relationships

Video Walkthrough

Creating a GitHub Actions Self-Hosted Runner

Video Walkthrough

Extended Event Comfort

No End to Programmers

Search this blog

2026 Redgate Summit – New York City

18 Year MVP Awardee

Tags

Search this blog

Steve’s Tweets

Older Posts

Meta

Recent Posts

Archives

Copyright Steve Jones 2018

Copyright 2016

Meta