Don’t Let Corner Cases Drive Your Design

If you graph computer/query cost against the size of data, you can get four quadrants:

  1. small data, small compute (most CRUD app queries)
  2. small data, big compute (complex BI queries for this quarter, most reporting)
  3. big data, small compute (logs, audit data)
  4. big data, big compute (complex BI queries across all our data)

If you examine the costs here, 1 is the cheapest, with 2 and 3 having a similar cost. Number 4 is expensive, and it’s why we often have big boxes running our database server software. However, where is most of our work? The majority is in quadrant 1, with 2 getting the second most action. 3 might rarely exist, as does 4, but we often design for 4. We have to as we don’t want phone calls, ever. What we want is to provision a system large enough that we don’t hear many complaints about performance. On premises, many of us have over-provisioned systems to handle the peak load to avoid phone calls.

Can we handle the peaks or the really important things that someone thinks are important? Everyone thinks their workload is important, and it is. To them. However, there are plenty of cases where someone could think about designing for specific types of workloads, rather than just aiming for quadrant 4. I’ve got an image of different types of workloads that I grabbed from the Small Data 2025 conference. For example, if I am working with things like Time Series data or streaming analytics, I might not need huge compute. I might be storing a lot of data, and I need space, but the compute is low. The analysis of that data, however, might be compute intensive.

This is a reason why we might separate analytic systems out as they often are in quadrants 2 and 4, and we might want serverless or scale up/down systems to handle the rare cases, and get a real cost for them. I found it particularly interesting that the Bronze tier might be where we have big data and big compute, but once we’ve moved to Silver or Gold, we might have lower compute and data requirements. This makes sense as Bronze is more staging, but it is a good reason why we might aim for a Gold layer in our organization and only keep that data for the long term; it’s more cost-effective.

Often, for simplicity, we build a bigger system for all types of queries. In other words, we are letting corner cases drive our design. That might be required, but it might not be. In this area of cost concerns, especially in the cloud, designing systems with appropriate resource usage is something that might override the analyst’s desire for queries across all data running as quickly as order lookups in an OLTP system. This might be even more true if we can predict some patterns in our workloads during system design. We can’t scale up or down instantly, but in a lot of places, I wish I had been able to scale financial or reporting systems up for a few days as we close out the period and scale them down for the rest of the month.

When building a system, think about the practical nature of your requirements and assign a cost to them. Let users know what workload you’re building a system to handle and set expectations on performance and cost. If you do that, you can let others decide when we handle corner cases and when we don’t. That’s often a much easier conversation when we have cost numbers to help customers understand the implications of their request.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | Leave a comment

Reverse Engineering a Physical Model Diagram

I recently wrote about a logical diagram with Redgate Data Modeler. That was interesting, but creating all the objects is a pain. I decided to try creating a physical diagram from an existing database. This post looks at the experience.

This is part of a series of posts on Redgate Data Modeler.

Getting Started

As with the logical model, I right click and choose New document.

Then I get a list of options. I’ll choose the middle one, Physical Model. This is what I mostly need to work with an existing system. Since I don’t often move models from one platform to the other, a physical diagram can work well for me.

2025-11_0143

Once I click next, I get a list of platforms, and I need to enter a name. I’ve done that. Notice that when I select SQL Server, I can alter the drop down for different versions of SQL Server.

2025-11_0145

Below this, I see a source box, and a file picker.

2025-11_0146

I need a file.

I’ll go to SSMS and right click my database. I’ll find the Generate Scripts task, as shown below.

2025-11_0147

I go through the wizard and mostly pick defaults.

2025-11_0183

I do script to one file, not separate ones. I choose all objects.

2025-11_0184

Back to Data Modeler with the file. I select this and upload it and see…an error.

2025-11_0185

When I open the script, I see lots of non-database stuff.

2025-11_0186

I’ll delete this stuff and also a bit at the bottom. I rename this to Westwind.sql, which is what I am importing. Then I was able to import the file and when I clicked “Start modeling”, I ended up with this:

2025-11_0215

Redgate Data Modeler (RDM) has detected quite a few relationships. You can see them in the diagram for the explicit ones defined. If I click one, such as the Order to Order Details item, I can see on the right that this is for OrderID in both tables..

2025-11_0216

One thing that threw me slightly was two Employees tables. I was wondering what was going on here, but when I clicked the lower one and looked through properties, I can see that this is for a different schema. I wish this were more visible, but it did get detected.

2025-11_0217

Cleaning up the Design

One of the things I like is that I can set areas in the model. This lets me organize things and even convey information to developers and others that work with the database.

At the top left, I have a series of icons. The last one on the right is the New Area icon. I’ll click this.

2025-11_0218

I can now draw an “area” somewhere. Notice the right when I do this. I have properties for this area.

2025-11_0219

I’ll add a name and change the color, which gives me way to easily see this area in my diagram. I’ll also drag in the Auditing.Employees table, as this is what I want people to know.

2025-11_0220

I can select this and move it (look at the video) and then I see this as a part of my diagram, but clearly separate from other parts. Developers can learn the light red is the auditing schema, which is separate from the rest of the diagram.

2025-11_0221

I can add other areas, not just for schemas, but for separating out parts of my database. Often I have a series of entities that I care about, or want to cluster together, and having areas with colors lets me separate these out.

2025-11_0222

I might run out of colors in a large database, but I could use lots of pale blues or grays to separate out areas, each of which has a name. That can be helpful to reduce the complexity of the model.

Adding a New Table

I can also add a new table if I want. One of the icons at the top is for new tables.

2025-11_0224

When I click in the model, a new table appears. The right opens up the properties with a default name (table14). I can change this and add the columns I need.

2025-11_0225

Once I’ve finished, I can add a relationship. There is an icon for this.

2025-11_0227

I’ll click this and then click and drag from Products to Discount. This gives me a new column in Discount by default, called Products_ProductID. Not a bad name, but in general I want a cleaner name. I’ll edit the relationship to use ProductID in both tables and delete the other column from the model.

2025-11_0226

Modeling with an image is a good way to start to visualize how things are setup and where you might be normalizing, or denormalizing data. I also want to know just how many things are in here and what is related to what.

Moving Forward

There are more things that can be done with tools like that to help ensure our databases are well designed and perform well. I’ve submitted feedback to the team and asked for some enhancements.

One of the things I might want to do from here is update my dev db from the model. I’ll show that in a future post.

For now, I like the idea of getting a model started from my SQL script, though clearly I need a clean script. I’ll do more testing with other scripts for both forward and reverse engineering.

Give Redgate Data Modeler a try and see if it helps you and your team get a handle on your database.

Posted in Blog | Tagged , , , | 1 Comment

What’s Your Theme Music?

A few weeks ago, I was at the Small Data SF 2025 Conference in San Francisco. I attended the inaugural event last year and decided to go back again. It’s a great chance to hear people thinking about data and its impact on the world in a different way, recognizing that building lager and larger systems isn’t always possible. Or a good idea. We might find that smaller systems fit well, especially smaller datasets, which can both serve our purposes and create agility. The manifesto of the conference says that “We champion the power of Small Data and smart AI, believing that less is truly more.” There’s a bit more, but that’s the idea.

The format for the conference is a little different, with 3-5 talks in a row, all on one stage, each about 25 minutes long. These are talks with or without slides, but no live demos, just speaking and expressing a point of view. What I found fun was that each person picked their own music to play as they walked onto stage (or ran/danced in the case of Glauber from Turso). It was a bit of fun, with the DJ letting the music play as the person made their way to the front and were welcomed by the audience. I heard rock, metal, hip hop, and more.

It’s Friday during the week of the PASS Data Community Summit, and I had the chance to deliver part of the keynote on Wednesday. I’ve done this before, and no one has every asked me if I wanted a pick a piece of music, but it got me thinking. What would I pick?

For a fun Friday during the holiday season, think about if you were going to give a presentation. Maybe to your team, maybe other groups in the company, or (for some of you) on a conference stage. What music would you choose to accompany your walk into the bright lights? Imagine you get between 10 and 30 seconds.

Have some fun, and remember this is a professional setting. My first thought was something from the Notorious BIG, but I realized I’d have to walk fast as most of his lyrics wouldn’t be appropriate. I’m not sure, but I lean towards one of these: one, two, three, or four.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | 3 Comments

Done is Better than Good

Mary Spender is a musician in the UK who I follow and hope to see live one day. She works hard producing content about music, that business, and, of course, songs. Recently she had a little essay on Instagram where talked about creative time and focus. In it she referenced Elizabeth Gilbert saying “done is better than good.”

My initial reaction was “that’s right.”

Then I thought about software, and poor queries impacting database performance, and thought, “No, that’s not right.” My next reaction was to think maybe it’s “done is better than great”. I do see plenty of engineers trying to build great software. Code that would impress their peers or their former professors. Or maybe their future self.

Then I thought, no, Elizabeth is right. If things don’t get done, then what’s the point?

At the same time, I think that “done” and “good” (or great) aren’t mutually exclusive. We can get things done and make them great, which is something to strive for. We can also get good things done.

Sometimes.

Sometimes we don’t have that luxury of time, for various reasons. If I had to make a trade, I push for good (or quality) as much as possible if the delay isn’t substantial. If it is, then done is likely the choice I’d make. I do try to return and refactor, improve, etc. to raise the quality over time, but I recognize that sometimes getting something done is important. Certainly, on the ranch, I need to fix things to get by, with the aim of doing a better job later. I have mixed success at returning to improve a patch with a better fix later, but so far, that’s worked well. I’d say the same thing has happened while building software or managing systems, with enough success to be comfortable with my choices.

What about you? What choices have you had to make about being done over delivering something that’s good? Or maybe great.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged , | 2 Comments