Your Biggest Data Model Complaints

I’ve been working with databases for a long time and there are no shortage of things I’ve seen other people do that I don’t like. Sometimes I shake my head a little. Sometimes I might groan inwardly (hopefully not aloud), and sometimes I might make an effort to convince someone else to do something differently.

Sometimes I’m really annoyed (or angry) and don’t even know what to do.

I know that most people are trying to just get work done. They might rush through something and not do a good job, perhaps because of oversight, or perhaps they are naïve about the effects of their work. Maybe they have ingrained habits and are unwilling to change. Maybe there’s another reason (let me know if there is one).

However, no matter the reason, it can be very frustrating to work on poor database designs. There might be other things that bother you, but today I’m focused on the data model. Do you see poor naming of objects? Are there problems with the way they structure their entities? A lack of indexes?

What are your biggest complaints about the structures in your databases?

While I am looking forward to your stories, I want you to be professional. We’ve all made mistakes, and there is likely some (most?) code we’ve written that we wish we could redo. Don’t embarrass anyone or any organization, but let us know which types of problems or anti-patterns are your biggest complaints. Bonus points if you can do it in a humorous story.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

This entry was posted in Editorial and tagged database design. Bookmark the permalink.

6 Responses to Your Biggest Data Model Complaints

paschott says:

February 28, 2025 at 8:36 am

Biggest one – probably lack of indexes. That’s just an easy one to cause all sorts of issues. After that – data types and normalization usually compete for problems.

Lately, JSONify all the things seems to be winning out. We get so many data bits in JSON format and just … dump them into SQL Server. That’s all well and good, but at some point we have to get it out or filter on it and then we’re left parsing huge JSON strings to find a match somewhere in the string. I know that JSON indexes are being considered and should make it into Azure/SQL 2025, but until then – it’s a pain point to design around them.

Even worse – I regularly see developers reaching for JSON because “we could have different data”. For example – Address data may or may not have an additional Address3 or maybe a flag or category for a particular use case. But instead of just defining columns and leaving them empty/defaulted, the entire address is stored in JSON so we have to parse it out later. It’s kind of funny, but at the same time – it reminds me of the old “SSIS Hammer” that we used to have. Solving every problem by using SSIS. I will not confirm nor deny having ever done that. 🙂

LikeLiked by 1 person
- way0utwest says:
  
  February 28, 2025 at 8:53 am
  
  I find either a lack of indexes, or an over-abundance of them, sometimes from the same developer(s). Both are challenges.
  
  JSON/XML/et al are always a challenge. I get that deserializing (or serializing) is a pain for devs, but there are lots of tools for this. I think JSON works well for a lot of things, and I get that sometimes they don’t know what data is there, but also that smashing that into a document also makes it hard to find and aggregate. I get less worried about the address3/address4 issue and more worried when they want to stuff price, tax, discount, and more into a document without deserializing this into a format that can be manipulated well. Or crush all messages/replies into a document when we might often need to find who sent messages across many documents.
  
  LikeLiked by 1 person
Louis Davidson says:

February 28, 2025 at 8:46 am

As much as I would like to think that I (and people like Hugo Kornelis) have helped by speaking and writing about database design for many years, there seems to be two factors that just keep getting in the way:1. The tyranny of time

2. The fact that RDBMS software keeps getting faster at dealing with poor patterns

Normalized database are all typically going to work better, when backed up with a good data warehouse/dimensional design, but normalizing takes a few things that people don’t really have. Plenty of time, and adequate understanding of the details of the system they are creating.

LikeLiked by 1 person
- way0utwest says:
  
  February 28, 2025 at 8:55 am
  
  I’m not sure it takes a lot of time. I think more we need some knowledge and some patterns for putting entities together quickly. That and the desire to refactor quickly when we realize we’ve started to dig a hole in some way. I think many of these small changes to the data model in a database, and in the C#/Java model aren’t that hard, we just don’t do them.
  
  LikeLike
- m60freeman says:
  
  February 28, 2025 at 1:20 pm
  
  Lack of normalization is the problem I see most frequently.
  
  I am working on a system now where the same NVARCHAR column exists in 40+ tables where there are about a dozen values used. It is a type of status/state indicator. It should obviously be normalized, but it is involved in what I think is an even worse than that.
  
  The column is referenced in dozens of stored procedures (and probably in many places in the C# code) only using IN() predicates in queries. There are about a dozen permutations of which 3 to 5 of these strings are in any instance of these IN() predicates. There is no way to tell what the intended meaning is of any of these groupings.
  
  Fixing it now would require so much time and effort that it has been “kicked down the road” for 7 years of continuous application expansion, during which time ever more code has been written that should eventually be refactored.
  
  Indexing may be my biggest performance issue, but lack of normalization makes any needed indexes much bigger, and the second issue I mentioned means no equality predicates that could make the indexes more effective.
  
  This is a Code First Entity Framework Core project that had over 100 developers but no DBA involved until after the first few production releases.
  
  LikeLike
- way0utwest says:
  
  February 28, 2025 at 1:23 pm
  
  good point on normalization. I think so few people think about the impact of this, especially when we have updates for some data. I see this as a big problem in NoSQL systems, where we repeat some of this data in every document, and at times when an update is needed, it’s computationally very hard. Both impacts to storage and compute.
  
  LikeLiked by 1 person

Comments are closed.

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28