The Modern Algorithm of Chance

These days algorithms rule much of the world. From how supply chains are managed to how vehicles run their engines to the media that many of us watch on the various streaming services. I assume that most of you know that algorithms drive what you see on social media, on YouTube, and even the search results you get, and what you see might be different than what I see. There is a constant search for a perfect, or at least, very targeted way of getting you what you want.

Or at least what the algorithm thinks you want. However, is that the best way for algorithms to be designed? It is for the companies that want to profit from your attention, but is this intense personalization better for us?

There is an interesting article on music discovery, focusing on Spotify, since they are one of the largest streaming services. The article talks about the algorithm and how it tries to match selections to our tastes, basically a complex data analysis of our choices along with metadata that’s been created around data that’s hard to classify. There are attributes assigned to songs, but are these the attributes that make sense? That’s a topic for another day. The result of this is that Spotify tends to recommend more of what we already listen to, which has also driven artists to change how they produce songs since the algorithm matters.

This seems like a similar challenge to what I’ve seen with the written word. A long time ago many of us consumed the words (with less choice) in newspapers, books, and other physical media. However, we often ran into random things that were different because of our physical paths in life. We might encounter books in a shop or library and be attracted to a cover for some random reason. We might pick up an unexpected work lying adjacent to one in which we were interested and discover something new.

The way we look at books, or anything, changes when we browse and randomly wander the world. These days, we have less of that, with algorithms in electronic systems that guide us further on a path we’re walking, not allowing for chance encounters, or even wildly different thoughts because we stumbled on something. Even in our social media, this doesn’t often happen. I’d hope that we might encounter a recommendation from another we wouldn’t otherwise see, but the promotion of certain feeds and the glut of viral re-sharing often ensures that we don’t see many random things. Instead, most of us see the same thing that many others do.

Those of us who have studied computer science know random things are hard to create in computer systems. Building algorithms that embrace randomness isn’t something many of us focus on, instead trying for matches that reinforce or duplicate something our clients already want/use/see/etc. That has helped create many businesses in the digital world, but I’m not sure that those businesses are always good for the world.

I don’t have a good solution for random chance, other than talking with others, especially those who live different lives from you, and embracing the way they view the world. Hopefully that leads to a book, movie, or other chance encounter that you might not otherwise have.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | Comments Off on The Modern Algorithm of Chance

Denver Dev Day Oct 2024

This Friday is the second Denver Dev Day of 2024. If you’re in the Denver area and you want to come network with fellow developers (and a few data pros) and learn something, come down to the Microsoft office in the Tech center and join us.

Register now and come Friday. Tickets are FREE

I’ll be there talking about Continuous Integration in Azure DevOps with Local Agents running your code, but the schedule has a lot of interesting sessions. Quite a few data ones as well.

If you’ve never been, this is a fun event, you’ll get lunch and chance to win prizes as well as interact with lots of fellow Denver peers, so join me Friday.

Posted in Blog | Comments Off on Denver Dev Day Oct 2024

Monday Monitor Tips: Knowing Your RPO

A customer was asking recently about the RPO for their estate, and I showed them a few things from the Estate tab in Redgate Monitor. This post covers a few highlights.

This is part of a series of posts on Redgate Monitor. Click to see the other posts

Knowing Your RPO

It’s not often I have been asked by a business user about the Recovery Point Objective, though I’ve seen plenty of DBAs ask business people what they want as an RPO. If you don’t know what an RPO is, read this, and then think of this as the amount of data I could potentially lose.

Not that I will, but it’s possible.

In Redgate Monitor, the Estate tab shows you the RPO for all databases, as a graph. I’m showing this one from the demo site at monitor.red-gate.com. As you can see, I have most of my databases with an RPO of 1 hour or less. The vast majority are under 12 hours.

2024-09_0096

Note: This graph doesn’t include databases that have never been backed up.

This is based on backups, not on clustering or AGs or anything else, but this does let me determine if I think my estate is healthy. I wish I could alert or easily monitor this, but I can see if there are problems.

If I scroll down, I can sort the table of all databases by the worst RPO. I see this, which shows the SSC server at the top.

2024-09_0102

I’m not worried, as this is an AG setup and the one database is backed up regularly. Obviously we had a hiccup as 3 days is crazy, but I’m confident someone fixed this as I see a lower RPO with the current backups that have been completed.

If I filter by these servers, things look better.

2024-09_0103

At an estate level, there may be systems with a large RPO. Data warehouses, or reporting systems aren’t backed up as they can be reloaded in an emergency.

This is a good way for you to keep a high level view of your estate. In general, you will get a feel for what the estate looks like and you can drill into individual systems or databases if you’re wondering. I wish I could filter by RPO, but maybe that’s coming one day.

Summary

This section of the Estate tab gives you a high level view of backups and how you’ve configured them. Ultimately your backup schedule is less important than the RPO it produces and this let’s you keep an eye on what the RPO is for everything.

Redgate Monitor is a world class monitoring solution for your database estate. Download a trial today and see how it can help you manage your estate more efficiently.

Posted in Blog | Tagged , , | Comments Off on Monday Monitor Tips: Knowing Your RPO

The Role of Databases in the Era of AI

I’m hosting a webinar tomorrow with this same title: The Role of Databases in the Era of AI. Click the link to register and you’ll get some other perspectives from Microsoft and Rie Merritt.

However, I think this is an interesting topic and decided to try and synthesize some thoughts into an editorial today, partially to prep for tomorrow and partly because I’m fascinated by AI and how this technology will be used in the future.

The title says the role of databases, not data professionals. You might worry an AI is going to take your job as a DBA or developer, or you might think there is no way an AI can do your job. I tend to think the latter, but only if you are above average in your role and you add value by understanding your employer’s business. In those cases, the AI will help you (as a co-pilot, not a pilot) and allow you to get more work done or work done faster. You choose. If you churn out average, or below-average work, or cut/paste from Stack Overflow or SQL Server Central or anywhere on the Internet, then yes, you should worry.

Databases store lots of information, and extracting that out is hard. I see no shortage of poor data models, no shortage of overloaded data in fields, de-normalized structures, repeated information, and more. Humans jump through lots of hoops to build reports or screens or other interfaces to present to humans looking for answers. We may load join data in Excel with values in a database or vice-versa. I’m sure many of you have plenty of stories on how you get data to move between some data store and a text format. I’m sure you also have no shortage of frustrations from your efforts.

AIs will get good at this. At the Small Data 2024 conference, I saw many people working at using AI without a semantic layer, which I think is possible, but will likely fail. We store data in too many crazy ways, and companies will need to make it easy for customers to create a semantic layer that describes what data is stored in each place. They’ll also get the AIs to help not only with this but with creating a way to simulate Master Data Management without requiring every application to use Redgate Software, Inc. as a name. We need to ensure Redgate, Red-gate, Redgate Software, and RG stored in different fields can all joined as if they were the same value. Which they are.

Fuzzy matching is the domain where AIs can shine, as the models can do this quicker than humans, without getting annoyed and with fewer mistakes. AIs can adapt with our feedback as we find ways to train the models better and overload the AI prompts with semantics that help translate the (extremely) poor data models in our databases, data lakes, spreadsheets, and even PDF documents. Companies that require a semantic layer can ease the process of building one with AI assistance so that customers can quickly start to query their wide array of data sources.

The best use I’ve seen for AIs is as an easy-to-use, context-aware, powerful search engine. When we learn how to tune these for specific sets of data, such as all the datastores and spreadsheets in a company, we’ll start to see some amazing gains in information analysis. I don’t know that humans will analyze any better than they do today, but the process of getting the information to analyze will be easier. I think AIs will also help in the analysis phase, but that’s going to require more co-work between humans and AIs to improve the quality of analysis.

There are other things, but I see databases as incredible stores of information that AIs will make easy to access. I’m also positive AIs will be used to more easily update information in databases and assist in easily moving data from one format to another or one location to another.

Tune into the webinar tomorrow and see what Microsoft thinks and ask any questions you have.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | Comments Off on The Role of Databases in the Era of AI