The Worst Data Breech

I noticed this week that Australia passed a law that requires companies to hand over user information, even if encrypted. Quite a few articles that point out this might require backdoors to be created in communication systems to comply with the law. Companies are required to provide plain text user communication if they can, or build tools to allow this if they do not have the capability. The proponents of the bill argue this is necessary for criminal prosecution.

Perhaps they are right, but if this capability is required, this means that either companies will have backdoors built into their products that allow them to decrypt things you might have expected to remain encrypted. That’s disconcerting to me, not because Apple, Google, or someone else might read my communications, but because no company has really proven they can protect all the data they store.

Can you imagine how many malicious actors might spent their efforts trying to find those backdoor encryption keys? What if there aren’t backdoor keys, but companies decide to build some sort of key logger into software that copies data before it’s encrypted. Can you imagine how problematic it might be to secure that data?

I’m also concerned because this would mean that there could be a few keys that can be used to get access to encrypted data, something like “master keys” in door locks. In this case, the loss of a key might mean problems for huge numbers of people. The other option would be lots of backdoor keys, potentially a different one for each customer/device, in which case we have a large data set that I’m sure will get leaked. At that time, how likely will it be that we’ll be able to implement new keys for large numbers of people?

I sympathize with law enforcement. In some ways, their jobs are much harder. In others, however, I think they have many more tools, and the need to weaken encryption doesn’t seem to be necessary. Many of us have a need to secure data, to protect it from unauthorized access. At a time when security is proving to be a challenge and record numbers of data breeches are occurring, do we really want tech companies to start building products with less security? I don’t.

Steve Jones

 

Posted in Editorial | Tagged , , | 1 Comment

Automation at Work

I do worry about the future of work for large sections of people. When I read pieces like this one in the Atlantic on automation, there are two things that come to mind. First, we are mindlessly sticking with 19th century models of work in many cases. Second, there are opportunities that could dramatically utilize the leverage of computing power to reduce our need for humans in many cases.

Far too often I’ve seen processes and procedures in place that exist strictly because of historical precedence. We developed some way of working, likely because of expediency. We needed something done, so we found a way for a human to do it. We continue to do it that way, often because of a factory mentality. We don’t trust workers, who come and go, to handle the process correctly, so we specify a way of doing things that we know works. Even if it doesn’t work well.

What’s amazing to me is that many of us still do this in technological jobs. I find lots of DBAs and infrastructure people that still do an amazing amount of manual work to check logs, jobs, backups, etc. They avoid automation for a variety of reasons, but often because of laziness and fear. They don’t want to think and put time into changing a process, both avoiding coding as well as asking permission. They also fear for their jobs, as shown in the article. Automate too much and maybe the company will replace you with a less skilled, far cheaper worker.

Perhaps I’m an outlier, but this has never been something I’ve seen in my career. When I automate things and free up time at work, I don’t sit and browse Reddit play chess, as a few profiles from the article show. Instead, I’m more like Gary. I look for, and find, ways to improve other aspects of the company. I help others. I provide more “value” for my salary. This has worked well, even in companies that had a culture of “just do your job.”. There are always a few managers that want thinkers and doers, not just people that mindlessly move through each day.

Automation is coming, more and more every day. As I look at the evolution of the data platform from Microsoft, the growth and capabilities of cloud services, and even the amazing third party products that free up our time, I know that the bar is constantly raising for the skills we require. What we might have expected only senior level people to do in 1999, we expect juniors to know now. Not everywhere, and certainly plenty of older management is stuck with their historical views of “just do this job,” but times are changing if you seek a new employer.

I want to see more scripting, more PowerShell, more Bash scripts, more DevOps pipelines, more systems doing tedious work. That’s because many of our scripts and our flows are still rudimentary. They’re basic, expecting the happy paths to work, with limited testing and error handling. Instead, I’d like to automate myself out of work, but then find ways to better script more robust processes, with ways that double check my code is working, and alert me when it’s not. With new responses that are more intelligent than  a simple IF..THEN statement.

We have lots of room to improve in how we structure systems and code, whether in application development or infrastructure management. Hopefully we’ll all start to embrace more automation, and look for new opportunities rather than being fearful of change.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 4.8MB) podcast or subscribe to the feed at iTunes and Libsyn.

 

Posted in Editorial | Tagged , | Leave a comment

MoSQL

Google is doing more SQL, or at least shifting towards relational SQL databases as a way of storing data. At least, some of their engineers see this as a better way to store data for many problems. Since I’m a relational database advocate, I found this to be interesting.

When Google first started to publish information on BigTable and other new ways of dealing with large amounts of data, I felt that these weren’t solutions I’d use or problems that many people had. The idea of Map Reduce is interesting and certainly applicable to the problem space Google had of a global database of sites, but that’s not a problem I’ve ever encountered. Instead, most of the struggles I’ve had with relational systems are still better addressed in a relational system.

Google feels the same way, and in a blog, they talk about choosing strong consistency where possible. This is a post that promotes their relational SQL database (Cloud Spanner), but there is a good discussion of why consistency matters and where moving to popular NoSQL models, like eventual consistency, cause a lot of problems. Both for developers and clients.

This quote caught me eye, and I may use this with developers that look to avoid RDBMS systems: “Put another way, data stores that provide transactions and consistency across the entire dataset by default lead to fewer bugs, fewer headaches and easier-to-maintain application code.” I think that’s true as I think many of the advantages promoted in non-RDBMS systems are often placing a greater burden on the application developer than they realize. A burden that grows over time as the techniques used cause more technical debt.

I think more SQL based systems are the way to go for many problem domains. Google agrees, and if you read more about the Cloud Spanner team, you might agree as well. You’ll also find them to be incredibly smart people that think deeply about the problems that are both raised and solved by relational systems.

So go ahead and promote more SQL Server databases. Google thinks they’re good for many applications, and that’s good enough for most developers.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.1MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged , | 1 Comment

Lax Security is Harmful for Employment

Manure rolls downhill Since I live on a horse ranch with some slight hills, I can attest this to be true. At least, it’s true for horses and it’s true for short distances. Manure isn’t very friction free and often ceases movement quickly. The same isn’t likely true for bull droppings, but I haven’t done much testing in that area.

Most of us would agree that those that are negligent in their jobs, especially with regard to security, ought to be punished. In some cases, this should lead to termination, though I think many of us technical people would prefer that management who doesn’t budget resources for security be the ones punished.

I mentioned manure rolls downhill, and this article on the after effects of data breaches bears that out. Not only were there record numbers of issues last year, but the typical cost is nearly $4million. That’s likely some very expensive breaches and lots of relatively inexpensive ones, but even the low cost ones probably feel expensive to small companies that experience them. In the lists of breaches I’ve seen, lots of smaller firms (retail, law, etc.) are included, and tens of thousands of dollars might be expensive for them.

One thing that article points out, there are an increasing number of C-level executives being terminated after breaches. I’d like to think that’s good, but I’m somewhat pessimistic that the next hire will find ways to improve security. There are lots of impediments to fundamental change in more organizations, so I suspect this trend leads more to short term employment for CIOs and others, and likely higher demands for salaries because of the risk of security issues inside the company. The further puts pressure on budgets, which is another impediment to better security.

Note that it’s not just IT execs, but non-IT staff as well. Maybe I’ll be wrong and this will make a difference. Of course, IT staff are let go as well, often blamed for issues. There will always be some security issues, but I urge those of you with privileged accounts and access to sensitive data to be careful with your credentials and work to improve security when you see issues. Get written documentation when someone doesn’t allow security changes, in addition to noting your requests. This might not stop a data breach, but perhaps it will give you a better chance of not being blamed for security incidents.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 4.0MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged , | Leave a comment

Basic Sequences–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I haven’t used sequences much in my work, but I ran into a question recently on how they work, so I decided to play with them a bit.

Sequences are an object in SQL Server, much like  a table or function. They have a schema, and are numeric values. In fact, the default is a bigint, which I think is both good, and very interesting. Since this will implicitly cast down to an int or other value, that’s good.

The sequence is created like this:

CREATE SEQUENCE dbo.SingleIncrement
  AS INT
  START WITH 1
  INCREMENT BY 1;
GO

These can be similar to identity values, and in fact, if I make 5 calls to this object, I’ll get the numbers 1-5 returned. Here I’ve made one call.

2018-12-04 13_19_48-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

This is interesting, as the NEXT VALUE FOR is what accesses the sequence and returns values. I can use this in some interesting ways. For example, if I have to insert values into a table, I can do this:

CREATE TABLE SequenceTest
( SequenceTestKey INT IDENTITY(1,1)
, SequenceValue INT
, SomeChar VARCHAR(10)
)
GO
INSERT dbo.SequenceTest
(
     SequenceValue,
     SomeChar
)
VALUES
   (NEXT VALUE FOR dbo.SingleIncrement, 'AAAA')
, (NEXT VALUE FOR dbo.SingleIncrement, 'BBBB')
, (NEXT VALUE FOR dbo.SingleIncrement, 'CCCC')
, (NEXT VALUE FOR dbo.SingleIncrement, 'DDDD')
, (NEXT VALUE FOR dbo.SingleIncrement, 'EEEE')

When I query the table, I see:

2018-12-04 13_22_50-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

Notice that the sequence number is off by one from the identity. This because I first accessed the sequence above.

The sequence is independent of a table or columns, unlike the identity. this means, I can keep the sequence numbers going between tables. For example, let’s create another table.

CREATE TABLE dbo.NewSequenceTest
( NewSequenceKey INT IDENTITY(1,1)
, SequenceValue INT
, SomeChar VARCHAR(10)
)
GO

Now, we can run some inserts to both tables and see what we get.

INSERT dbo.NewSequenceTest VALUES (NEXT VALUE FOR dbo.SingleIncrement, 'FFFF')
INSERT dbo.SequenceTest    VALUES  (NEXT VALUE FOR dbo.SingleIncrement, 'GGGG')
INSERT dbo.NewSequenceTest VALUES (NEXT VALUE FOR dbo.SingleIncrement, 'HHHH')
INSERT dbo.SequenceTest    VALUES  (NEXT VALUE FOR dbo.SingleIncrement, 'IIII')
INSERT dbo.NewSequenceTest VALUES (NEXT VALUE FOR dbo.SingleIncrement, 'JJJJ')

After running the inserts, I’ll look at both tables. Notice that the values for the sequence are interleaved between the tables. The first insert to the new table has the value, 7, which is the next value for the sequence after running the inserts for the first table.

2018-12-04 13_27_14-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

In these tests, I’ve used 11 values so far. I can continue to use values, not just for inserts, but elsewhere.

2018-12-04 13_34_02-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

This behavior is both fun, handy, and useful, but also dangerous. These values get used when I query them, whether the inserts work or not. Here’s a short test to look at this:

ALTER TABLE dbo.SequenceTest ADD CONSTRAINT SequencePK PRIMARY KEY (SequenceTestKey)
SELECT NEXT VALUE FOR SingleIncrement
SET IDENTITY_INSERT dbo.SequenceTest ON
INSERT dbo.SequenceTest VALUES (NEXT VALUE FOR SingleIncrement, 'ZZZZ')
SET IDENTITY_INSERT dbo.SequenceTest OFF
SELECT NEXT VALUE FOR SingleIncrement

This gives me an error:

2018-12-04 13_36_52-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

and I can see the last SELECT has the next sequence value.

2018-12-04 13_36_45-SQLQuery6.sql - dkrSpectre_SQL2017.sandbox (DKRSPECTRE_way0u (55))_ - Microsoft

There are a lot more to sequences, but I’ve gone on long enough here. This is a good set of basics to experiment further, which I’ll do in future posts.

SQLNewBlogger

This post went on longer than expected, and it was more of a 15-20 minute writeup as I set up a couple quick examples, tore them down, and rebuilt them with screenshots for the post.

This is a place where I can show I’ve started to learn more, and by continuing with other items in this series, I’ll show some regular learning.

Posted in Uncategorized | Tagged , | 2 Comments

SQL in the City is Next Week

Next Wednesday, Dec 12, I’ll be back in the UK with Kathi, Kendra, and Grant for SQL in the City Streamed. You can register today and join us for a set of DevOps talks that will get you to think about ways to better perform database development.

Steve-email signature

For once I have no demos. I’ll do the keynote, looking at the State of Compliance Database DevOps. I’m going to summarize some of the results and findings from other surveys, as well as various observations from my conversations with customers and colleagues at various events this year.

There are lots of other sessions, where we talk about deployments, security and compliance, and a some Redgate tools.

Join me and register today for a nice break from work.

Posted in Blog | Tagged , | Leave a comment

Revving the Error

I was pleased to see the fix for the string or binary data truncated error saw some development work in SQL Server 2019. If you haven’t read about this, the changes are described on MSDN and I thanked the development team. What’s even better news is that the fix has been ported to SQL Server 2017 in CU12 is is also slated to appear in an upcoming SQL Server 2016 CU.

In working on this error, Microsoft provided some guidance about where the first occurrence of the truncation would occur, in the column and which data value. This is invaluable in troubleshooting data movement, though this could be a trial and error process if you have lots of possible errors in different rows and columns. I could see us working our way through the list of problem data cell by cell.

Microsoft didn’t change the error message, as that might have broken code. Instead, we get a new error message, number 2628, which allows us to slowly rev our error handling code. To me, this was a much better solution than merely revving the old error. We have a trace flag to turn the behavior on and off, so we can work through the places our code is affected.

The plan is that at some point this will become default behavior, and error 2628 will replace that the all too familiar and frustrating error 8152. That makes sense, and I like the idea of limiting how long we can control behavior of our systems. While we might have code that needs to be fixed, we need to fix the code, and sooner rather than endlessly deferred. Even if you have an existing SQL Server 2016 or 2017 system, I’d hope that you’d be looking to patch your system at some point, perhaps to a level where this is the default behavior.

There are many reasons why and when we patch, but we might be forced to because of some security issue. If that happens, I hope you’ve been working to fix code that is looking for error 8152 and are prepared to have SQL Server behavior changed.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.2MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged | Leave a comment

Updating SSMS to 17.9.1

I went to download the SSMS v17.9.1 package the other day and saw this:

2018-11-27 12_48_59-Download SQL Server Management Studio (SSMS) _ Microsoft Docs

I remember some issues with a C++ redistributable with a SQL Server install (2016?) and decided to just check on this. I really wish that these articles and items were crosslinked, so I could check them easily.

I decided to write this post to note how to do a few things, maybe to help you, but mostly so that I’ll remember things later.

I don’t know if I need a reboot, but I’d rather not if I can avoid it, through I’m slightly concerned this stuff will force a reboot.

Checking C++ Redistributable

The first thing is to check the Visual C++ 2013 version. I found this KB article on the update with downloads. The file information is

Name: VCRedist_x86.exe

X64 location: %WinDir%\SysWow64

A search was slow, but found files. Lots of them. Apparently this is put down quite a bit by various software.

2018-11-27 12_53_25-VCRedist_x86.exe - Search Results in SanDisk900_a (C_)

Let’s go to the location noted. In there, I find I don’t have the file.

2018-11-27 12_53_40-SysWOW64

Not good, and I’m glad I checked. I’ll download the US English version from: http://download.microsoft.com/download/0/5/6/056dcda9-d667-4e27-8001-8a0c6971d6b1/vcredist_x86.exe

Once that’s done, I’ll install it.

2018-11-27 12_54_58-Microsoft Visual C   2013 Redistributable (x86) - 12.0.40660 Setup

This is 12.0.40660 and the minimum for SSMS is 12.0.40649.5, so I should be good here. The installation is slow, especially for a 6MB download.

It finishes, and I’m out of luck. I’ll reboot and continue this post.

2018-11-27 13_00_26-Microsoft Visual C   2013 Redistributable (x86) - 12.0.40660 Setup

Checking .NET

There’s a KB for this and it has various links, including viewing the registry or using code. I’ll use PoSh. The code they give is:

# PowerShell 5
Get-ChildItem 'HKLM:\SOFTWARE\Microsoft\NET Framework Setup\NDP\v4\Full\' | Get-ItemPropertyValue -Name Release | Foreach-Object { $_ -ge 394802 }

That last value is the minimum for .NET 4.6.2. If I remove the Foreach, I get a result of 461808, which is 4.7.2. I’m good here.

OS Updates

I’ll check Windows Updates, and its shows a couple items, none of which matter. Not sure what the SQL2016 item is. I know I have a couple old installations that are broken, so maybe that’s for one of those. Fortunately I have a couple working instances, should I need them.

2018-11-27 13_11_49-Settings

Upgrade SSMS

Let’s install.

2018-11-27 13_15_26-Microsoft SQL Server Management Studio

The process runs without issues, as I’d expect. For the most part SSMS 17.8 has been what I’ve run and it’s been fine. No real issues, and not sure this updates changes anything for me.

The list of issues is small, and I haven’t had any of these. Even the 17.9 list of issues isn’t anything that concerned me, so this was more updating for the sake of updating here.

This completed, and when I restart SSMS, I see what I expect. All my plugins and the tool connects to my instances.

2018-11-27 13_19_43-SQL Source Control - Microsoft SQL Server Management Studio

Posted in Blog | Tagged , | Leave a comment

GitHub Downtime

I didn’t notice any issues with GitHub, but others did. The majority of my interaction is just through the git protocol, so things tend to work fast, and I don’t have any database access. I rarely use the Issues, and other parts of GitHub, which were affected when GitHub had a MySQL cluster fail over. There’s a good write up of the post incident analysis that’s worth reading, from a database perspective.

I’m not a big MySQL guy, only running an instance to power T-SQL Tuesday. The structure of a write primary and many read replicas that GitHub describes makes sense. It’s similar to what I’ve done in SQL Server, and certainly the idea of some quorum management, handled at GitHub with the Orchestrator software, is something that needs to be configured properly. Allan Hirt has talked about the complexities of quorum in large installations, and it’s not a simple thing to configure.

In reading about this, there are a couple things that strike me. First, the analysis talks about a degredation of service because East coast applications had to send writes to West Coast database servers. There were some problems with the way the database servers were working, but it seems to me that there should be some sort of application failover that’s possible. If you can’t have an application and database fail separately without customer impact, then there should be some way to fail applications over. Perhaps not, but if you’re responsible for designing HA for the database, make sure you talk to the application people and test for issues.

The second thing for me is that somehow there was a period of time when writes were occurring to the East Coast system that weren’t sent to the West Coast. My ignorance of how this HA stuff works in MySQL prevents me from making a big deal of this, but this isn’t something that should happen. If the quorum moves data to another node, it must stop writes to the first node. This could happen in SQL Server, but for me, this is the level of data loss I’d need to accept in my RPO.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.4MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged | Leave a comment

Understanding a Database

I ran across a post that asked a good question, one which I want to ask you today: how do you learn about a database?

I’ve run into quite a few databases in my career. Some were third party systems, like Dynamics and JD Edwards World. Some were databases that custom designed and built by developers and database modelers of widely varying skills. Some were well built in order to normalize data and define referential integrity, and other databases were put together in a piecemeal fashion over time, lacking keys and consistent naming. I’ll leave it to you to guess if there were more of the former or the latter.

When a developer or DBA comes across a database, what’s the way that they can decode what fields and columns mean? Certainly names help at times, especially when the purpose of the database is understood, but all too often the names don’t quite make sense. This is especially true in many vendor databases. The one common theme I’ve seen in many databases is that there is no data dictionary provided by anyone.

Trying to understand a database has been a trial and error detective task for me in the past. Usually this starts when I need to do some work that is requested by users: write a report, change data, etc. In these cases, I often will ask users to access certain data related to the change from their application while I run Extended Events and note which entities are accessed. I can then start looking for data elements, and note which columns might be mapped to which fields in an application.

Often I’ve built a data dictionary of sorts outside of the database using something like ErWin, ER/Studio, or another tool. That has been somewhat flawed, as it’s hard to share the information with others. These days I think I’d make extensive use of Extended Properties to document what I learned, so that all my knowledge is available for anyone else that needed to work on the system. They can just look at the properties for various entities.

If you’ve got other methods, share them with us today. I’m sure there are plenty of DBAs and developers out there that would like some tips and tricks for decoding a database design.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.4MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged , | 2 Comments