Save on IT/DevConnections this fall

For the first time in many years, I won’t be at IT/DevConnections this fall.  I’m disappointed to miss the event, but I had other commitments around this time when I heard about being chosen as a speaker and this would be the fourth event in four weeks. That’s a bit much, and so I decided to pull out Sad smile

However, this is another of those great shows where you can learn about a number of different technologies besides SQL Server. I’ve like having the variety, and seeing different perspectives on technology, data related or not.

You don’t have to miss IT/DevConnections, however, and you can go for a discount for the next week. Register today and use the code MDS17 to save until June 1.

Posted in Blog | Tagged , | Leave a comment

Restore to a point in time–#SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. This is also a part of a basic series on git and how to use it.

One of the things I had to show recently was a restore to a point in time, and I needed to actually lookup syntax. That’s a good excuse for a #SQLNewBlogger post, so here we go.

When you restore, the default is to restore to the end of the backup file, whether full, diff, or log. In the latter, you have the option to stop early, and only restore part of the log. In doing this, you have two choices:

  • restore to a time
  • restore to a marked transaction

Relatively few of us use marked transactions, but they can be useful. However, this post looks at time.

Let’s do a few things and show how this works.

Setup

The first thing I need to do is get a database, with some data. I’ll create a database, add a table with a row, and then back up the database.

CREATE DATABASE RestoreTime;
GO
USE RestoreTime
GO
CREATE TABLE BackupLog
( logdate DATETIME2 DEFAULT SYSDATETIME()
, logmessage VARCHAR(200)
)
-- add a message
INSERT dbo.BackupLog
        ( logdate, logmessage )
    VALUES ( SYSDATETIME(), '1st Log, before a backup' )
GO
BACKUP DATABASE RestoreTime TO disk = 'RestoreTime.bak'
GO

Next we want to add some data to the table and get some log records. In this case, I’ll use a simple loop to add a new row to the table every second. This gives me some points in time to look for and use to show I  am restoring to a point in time.

DECLARE @seconds INT = 10,
        @starttime DATETIME2 = SYSDATETIME();

WHILE (SYSDATETIME() < DATEADD( SECOND, @seconds, @starttime))
 BEGIN
  INSERT dbo.BackupLog
        ( logdate, logmessage )
    VALUES ( SYSDATETIME(), 'Log record entered at ' + CAST(SYSDATETIME() AS VARCHAR(30)) )
  WAITFOR DELAY '00:00:01'
 END

GO
BACKUP LOG RestoreTime TO DISK = 'RestoreTime.trn'
GO

I’ve added data, and my table looks like this:

2017-05-23 15_23_12-SQLQuery8.sql - (local)_SQL2016.RestoreTime (PLATO_Steve (57))_ - Microsoft SQL

My log backup contains all these records. If I restore the full backup and log now, by default I’ll end up in this state, with all these rows in my table. However, what if I want to only get the first 5?

I can use the STOPAT syntax in my restore. I start by restoring the full with NORECOVERY and then the log. However, in the log, I’ll choose a time that is after row 5, but before 6. In this case, that’s 2017-05-23T15:22:57. Here’s the code:

USE master
GO
RESTORE DATABASE RestoreTime FROM DISK = 'RestoreTime.bak' WITH NORECOVERY, replace
GO
RESTORE LOG RestoreTime FROM DISK = 'RestoreTime.trn' WITH RECOVERY, STOPAT = '2017-05-23T15:22:57'
GO
USE RestoreTime
GO
SELECT 
 *
 FROM dbo.BackupLog

I have my results below

2017-05-23 15_31_10-SQLQuery8.sql - (local)_SQL2016.RestoreTime (PLATO_Steve (57))_ - Microsoft SQL

Only the log activity before this time is restored.

SQLNewBlogger

After I’d looked up the syntax, I spent only about 10 minutes setting up the demo and getting this ready. Practice skills and write about it. Show your next employer you are always learning and working.

Posted in Blog | Tagged , , , , | Leave a comment

Audio Attacks

There are running jokes about how Amazon Alexa and Google Home are susceptible to visitors in your home. It’s become a bit of test from some geeks who will walk in your house and call “Hey, Alexa” out loud to see if you have a device that’s available. I even had an accidental Siri activation while recording some content recently. I said something that started Siri listening on my phone and then a few people nearby kept trying to duplicate the effort. Fortunately Siri wouldn’t come on, so no one was able to cause too much trouble. I suspect that the speakers in the recording room weren’t positioned well enough to allow the control room based hacking.

I know that will change. In fact, even if Apple and other companies manage to get digital assistants to recognize specific people’s voices (as rumored), technology marches on. People are already starting to fake audio data. The fidelity of digital recording and capabilities of speakers improve constantly. This is going to be a greater and greater issue over time, and I have no idea what security technique will prevent things. Maybe we need two factor authentication for audio commands? Won’t that defeat the purpose?

As data professionals, we are going to be dealing with more and more types of data, and trying to process, analyze, and act on information. This is one reason that I think understanding data lakes and being able to import and combine many types and formats of data will be a valuable skill for all data professionals. Whether you use something like Azure Data Lake or another platform, I expect to be combining data in all sorts of ways and providing information to users.

While speech recognition might not be something many of us worry about, will we want to extract information from audio or video and use it? Do we expect that audio files have more integrity than other sources? I worry that we give some types of data more veracity than others, when all types are subject to hacking. Some of us may get audio files as data, and it’s only a matter of time before we get hacked, perhaps with fake audio.

One of the issues we have with some data is determining the source of record. If I record my voice as a sample, and you compare all future audio of me to that sample, you can verify my identity, right? What if someone can fake my voice with simple software? It sounds crazy, but those days are coming, especially if our systems are susceptible to a person stitching together words from different captures, such as some of Baracksdubs. What might be worse is when we find someone hacking a database and replacing the samples. What’s the source of record then?

As I spend more time in this business, I become more convinced that auditing and monitoring are more important that security. We want them all, but I’d rather know I have an issue than assume my systems are protected because the security doesn’t appear to be broken.

Steve Jones

 

Posted in Editorial | Tagged | Leave a comment

Looking Back at Build–CosmosDB

Part of a series looking back at Build 2017, going over the 20+ pages of notes I took.

One of the big announcements at Build 2017 was the release of CosmosDB. This is a rebranding of DocumentDB, the document store in Azure, with some additional capabilities. You can also choose to store as

  • Column-family (columnar)
  • Key-Value
  • Graph

The service is interesting in that it supports the MondoDB and Gremlin protocols for querying, which should allow some people to consider Azure instead of those platforms. The announcement was interesting from the data perspective. Microsoft focused a decent amount of time on this platform, and certainly people at the conference were very interested in the offering. Quite a few developers were thinking this would replace SQL Server, since it has some nice capabilities.

One of the big ones is that CosmosDB offers < 10ms latency for reads and < 15ms for writes. They scale up to millions of transactions/sec, and also grow to PBs in size. There are some good guarantees for the database.

They call it infinite, but that’s a marketing term. There’s a limit, and there certainly may be a limit you’re willing to store in the service. Jet.com was the featured customer and they talked about how much data they used and how they can scale the platform. I’m sure they went with DocumentDB and moved over, but they have their story in the keynote, if you want to watch. They get over 100 trillion, yes trillion, CosmosDB transactions/day. No matter what you do, a trillion of anything is a lot.

Perhaps the more interesting thing is that Azure is offering five different consistency models in CosmosDB. They are

  • Strong
  • Bound Staleness
  • Session
  • Consistent Prefix
  • Eventual

I don’t know how well these are supported in other platforms, but I like to see that developers have a choice. I’m not sure how many will understand the trade-offs and implications, and how many will get burned by choosing one over the others, but I’m glad the choice exists.

They are also saying a money back guarantee. What that means, or how you get money back, is going to be something to see. I haven’t always been thrilled with the disclosure in billing for new parts of Azure services, but perhaps there is going to be some way to request credit, though I would hope that any telemetry that shows issues results in some credit.

The other good thing is that the SLA is guaranteed across multiple dimensions: latency, throughput, availability, and consistency.

Rimme Nehme, who demo’d the product, had an interesting quote. It was something like “developers can concentrate on the app, not the database.” Forgive me if I’ve slightly misquoted as I’m going from notes. While I cringe a bit as a data person, I do know that most developers don’t want to really work deep in the database, especially for many simple apps. If I were building some simple mobile app, I’d seriously look at CosmosDB, and I plan to learn a bit about it.

Posted in Blog | Tagged , , , | Leave a comment

The History of the World, I Guess

The lighter side. From my kids, and since there’s some science and it reminded me of the Big Bang Theory.

Have fun.

Posted in Blog | Tagged | Leave a comment

Naming Conventions for the Cloud

There are plenty of controversial topics in technology to write about. Today I want to combine two of them into one piece: the cloud and naming standards. I’ll leave tabs v spaces and language choices for another day.

I like the cloud, and I especially like Azure. I’m not sure that it fits with most applications and organizations and at times I question the value for some parts of the platform, but overall, the ideas are great. I tend to work more in Azure because I’m an MS stack person, but I suspect AWS and the Google Cloud Platform are similar. Many of the cloud platforms, or even the various services from vendors like Rackspace offer lots of opportunity and an amazing set of platform services.

However, as there are more and more options, tracking the various items and services that you are using becomes an issue. This isn’t a new problem, and I’ve faced the same issue in large Fortune companies. Inventory and management of any technology at scale is hard. This brings me to naming standards. Many of you have probably dealt with naming standards in development and perhaps argued about the choices that your team uses.

There are also issues for infrastructure, which can be more complex and problematic in the cloud. Because we don’t have complete control over devices, we are limited to managing our objects through interfaces. At least in my data center I can fall back on labeling or connecting to a physical box to determine what services are running. Or even turning a system off to see who complains, something I’ve had to do in the past.

In the cloud, the world gets confusing quickly because we have many more types of services. Not only items like VMs and databases, but we can have network interfaces, disks, storage accounts, load balancers, and more. In fact, the list in Azure has gotten extremely long. The addition of things like Azure Functions and Azure Logic Apps have me worried that we’ll start to lose control of our infrastructure. The one thing that I appreciate is Resource Groups, which allow me to put groups of items together.

However, even with these tools, I sometimes find myself confused over what’s running. When I ran across John Paul Cook’s post on naming standards, I realized that I needed to get better organized with my resources. Having some sort of naming standard might help me quickly identify items and keep my portal organized. Even with the various icons, sometimes I get them confused because there are so many different possible icons and services. With a prefix and a resource group, maybe I’ll be better able to track which items I need to keep and which I don’t.

As you grow the number of anything, the overhead for organization and management grows. It becomes more important to track what you have and use, especially when you’re being charged by the resource. I expect that many of us will be answering questions and reporting on not only usage and performance, but also cost for some part of our jobs as more applications move to the cloud.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 5MB) podcast or subscribe to the feed at iTunes and Libsyn.

 

Posted in Editorial | Tagged | Leave a comment

A Busy Few Years

Last week on Twitter, a friend mentioned they were submitting to the PASS Summit and went through their previous speaking engagements. This person noted they had done over 40 talks in the last few years, which is a lot. I do more, but it’s part of my job. Ten talks a year in your spare time is quite a bit, and kudos to anyone that does that.

I have a Speaking CV page, but I decided to load a bit of data into a database, where it should be but isn’t. With time constraints, I only loaded 2015-2017 into the db, but I’ll get the rest loaded at some point.

In any case, I decided to run a few metrics and see where I stand.

2017-05-19 15_41_39-SQLQuery2.sql - (local)_SQL2016.sandbox (PLATO_Steve (63))_ - Microsoft SQL Serv

2017-05-19 15_44_16-SQLQuery2.sql - (local)_SQL2016.sandbox (PLATO_Steve (63))_ - Microsoft SQL Serv

2017-05-19 15_44_59-SQLQuery2.sql - (local)_SQL2016.sandbox (PLATO_Steve (63))_ - Microsoft SQL Serv

2017-05-19 15_45_38-SQLQuery2.sql - (local)_SQL2016.sandbox (PLATO_Steve (63))_ - Microsoft SQL Serv

It’s been a busy few years.

Posted in Blog | Tagged , , | 2 Comments

Listening and Taking Action

One way most of us would like to work is to have others listen to our concerns, our requests, our requirements, and then take action to help us. One of the principles of DevOps is just this. We work together and optimize the entire system, increasing the flow of work, and not just trying to ensure our particular task or part is improved. I would hope that most of us would actually take time to listen, evaluate what we hear, and then move forward in a way that is a good decision for both parties. We certainly might debate about what action makes sense, but I would hope we could some to some consensus and decision.

Microsoft is listening and responding as well. They aren’t building SQL Server based on everything we file on Connect or complain about, but certainly they are listening to some of our concerns and requests. There is a good post that look sat some of the community driven enhancements, which are being included into SQL Server 2017. These are different customer and community requests that were given some attention and added to SQL Server 2017.

These don’t seem like big changes to me, but as I read through them, a few seem to go quite deep into the engine, and I’m sure there was quite a bit of testing to get these completed. You can test them in SQL Server 2017 today, helping to ensure they work correctly in the final product. Some of these will change the way you run backups or diagnose issues, perhaps even enough to consider an upgrade.

I especially like that performance of smaller databases is addressed with backup improvements. I would still guess that most of the SQL Server databases out there are smaller than 100GB, maybe substantially smaller, and while we may consolidate and change hardware, we need these systems to perform well. The large, 100TB database changes are nice, and I know some customers need these improvements, but most of us don’t. Even though the percentage of people with 1TB systems grows regularly, it’s still a minority, so kudos to Microsoft for doing some work here.

Not every highly rated Connect item is addressed. The string or binary data truncated message isn’t changed, and that’s been highly rated for years. However, the top JSON request made it into SQL Server 2016, and there are other items in the most voted on list that haven’t been addressed. I understand, though I don’t always like the way Microsoft looks at the product. I can relate their decisions to my own when building software, realizing that the customer doesn’t always have the best view of what makes sense for commercial viability. However, I’d also like to see the platform continue to evolve and make development and administration easier for all of us.

I’ll keep voting on, and creating Connect requests as I see a need, and I’d encourage you to do the same. Think about what you are really asking for and decide if it makes sense for large numbers of people, but continue to participate and let Microsoft know what changes are needed. Your voice can make a difference, so please use it in a constructive and positive way.

Steve Jones

 

Posted in Editorial | Tagged , | Leave a comment

Security and Patching Struggles

A couple weeks ago we had a huge security issue with the WannaCry ransomware racing across the world. I was out of the office, and mostly offline, so I was a bit out of touch. However, many organizations were affected, and I’m sure many infrastructure people were scrambling to patch and protect vulnerable systems, possibly even restore affected systems. I certainly hope that most (or all) organizations didn’t pay any ransoms. For the future, my view is it’s better to lose a little data and restore systems than pay ransoms.
I can remember when we got patches at random times, as vendors wanted to provide more functionality or finally had enough bugs fixed to release a patch. Sysadmins struggled to deploy these patches, being out of practice from the infrequency and struggling with non-standardized ways of applying updates. There was also the concern about the quality of patches, many of which introduced more bugs and issues. In the Windows world, I found many companies wanting to wait until an SP2 was released before applying, or even upgrading, many systems.
That changed, with many companies moving to regular patches, and standard ways of applying, or even slipstreaming, patches easily onto machines. I welcomed the Microsoft Patch Tuesdays, as this provided a regular release, an expectation, and both admins and users became comfortable with the idea of regular patches. I haven’t loved the auto patches in Windows 10, but I find myself agreeing with Troy Hunt that we should just be patching. In an organization you may want someone to be responsible, but for home users, just patch.
And, by the way, vendors, you need to do more work, and be more responsive to any issues that come up from patches. Your quality issues lead to greater security issues.
If you want to ready about the WannaCry issue, there’s a good general post, as well as some guidance from the SQL Server perspective. If you haven’t patched, that’s something you should do ASAP, and while you’re at it, be sure you have the latest security patches for SQL Server applied. I’ve got a series of Build Lists at SQLServerCentral, one for each version. I’m still nervous about applying Cumulative Updates too quickly, but I certainly would download and have them ready, perhaps applying each a month late once once the early adopters have had a chance to report any major problems.
Patching is a reality for the modern software world. We get regular patches for applications, but our core infrastructure (including servers, desktops, and various devices) also need patches to the OS and platforms. There’s a balance between ensuring stability with known softwre versions and keeping up with patches to prevent problems. We need to find a balance, which is probably different for each organization, and re-evaluate periodically if we are updating an an appropriate level. I lean towards fewer patches when I can, but I always want to keep up with security patches. We never know when someone will take advantage of those.
Steve Jones
Posted in Editorial | Tagged , , | Leave a comment

La La Land Speaking at SQL Saturday #640

I love LA. In many ways, it’s a fun city to visit, and I love taking a few minutes, even just a 10-20 minute walk along the boardwalk on some beach. I’ve looked forward to quite a few trips to LA in my life, and I often take advantage of offers to go back when I can. This was my last trip in 2015 to Huntington Beach.

Photo Apr 09, 1 35 53 PM

There are a few SQL Saturday events in LA, and the latest is #640 at the Microsoft Center in LA proper. I’m honored to have been accepted and am looking forward to the trip in June.

I’ll be presenting my Continuous Integration for Databases talk, looking at how we can build a CI pipeline for our database code in an hour. We’ll go into detail with what CI means and the general process. I’ll use a few tools, but there are many ways to setup CI, so I don’t recommend one over the other.

I’m redoing this talk for another event, and I’ll end up with a subset of that work for this event, but hopefully I’ll show you a few things that will help you in your daily work and improve your database code.

There are lots of other great sessions, and this is a free day of training in LA, so pass the word and I’ll see you June 10.

Posted in Blog | Tagged , , | 2 Comments