DevOps Basics – Ignoring Files in Git

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers. This is also a part of a basic series on git and how to use it.

One of the things you’ll run into at times is the need to keep some scratch files, or extra files, in your Git repository, but not track them. One common type of file for me when working with SQL is a .zip file. I may zip up code to share or copy to a friend (without giving repo access).

Ignoring files is easy in Git. We just add a .gitignore file. This is a list of files that the git repository will not track or show in status. Essentially, we see them in our file system, but git doesn’t.

Creating .gitignore

To create a .gitignore file, the easiest method for me is to just create a text file. I can do it like this:

2017-10-09 16_36_08-cmd

This gives me a new file. Certainly VS Code, Sublime, etc. will make this easy as well.  The format is simple, with a list of files and/or patterns to ignore. For example, I’ve got a .zip file in my repo.

2017-10-09 16_34_24-GitTests

I don’t want to see this, but I do right now:

2017-10-09 16_37_31-cmd

If I want to ignore this file, I’ll enter this in my .gitignore file:

GitTests.zip

If I want to ignore all zips, I’ll do this:

2017-10-09 16_38_09-cmd

This is a part of my repo, so I need to commit it.

2017-10-09 16_38_32-cmd

Now my status is clean.

2017-10-09 16_39_14-cmd

Generated .gitignore

Some applications will generate a .gitignore. For example, my C# project gets this file from Visual Studio.

2017-10-09 16_40_33-.gitignore — Visual Studio Code

That’s a subset of files that are often in a VS project, but we don’t want to track in a VCS. Images, archives, executables, etc.

You can customize this as you need, and it’s easy to just edit the text file and commit the changes.

Hopefully this helps you understand how to best work with git and keep your repo clean. This also means your  git add –all is easy to use without adding unnecessary files.

Posted in Blog | Tagged , , , | 1 Comment

Prepping for Summit 2017

It’s about time for the PASS 2017 Summit. The event essentially starts on Monday with pre-cons and the unofficial networking dinner. Be sure you RSVP and come to the dinner if you don’t have plans. The official event is

Many people are already traveling and packing for the event. I feel a bit behind as I won’t leave until late Monday afternoon and arrive late Monday night. I’m sure I’m not alone, but it seems like everyone’s ready and I’ve still got a day of work Monday.

Redgate Software has a booth and a few presentations next week on Wednesday. We’d love to chat with you about ways to make database development easier, especially if you’re thinking DevOps. We also have some contests, swag and prizes.

I’ll be around Tues, Wed, and Friday, and of course, moving from Game Night to the Redgate Party Thursday night. I’ve got other commitments Thursday day, but hope to see lots of you there.

Posted in Blog | Tagged , , , | Leave a comment

Physical or Virtual Storage

When I started working with SQL Server, every server had what we’d call das-dee, or DASD (Direct Attached Storage Devices). These were hard drives inside the same physical case as the rest of the Windows computer. I’ve added lots of drives to various server systems over the years. As databases grew, we even had separate boxes in our racks that were attached to the main server, but only filled with drives.

Technology has changed, and today most of us work with SAN or NAS devices, where the storage is addressed across some type of network. Either a private one (copper or fiber), or the same Ethernet that connects the various computers together. A few of us might even have cloud storage that is located at Microsoft, Amazon, or elsewhere. The Stretch Database feature takes advantage of this last configuration. In all these cases, the storage that our databases see is often cobbled together from other disks that hide the underlying organization from the system.

Recently I read a piece from Randolph West that talked about recovering data from a RAID array. That reminded me of my early career, where I had to make decisions about how to structure storage. I’ve run RAID 1, 5, 10, 0+1, 6, and maybe more in my career to store data files. However, at some point I stopped worrying about the underlying configuration. I just expected, and trusted, the storage people to ensure that space was available. I even stopped thinking of the z: or y: drives on my database server as disks. Those drives were just storage that existed somewhere in the ether, just available for the database to use.

In thinking about Randolph’s experiences, I wondered how many of you out there might still deal with physical drives. Do you still make decisions about RAID levels? Do you even know what RAID levels are being used by your databases? If you’re a storage admin, you might, but for those of you that aren’t, do you know anything about your storage configuration?

Really, I’m speaking of production systems, not development ones. Certainly many of us might know there’s a development server with RAID 5 that holds a bunch of dev/test VMs, but I would expect that might even be rare. Outside of our own workstation, we likely don’t know the storage setup. Plenty of development systems these days probably even use a SAN, maybe even the same one as production, for storage.

For me, I have no idea of our systems. I used to build the SQLServerCentral servers, and when Redgate took over that part of the business, I helped spec the initial machines we rented as physical hosts. At some point we moved to virtual machines, and while I was asked about the specifications, I didn’t care about any of the hardware. I just said that I wanted enough CPU, RAM, space, and IOPS to handle the load. Deciding what that was, and ensuring it was available, was someone else’s job.

If you spec hardware, or pay attention, let me know. There certainly are plenty of hardware geeks, like Glenn Berry, that pay attention and prefer particular configurations. Those are the people I’m glad I can ask for advice if need it. I certainly ask for help with my personal systems, but for servers, I just need capacity. Do you feel the same way?

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 4.7MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged , | 2 Comments

Unstructured Data

Is unstructured data a bad term? I saw some data professionals complaining about this, saying all data is structured. That’s usually true. A CSV, even a ragged one has structure. XML and JSON have structure, even if it might vary node to node. Certainly our relational tables are structured and some formats can be rigidly mandated between organizations (like EDI). Even data in PDF, Word, MP3, MP4 or other audio/video mediums is structured in that we know the format.

Given that, is it a misnomer to use the term, unstructured, when describing flexible formats, such as XML? Is it OK for a PDF? I have had a presentation called Unstructured Data in SQL Server. This is primarily about FileStream, FileTable, and searching those objects. In the talk, I classify data in known formats as structured. These would be SQL Server tables and similar objects. At any point in time, we know what all data in the table looks like, even though we can have NULLs or missing data in rows.

I call XML and JSON semi-structured formats. We can certainly determine the format for any node or section, but we wouldn’t know without querying or examining the data. It’s semi-structured in that there is a hierarchy, but the structured from section to section (essentially row to row) can vary. There can even be depths to hierarchies that vary. In many ways, that makes these great formats for flexibility in data exchange.

I tend to view data in Word, PDF, MP4s, as unstructured. We don’t necessarily know where the data is, or how to separate it. We can get pages in Word or PDF, but those can vary and don’t necessarily help us extract information. They are XML, but the XML tags don’t relate to the content, unlike many other XML documents. Scenes or tracks in audio/video files might be separators, but those aren’t necessarily helpful in gathering information. Instead, we need other tools that can help deal with that data, finding words, concepts, or more inside of the binary stream.

I like the term unstructured data because it helps me understand where the information is. While the tables in a database might be full of nonsensical information in some rows, or be poorly designed with data combined into text fields, at least I know where the fields are. Actually, in that case, I’d argue the data in varchar(max) text fields is really unstructured. You might disagree, but give me a better term to describe there the information is stored in a data format.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.7MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged | 1 Comment

Basic FORMATting– #SQLNewBlogger

Another post for me that is simple and hopefully serves as an example for people trying to get blogging as #SQLNewBloggers.

I saw the addition of FORMAT() to the T-SQL language, but didn’t play with it much. Recently it appeared in some code, and decided to experiment a bit. I had assumed this was mainly for dates, but it’s a general format/culture function that handles numbers as well.

On the doc’s page, there are the basic description of the parameters, which are NVARCHAR(), so passing in VARCHAR() causes an implicit conversion. It shouldn’t be much, but there are already performance penalties (see Aaron Bertrand’s piece), so don’t add to the overhead.

One good thing to note is that if you pass in invalid formats or cultures, a NULL is returned. Since the format and culture strings aren’t completely intuitive, this might be a source of issues in your code.

This is a neat function, relying on CLR formatting rules. That means I can do fun things like:

DECLARE @i int = 5000;

SELECT FORMAT(@i, N'USD$#');

Which returns:

USD$5000

Or even:

DECLARE @i INT = 5000
;
SELECT  FORMAT(@i, N'# dahlahs')
;
GO

Which gives me:

5000 dahlahs

There are lots of formats, and certainly lots of nuances to numeric formatting strings. It’s worth reading up if you plan to use this, but again, beware of performance. I’d avoid using this if the data size is large, maybe more than a few hundred rows.

After all, the database server is a shared resource, and using this CPU to handle simple formatting may not be the best use of your system.

Posted in Blog | Tagged , , | Leave a comment

All Day DevOps Slides

The conference will upload the slides to Slide Share at some point, but if you want to get them now, here are the slides from my talk:

Including Database in DevOps – AllDayDevOps.pptx

If you  have questions, drop a note at AllDayDevOps.Slack.com in the #modern-infrastructure channel. I’m way0utwest.

Posted in Blog | Tagged , | Leave a comment

Redgate at Summit 2017

I’ll be heading out to the PASS Summit next week, spending Halloween in a conference center with a lot of other geeks. I know some of you won’t make it, and we’ll miss you, but I understand.

I’m meeting my colleagues from Redgate there, including the new editor of Simple Talk, Kathi Kellenberger. We’ll be at the event all week and we look forward to chatting with you about databases, DevOps, and more. We’ve also got a few events during the week that we hope you’ll attend.

Stop by the Redgate Booth in the expo center for a demo, or just to chat about SQL Server.

Adopting a DevOps Process for Your Database

I’ve got a session on Wednesday, Nov 1, at 1:30 that will provide an overview of how you might rub some DevOps on your database development. I’ll use some of the Redgate tools, but really I want you convince you that a smoother, more reliable database development process is possible.

Lower your risk, make your deployments reliable and repeatable. Come join me, see how the process can work, and ask lots of questions. I’ll have lots of answers.

How DevOps for the Database Helps with Compliance

Immediately after my DevOps talk, I’ll be joining Grant Fritchey and Richard Macaskill to talk about DevOps and compliance. We are seeing more and more organizations bound by rules and regulations that require them to be careful about who can see their production data and how it’s stored. This will mean that more of us need to be careful about how we manage data.

With the move to DevOps and releasing changes more often, this can present some challenges for data professionals. However, DevOps can actually help with compliance if done right. Join Grant, Richard, and I for a discussion of the challenges and potential solutions.

Redgate Rocks

If you are coming to the Summit, make sure you stop by the Redgate booth to pick up your #RedgateRocks ribbon for entrance to our party Thursday night. I’ll be there with the Redgate crew to celebrate the last evening of the Summit at 1927 Events.

This is your chance to get out of the Conference Center, head towards the ocean and have a good time in downtown Seattle with us.

Posted in Blog | Tagged , , , , | 1 Comment

Gigging for a Career

There are some people that like working for an organization, sacrificing some compensation and flexibility for stability and security. Others enjoy the chance to experience constantly changing environments and a rich variety of projects at the expense of regularly searching for new work. Neither is necessarily better or worse than the other, and these aren’t polar opposites. In the real world, each includes some of the advantages and disadvantages of the other.

For many of us that work with data, we realize there are some advantages to understanding the business meanings and implications of the information derived from data. Indeed, those that tend to work with the same types of data become fluent and comfortable manipulating and discussing how analysis, transformation, and more relate to that type of data. This leads me to think that someone that works for a company or stays within a particular industry might be preferred as a data professional for organizations in that space. Maybe this is someone that an organization wants to hire and retain over time.

I ran across a piece that discusses the Gig Economy from the perspective of data professionals. Their view is that many data professionals would rather work on interesting projects, and analyze data that is interesting to them, rather than being stuck with a single organization. Certainly from the organization’s view, having very skilled professionals available for project work means less costs in training, benefits, hiring, etc. While the cost per day might be high, there is no need for an ongoing, or at least not a constant, commitment.

I do think that many of the changes in technology have made it possible for talented workers to find plenty to keep them busy and earn a very good living. However, there’s a cost in spending time looking for projects on a regular basis. It takes not only time, but mental strength and a desire to be a bit of a salesperson and marketing professional. Some people may find long term clients that call them over and over, which isn’t that different from working for an organization unless you can interleave a number of clients together on a regular basis. However, most people that work at “gigs” are regularly spending time looking for work.

I would postulate that most of us prefer some amount of security and stability, preferring to work for an organization for a period of time. While we may change jobs at times, it’s often at a lower pace than those that might prefer to work in the Gig Economy

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.6MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged | Leave a comment

SQL Clone Server Service Permissions

SQL Clone is amazing, and it can really save time and disk space for many organizations. I’ve got a series posted here on various little things I’ve learned about the product. There are also a number of articles on the Redgate Community Hub.

I was working on helping a customer install the SQL Clone server recently and one of the things that the client wanted to know was what are the minimum permissions needed for the SQL Clone Server.

When you install the SQL Clone server, the configuration dialog asks you for a Windows account and password. This is noted in the documentation as the account that configures and starts the server.

sqlcloneserver

This means that during the configuration, this account will:

  • Create a local service on the Clone Server OS
  • Connect to the SQL Server specified
  • Create a new database (or use the one that exists)
  • Map itself to dbo in the new database

If the SQL Server can be a remote SQL Server from the SQL Clone server, a domain account is needed. If this is a local SQL Server, then you can use a local account. The account does need to have local administrator privileges.

With that in mind, here’s what I did as a minimum permission set:

  • Create a new domain account, SQLCloneServer (I want to be able to use a remote SQL Server. I left this as just a member of Domain Users.
  • Add this account as a local administrator on the SQL Clone server host.
  • Add this AD user as a login to the SQL Server that will host the configuration databse
  • Give the SQL user the dbcreator role (you can remove this later and leave them with permissions inside the db)

That’s it.

Scripting

It’s always better to script. Here’s the AD part in PowerShell:

New-ADUser -Name “SQL Clone Server” -GivenName “SQL” -Surname “Clone Server” -SamAccountName “SQLCloneServer” -UserPrincipalName SQLCloneServer@mydomain.com

Here’s the local SQL Clone, web server permissions part, using local commands. This could be in PoSh, but it’s not as clean (to me).

net localgroup Administrators "MyDomain\SQLCloneServer" /add

Here’s the SQL Part

USE [master]
GO
CREATE LOGIN [MYDOMAIN\MySQLCloneUser] FROM WINDOWS WITH DEFAULT_DATABASE=[master]
GO
ALTER SERVER ROLE [dbcreator] ADD MEMBER [MYDOMAIN\MySQLCloneUser]
GO
Posted in Blog | Tagged , , | Leave a comment

All DevOps All Day

There is no shortage of webinar events taking place for technology professionals. From our monthly Redgate DevOps webinars to the GroupBy conference to PASS Virtual chapters, there’s a huge set of online events that you can attend and learn from. It’s gotten to the point where you could have a full time job watching webinars if you could find someone to fund your efforts.

Every year PASS does a 24 Hours of PASS, which is a neat event where 24 hour long sessions in a row are broadcast. It’s a lot of work to get this going, and certainly not many people want to watch 24 hours in a row, but it does seem to be fun. That same idea is being taken a step further tomorrow with All Day DevOps 2017.

There’s still time to register and attend a few sessions, or even throw a party (or join one) with fellow technical professionals. Maybe this is a good way to take a break from work and still be productive. This might be one of the cheaper conferences you could attend this year. Maybe you can even convince the boss to get some t-shirts with the savings.

As you might have guessed, I’m a part of this event. I submitted to DevOps East, a conference in Orlando next month, but they also picked me up for the online event, so I’ll be watching and speaking tomorrow. This is an interesting format as it’s not just 24 sessions. It’s over 100 sessions, with 2, 3, or 6 sessions going on at the same time. There are some long sessions, some short ones, all covering different aspects of what we call DevOps.

Whether you think DevOps is a fad, the idea is silly, or are excited to start building software better and faster, I’d urge you to take a few minutes to scan the schedule and see if there is anything that catches your eye. You might learn a bit on how to change your culture, or improve security, or implement automation. There are even some testing sessions, which I’m looking forward to watching.

Whether you’re a believer or a skeptic, learn a bit more about DevOps and have an informed opinion. Don’t just judge the movement on the media hype. Listen to some real technical people talk about the ways that they are trying to improve software.

Steve Jones

The Voice of the DBA Podcast

Listen to the MP3 Audio ( 3.1MB) podcast or subscribe to the feed at iTunes and Libsyn.

Posted in Editorial | Tagged | 2 Comments