ETL v ELT

This is part of a series on my preparation for the DP-900 exam. This is the Microsoft Azure Data Fundamentals, part of a number of certification paths. You can read various posts I’ve created as part of this learning experience.

I don’t have an ELT tag, and I’m not likely to make one. I tend to think of ETL as loading data somewhere, even though I know it means more.

The important concepts for DP-900 here are that ELT is becoming more important and you need to understand what this means. I’ll cover these concepts, but also give a little overlap with where the different Azure services fit in with this.

ETL

For most of my career, the pattern for loading data was Extract-Transform-Load. In this pattern we:

  • grab data from a source
  • make changes to clean/change/etc.
  • write to a target (or sink)

It’s how tools like SSIS work. They connect a source to a target and have a bunch of tasks or transforms in the middle that change the data in some way.

This is a good pattern for getting the work done when the target system is just built for querying data, such as a data warehouse. It is also good when you need to scrub some data, perhaps for privacy reasons.

This isn’t a good pattern when you are trying to load data quickly as the transform process takes time.

ELT

This is the new way of doing things. I this patter we Extract-Load-Transform, though really, it’s not a pattern that quite makes sense in that the process of moving the data just moves it.

Here we:

  • grab data from a source
  • write it to a target

Where’s the transform? Well, that happens on the target, often when someone queries the data. Modern analytic systems, like Snowflake and Synapse, can work with vast quantities of data, often stored in a data lake or blob system, and consume that with powerful computational capabilities. There could be some minor re-shaping of the data on write, but that’s not the idea.

This is good when you might not read all the data. Why process (transform) what isn’t being read. Before you complain that you should know what is used, none of us know if all our data is being used. Unless we write crappy SELECT * code with no WHERE clauses.

This is also good when we need to work at speed and privacy isn’t a concern. It’s great for the known formats of files sent to us, as the target system can project a table on top of a ser of files.

ELT seems to be the current future direction of many analytical and warehouse systems.

Posted in Blog | Tagged , , , , | 1 Comment

Republish: The (Former) Complexity of PowerShell

I’m at SQL Bits today, teaching a pre-con with Grant, so I’m republishing The (Former) Complexity of PowerShell

Posted in Editorial | Tagged | Comments Off on Republish: The (Former) Complexity of PowerShell

Daily Coping 9 Feb 2022

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag.

Today’s tip is to give positive complements to as many people as you can today.

I do struggle with this tip when it’s come up. I do always feel good when I complement someone, but it’s not a habit for me. It ought to be, but it isn’t.

This is easy to do while coaching, at least for me, but outside of that it’s tough. When I saw this, I knew this would be a challenge.

I stared in yoga. I not only thanked the teacher (easy), but I complemented this guy in class that must be a gymnast. He can not only balance on his hands, but he can alternately touch his toes to the ground on his left and right and balance on one hand. Freaking amazing.

I also saw a former teacher in the hall, greeted her and let her know it was nice to see her and I miss her teaching (easy, but I had to think of it).

I told the guy at the petrol station he looked happy. Remembered to tell me wife she looked nice, and I took a moment to ask my son about his day and then give him a postive comment on a thing he’d done.

Not a bad day.

Posted in Blog | Tagged , , | Comments Off on Daily Coping 9 Feb 2022

T-SQL Tuesday #148–User Group Advice

tsqltuesdayThis month the T-SQL Tuesday invitation is from Rie Merritt, and it’s one that means a lot to me. I don’t actually run a user group, but I think community is important. It’s a big part of my job and my life.

Rie asked me if she could host this month because she had a specific topic. If you’d like to host a T-SQL Tuesday, all you need is a blog and participation in another month by writing a post. You could even write a post today for T-SQL Tuesday #1.

If you’re interested, contact me.

Advice for User Groups

As I mentioned, I don’t run a group. In fact, I never have, but I have run events, attended lots of meetings, and I do speak at quite a few.

The pandemic has been hard on many groups, and a blessing to others. While I don’t like the virtual meetings, I understand how and why they work for some groups and not for others.

My big advice for organizers is twofold, and I’ll write a couple paragraphs about each.

Serve Your Community

My main advice is that your work with groups or events, whether leader, speaker, or something else, you ought to make sure you are serving your community. What is best for them, or what do they want?

Sometimes I find leaders doing what’s best for them, without knowing what the community might prefer. If you want virtual meetings for your own personal reasons, whatever they are,  make sure a large portion of your attendees and speakers agree. Same for in-person meetings. If your area wants to stay online, understand that.

This isn’t to say that you can’t experiment, and that you might make decisions on days/times/etc. More it’s advice to think about what helps people as we come out of this pandemic.

Go Slow

I’ve seen a lot of groups struggle over time to run a monthly meeting and find speakers. Speakers are easier online, but it’s still work.

My advice is that you can consider doing something less than monthly, especially if you move to hybrid or in-person. A quarterly or every-other-month pace might suit you (and others better).

Also think about adding in some lunch meetings, perhaps just discussion ones without a presentation. Bond, be social, just talk, vent, and share what we love about data and technology.

Don’t kill yourself. I’d prefer you enjoy running a group for 3 years than burning out after 6 months.

Posted in Blog | Tagged , , | 3 Comments