Should I Learn PostgreSQL

I got asked this question recently:

I constantly see PostgreSQL on Microsoft slides, email, ads, etc. My MCADAA exam started with an entire section on it. I’m trying to determine if it’s worth focused study and training time. Roughly how much of the Azure cloud database space does PostgreSQL occupy? Should I include it in my personal training program?

It’s a good question, though I am assuming that MCADDA is the Microsoft Certified: Azure Database Administrator Associate cert. That page describing the exam only mentions Azure SQL (SQL Server), and not PostgreSQL, so while I think Azure SQL for PostgreSQL is good to learn, I think the cert preparation needs an update if that topic is on the exam.

In any case, should you learn PostgreSQL?

I’ll give you the DBA answer: it depends.

Ask yourself some questions:

  • Does your org use PostgreSQL or are they planning to do so?
  • Are you going to stick with your company for a few more years?
  • Are there more things you should be learning in the area you do work now?
  • Are there things you should become more skilled at that your company values?
  • Do you know what the opportunities are for people that know PostgreSQL well?

Depending on the answers, I may or may not recommend you spend time there. If you have other things to learn that might be better at your current job, either the job you have or one you might want, then focus there. If your company does (or doesn’t) value PostgreSQL, then that influences your choice as well.

PostgreSQL is growing fast. DB-Engines shows growth across a few platforms, and PostgreSQL is doing well.

2024-10_0140

The StackOverflow developer survey shows similar results:

2024-10_0141

However, those are general results. Your specific situation is different. Think about it and ask the question above to lots of people.

Posted in Blog | Tagged , , | 2 Comments

The Vast Expansions of Hardware

At the Small Data conference recently, one of the talks looked at hardware advances. It was interesting to see a data perspective on hardware changes, as many of us only worry about the results of hardware: can I get my data quickly? In or out, most of us are more often worried about performance than specs. However, today I thought it might be fun to look at a few changes and numbers to get an idea of how our hardware has changed, in the march towards dealing with more and more data. Big data anyone?

In thinking about disks, I saw a chart that looked at the changes from HDD (hard disk drives) to SDD (solid state drives) to NVMe (Nonvolatile Memory Express). These show read speeds going through the list from 80MB/S to 200MB/s to 5000+MB/s. That’s a dramatic change, and not one only in high-end arrays. There are off-the-shelf drives you can put in a desktop that read this fast. If you think about some of the early IBM drives, which read at 8800b/s. Growth in disk speed, inside the timeline of our careers, has grown by a few orders of magnitude in read speed.

Write speed hasn’t grown as much but capacity has. My early career work used HDDs with a 100MB capacity. These days we can get TB range storage on all of these mediums, with many laptops having 0.5TB or more on them. Desktops often have plenty more. My current workstation at home has 3.5TB of storage. Contrast that to the early IBM drive linked above, which had 5MB. These days people regularly demo hundreds of TB, or even 1PB queries from a database.

Many of us just expect the network to work well. In fact, I assume many of us won’t complain to network people since they are never at fault for performance issues. I started my career with Arcnet connections between machines. Those ran at 2.5Mb/s. We were moving those and 4Mbps Token ring to Ethernet at 10Mbps with Thicknet, Thinnet, and eventually RJ-45 connections. When we got 100Mpbs bridges, I thought we were cutting edge for our SQL Server Central servers. If we look back 20 years, 1Gbps was more the standard then, but today we see growth up into the 800Gbps with Infiniband. While I don’t know many data centers doing that, there are plenty running in the 50Gbps range.

If we think about CPUs, I started my career on a 386 machine running at 25MHz. I helped upgrade some 286 machines, but most of our servers were 486 class machines at 25 MHz. I still remember being excited about the early Pentium processors for a large system. There were many Pentium variants and later families of processors, but back in the 2000s, almost all machines were single-core. The first multi-core chips were released and slowly became more common over time. These days, many new laptops have multiple cores, including the new on I got, which has 12 cores. If you want, you can purchase an AMD Epyc 9004 processor with 96 cores. That’s on one chip. Since most servers can take more than one CPU, you can have hundreds of cores running if you want. If you want to get really crazy. the Nvidia Blackwell has thousands of cores for their GPU-based AI calculations.

Memory has likewise grown, though it seems most servers are much less than a TB of RAM, which is a much lower growth over time than storage and networking. Maybe because of those two changes, memory has had less of a reason to grow into common multi-TB-sized capacities in our systems. In fact, for you reading this, what are the common memory sizes you have in servers? I see many VMs and other machines set up with somewhere between 128GB and 1TB for memory, even as their data sizes have grown much, much larger. However, there are plenty that don’t have anything near 128GB.

That was one of the interesting things I realized about the Small Data conference, and one reason the event was created. Most of our data sets, especially usable sets, and most of our queries can run on a laptop if not a mobile device. The focus on big data seems overblown, especially as most of our companies don’t have anything approaching 100TB, much less 1PB. If you need it, there is hardware out there for you, but some of the amazing advances made over time are lost on me as the common, average capabilities out there on the majority of systems could handle the majority of my needs.

With some well-written queries.

Steve Jones

Listen to the podcast at Libsyn, Spotify, or iTunes.

Note, podcasts are only available for a limited time online.

Posted in Editorial | Tagged | Comments Off on The Vast Expansions of Hardware

A New Word: Hickering

hickering – n.  the habit of falling hard for whatever pretty new acquaintance happens to come along, spending hours wallowing in the handful of details you can gather about them, connecting the dots into elaborate constellations, even imaging an entire future together – images that have no particular purpose, except that they’re kind fun to think about.

This sounds like a teenage love story habit, which I see from kids I coach, or from my kids in the past.

I’m not often enamored with new friends, and can be a bit wary or distant. I certainly get that from my parents, who were a little hesitant to add new friends. I do make new friends, and I find people very interesting in new ways, but I rarely feel hickering where I try to learn more about them.

I tend to flow in life and enjoy the interactions without looking to make them more or less important than there are in the moment.

From the Dictionary of Obscure Sorrows

Posted in Blog | Tagged , | Comments Off on A New Word: Hickering

Using Copilot to help me update SQL Saturday

An interesting AI experiment here with Copilot from GitHub in handling some code I don’t work with that often. Read on and watch.

This is part of a series of experiments with AI systems.

SQL Saturday Updates

I keep the sqlsaturday.com site updated with code updates through GitHub. The repository is here: https://github.com/sqlsaturday/sqlsatwebsite

I had hoped most organizers would fork this and use pull requests to update it, but surprisingly few data professionals are comfortable in Git. It’s fine, and I don’t mind.

However, sometimes I get text updates that are a bit cumbersome to make. Recently I had someone send me a bunch of table text to update, and as I was editing, I realized that Copilot in VS Code had improved dramatically. You can watch below how it worked.

Early in the Copilot days, I’d only get the next line added and sometimes the ending tag. Later, it started to suggest things, but from a guess standpoint, not reading what I’d pasted into the editor.

Now, it’s really, really helpful.

A cool way that AI can speed up your work, albeit one not a lot of data professionals might need.

Posted in Blog | Tagged , , | Comments Off on Using Copilot to help me update SQL Saturday