Reading an dataset in R

I got a dataset file the other day with an .rds extension. I had never seen this, but with a quick Google, I figured out this was a dataset exported from r. The extension is for “R DataSet”.

That’s interesting, but how do I get this data into my workspace? Does readcsv just work on this? It doesn’t, as you can see here:

2020-05-29 15_36_04-RStudio

The format used is different, and there is a separate function to get this data. I searched and found this link, which describes loading data in the two different ways. Looks easy, so let’s try it. I’ll use the readRDS function in code like this:

> qb.2016 <- readRDS(file="passingleaders2016.rds")

This works well, as we can see below. I get my data loaded in.

2020-05-29 15_36_53-RStudio

The first link above recommends this for working in R, but I often am going from a database or other system to R, so I think I’ll mostly stick with CSV. It’s good to know how to load them, if I do need to work with these files.

Posted in Blog | Tagged , | Comments Off on Reading an dataset in R

Daily Coping 8 June 2020

I’ve started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here.

Today’s tip is to re-frame a worry and try to find a positive way to respond.

Today, I truly worry for the way forward in the US. I don’t think we’re headed for civil war, or anarchy, or the collapse of the economy, but I do worry that we continue to devolve into a place I no longer want to live.

I consider leaving the country more and more, something I hadn’t done before the last 2-3 years. Despite my differences with people, leaders, laws, systems, etc., I have often thought this was one of the best places to live.

In some ways, I still see that. I see tremendous creativity, activism, and effort from young people. In way that I only read about people in the 60s, and didn’t experience in my generation, I see changes and work in a younger one. I don’t always agree, but I appreciate their work.

In some sense, I miss the simple world before the Internet, but at the same time, I am constantly amazed by the way in which young and old people produce music, media, art, sports, and share with the world. They come up with things that never get through some committee or individual funding work.

I’m also amazed by the enthusiasm and creativity of technology professionals. Certainly dismayed and disgusted by many, but the people that build software, that try to make something useful, that compile and distribute data, they are doing incredible things.

All this gives me hope and positivity at a time when many things are not.

Posted in Blog | Tagged , , | Comments Off on Daily Coping 8 June 2020

Double Half and Quarter

I’ve worked in a a couple very high performing organizations that adapted to changing conditions and built software well. I’ve worked in more poorly performing organizations that struggled to release updates and patches, causing tremendous stress for the IT staff.  DevOps is designed to help improve our software delivery and quality, if you work on improvement in many areas.

I saw a post on LinkedIn, from the Chief Architect at HSBC bank. This was interesting to me because I see HSBC ads constantly when I travel to the UK. They’re the 7th largest bank in the world and was founded in 1865. If ever there was an organization with lots of legacy everything, this is it. They have every reason to do what they’ve been doing for years, since it’s worked out well.

The post notes that Jez Humble, well known DevOps author and co-founder of DORA, came to talk to them about software delivery and DevOps. For the author, the highlight of the day was the CIO giving this challenge: “… setting every team the goal to double, half and quarter every year: double the frequency of releases, half the number of low impact incidents, and quarter the number of high impact incidents.”

That’s an ambitious goal, and as the post notes, this results in exponential improvement year over year if the team can achieve this. I think there is likely some limit to this, based on team size and application complexity, but certainly when you’re going from a mid range performing software development group, this isn’t bad.

I like that this goal was set not just to increase deployments, but to also prevent incidents. I think too many managers look at speed as the goal, without requiring quality to improve. This goal doesn’t quite address this, unless the impact incidents include bugs and poor performing software. It can be easy to limit incidents to downtime from deployments, and not necessarily the use of software.

The hands off management of this approach is good as well. Not specifying how this gets done. Leave it to the technologists to get this done and hold each other accountable. With that kind of support from management, I’d hope most professionals would step up, improve process and quality, and take pride in their work. It seems to have worked as HSBC, as they’ve been written up a few times in their DevOps approach to IT. I think it can work anywhere with the right management approach.

Steve Jones

Listen to the podcast at Libsyn, Stitcher or iTunes.

Posted in Editorial | Tagged | Comments Off on Double Half and Quarter

The Social Impact of Data

Let the data drive your decisions.

This has been something of a mantra for many technical people, and even many business people, across the last twenty or so years. The allure of business intelligence is harnessing lots of data to make decisions that are rooted in some rational analysis of what has happened. Many companies use “data driven decisions” as a way of achieving success.

However, what about when the data is flawed? When deliberate or inadvertent actions give us data that isn’t quite as pure as we expect. In the last week we have seen many protests and complaints about the ways that many people feel they have been unfairly treated by police. That brutality, particularly for African Americans in the US, has been a problem for decades. Some of that is due to human biases, beliefs, and more. However technology plays a part, and will for some time to come.

I have watched as algorithms have been used in sentencing, and I’ve questioned their use, as have others. There is this idea that computers will be more fair, looking at inputs and making a decision that isn’t encumbered by human biases. The problem is that humans that program the systems might have some bias. Maybe more disconcerting is that the data used to train systems is likely biased as well.

There is also the concern that as technology advances, it can be put to new uses, perhaps in ways that the inventor regrets. Oppenheimer regretted the violent use of his work, and I wonder if technology inventors will feel the same way. Surveillance technology is controversial. It can help retails companies prevent theft, but it can also be used in ways that might enhance and reinforce bias in police work. This can be controversial, and no matter how you may feel about the technology, there are moral questions of privacy and prejudgment that are worth debating.

The last couple weeks have saddened, upset, and angered me at different times. I am also confused and concerned, unsure of how to discuss and debate these topics. I find my position moving slightly with different stories and different information, as I should. I learn more and my views grow and change, shaped by what touches me. I do worry about how we use data in the future, and how it can be abused. There are ways in which more data can help improve our world, but the potential for abuse is high, and I do believe we need governance, transparency, and an independent appeal process for those wrongly impacted.

Steve Jones

 

Posted in Editorial | Tagged , | Comments Off on The Social Impact of Data