Representative Data Challenges

One of the areas where machine learning and artificial intelligence have had lots of success is with image work. Whether identifying people in pictures or helping cars stay on the road and out of each other’s way, this capability of computing has worked well. It’s not perfect, and not necessarily as accurate as most humans, but it works well. At least well enough. Sometimes it’s even better than humans.

There are issues, however, and I think some of them are because of poor data sets. Last year when the pandemic hit, education was challenged with how to conduct remote exams. While there are some solutions, they don’t always work well. Sometimes the algorithms don’t recognize people, especially non-Caucasians.

The issues raised reminded me of the issues with some bathroom gadgets. I have fairly dark skin, and I’ve always wondered why some sinks and soap dispensers wouldn’t work for me. I hadn’t thought much about it until I saw a few reports like the one listed above.

I don’t think there is anything malicious here, but I do think that often we find teams work on a happy path when building some new tool. They test it often themselves, but they don’t think widely about how a variety of customers will use things. While I’ve seen many personas, I often don’t see anyone creating personas that might consider something like skin color, or even a different culture. We often consider roles, without deeply examining how those roles are implemented.

We need to work with representative data in whatever area we work, but data that does include some of the edge or corner cases that might come up. Our dev and test areas can start with small data sets, including those that we build, but at some point we need representative data. Whether we’re building OLTP software, sensors, or image recognition, our data should be well rounded.

While systems don’t need to solve every issue, we ought to consider a large percentage. In the case of imaging, certainly understanding the wide variety of type of people that can use products would seem to be important. Hopefully future teams won’t make the mistake of assuming that most of their customers look exactly like them.

Steve Jones

Listen to the podcast at Libsyn, Stitcher, Spotify, or iTunes.

Posted in Editorial | Tagged | Comments Off on Representative Data Challenges

Daily Coping 21 Jan 2021

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. 

Today’s tip is to connect with someone near you – share a smile or chat.

One of the things I’ve still been able to do this month is go to yoga at the gym. I try to go a couple times a week, and then practice at home some times.

Recently, I went early and saw someone I lightly knew from class. Rather than just sit and start my own warmup, I said hi, asked how they were doing and spent a couple minutes engaging pleasantly with someone else.

That’s one of the things that doesn’t happen often during the pandemic, and it is something I appreciate taking the time to do.

Posted in Blog | Tagged , , | Comments Off on Daily Coping 21 Jan 2021

Daily Coping 20 Jan 2021

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. 

Today’s tip is to switch off your tech 2 hours before bedtime.

I try to get away from my computer after work. I am lucky in that I have a separate office and PC. I typically don’t use a computer after hours, and I try not to use my phone too much at night. I do read on it, which I don’t count as “tech”, and have gotten better at not switching to check other things.

I decided recently to skip reading and instead, just put my phone on the charger and ignore it. My wife and I do watch some TV at night, but I took time to instead hang out, play some cards, play some guitar, and relax without any digital anything.

It was a nice break. Something I should do more often.

Posted in Blog | Tagged , , | Comments Off on Daily Coping 20 Jan 2021

Daily Coping 19 Jan 2021

I started to add a daily coping tip to the SQLServerCentral newsletter and to the Community Circle, which is helping me deal with the issues in the world. I’m adding my responses for each day here. All my coping tips are under this tag. 

Today’s tip is to get moving. Do something physically active (ideally outdoors).

I am going to get active today. I’m taking a couple days off with my wife and heading to the mountains to ski. I went for the first time last week with a friend, and this will be my first time with my wife. We’re driving up early and will spend a couple days enjoying the outdoors, getting some exercise, and taking a break from life.

I hope you find a way to do something similar soon.

Posted in Blog | Tagged , , | Comments Off on Daily Coping 19 Jan 2021