Post

Package Management - Supply Chain problems

As a millennial who develops software, I use other peoples libraries whenever I can to get the job done as quickly as possible. Unless its something particularly bespoke, there is always a library out there which does what i want to do, written better than i ever good. Its by harnessing the collaborative and open source communities that enables technology to advance so quickly. Often when developing software professionally, you are under time constraints. It seems to be very rare to get given time to learn, design, and develop software and often its expected to just get the job done as quickly as possible.

Companies that develop software are likely to employ information assurance personnel who (in my experience) do not have development experience. They may hold CISSP, but it really just teaches the basics and i dont believe it is enough to make a competant cyber professional. The thought process is often “create a policy that blocks TCP 3389 and 22”, glossing over the actual security concerns. Policy needs to be “block remote access from public sources” and not touch specific technologies. A port can be used for anything - just because a spec says that 3389 is reserved for RDP dosent mean that it is. Its trivial to proxy ports or find workarounds (guacamole?). Whats the answer? Block everything? Monitor evertyhing? Zero trust? In a similar vein, assurance personal may dictate the use of internal repository mirrors, under the illusion that it brings any security whatsoever. I strongly believe that security needs to come from the bottom up - contrary to popular believe. I know the security processes are bad, because I know how i’d get around them.

As i was developing some Python code, i realised that I really dont have a clue whats actually in the libraries i’m using. From a due dilligence perspective, it dosent matter. Im using libraries from approved internal mirrors. But i dont control the mirrors and i dont control the libraries. As an experiment, i thought i would create a Python package called ‘johndoe’ and upload it to PyPi. The thought process is that anyone (John Doe) and upload any python code onto PyPi. They dont check whats being uploaded and they wont. Its not in their remit - they are a platform enabling collaboration.

I havent updated the package since i made it last year. The johndoe package dosent do much, but it dosent really have to. Once the package is installed (and used) it can do pretty much anything. It could literally be ransomwhere. The whole im trying to make isnt to upload a malicious package - plenty of people have already done that by uploading packages with slightly mispelt names to popular packages (although i do think PyPi have cracked down on that). All it takes is a malicious actor to solve a problem on stackoverflow using a dodgy package and the whole company could be ruined.

johndoe itself isnt really malicious, all it does it print out some basic group policy information. There are plenty of methods of getting data out of a network. The obvious is just send to a website, even if you have ingress/egress traffic montioring and encryption, there may be ways around that. A method that i found really interesting was data exfiltration via DNS - https://dejandayoff.com/using-dns-to-break-out-of-isolated-networks-in-a-aws-cloud-environment/ Getting information out is the easy bit, its getting access to the systems in the first place which is “difficult”. Poor package management is one way that seems to have slipped through a lot of security policies.

This problem is global and highlights the need for security design in development environments. This problem isnt Python specific by any means. The answer is NOT “block all packages”, they are pivotal to modern software development. When you next get into your “secure” development environment, try and install johndoe. I’d bet that a majority of people have no problems downloading that package. Even if your policy is to download onto an air gapped machine and virus scan the software, its not going to find anything.

This post is licensed under CC BY 4.0 by the author.