Trust assumptions in none-reproducible FOSS applications

Yes, that’s the point of Hardware-Assisted Build Environments & provenance attestations.

1 Like

Is there ny tutorial on how to verify this?
Because the workflow for repreducible buils is simple:

  1. Download the binary
  2. Download the source code
  3. Compile the source code as instructed
  4. Hash both the published and the compiled binary and compare the hashes

For GitHub HABE outputs:

For my project, firestack, Actions verify (both Build output and Software Bill of Material) attestations before publishing the artefacts (build outputs) to Maven Central.

Reproducibility is not my idea of “simple”. Neither is “as instructed”. That isn’t to say Reproducibility isn’t a stronger guarantee, but for establishing “trust” in build outputs, it needn’t be the only solution and most certainly, isn’t the “simplest”.

What exactly is so complicated about making reproducible builds?

Projects like F-Droid successfully produce reproducible builds for all their apps, so it can’t be to hard to do this

1 Like

I hope you’re not trolling (:

Several reasons. The primary (imo) is, some of the things (mirror) required for reproducibility are out of the control of the maintainers, and may even require disabling better build outputs (like from Profile-Guided Optimizations, which is first-class citizen in some of the popular ecosystems, like Android & Go).

If you’re technical enough, I recommend reading Debian’s journey published by IEEE: [2104.06020] Reproducible Builds: Increasing the Integrity of Software Supply Chains

If that’s too much, here’s a quote by Chris Lamb (mirror), one of the 4 stewards of the Reproducible Builds project at Software Conservancy, that sums it up (emphasis mine):

After considerable effort, Tails now offers fully reproducible and verifiable images, helping to protect the users of Tails but also the developers that volunteer their time to the project.

All that said, things across the board (toolchains, compilers, build environments, file systems, packagers, etc) have improved considerably since 2014/15 when the effort was started at Debian.

And that’s for projects like F-Droid that have a lot of overlap with the Debian community (and by extension, the Reproducible Builds community, too).

And even then, things like number of CPU cores (mirror) trip up reproducibility.

Note that, F-Droid defines “upstream reproducibility” (mirror), where the signature can copied over from developer-built APK to F-Droid-built APK, as a higher standard, which I don’t think is all that many apps?

4 Likes

Thank you for the reply and no I am not trolling :slight_smile:

Could these things that make build possible none deterministic be written into a build log file that can be used as a compiler instruction when reproducing outputs?
So that the compiler/build-environments does everything the same as described in the build log?

Isn’t this only possible if the original build is reproducible too?
And then they would be no need for an F-Droid build when the upstream build is identical?

Unsure what went on, but it is likely a lot of stuff (including what you suspect) needed to happen for F-Droid to have all those Android apps (the ones written in Java/Kotlin code, at least) be built reproducibly (albeit, on their own rebuilders).

You’d have better answer asking about this on the F-Droid forums. Hans-Christoph Steiner, who lead the reproducibility effort, is active there.

Well, technically, yes.

But it is also not necessary F-Droid’s build / rebuild correspond to upstream’s build (“original build”). For example, the upstream may depend on different dependencies for their own builds versus what they publish to F-Droid for packaging (and this example is one of the “easier” hurdles to clear).

1 Like

(I stumbled upon an article and thought I’d share here as it is super relevant)

Yeah, Eric Rescorla (ex-CTO Mozilla / TLS 1.3 author) argues the same thing in his essay:

But what about open source software?" I hear you say. “I’ll just review the source code and determine whether it’s malicious”. I would make several points in response to this. The first is: “LOL”.

Any nontrivial program consists of hundreds of thousands to millions of lines of code, and reviewing any fraction of that in a reasonable period of time is simply impractical. The way you can tell this is that people are constantly finding vulnerabilities in programs, and if it were straightforward to find those vulnerabilities, then we would have found them all.

You’re certainly not going to review every program you run yourself, at least not in any way that’s effective. And that’s just the first step: the supply chain from “source code available” to “I actually trust this code” is very long and leaky.

Even if you did review the source, most software—even open source software—is actually delivered in binary form (when was the last time you compiled Firefox for yourself?) so what makes you think the binary you’re getting was compiled from the source code you reviewed?

If anything, open source may be better for security (as barriers for external reviewers is non-existent) but that alone is not nearly enough.

Eric’s blog post on software trust is just too good.

Eric Rescorla’s articles in general have been a big help to me in breaking down the complexity/layers of privacy and security topics. I had not read this article so thank you for sharing.

At a quick glance, it also confirms my general feeling:

Open source, audits, reproducible builds, and binary transparency are all good, but they don’t eliminate the need to trust whoever is providing your software and you should be suspicious of anyone telling you otherwise.

There is sometimes a strong opinion in FLOSS/Privacy communities that you don’t need to trust the software author/maintainer if it is open source and has a specified recipe for reproducible builds. In reality, as observed by ekr, it’s more complicated.

1 Like

The funniest of it all is “audits”, especially of closed source software.

[Skiff’s] transparency statement clearly says that Security audits. This is different than privacy audits. You cannot audit privacy, since you can intentionally change the functionality of your software right after the audit.

For the same reason, you cannot [simply] share open-source version of your software and say that it respects privacy. That can be only said if you use reproducible builds, and for client software only.

From the white paper, it appears as if this system requires its users to trust t... | Hacker News

The intended audience don’t even know what they reading or looking at. To some a mere audit is good enough, but do they realise that bad audits may not be ever made public? At this point, this is why it feels like the firms are just doing marketing for each other with carefully authored audits fit for a public release. Deeply non serious.

[Skiff hasn’t] published the reports, scope, and full findings. We don’t even know what Trail was testing. I don’t think the security audit stuff matters at all, and Trail is a fine firm, but you can’t use the mere existence of a pentest project this way.

What exactly got tested in each of these assessments, and what conclusions did those assessments draw? I asked this upthread and I’m asking here again, because “[Skiff has] had 4 audits” doesn’t mean anything without that detail.

From the white paper, it appears as if this system requires its users to trust t... | Hacker News

And sometimes, the audits made public are limited in scope to make anything out of them.

tbf, I understand where FOSS folks are coming from. Reproducibility is the at most they could do, and they’re willing to go that far, which is positive. But I also understand the fact that it is being sold as something it quite isn’t just because FOSS folks may want to hammer home the point that for security, it is open source or bust. It is a necessary criteria[1] (which is a good argument with or without reproducibility) but not nearly sufficient, as Eric points out.


  1. Debatable, as for closed source software, there’s always source escrow services like IronMountain and provenance stores for Binary Transparency like SigStore. ↩︎

1 Like

I think at the moment I can’t really tell how strong the security guarantees are, so I definitely will read more into this.
But I still hope that most FLOSS software will be reproducible one day.

Thank you for all your replies

1 Like

Is Java/Kotlin better suited for reproducibility?
It also could be better for code audibility in because you could share much logic between different platforms, because Java/Kotlin can basically run everywher, plus Kotlins syntax seems to be easy to read for a compiled language.
what do you think about this?

But if the upstream maintainers doesn’t publish all information necessary to reproduce their binaries, how can someone make a matching build (without spending a lot of time testing build variables) ?

BTW: Thank you for answering my (probably dumb) questions

1 Like

Don’t think so.

It is a given that bigger projects that pull in more deps that they build from source, the more complicated it might be for those to achieve reproducibility. This is why OSes / distros attempting this have been at it for a decade if not years.

Kotlin as-is isn’t statically compiled but Kotlin Native can be. And on Android, Kotlin may be Ahead Of Time (AOT) compiled (which is the default setting in GrapheneOS from what I know, or at least, from what the code I saw, this setting kicks in when you turn on Dynamic Code Loading protections, I think but on most Android distros, Kotlin is only partially AOT’d due to disk space concerns).

AOT may also result in hard to reproduce builds. I am not sure.

The way developers typically vend binaries on F-Droid is they write a separate build recipe for it which is different to build recipes for other platforms. And so, it needn’t necessarily be the case that the developers are building the same stuff as F-Droid is, and hence the build output F-Droid has will not match the build output the developers are building. Like mentioned above, this is one of the “easier” hurdles to clear (that is, the developer building F-Droid recipes everywhere may be straight-forward for some apps).

1 Like

Of course if they pull none reproducible binaries then it becomes harder to make the build reproducible, because you would need to also make all the dependencies reproducible.

But they are some distributions that are reproducible.
One example is GrapheneOS.

There a many Kotlin apps on F-Droid that are reproducible, so I don’t think that Kotlin is preventing reproducibility.
If you just compile Kotlin with kotlinc CLI to a .jar, its reproducible by default. So if there are issues is probably somewhere else in the build chain.

F-Droid has there own build system and I think you can only make a matching build when you also using it.

Yes they are more trustworthy.

One crucial aspect is code smell. Some proprietary program’s source has leaked in the past and it has revealed extreme cluelessness towards implementing proper cryptographic code.

It’s not necessarily a bug or vulnerability, but it can show the developer has no idea what they’re doing and serious vulnerabilities are a question of when.