It is not a compiler error (2017)

blog.plover.com

102 points by misonic 5 months ago

Embedded systems often have crappy compilers. And you sometimes have to pay crazy money to be abused, as well.

Years ago, we were building an embedded vehicle tracker for commercial vehicles. The hardware used an ARM7 CPU, GPS, and GPRS modem, running uClinux.

We ran into a tricky bug in the initial application startup process. The program that read from the GPS and sent location updates to the network was failing. When it did, the console stopped working, so we could not see what was happening. Writing to a log file gave the same results.

For regular programmers, if your machine won't boot up, you are having a bad day. For embedded developers, that's just a typical Tuesday, and your only debugging option may be staring at the code and thinking hard.

This board had no Ethernet and only two serial ports, one for the console and one hard-wired for the GPS. The ROM was almost full (it had a whopping 2 MB of flash, 1 MB for the Linux kernel, 750 KB for apps, and 250 KB for storage). The lack of MMU meant no shared libraries, so every binary was statically linked and huge. We couldn't install much else to help us.

A colleague came up with the idea of running gdb (the text mode debugger) over the cellular network. It took multiple tries due to packet loss and high latency, but suddenly, we got a stack backtrace. It turned out `printf()` was failing when it tried to print the latitude and longitude from the GPS, a floating point number.

A few hours of debugging and scouring five-year-old mailing list posts turned up a patch to GCC (never applied), which fixed a bug on the ARM7 that affected uclibc.

This made me think of how the folks who make the space probes debug their problems. If you can't be an astronaut, at least you can be a programmer, right? :-)

toast0 5 months ago

At least the debugger worked. The processor I used in embedded systems in college, the 68HC11, would stop doing conditional branches when the supply voltage was too low.
We had a battery powered board, with no brownout detection, and I was using rechargable NiMH batteries to save money/waste. When the students with alkaline batteries had low batteries, the motor load would bring vcc down far enough that the CPU would reset by itself. With NiMH, the batteries could still drive the motors and keep the CPU alive...
You could single step in the debugger, and see the flag register was set as expected, but the branch didn't happen. Just ran straight through. I can't remember if unconditional jump or call worked. After about the third time this happened, I got good at figuring it out.
apple1417 5 months ago

> For regular programmers, if your machine won't boot up, you are having a bad day. For embedded developers, that's just a typical Tuesday, and your only debugging option may be staring at the code and thinking hard.
Of course where it becomes even more fun is when it's a customer's unit in Peru and you can't replicate it locally :). But oh how I love it. I have definitely spent many a day staring at code piecing things together with what limited info we have.
But to get back on topic, I can definitely confer on the quality of most embedded compilers. It's a great day when I can just use normal old gcc. I've never run into anything explicitly wrong, but I see so many bits of weird codegen or missed optimisations that I keep the disassembly view open permanently, as a sanity check. The assembly never lies to you - until you find a silicon bug at least.
anitil 5 months ago

> For embedded developers, that's just a typical Tuesday
I was trying to explain to my colleague the other day that I've spent an unhealthy amount of time rebooting devices while staring at an LED wondering why it won't turn on.
eschneider 5 months ago

Tuesday, indeed. :)
In the embedded world, correctly working hardware isn't a given, either. Part of the board bringup/hardware verification process is just determining that everything on the board actually works. Always fun when you have to figure out if a problem is in your code or in the hardware. (HINT: It's often both.)
It's rare that you need to break out the oscilloscope or logic analyzer, but when you absolutely have to know if that line went high or not, there's no substitute. :)
- taneq 5 months ago
  
  > (HINT: It's often both.)
  Or worse, it’s neither! By which I mean both. Neither part of the design is technically wrong but the fault is in the way the two interact. Those are some of the fun ones… I had one where I had to make sure the chip select line was off before turning power off to a chip, because CS would keep it half powered.
  - eschneider 4 months ago
    
    At a sufficiently high resolution, all digital electronics is actually analog. :/
sitkack 5 months ago

It is nuts to have a dev board that is constrained as the final device. You should have had an additional serial port and 8x as much flash, it would have solved your problem immediately.
It is even better to do the bulk of the dev inside of an emulator if you can swing it. The GPS and GPRS could be tethered into the emulator instead of trying to get a debug link into the system board.
ShroudedNight 5 months ago

Were these commodity boards? Having to resort to using the cellular connection, instead of attaching a hardware debugging probe (J-Link?) seems like a recipe for a painful squandering of intellect.
- exmadscientist 5 months ago
  
  One of the lovely "features" of embedded work is that after a while of doing this sort of thing, sometimes you get good enough at the crazy hacks that it becomes faster and easier to do something like this than to track down who has the J-Link (okay, they've usually got more than one) and can they spare it/where did they put it/why does that person have a J-Link at all/is the J-Link still alive....
jamesfinlayson 5 months ago

Oof, I remember doing lots of embedded stuff at university and this rings true.
The compiler we used was built off gcc so it was reasonably good but I remember we had some weird crash one day that I couldn't figure out. Eventually I added some inline assembly to do an absolute jump to the next place that it needed to go and it started working again. I was too inexperienced to know how to dig deeper but presumably the code generator had inserted something weird that was causing a crash.
lisper 5 months ago

Yeah, I have a war story...
I was working on mobile robot research at JPL back in the 1990s. We had a robot with an arm attached. It worked fine except that every now and then the whole system would crash hard with a totally corrupted heap and stack, just random data everywhere. So no chance of a backtrace. The really weird thing was that this only happened when the arm was moving. We also had the exact same system running under a different operating system and we never had any problems there, so we were 100% sure it was not a compiler error.
It was a compiler error.
It took us a year to figure out what was going on. It turned out that the compiler had a bug where it would emit code that would pop the stack pointer and then pull a value out of the now unprotected stack frame. On the non-embedded system this did not cause any problems, but on the embedded system (running vxWorks) hardware interrupts used the same stack as the process that was running when the interrupt hit. So if we happened to get an interrupt just after the stack pointer was popped but before the unprotected value was grabbed, that value would get stomped on by the interrupt handler. Then when the interrupt handler would return, the process would resume, grab the now-random value, and chaos ensued.
- ShroudedNight 5 months ago
  
  How many novel depressions were created as a result of high velocity impacts after making that discovery? I think I'd be seeing red...
  - lisper 5 months ago
    
    Actually, I remember being thrilled to have finally figured it out. We had been beating our heads against the wall (metaphorically) for a year, and I remember looking at the screen at the disassembly sequence and thinking, Oh my God, I think I've found it! It felt like making a major scientific discovery. (To be fair, I was only able to do this after others laid the groundwork for me by finding ways to reliably reproduce the problem. But I'm the one who spent hours single-stepping through assembly code before finally realizing what was happening.)
    I also remember reporting the problem to one of the authors of the compiler (I think it was David Kranz) so he could fix it in the next version and him telling me that there wasn't going to be a next version because the funding for the project had been cut. There was no github in those days so the whole thing just faded into the mists of time, which is a real shame because the system really kicked ass.
    The whole history of the project can be found here:
    https://paulgraham.com/thist.html
motorest 5 months ago

> For regular programmers, if your machine won't boot up, you are having a bad day. For embedded developers, that's just a typical Tuesday, and your only debugging option may be staring at the code and thinking hard.
It seems to me that if you can still update and reboot said machine, you can do a bisect on your commits to pinpoint the regression. Once you spot the regression commit you can split it to check what introduced the regression.
- smcl 5 months ago
  
  It took them multiple tries just to use gdb, I don’t think this is a scenario where you can easily reflash the image on the board
stuaxo 5 months ago

Did the GCC patch get applied after that?
- actionfromafar 5 months ago
  
  "Never" implies no, I guess. :-)

subharmonicon 5 months ago

I’ve spent 30 years working on compilers.

They have bugs. Lots of them.

With that in mind, the article is correct that the vast majority of issues people think might be a compiler bug are in fact user errors and misunderstanding.

My experience actually working with users has been somewhat humorous in the past, including multiple instances of people completely freaking out when they report something that turns out to be a miscompile. I’ve seen people completely freaking out, to the point that they no longer felt that any code could be trusted since it could have been miscompiled in some way.

jcranmer 5 months ago

Compilers are multimillion line programs, and they have an error rate which is commensurate with multimillion line programs.
That said, I think like half the bugs I see get filed against the compiler aren't actually compiler bugs but errors in user code--and this is already using the filter of "took the trouble to file a compiler bug." So it's a pretty good rule of thumb that it's not a compiler bug, unless you understand the compiler rules well enough to articulate why it can't be user error.
- LiamPowell 5 months ago
  
  It's not quite half the bugs on GCCs bug tracker, but it's very high: https://gcc.gnu.org/bugzilla/report.cgi?x_axis_field=&y_axis...
  It's around 10% invalid bugs and another 10% duplicates. A lot of them that I've seen, including one of mine, are a result of misinterpreting details of language standards.
- marckerbiquet 5 months ago
  
  Compilers have a huge advantage over other programs: they are fully deterministic since they depend only on input files, command line arguments and few environment variables. It makes bugs easier to reproduce and fix compared to interactive applications, programs with networking, multi-threading...
  - perching_aix 5 months ago
    
    Pretty sure most modern compilers are multithreaded, and do exhibit a slew of practical nondeterminisms, which is how/why projects like Reproducible Builds were formed.
    
    jcranmer 5 months ago
    
    In general, most compilers are generally single-threaded for most of the compilation process--at the very least, compiling a single file (translation unit) is almost always done using just one thread.
    However, nondeterminism does creep in in various places in the compiler. Sorting an array by pointer value is an easy way to get nondeterminism. But the most common nondeterminism in a build system comes not from the compiler but the filesystem--"for file in directory" usually sorts the file by inode, which is effectively nondeterministic across different computers.
    
    perching_aix 5 months ago
    
    Yes, that's why I was so careful with the wording. Timestamps are another example.
AlotOfReading 5 months ago

It's amazing how many compiler issues never translate into meaningful deviations at the level of application behaviour. Code tends to be highly resilient to small execution errors, seemingly by accident. I wonder what a language/runtime would look like if it were optimized to maximize that resilience, i.e. every line could miscompile in arbitrary ways. Is there a smarter solution than computational redundancy without an isolated verifier system?
- mturmon 5 months ago
  
  Interesting comment.
  I do a lot of numerical programming. When developing programs based on optimization, in particular, a similar robustness-to-error property happens. Your implementation can have bugs, but it's generally hill-climbing, and so often the results generally look OK.
  If you really want to verify correct operation, you have to construct hard cases, or compare with another implementation, or look at intermediate state variables, or examine the cost function at very high numerical precision, to detect trouble. Run-of-the-mill inputs will not tickle the bug hard enough to notice.
IgorPartola 5 months ago

I am very curious, if these bugs are that common then why don’t we see more programs with weird bugs when they are running and especially having them be documented? Is it because when an unknown bug turns out to be a compiler bug and not a code error it gets fixed right away and with little fanfare? Or that there is some sort of resiliency built into the compiled code that can mask compiler bugs? Or is there some other factor?
Also how easy is it do discover a compiler bug and how easy is it to identify that a bug in your executable is due to a compiler bug?
- starspangled 5 months ago
  
  Compilers runs enormous regression suites, and CI/git/bisect/etc style of development has made bugs harder to check in and quicker to squash in a lot of cases I would say.
  I have found a number of compiler bugs in GCC and LLVM (and GAS and LLMV AS). Almost without fail they have been in the use of new features (certain new instructions, new ABI / addresing model) or esoteric things (linker script trickery, unusual use of extended inline asm) etc where the compilers had probably no or very little "real" code to test against other than presumably some simple things and basic unit tests when they check in said features.
  Unless you're doing _really_ unusual things, or exercising new paths that don't just get picked up when compiling existing code (e.g., like many/most optimizations would), it's just not that likely you'll write code that triggers some unique path / state that has a noticeable bug.
  To identify the bug is a compiler bug that is silent bad code generation, you basically assume the compiler is correct until you start to narrow the problem down to a state which should be impossible. After you put in enough assertions and breakpoints and logging (some of which might make the problem mysteriously go away) and reach the point of banging your head on the table, you start side-eyeing the compiler. If you know assembly you might start looking at some assembly output. Or you would start trying to make an reduced reproducer case. E.g., take the suspect function out on its own and make some unit tests for it. A tool like C-reduce can sometimes help if it's not a relatively simple small function.
  How quickly you reach that point where you can actually start to narrow down on a possible compiler bug entirely depends on the problem. If it's causing some memory ordering or race condition or silent memory corruption that is only detected later or can only be reproduced at a customer sporadically, then who knows? Could be months, if ever. Others could be an almost immediate assert or error log or obvious bad result that you could debug and file a bug report in a day.
- octachron 5 months ago
  
  A significant factor in my experience is that a lot of programs are quite similar from an compiler perspective: they use well-trodden set of features and combine then in a predictable way. Compiling those regular programs is well-tested and well-understood. Compiler bugs tend to be relegated on the exotic paths, when using language features in novel and interesting ways.
  - whizzter 5 months ago
    
    Large functions is a particular breeding ground.
    Ages ago working on PS2 games one of our guys had a particularly huge "do-animations-and-interpolations-and-state-and-everything-for-the-hero-in-one-huge-switch" thingy (not uncommon to encounter in games) that crashed the GCC, the function was split up.
    In the sequel I think a similar function grew enough that not only had they the function but also split in multiple files to avoid miscompiles.
    Most recently I was generating an ORM binding(C#) from the database model of an ERP system, for mysterious reasons the C# runtime was crashing without stacktraces,etc (no debugger help). Having seen things like this before I realized that one of the auto-generated functions was huge so I split it up in multiple units and lo-and-behold it worked.
    (Having written a tiny JVM once I also remembered that jump instructions are limited to 64kb, not 100% if the .NET runtime inherited that... once it worked I didn't put any effort into investigating the causes).
    Most of the time though compiler bugs aren't the worst (unless they help cause confusion in already hard scenarios).
- alexey-salmin 5 months ago
  
  > I am very curious, if these bugs are that common then why don’t we see more programs with weird bugs when they are running and especially having them be documented?
  Any given program has N "native" bugs and M bugs introduced by the compiler. I think as long as N >> M you won't really notice. Even if you stumble across a compiler bug by chance, proving it is a nightmare: there's so much UB everywhere that any possible output is technically correct. Exceptions are compiler crashes but those are rare.
  In my experience most of compiler bugs were found by well-tested and proven software during the update of the compiler version or switching compilers. That kind corresponds to the prerequisite of "N is small".

CrossVR 5 months ago

Back when I worked on the MPC-HC project we found a bug in the Visual Studio MSVC compiler. When we upgraded from VS2010 to VS2012 subtitles would fail to render.

We eventually traced it down to a small for loop that added 0.5 to double members in an anonymous struct. For some reason these three factors: an anonymous struct, double datatypes and a for loop caused those member variables to become uninitialized.

We extracted this code into a small code sample to make it easily reproducible and reported it to Microsoft. Their compiler team called it one of the most helpful reports they'd gotten and confirmed it was a bug in their for-loop vectorization code. The compiler appeared to have messed up the SIMD instructions to write the results of the addition back to memory.

LiamPowell 5 months ago

There's 830 open and confirmed wrong-code bugs in GCC at the time of writing. Compiler bugs aren't as rare as people think: https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=NEW&bug_...

I think it's just common for people to assume they're wrong and change things blindly rather than carefully checking the standard for their language (assuming their language even has a standard to check). It doesn't help that before AddressSanitizer and co. existed compilers would just do all sorts of nonsense when they detected possibly undefined code in C and C++.

vessenes 5 months ago

Oh man. I uncovered a hash implementation bug in go, ca 2014 or so and I spent like two days prepping my bug report, tests, I was so certain it was me. The team of course was super nice and like ‘good catch’. Victory lap day for any nerd.

procaryote 5 months ago

Learning to code (C) I thought I found a compiler bug lots of times and was almost always wrong. It gave me the heuristic that if I thought I found a compiler bug, it was time to take a break, have a snack and go for a walk or something before looking again. It usually helped me find my mistake much faster.

The thing I disliked most about later learning PHP or Javascript was that my previously usually wrong reaction of "the compiler is insane" suddenly turned out to be commonly true. Even when it wasn't an actual bug PHP and javascript were often so poorly designed that intended behaviour wasn't much better than one.

dominicrose 5 months ago

Thanks for reminding me that the two programming languages I'm using are poorly designed :) Joke aside, JS is getting better especially when paired with the right tools, like Typescript, VS Code, ESLint, Prettier, React+JSX, etc. PHP has been evolving for a long time to be a bit safer with more static analysis.
I'm not a fan or PHP's variables that are available in a bigger scope than they should, arrays that can be filled without having been defined in the first place, or that are falsy when empty. The solution is to not abuse these features (not use them at all, really) and code as if it was not PHP.
- Vilian 5 months ago
  
  JavaScript become good when you create another language of top of it and use at least 2 different frameworks isn't a good sign
- procaryote 5 months ago
  
  Yeah, Javascript has definitely gotten a lot better since those days. It's possible PHP has too, but I haven't had to check
  Both of them do this weird type coercion that break a lot of things if you pretend they're not JS or PHP; I found I had to learn where the traps were, and use a lot more === and !== and >== etc

kevingadd 5 months ago

Similarly, I once ran into a broken implementation of a Dictionary type (in Mono, I think.) It was only comparing the keys' hash codes, not the keys themselves. In most scenarios this turned out to be more than good enough - for int32 keys obviously it will work, and for most strings it works too if the hash function is good - but I had a great many keys without an amazing hash function for them.

It's funny how sometimes a really glaring bug can hide in a stdlib for months or years just because by luck the stars never align to trigger it where somebody can notice it. In my case, the dictionary bug was causing recoverable errors, and I only noticed because I dug in instead of going "Mono's just broken".

est31 5 months ago

It depends really which compiler you are testing and whether the version you are testing has just been released or has been around for some time. If the compiler is for a niche language, then it's possible to find bugs. If the compiler has been released, it's even possible to be the first person to note the bug. But the bigger the language, the more has passed, the less likely this is.

DylanSp 5 months ago

This is definitely a big factor. I've found one compiler bug, but it was in a feature that had been added all of two months earlier (optional chaining in Typescript 3.7).

dataflow 5 months ago

Note... it's not really that "it's never a compiler bug," but more like "it's never a backend/codegen bug."

It's not particularly hard (for someone who knows the language rules, which are difficult for a language like C++) to make a widely-used compiler be erroneous in its acceptance or rejection of code.

What's much more difficult ("never" happens) is to make the compiler accept valid code and then generate an incorrect executable. It's possible (and I run into this maybe once a year doing unusual things) but it's really rare. If you think that's what's going on, it's very unlikely to be the case.

ynik 5 months ago

Codegen bugs are not particularly rare either; but you usually run into them if doing "weird stuff" (which hits an edge case somewhere within the compiler). And the first instinct of most C++ programmers when seeing weird compiler behavior is to assume their weird code somehow triggered undefined behavior, so they refactor their program until it's less weird. But then it usually also no longer hits the edge case in the compiler's logic, so the program starts working correctly. Most developers then don't spend additional hours/days to investigate whether it was truly undefined behavior or if they hit a compiler bug.
UncleEntity 5 months ago

Round 'bout 10 years ago I was working on this Python C extension and, after a distro upgrade, it started segfaulting. Dropping down into gdb, python was fairly obviously calling the wrong C function. I didn't know if the linker, compiler or python was at fault and "it is never a compiler error" was at the forefront of my mind so I never even tried to report the incorrect behavior out of fear that maybe I was doing something stupid that caused gcc to compile an incorrect shared library without complaining.
IIRC after the next fedora release everything started working again so maybe not me? Still don't know.
maginx 5 months ago

That's not my experience - I've found a handful of accepted and verified bugs in major commercial compilers and all were in the codegen/backend, and the code to be generated was quite simple. In one case it was basically an array copy in Java byte code that got erroneously translated into what was effectively a "copy until zero termination" error.
titzer 5 months ago

Compiler backends are really complicated and a lot more difficult to debug than frontends. In my experience they account for more than frontend bugs.

CGamesPlay 5 months ago

I hit a similar issue in 2017 which is still the case today: Python's builtin `random.shuffle` destroys numpy arrays passed into it [0]. This is apparently a design limitation within numpy and cannot be detected or fixed, so it still stands today. I spent hours combing through my own code wondering where the bug was, because there was no way that it was caused by numpy or Python, but eventually all the likely scenarios got ruled out...

[0] https://github.com/numpy/numpy/issues/10215

tomcam 5 months ago

My worst slowdown ever was when a compiler failed because a bit had flipped somehow. After a month and a half I finally reinstalled it and everything worked perfectly.

sitkack 5 months ago

That sucks, but feels good to have solved it.
This is why it is important to have portable build environments. And ECC and checksumming file systems.
- 0x1ceb00da 5 months ago
  
  Do macbooks have that? What is the best checksumming file system I could use on mainstream linux?
  - actionfromafar 5 months ago
    
    Only ones I know of are ZFS and BTRFS.
    However, duckling the web got this workaround for EXT4:
    https://serverfault.com/a/1153319
    i.e. use device mapping or LVM setup to do the data integrity check at the block level, under the filesystem.
  - speed_spread 5 months ago
    
    1. No, Macs don't have ECC to my knowledge
    2. ZFS if supported by your distro, btrfs otherwise

pfdietz 5 months ago

On the other hand, if you really focus in testing a compiler, particularly an immature one, it's remarkable how many bugs you can find.

jfim 5 months ago

Or if one is using newly introduced language features or accelerated instruction sets.

pjmlp 5 months ago

I still have a recognition letter from Borland regarding a bug I have found in Turbo Pascal 6.0.

    function BrokenResult: Integer;
    var

      BrokenResult: Integer; (* This should not happen *)


    begin
      BrokenResult := 42  (* Local variable will be assigned, function result is whatever the compiler comes up with*)
    end;

dehrmann 5 months ago

At my first job, it actually was a compiler error, and I'm not sure if my manager ever believed me. We were using an internal gcc fork and cross-compiling, so who knows where the bug was, but the compiler team got back to me. Jump tables were sometimes broken, and we had to add a switch to disable them.

Not the right lesson to learn for a first job.

amluto 5 months ago

For anyone who worked in embedded programming in the bad old days of proprietary compilers, it sometimes felt like the compiler working correctly was the common case. One of my first jobs involved programming a smallish, embeddedish, ruggedized computer in C. IIRC I wasted several hours on a bug once before realizing that it was a compiler issue and I needed to try arbitrarily rearranging the buggy function until it generated code that at least appeared to work.

dredmorbius 5 months ago

As with many others commenting here, I've certainly had many suspicions I'd found a compiler (or equivalent) bug almost always proved false.

But in 30+ years of professional experience, I've also found two compiler-like bugs (I tend to use scripting / interpreted languages, so "compiler" isn't entirely accurate). One was in a commercial software package in which the documentation and implementation of a feature were reversed (what resolved as "true" should have been "false" and vice versa). That resulted in a code fix.

And another was a bug (specifics of which I've since forgotten) in GNU Awk, and not, I painstakingly verified, in my own code. That was also submitted and fixed.

Every other time, though, my own damned fault ;-)

dang 5 months ago

Discussed at the time:

“It is never a compiler error” - https://news.ycombinator.com/item?id=15699675 - Nov 2017 (272 comments)

nayuki 5 months ago

I crashed the Oracle HotSpot Java virtual machine back in 2017 with a totally innocuous program involving nested arrays. After reproducing and minimizing it, I filed a bug report. It got fixed quickly.

I'm not sure why the page is no longer publicly available: https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-818... (JDK-8181921)

okaleniuk 5 months ago

Our infrastructural team keeps about 2 MSLOC building on several compilers and running on several architectures. They report a new compiler bug every 2-3 years.

maginx 5 months ago

Around 10 years ago I found a JIT bug in a JDK from a big Java vendor. A new version of a web application server had been applied. The production application crashed after around 30 minutes of running, almost simultaneously on both production sites. It was an internal checksum calculation in the application that failed - an obscure error never seen before. The upgrade was rolled back immediately. I was assigned to the case and of course didn't suspect a JIT error. But within a week of investigation I started suspecting it must be (but I didn't dare tell anyone!) and I eventually managed to show this and reproduce it consistently. The vendor confirmed and made a temporary workaround via switches that disabled some new optimizations. Later a real fix was shipped.

I've also found 3-4 JavaScript JIT compiler errors in major browsers, all confirmed. I was a developer on what was for its time a quite complicated JavaScript solution, so we tended to encounter obscure JavaScript errors before others.

ShroudedNight 5 months ago

> Around 10 years ago I found a JIT bug in a JDK from a big Java vendor.
Was it J9? Even the remote possibility of having been a cubicle partition away from the unrolling of your story, or that I might have heard about it at lunch, or even contributed in some small way... it's strangely affirming.
If it was J9, I'm curious if you remember much about it. The options the service team would have given you may well be still around: https://github.com/eclipse-omr/omr/blob/master/compiler/cont...

AnimalMuppet 5 months ago

In my case, it wasn't a compiler bug - it was a bug in the STL, before the STL was part of the compiler. It was a separate thing you downloaded. I found a bug, and emailed Stepanov (or Lee - I forget). Me, just some random nobody on the internet. I got a fix, and then an improved fix, and then a final fix, all within two hours. I was floored.

adzm 5 months ago

Thankfully though we can still look at the STL source easily and presumably be able to determine the source of the bug or trace behavior or design test cases easier etc.

jcelerier 5 months ago

Idk I reports bugs on GCC / clang something like every few months. I used to do it for msvc too but there were honestly too many

lambdaone 5 months ago

While it's almost never a compiler error, it happens, and I have personal experience; I once found an error in the VAX/VMS Pascal compiler - and could demonstrate it as such by disassembling the compiler output - and had to work around it until DEC fixed it.

ShroudedNight 5 months ago

This brings back memories of XL calculating an address wrong as a result of it lying on a boundary ≡ 0 (mod 2^32). Fortunately, the TOBEY (XL back-end) guys were in the same area in the building so restablishing our sanity was faster than it otherwise could have been...

o11c 5 months ago

In the early days of C++11, I used to get unique ICEs in both GCC and Clang weekly. One particular annoyance was when a stable release of Debian decided to ship a point release with a regression (not looking it up, but it was something like: 4.6.1 or 4.6.3 worked, but 4.6.2 had completely broken UDLs for constant expressions or something). I had just converted the whole codebase to use UDLs aggressively since they worked everywhere in my tests, not thinking I had to test every point release in between ...

Thankfully I don't think I ever had any miscompilations - that would require the code actually compile across several compiler versions in the first place.

groos 5 months ago

As a compiler developer, I see plenty of bugs. So, it's sometimes a bug. But, in the case of C (and C++ by extension), it's often a language design bug that unfortunately has no fix and can only be worked around.

titzer 5 months ago

Indeed. C has such a loose spec, and so many behaviors are allowed under either "implementation-defined" or "undefined" behavior, that it devolves into finger-pointing pretty quickly.
Most other languages have tighter specs. I've primarily worked on implementing Java, JavaScript, and wasm, all of which have very tight specs. Then it's a lot more cut-and-dried whose fault it is.
And I've had plenty of compiler bugs.

gavinhoward 5 months ago

The article is right: it is almost never a compiler bug. I have had that experience of reporting and being wrong. It sucks.

On the other hand, I have a confirmed bug in Clang [1] and a non-rejected bug in GCC [2], so it does happen.

[1]: https://github.com/llvm/llvm-project/issues/61133

[2]: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108448

gavinhoward 5 months ago

The cause of the GCC one has been found. It was me. Of course.

iKlsR 5 months ago

Related "It Is Never a Compiler Bug Until It Is" https://news.ycombinator.com/item?id=24636326

nsoonhui 5 months ago

>> It is not a compiler error. It is never a compiler error (2017)

No, not always true. Even in modern compilers -- as matured and as modern as VS 2022-- you would still get bug.

I found one[0]. In my case it's easy to tell it's a compiler bug because the program just can't compile properly. But it's also not easy to reproduce, which just proves how well tested compilers usually are.

0: https://github.com/dotnet/roslyn/issues/74872

flerchin 5 months ago

I wonder if the bubble-sort implementation in this library helped prolong the life of this bug. Most people would choose another impl for performance reasons, and thus not find this bug.

bitwize 5 months ago

I was playing with Java 1.0.1 trying to make an app screen with a GridBagLayout. It made utter hash of my layout, drawing things on top of each other, etc. Applying the First Rule of Compiler/Runtime Bugs I double-checked and triple-checked and quadruple-checked my work, making sure I used the GridBagLayout API exactly according to spec. Eventually I posted to USENET comp.lang.java asking, "Is there a bug in GridBagLayout?"

The problem disappeared in Java 1.0.3.

timpark 5 months ago

Back when I was using CodeWarrior to make a game for PlayStation 2, I found a compiler bug, but fortunately, it was one where it gave an error on valid code, rather than generating bad output. I can't remember the details, but I had some sort of equation that my co-workers agreed should have compiled with no problems. I was able to rewrite it a little to get the result I wanted without triggering any compiler errors.

hasley 5 months ago

Woa, CodeWarrior was one of the worst compilers (and IDEs) I had to use so far.
- eschaton 5 months ago
  
  Mac developers who used CodeWarrior on its native platform from the PowerPC System 7 through Mac OS 9 era (so 1993-2001) generally consider it a fine compiler and the best IDE ever made.
  I wonder what happened.

tbrownaw 5 months ago

Just last week I tripped over a couple compilation bugs in (an old version of) bpftrace.

One was caught by internal checks somewhere, something about struct member offsets that I think was an alignment / padding issue and didn't seem to actually break anything. The other made it segfault during compilation, and I had to just tweak my code blindly until it decided to go away.

don_neufeld 5 months ago

I found multiple compiler bugs at my first real programming job in 1997.

MSVC did not do a good job of maintaining the FPU stack in those days…

hasley 5 months ago

When I started learning Turbo Pascal I came across a problem where an if-statement was obviously decided wrong. I saw the values in the debugger.

My rescue was that I had a more experienced friend who knew that IIRC the compiler would choose the data type of the left operand of a comparison also for the right operand leading to potential sign switches.

carterschonwald 5 months ago

I’ve hit so many fun compiler bugs. Usually easy to work around though (yay modern / fp flavored languages). It certainly helps when it also crashes the compiler ;).

Miscompilation bugs are definitely nasty though. Especially if it’s a self boot strapping compiler. Save your old build artifacts! :)

KolmogorovComp 5 months ago

As the article shows it’s highly dependent on which compiler you’re relying on. Always good to keep this in mind when assessing the likelyhood of an error.

bmenrigh 5 months ago

I’ve thought I’d found a compiler but maybe 5 times in my life and it has never actually been a compiler bug.

When I reflect on the ~25 years I’ve been programming C, all of the times I thought I’d found a compiler bug were in the first ~8 years. Dunning-Kruger hard at work :-/

btilly 5 months ago

I found one by accident with C++. It was a situation where class A had a protected field x, B inherited from A, and C was a friend of B. Can C access x?
GCC and Clang disagreed on this. Upon close reading, Clang was right, C should be able to access x.
(Why did I do this? C was a helper class for the purpose of running unit tests. Unit tests are supposed to poke around in stuff you wouldn't normally poke around in.)
cwalv 5 months ago

I've encountered 1 in ~20 years. I don't even remember what it was, but I remember being shocked when I tracked it down and it actually was a compiler bug

rurban 5 months ago

No, it is very often a compiler bug. Just look at the gcc, clang or rustc tickets.

e.g. https://gcc.gnu.org/bugzilla/buglist.cgi?bug_status=__open__...

It's massive, and several gcc versions have to be blacklisted. The clang restrict bug is still not fixed, it never worked. rustc was never memory-, type- nor concurrency-safe.

cillian64 5 months ago

Compared to the amount of code that people compile and the number of bugs seen and fixed in that code, that is a tiny number of bugs. I wouldn't say it's "never" a compiler error but when you find a bug in your program, it's almost certainly not the compiler's fault.

colonial 5 months ago

I wonder what % of compiler bugs go unidentified due to the user code-massaging them away in some fashion.

tgma 5 months ago

...unless it is. Compiler crashes are easy to see, but it can actually be nontrivial to identify miscompilations as they can only trigger in certain code paths and with careful observation you can notice the second order effects...

If you specifically look for them you might find quite a bit: https://web.cs.ucdavis.edu/~su/publications/emi.pdf [disclosure: an author]