LLVMs libcxxabi has a demangler which doesn't have the worst looking code in the world[0], has no external dependencies (outside of the C++ standard library), has lately been fuzzed[1] and is tested[2].
Switching languages is cool, but the Rust code is actually longer and still uses a hand written parser, so how can you be sure it is any more correct or won't eat all your memory?
[0] http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/src/cxa_...
[1] http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/fuzz/cxa...
[2] http://llvm.org/viewvc/llvm-project/libcxxabi/trunk/test/tes...
Coincidentally, I recently discovered that code I landed in the mozilla repo a couple of weeks ago crashes gdb when it tries to demangle (and c++filt on the command line). Looks to be infinite recursion. It's still in the current Nightly binary.
_ZN2js18CompartmentChecker5checkIN2JS8GCVectorI4jsidLm0ENS_15TempAllocPolicyEEEEEN7mozilla8EnableIfIXsrNS7_6IsSameIDTclptcvPT_LDn0E5beginEEDTclptcvSB_LDn0E3endEEEE5valueEvE4TypeERKSA_
coming from
template <typename Container>
typename mozilla::EnableIf<
mozilla::IsSame<
decltype(((Container*)nullptr)->begin()),
decltype(((Container*)nullptr)->end())
>::value
>::Type
check(const Container& container) { ... }
> Linkers only support C identifiers for symbol names.
This is only true of UNIX system linkers, before the FOSS and UNIX clones wave, it was common for each compiler to have its own language specific linker.
Can it help to implement C++ bindings for Rust?
How bad are the rules for MSVC compared to the Itanium ones and would you consider adding support for it too?
Not that I have a use case in mind or anything, just curious.
> These days, almost every C++ compiler uses the Itanium C++ ABI’s name mangling rules. The notable exception is MSVC, which uses a completely different format.
You stay classy, Microsoft.
> Its not just the grammar that’s huge, the symbols themselves are too. Here is a pretty big mangled C++ symbol from SpiderMonkey [...] That’s 355 bytes!
Here's a >4kB symbol I encountered while liberating some ebooks from an abandoned DRM app:
tetraphilia::transient_ptrs<tetraphilia::imaging_model::PixelProducer<T3AppTraits> >::ptr_type tetraphilia::imaging_model::MakeIdealPixelProducer<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 1ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 0ul> > > >, tetraphilia::TypeList<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 0ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 0ul> > > >, tetraphilia::TypeList<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 1ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> > > >, tetraphilia::Terminal> >, T3AppTraits, tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, tetraphilia::imaging_model::SeparableOperation<tetraphilia::imaging_model::ClipOperation<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> > > >(tetraphilia::ArgType<tetraphilia::TypeList<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 1ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 0ul> > > >, tetraphilia::TypeList<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 0ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::OneXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 0ul> > > >, tetraphilia::TypeList<tetraphilia::imaging_model::XWalkerCluster<tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalkerList3<tetraphilia::imaging_model::const_UnifiedGraphicXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits>, 0ul, 0, 1ul, 0ul, 0, 0ul, 0ul, 0, 0ul, 1ul>, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> >, tetraphilia::imaging_model::GraphicXWalker<tetraphilia::imaging_model::const_IgnoredRasterXWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 0ul, 0, 1ul, 1ul>, tetraphilia::imaging_model::const_SpecializedRasterXWalker<unsigned char, 2ul, -1, 3ul, 3ul> > > >, tetraphilia::Terminal> > > >, T3AppTraits::context_type&, tetraphilia::imaging_model::Constraints<T3AppTraits> const&, tetraphilia::imaging_model::SeparableOperation<tetraphilia::imaging_model::ClipOperation<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> > >, tetraphilia::imaging_model::const_GraphicYWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> > const*, tetraphilia::imaging_model::const_GraphicYWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> > const*, tetraphilia::imaging_model::const_GraphicYWalker<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> > const*, tetraphilia::imaging_model::SegmentFactory<tetraphilia::imaging_model::ByteSignalTraits<T3AppTraits> >*)
There's also this one https://github.com/ianlancetaylor/demangle
I really enjoy reading these pieces where someone rewrites something in rust and it turns out better than the old C version due to rusts safety features. Usually those kinds of projects are just a rewrite to someones favorite language of the month, for little tangible benefit other than their own satisfaction or education. These rust ones seem to show tangible benefits to the language itself.
Tom Tromey, a GNU hacker and buddy of mine, mentioned that historically, the canonical C++ demangler in libiberty (used by c++filt and gdb) has had tons of classic C bugs: use-after-free, out-of-bounds array accesses, etc, and that it falls over immediately when faced with a fuzzer. In fact, there were so many of these issues that gdb went so far as to install a signal handler to catch SIGSEGVs during demangling. It “recovered” from the segfaults by longjmping out of the signal handler and printing a warning message before moving along and pretending that nothing happened. My ears perked up. Those are the exact kinds of things Rust protects us from at compile time! A robust alternative might actually be a boon not just for the Rust community, but everybody who wants to demangle C++ symbols.
Then, later:
Additionally, I’ve been running American Fuzzy Lop (with afl.rs) on cpp_demangle overnight. It found a panic involving unhandled integer overflow, which I fixed. Since then, AFL hasn’t triggered any panics, and its never been able to find a crash (thanks Rust!) so I think cpp_demangle is fairly solid and robust.
That's what I like to see. Targeted useful reimplementations in Rust that play well to its strengths. In this case, as a double benefit to both the Rust ecosystem and to anyone that wants a robust demangling library.