C2 Lang design (2014) [pdf]

  • I understand why it's attractive to integrate more and more into the language proper, but I don't like it. Almost all tools which still embrace the unix way of doing things are written in C, and it's not just because that community is conservative. C feels like unix, because it's one of the only languages that abides by that philosophy. C-the-language is only part of the experience of programming in C. Programming in C includes DSLs and code generation and makefiles. Even what we consider C-the-language can be decomposed into preprocessor code and actual C.

    That said, C needs to be improved, but I think it should be done via the build system, adding layers on top of C, in a tasteful, thought-out way. Some ideas below:

    If we're expanding on C by adding build system complexity, Makefiles need to be improved. Make is a great tool, but it's unityped, like shell scripts. And it's essentially a preprocessor over shell, which leads to a mess of sigils. Maybe redesigning make as an embedded language in some lisp would do it. Additionally we could unify the notion of linking to a library and importing code into the makefile to allow dependencies to specify build steps. This could make some of the added build complexity implicit.

    Namespaces could probably be implemeneted as a preprocessor, taking module declarations and import statements and converting all identifiers into they prefix qualified equivalents, emiting warnings when there would be a collision (like if module "foo" declared "bar_baz" and module "foo_bar" declared "baz").

    Rust-style syntax-case macros can be implemented as a preprocessor.

    Go-style defer statements could be implemented in a preprocessor, and avoid the somewhat verbose goto-style error handling.

    As I said, this approach requires a lot of care to avoid adding a huge amount of complexity to the language.

  • Thoughts as I read:

    1. "uninitialized var usage is error": unfortunately impossible without at least one of the following compromises: Automatically initialize variables (wastes CPU); False alarms (see Java); Built-in formal proof system; or, Require compilers to solve the halting problem.

    2. Removed keyword "static": kills one of my favorite tricks, "self-init'ing functions".

    3. New keyword "as": A good invention in Pythonland. Good call to bring this in.

    4. New keyword "nil": Redundant with NULL?

    5. Example - Base Types: Uses uint8 in place of char. This obscures intent and makes code less readable. Compare: int library_fnc(char asterisk errmsg) versus int library_fnc(uint8 asterisk errmsg). (HN wants to turn my asterisks into italics...) In the former it's clear errmsg is a string, in the latter it's not clear (it could be a pointer to a flag).

    6. Example - function types. Doesn't one usually typedef the function pointer, rather than the function itself? So making that require two lines is annoying. Aside that, the author is right that C has confusing function pointer typedef syntax.

    7. Multi-part array initialization: Encourages unmaintainable code. Depending on what's in those "..."'s, might require compiler to solve halting problem?

    8. Multi-pass parsing: Trades maintainability for instant gratification.

    9. Symbol accessibility: The author makes "public" (and implicit "private") modify entire structs rather than individual fields...

    10. Multi-file module: May lead to unmaintainable code

    11. I'm worried about the language arbitrarily defining things like "the results of building are stored in the 'output' directory". OTOH the recipe.txt idea could help standardize what amounts to a lot of ad hoc Makefile programming.

    12. Build process difference: Theoretically could speed up compilation. I'm worried for social reasons. In module-based languages, we tend to fall into module hell: one symptom being the infamous 20-page stacktrace (see: Java, Clojure, etc.) The nature of C's #include incentivizes shallow dependency trees (a very good thing).

    13. "Language scope": trades portability for convenience

    14. Tooling: This shouldn't be part of the language, it should be separate.

  • Is there an advantage to this over say a more modern and safe language like Rust? It seems to be just reducing the complexity of the language, but doesn't look like it will reduce memory related bugs.

  • This is cool. I've often toyed with a similar idea of creating a language that improves/fixes the thing C messed up. If you aren't worried about safety (memory bugs can largely be avoided by changing how you do memory allocation, i.e. switch from individual mallocing to region based memory management) then C is actually a pretty nice language since it is simple enough to hold the entire language in your head. Plus it's nice to know how things are actually laid out in memory. The problems C2 solves are really the main things that frustrate me about C: header files, lack of a build systems, no modules, spiraling type signatures.

  • Looks extremely interesting. I'm skeptical of some of its claims (faster compilation when incremental compilation is removed sounds unlikely, for example), but nothing that worrisome.

    Anybody have any experience? Is it still basically a toy language?

  • Very similar to Ark: www.github.com/ark-lang/ark. Except Ark has no GC, has tagged enums, ownership is enforced, and a few other smaller differences.

  • There is a Standard ISO committee (JTC1/SC22/WG14) - for that, please!