This talks about code complexity a lot. This, however, is not the chief source of complexity in many code bases. The number of tools needed to build things is the bane of the modern software developer. Also, the use of microservices where none are needed results in and enormous increase in complexity. Code complexity is relatively easy to fix compared to all of this.
The real answers about complexity come from thinking about why we even care. It's because our feeble minds have to build internal models of the code so we can work with it. The cognitive aspects of building those models is why complexity matters.
What things make it more difficult to build those models? A partial list, mostly as others have mentioned:
- tool and library dependencies - nested conditions - loops and especially nested loops - asynchronous processing, callbacks, etc - non-descriptively named variables and functions - using non-standard code patterns for standard functionality - delocalized code, as in, you have to navigate somewhere else to see it (throws off your working memory)
By one study, developers using Eclipse for Java spent 27% of their time just doing code navigation.
The starting point for code complexity is about how our minds work.
As many people have said, "it's easier to write a program than read it."
Some blog linked from here (maybe jvns.ca?) made the case that depth of your project's software dependency tree is an important metric. The more crap you have to pull in, the more things can go wrong. You're better off with a large program with no dependencies, than a somewhat smaller program with a ton of dependencies.
Language features on the other hand can let you develop complex programs quickly and reliably, by catching errors before the code gets deployed and so on.
One thing I hate is having like 20 different Python repos for one small company. At most places I've worked at, you have basically one thing you do business-wise, but it's split into what I believe are arbitrary repo delineations. This causes trouble with dependency resolution in your build systems and IDEs and increases cognitive load and merge complexity for changes across repos.
Just put the whole folder structure into one repo! This would reduce build complexity, and you can set up the configs for your tools and CI one time rather than 20. You can still have several services out of one repo, if you want to, but it's easier to reason about and easier to change those service delineations later, where in 20 repos you're having to "clone and cut code" to separate things. In one repo, you just move code around as you split or merge services.
I routinely create a "super repo" for myself at these companies using submodules, so that I can actually work with the code more easily, but that still requires me to check in maybe 5 or more PRs for one feature, so it's not ideal. This only solves the developer's problems with local tools and still requires more complex debugging since the services are not actually in one repo under one config for deployment.
I always liked the notions of coupling and cohesion because they are simple to understand and you can see at the glance of an eye if a particular bit of code would have good metrics for those, without actually bothering with the metrics. E.g. long list of parameters or imports == high coupling, large number of fuctions in a module == low cohesion. Specifying exactly how much isn't that useful. It's easier to think in terms of "relative to the rest of the code". Debating what is too high or too low even less so. But if you are struggling with working with a particular bit of code, being able to identify why it is hard to deal with is useful; especially if you know how to fix it.
But mostly metrics should not be telling you things you can't already know just looking at the code; if it looks complicated, it probably is. Metrics only become useful when you need to tell without looking. Sometimes that's useful.
There were attempts to predict bugs by looking at complexity metrics. As I recall, the research found that when you adjusted for code size, none of the metrics mattered. In other words, just use LOCs as your metric.
The last section on coupling reminded me of the concept of connascence[1] which I've found really helpful when talking about code.
My first job as a dev was for a consultancy and I had to review a code base for a client to support their argument that they require a rewrite. I had no idea about any of this so googled metrics, found cyclomatic complexity, and wrote a load of bullshit about the results of analyzing the code base showing it was complex. It served its purpose - they got corporate to accept a rewrite - but Iâve never used those metrics again.
Number of lines of code is a good metric for complexity. The hard part is estimating how many lines of code is reasonable for a specific feature. Some complexity is necessary. The problem is unnecessary complexity.
I propose a challenge, similar to the Obfuscated C challenge, to devise code that is impenetrable to the human mind and yet fantastically clean to all metrics.
There are a lot of things you can do to lower these sort of metrics without addressing actual complexity, sweeping the problem under the rug. Perhaps a better metric for software complexity would be the amount of work the computer has to do, or the number of instructions it has to execute.
The Einstein quote is that "something should be a simple as needed but no simpler", but how to be pin down "required" in an objective / quantifiable way?
Somehow it is more important to measure "gratuitous" complexity, redundant complexity that is not justified by present or plausible future requirements...
The problem is that the code itself does not capture requirements, so code analysis can give you absolute indicators but never an "efficiency" measure (how efficient and justified the measured complexity)
> Our work, as developers, pushes us to take many decisions, from the architectural design to the code implementation. How do we make these decisions? Most of the time, we follow what âfeel rightâ, that is, we rely on our intuition.
So no engineering best practices. That explains very good the quality of SW.
The number one metric: Consistency. If an app is similar to itself in all places, it's very easy to understand. Better yet, if it's similar and consistent with how other things have been built, we can call it "clean code".
Back in the day we called these things architecture, but I'm old and salty.
The number of confused comments from PMs when engineering timelines are provided for âeasy tasks.â Why does moving an ad above the fold require 3 weeks? Because it implicates 6 different teams.
I didn't see anything on Function Point counts in the article. No advocating it per se but it was, at one time, considered one of the more useful ways to evaluate codebase complexity.
What if I told you that most software complexity doesnât come from the code but from the software requirements?
No mention of function points.
Verbosity.
Number of unnecessary abstractions.
tldr, blood pressure
If it looks ugly, or you have to navigate too much (long files or a lot of external dependencies/dependency chains)... it's complex.
That's all you need to know really
It is a simple, systematic, math-based method that `The Math-based Grand Unified Programming Theory: The Pure Function Pipeline Data Flow with Principle-based Warehouse/Workshop Model`, it makes development a simple task of serial and parallel functional pipelined "CRUD".
### Mathematical prototype
- Its mathematical prototype is the simple, classic, vivid, and widely used in social production practice, elementary school mathematics "water input/output of the pool".
### Basic quality control
- The code must meet the following three basic quality requirements before you can talk about other things. These simple and reliable evaluation criteria are enough to eliminate most unqualified codes.
- Function evaluation: Just look at the shape of the code (pipeline structure weight), and whether the function is a pure function.
- Functional pipelined dataflow evaluation: A data flow has at most two functions with side effects and only at the beginning and the end.
- System evaluation: Just look at the circuit diagram, you can treat the function as a black box like an electronic component.
- Code Quality Visualization:
- For Lisp languages, S expression is contour graph, can be very simple transformation into contour map, or 3D mountain map.
- If the height of the mountains is not high, and the altitude value is similar, it means that the quality of the code is good.
- For non-Lisp languages, you can convert the source code into an abstract syntax tree (AST), and then into a contour map, or a 3D mountain map.
### Programming Aesthetics Simplicity, Unity, order, symmetry and definiteness.
---- Lin Pengcheng, Programming aesthetics
The chief forms of beauty are order and symmetry and definiteness,
which the mathematical sciences demonstrate in a special degree.
---- Aristotle, "Metaphysica"
My programming aesthetic standards are derived from the basic principles of science. Newton, Einstein, Heisenberg, Aristotle and other major scientists hold this view.The aesthetics of non-art subjects are often complicated and mysterious, making it difficult to understand and learn.
The pure function pipeline data flow provides a simple, clear, scientific and operable demonstration.
Simplicity and Unity are the two guiding principles of scientific research and industrial production.
- Unification of theories is the long-standing goal of the natural sciences; and modern physics offers a spectacular paradigm of its achievement. It can be found from the knowledge of various disciplines: the more universally applicable a unified theory, the simpler it is, and the more basic it is, the greater it is.
- The more simple and unified things, the more suitable for large-scale industrial production.
- Only simple can unity, only unity can be truly simple.
In the IT field, only two systems fully comply with these 5 programming aesthetics:
- Binary system
The biggest advantage is that it makes the calculations reach the ultimate simplicity and unity, so digital logic circuits are produced, and then the large-scale industrial production methods of computer hardware are produced.
- The Math-based Grand Unified Programming Theory: The Pure Function Pipeline Data Flow with Principle-based Warehouse/Workshop Model### Others
- Software and hardware are factories that manufacture data, so they have the same "warehouse/workshop model" and management methods as the manufacturing industry.
- From the perspective of system architecture, it is a warehouse/workshop model fractal system. It abstracts every system architecture into a warehouse/workshop model .
- From the perspective of component, it is a pure function pipeline fractal system. It abstracts everything into a pipeline.
- It adheres strictly to 10 principles and 5 aesthetics, and it consists of 5 basic components.
- It uses the "operational research" method to schedule the workshop to complete tasks in optimal order and maximum efficiency.
### Reference
The Math-based Grand Unified Programming Theory: The Pure Function Pipeline Data Flow with Principle-based Warehouse/Workshop Model
https://github.com/linpengcheng/PurefunctionPipelineDataflow
Ugh, no. Iâve worked in a codebase where CI would reject changes that had too much âcode complexityâ. Youâd constantly have to find clever ways to split up your code, when doing so did not make sense, to appease the complexity checker. Oh yeah, and if you ever make a one-liner change, you might end up being forced to do a full refactor because that one line pushed the complexity threshold over the edge. The results: PITA for developers and worse code. What a crock of shit.