On the Infosec.Exhange Mastadon there was a post asking what the worst programming language for security is, with a pretty lively discussion of whether it was C or C++ (and a handful of other contenders thrown in). First things first:
but a cheeky nerd debate aside, I think an interesting line of questioning is "What makes a language bad for security?" as this also helps us with the inverse and more impactful question "What makes a language good for security?". By exploring the considerations around this I believe we can make our lives securing applications easier by biasing towards certain programming languages up front, exploring how to improve the security of the languages we are using now, and guiding how we design better languages going forward.
Note I will be using Language somewhat ambiguously to both mean the core language semantics and standardized API/Function surface, but ALSO the Framework and Library story when there is consistency or DeFacto standardization.
While undoubtedly incomplete since these are the considerations of a single person and not the collective expertise of many, I have banged my head against this problem for a couple of decades and think some of what I have learned has value to others. So below are the primary considerations that come to mind when trying to answer "what the worst language for security is":
Secure by Default Patterns & Decision Density
I'm not sure if I am stealing this from Brian Chess or Jacob West, but one of the two of them were the first folks I'd heard describe these concepts, so props to whoever deserves the credit.
Secure by Default patterns in programming are patterns where, if the developer does what is most natural and easiest for them, the result is secure, and they need to take an explicit action to make it insecure (the antonym being Insecure by Default where the easiest thing is the insecure thing). A great example of this is the modern MVC frameworks that default to encoding all variable output, and the developer must call a specific API if they want to add the variable contents to the DOM unencoded. This has significantly reduced XSS from the server as the developer has to go out of their way to inject user input into the DOM without encoding.
Solid framework and library design are great mechanisms to eliminate or reduce classes of vulnerability, but core philosophies of languages are as well. Managed languages almost completely eliminate memory issues that are rampant in unmanaged languages by making it so that the easiest and more natural code is memory safe. In a managed language you can just assign one string to another, an operation that occurs with significant frequency, without concern for buffer overflows, while in an unmanaged language if you simply do string copying without explicit validation you will inevitably introduce buffer overflows.
Similarly, strongly typed languages where every variable requires an explicit type decorator (string, integer, etc.) specifying what sort of data can be in the variable, and where there are multiple automatic controls to enforce that only data conforming to that type ever goes into that variable, are easier to write free of issues around unexpected/unanticipated data formats than weakly typed languages that allow most any data in any variable. In C# if the developer wants to ensure the quantity of an item the user selected when ordering an item makes sense they can simply use the
The easiest way to differentiate between a Secure by Default pattern and an Insecure by Default pattern is that typically, when trying to determine if the code is vulnerable, a person doing a code review will look for the PRESENCE of a condition in Secure by Default patterns, and look for the ABSENCE of a condition in Insecure by Default patterns. E.g. in ASP.net MVC they will look for the presence of
HTML.Raw() to see if XSS is possible, while in C they will be looking for the absence of length checks and null terminator logic when working with buffers. It is much, MUCH easier for a reviewer (and automation) to notice the presence of something that renders things dangerous versus the absence of code that renders it safe.
Decision Density represents how frequently the developer deals with patterns that are Insecure by Default and must consciously AND correctly make the correct decision to avoid introducing a vulnerability. The greater the decision density - the more frequently the developer needs to be perfect in their decision making while authoring code - the more frequently vulnerabilities will be introduced. This also increases the things code reviewers need to be mindful of when looking for vulnerabilities, and the number and complexity of detections analysis tools must implement to find vulnerabilities.
Decision Density is the combination of the number of Insecure by Default patterns present in the language/ framework/ library and the frequency they are used. C# has
unsafe code blocks which has a high amount of Insecure by Default patterns, but
unsafe both represents a tiny fraction of the overall C# language, and a rarely necessary part of the language, so has a negligible impact on the decision density (it also explicitly identifies the Insecure by Default code so that it is easily noticeable by code reviewers, and they can spend extra time on that code specifically). On the flip side all C/C++ code has similar characteristics as C# unsafe, with direct memory management and pointer access, but represents both a large amount of the overall languages AND almost universally occurring parts, so has a tremendous impact on the decision density.
Another consideration with decision density is how esoteric the knowledge to make the correct decision is. The vast majority of cryptography libraries are problematic as they simultaneously expose the developer to many different decisions (which algorithm to use, what cipher mode, what key length, does it need an IV, does that IV need to be random, is the algorithm that generated the IV really random, where did the crypto key come from, is it stored securely, how is it rotated, is it also random, are you doing fixed time comparisons when appropriate, do you need integrity as well as confidentiality, etc.) AND the developers almost never have the expertise to understand the nuance of the decisions.
Generally, memory managed, strongly typed languages used in conjunction with modern but mature frameworks and libraries will have the best ratio of Secure by Design to Insecure by Design elements, and the lowest decision density, and so will typically have the lowest rate of introduction of vulnerabilities. Changing any of those characteristics starts increasing the rate of vulnerabilities, though to varying degrees - there are weakly typed languages that have worked to make data validation easy, older frameworks that have tried to deprecate their more dangerous elements, and semi-managed languages that have tried to add stricter semantics that can be validated at compile time to minimize memory safety issues.
Regularness of the Language
As European Languages go, English is by far the worst, and the reason is how irregular it is. This is because English is not the result of one language evolving, but rather numerous languages being smashed together. Various Celtic, Germanic, and Latin root languages intertwining and then evolving interdependently. So we have multiple phonetic systems, multiple conjugation schemes, multiple tensing schemes, multiple singular/plural schemes, etc. before we even get into the dialects of the language. Learning English is learning multiple language systems at once, without a clear rubric for when to use one system over the other (quick, when does I come before E. Nope, you are wrong, there are 3 dozen words that break that rule). And this makes it an incredibly error prone language - a challenge to learn, and a perpetual challenge to use 100% "correctly" (sometimes with disagreement about what correctness even means).
Programming languages have similar characteristics. The more consistent a language is, both within a single codebase, between different codebases, and between different processor Instruction Set Architectures (ISA)/ Language VMs / OSes / and Compilers/Interpreters, the less error prone it will be, the easier it will be for someone to learn, and the easier it will be for a person or automation to review. And on the flip side, the more frequently the answer to "what does this code do" is "well, it depends", the more error prone it will be, the harder to learn, and the harder it will be to detect the mistakes. What it "depends" on also matters - if "it depends on these other conditions also in the source code" that's typically easier to detect mistakes than "it depends on what data the user provides" or "it depends on the environment this runs in". For example:
x = y + w + z;
But that relatively simple line of code has a second "it depends" - because some languages base their integer size on the size of the processor registers, it might be that this code only produces an integer overflow on 32-bit processors but never 64-bit. .Net is consistent about the size of an integer, but depending on compiler flag an overflow will either generate an exception or simply wrap around the value and continue executing (which can get even more confusing if some of the assemblies are compiled with the flag and others aren't).
Another pivot of regularness is the consistency in API/Library/Frameworks that would be commonly used. .Net's class libraries offer most of the common functionality that would be needed, authored with a "fairly consistent" API design philosophy (the fragmentation between Standard and Core was not great, but a lot has been done to try and correct this since and converge). This contrasts in its favor to other languages where its necessary to piecemeal libraries from many different authors to get a similar collection of functionality. Even simple things using multiple libraries where some opt to throw exceptions on error and others opt to return an error code make it significantly more likely that an unhandled error will occur. More complex scenarios, like simultaneously using Angular and Bootstrap frameworks, create such different semantics within one codebase that its effectively like using two difference languages at the same time.
Legibility is how much time and effort it takes to read code and understand both the intent of its author AND how it will behave when executing. Coding style definitely impacts legibility, as "clever" code is often unobvious code. For example, these two code snippets:
x ^= y;
y ^= x;
x ^= y;
temp = x;
x = y;
y = temp;
both do the same thing, swapping the values in x and y. The first example is faster in execution as there isn't an additional memory allocation, but it is MUCH less legible. If someone else reading the code is not aware of the esoterica of the XOR trick to swap values without additional memory allocation that code's function is going to be completely opaque to them (also, this is absolutely the sort of optimization a compiler should do for the developer behind the scenes, rather than the developer needing to be explicit about it).
Coding style / Coding choices absolutely impact legibility, but languages can both add features that are inherently hard to intuit and encourage more opaque coding choices. The "Regularness" of a language, and its decision density impact the legibility as it requires a reader to finely inspect the minutiae of the code. For example, when reviewing C and C++ its often necessary to have a notepad at hand to track memory allocation, buffer length, buffer state, room for a null terminator, etc. on every buffer operation. It's necessary to do math fairly constantly in order to understand what will actually happen when a concatenation happens. It's similarly necessary to keep a running track record of what a pointer is pointing at (especially with pointer math), and the state of that piece of memory - to read C/C++ you basically need to do symbolic execution on a piece of paper because the decision density and insecure by default code patterns are so prolific that simply reading the code is frequently insufficient to understand what its actual runtime behavior will be; there is too much to track mentally. (also, a disturbing large number of C devs think that the harder their code is to read, the better they are as developers. These will also be the same people that argue that insecure C isn't from the language but from the people writing it. These are toxic viewpoints.)
There are a lot of different elements that impact legibility, and this list is not exhaustive, but some broad taxonomy of elements:
Understandability of the details of a variable - while reading through the code, how easy or hard it is to look at a variable in the code, understand what it is and does, what error states are possible for that variable, and the implications for assignments, operations, and usage of that variable at each point. There are many factors this impact this, with a non-exhaustive list: -
- how explicit the existence of the variable even is (for example, properties on dynamic objects have no certainty to even exist)
- type & format/range information (and how reliable it is and how reliable the casting protections are)
- related, how clear the unnallocated/error states of a variable are. It may be pedantically correct to say null, undefined, and empty are different states, but it also means there is a much higher chance that guarding logic will forget to check for one of the states indicating "this variable isn't usable".
- how direct the reference to the underlying data really is (for example pointers, ESPECIALLY pointers to pointers, or pointers calculated based off of memory math, require more effort to understand)
- how clear and obvious the distinction between when a variable assignment is a reference to an existing object versus creating a copy of that object (which can be made more confusing when all assignments start as references, and become copies on object modification).
- how strict the language is about similarly named variables occupying different scopes and the semantics for differentiating those variables For example, it's not great that many languages allow for there to be different variables named "temp" simultaneously in the global, local function, and block scopes all referenced just by the name "temp". It's even worse when a language requires a special decorator to reference a global variable locally and will automatically create a local variable if the decorator isn't present (glaring squarely at you PHP).
Object Clarity is a specific case of "understanding the details of a variable" above, revolving around considerations of how easy it is to understand the details of an object when reading through the code. The most confounding factor is when the language features dynamic object modification, as it is no longer possible to rely solely on the written object definition to understand what it does, but rather a person needs to be aware of any modification to the object definition that happened in the control flow between the object instantiation and a specific usage. While not as challenging, an unobvious inheritance model of objects can also create a lot of confusion with things like multiple inheritance, allowing circular inheritance, etc. (does the object inherit the implementation of this method from the first or last class in the inheritance list?).
Obviousness of Control Flow - how easy or hard it is to understand when different code executes and in what order. I'm treading on well worn ground here, and folks like Gerard Holtzmann called out control flow patterns bad for legibility as a risk for safety critical code years ago. Specifically, things like complex flow patterns that
goto, recursion, pre-processor generated code (both macros and conditional compilation blocks), and function pointers make it really easy to author code that is challenging to read and analyze how it will behave when executed. This doesn't mean that the presence of these elements will always be bad, but that they do take intentionality in order to create easily legible code (for example, if you do use
goto at least always only jump down, and stay within the same function definition).
Mutli-Threading or Multi-Process is a clear challenge with legibility as code can now execute in parallel OR can execute serially while blocking other threads, depending on process attributes and execution conditions at runtime. For most code threading or a multi-process architecture is an intentional choice that can be planned for, though it's pretty common for end user apps to have a distinction between the main application logic and the UI eventing (in Win32 this is typically different threads, in ElectronJS they are full on different processes).
The increased prevalence of asynchronous programming patterns have dramatically increased the amount of code whose control flow is hard to intuit. From a technical perspective, with the amount of network calls that so much code relies on, its understandable why it would be desirable to not serially block on each one. However, it makes it incredibly challenging to understand the application state when a piece of code executes simply by looking at the source code.
Dynamic code generation, dynamic module loading, & code overwriting - a specific instance of control flow legibility challenges is when the code that is run isn't the code that is directly written. The various methods of runtime code creation obviously impact code legibility as a reviewer needs to manually assemble the code that will be present at runtime rather than simply read the executing logic. Since the days of C its been possible to dynamically generate code, but modern web programming patterns make it a regular occurrence. Its clearly useful (though I think a pretty good argument could be made that website design and UX has gotten worse in the last decade), but it means the only real way to know what code will execute at a point in time is to view the current page source in the browser at that point in time. It would be nice if we could just read it all in the source repository first.
Visual Parsability of the Code - Some languages force structure that makes it easier to visually understand things like scoping, while others have opted to make decisions that challenge a human's ability to easily understand code (LISP deciding the only formatting punctuation being () really does not do humans any favors - it should not be necessary to count open and close parens to understand what is happening). Most languages put thought into the punctuation that they use, but took pains to be un-opinionated in influencing the actual code layout on the page. Python is the most obvious exception by making indenting a core part of its syntax, and has its share of detractors that dislike that a language would be opinionated about code layout. However source code that makes a reviewer work to simply parse its structure is bad source code and opinionated style enforcement that makes details like scoping clear with an easy glance are good.
Humans aren't very good at reading and reviewing even the most legible code. I don't know who to attribute the quote, but there is a fairly true aphorism that if you give a code reviewer 100 lines of code to review they will find 100 problems, but give them 1000 lines and they will find 0 problems. Reading source code is tedious and the more we are asked to do of it in one stretch the less our concentration will be up for the task. So we use automation - linters, compilers, static analyzers, dynamic analyzers, etc. - that don't suffer from the tedium to hunt for mistakes at scale. However, other than the visual parsability of the code, everything that impacts legibility for humans also impacts how easy or hard it is for automation to analyze the language for problems. Typically the more effort a human would have to expend to find a problem, the more sophisticated the automation necessary to find it, and the less likely it will have comprehensive checks for all of the potential mistakes.
(As an aside, when you see a company that says "our analysis finds that these languages are more error prone" unless they are very specific about how they crafted the study and about how they benchmarked their analysis capability as being equally capable between languages, they are very likely actually saying "our analysis is better at finding errors in these languages and worse at these languages". I don't actually know how flawed the Veracode Language report is because I am not giving their sales team my contact info to read it, but I do know [this Dark Reading Article](https://www.darkreading.com/threat-intelligence/java-net-developers-frequent-vulnerabilities) based on it draws absolutely the wrong conclusions. Also Dark Reading - most of the OWASP top 10 are issues that lend themselves poorly to static analysis, which Veracode does, and in many cases lend themselves poorly to all forms of automation. If a company tells me their static analysis solution comprehensively addresses broken access control, security misconfiguration, integrity failures, logging and monitoring failures, insecure design, crappy identity and authentication, etc. as general problem sets, I know either that they do not understand the space or that they hope that I don't [in either case, screw them], and that at the very best case I am going to have to disable a ton of garbage detections in their product. Specific patterns that contribute to each problem lend themselves to static analysis [for example, when the security misconfiguration is from "Infrastructure as Code" or "Configuration as Code" it can be statically analyzed, but that's only a small subset of security misconfiguration mistakes], but part of the reason they even are in the OWASP top 10 right now is because they aren't the sorts of things that lend themselves to being solved with static analysis as a general class of issue. ).
While everything other than visual parsability impacts analyzability, certain things are especially thorny:
- While a human can often fairly easily take into account external factors (environment configuration where the code will run, trust and assumptions about external code, the overall threat model of the application, etc.) and make "reasonably safe" assumptions about them, it takes significant effort to create automation that can factor these elements in.
- Dynamic code states (either dynamic code generation, or code semantics that are dependent on runtime conditions such as the addition/concatenation example above) are a challenge both for static analysis (its very hard to construct those states via just static information) AND dynamic analysis (its very challenging to trigger all of the application flows that create those dynamic states).
- Same thing for complex code flow logic.
- The more context a language requires to identify a problem, and the less proximate that context is to each other, the harder the analysis problem. For example, while MVC patterns have made many more things secure by default, it has also separated context into three components making the remaining issues harder to analyze for (in the static analysis space this means you MUST use a tool with inter-procedural, arbitrary depth flow analysis capabilities to find issues with high accuracy; linters and local scope or limited data flow analyzers simply cannot connect all of the context).
- If analysis requires being aware of the state of specific memory addresses, its going to be horrendously more complicated than if it does not.
- Contextualizing the meaning of the logic in condition statements (if/then, switch, looping operators, etc.) is very hard to generalize, and the more the security depends on assessing the guarding conditional logic, and the more complex that logic, the more likely flaws will go unnoticed.
- More broadly, language features whose failure modes don't lend themselves to easy generalization are hard to analyze for. For example, finding type confusion issues in COM isn't one, or two, or three detections - there are sooo many special cases that don't abstract and generalize well that its very hard to detect the mistakes comprehensively.
So What's the Worst Programming language
If the first paragraph didn't give it away, C and C++ are in a clear tie for most insecure programming language in heavy use, as in each area above they are the worst, or nearly the worst. You could maybe squeeze Assembly in the mix, but I'd argue that its usage is too niche to really count (humans suck at writing assembly for any ISA, but fortunately rarely do). C and C++ have terrible defect density, the greatest amount of Insecure by Default patterns, terrible regularity, and are atrocious for both legibility and analyzability. This does not mean you can't write good C and C++ (see below), but it means you have to work at it, you have to opt into the safe language features and heavily restrict all of the unsafe patterns and show a level of discipline unnecessary in other languages. And very few people do those things.
So why hasn't DOM code brought about the cyber-pearl harbor-911-armegaddon?
A couple of reasons - first, no matter how horrendous a memory managed language is, it cannot aspire to the level of dumpster fire that a completely unmanaged language operates in. In an unmanaged language every single object access and assignment, every single string and buffer operation, are potential serious vulnerabilities. Memory managed languages, no matter how many problems they have, will never have the vulnerability decision density of unmanaged languages.
Given how effective sandboxing has been to limit the exploitability of the DOM, there is a strong argument to be made that sandboxing should be designed into all future language runtimes, and likely taken further than with the DOM, to include sandboxing third party modules from the main code, and a granular permissioning engine to have explicit control over what resources a sandboxed module can touch.
Making Existing Languages more Secure
Knowing what lends a language to more secure outcomes is great if starting from scratch with a new code base, or if working on a new language / library / framework, but what about the billions of lines of code already in existence?
Well, depending on the language, rewriting it should be an option. While code rewrites are a significant cost, maintaining problematic code is a long term drain that can very easily be quite a bit more costly. Google and Microsoft have spent a couple decades, and tremendous effort, working to secure C & C++, and both are backing Rust for very good reason. I appreciate that despite long term costs of continuing with problematic code, the upfront cost of rewriting it can be too much, however I think there is real hope here. The rate that Generative AI models have advanced in such a short time suggests that very realistically we will have AI driven transpilers that can turn C++ into equivalent functionality Rust in the not too distant future (couple years maybe) - early versions will likely still require a decent amount of manual editing but we don't need perfect, just cost effective.
But that's the (probably nearer than we appreciate) future; the options today are to adopt AND ENFORCE restrictions and conventions on hard to secure languages that make them easier to secure. C++ advocates will argue about all of the modern features that can create robust code, but the problem is that generally all of those advancements are optional. There is a very big difference between "developers can write type and memory safe C++" and "developers do write type and memory safe C++". The C++ Core Guidelines have some great things in them, but how many C++ developers are even fully aware of them, much less actively using them (and given that Microsoft's GSL is the most comprehensive implementation, how many are willing to set aside partisan ideology and adopt it)? Expecting developers to be aware of all new advancements, understand their security value, and consistently adopt and use them despite their existing familiarities is quite a lot. IDE/Compiler/Analysis technologies should be leveraged to enforce adoptions of safer modern features and restrictions of older unsafe language features.
In terms of what to enforce:
- Look for the language features that contribute to significant decision density, create safe alternatives, and ban the dangerous patterns. When the dangerous patterns cannot be outright banned, restrict them to clearly annotated areas that make it easy to identify their usage and audit them (e.g. the
unsafekeyword in C# and Rust). It's ridiculous that strcpy/strcat/etc. only throw compiler errors in MSVC rather than in all C compilers (its 2023 - we stopped using leaded gasoline, we can stop using absurdly error prone C api).
- Use strict stylistic and composition enforcement to drive consistent patterns, more regular code, simplified control flow, more obvious variable states, etc. Some of this can be done with a linter or optional compiler rules (for example, in .Net, the Roslyn compiler supports additional enforcement rules that can be defined in "analyzer" assemblies), but more powerful static analysis is sometimes necessary to enforce a more simple control flow logic
- Less frequently applicable, but Code Annotation Languages can be used on top of some languages to explicitly add context both for other developers (or just future versions of the author) and analysis tools. It's a shame that the rest of the world had such a viscerally negative reaction to MSVC's Source-code Annotation Language (SAL) out of partisan dislike, as SAL does a very good job contextualizing and codifying the intent of code that C & C++ otherwise lack.
strict features of the compiler to switch the optional TypeScript mandatory, some pretty sizable codebases have successfully fully migrated to safer patterns. It would be nice to see the same approach be deliberately applies to C and C++ (we can't just analyze out of their problems - we need to reduce, restrict, and mandate to limit what problems are possible in the first place)(or just wait until AI solves the problem for us