[Read Paper USENIX 2019] Less is More: Quantifying the Security Benefits of Debloating Web Applications

Introduction#

This article is a precursor to a paper on PHP debloating presented at USENIX this year (referred to as LIM, Less is More).

profiles: configuration, scenarios; profiling: analysis (performance analysis, behavior analysis, etc.); profiler: analyzer

bloat: inflation; debloat: deflation

ambient authority: Ambient authority is a term in system access control research. When a subject specifies the name of the object it needs and the action it will perform on that object, we say that the subject is using ambient authority.

monkey testing: Monkey testing is a technique where users test applications or systems by providing random inputs and checking behaviors or whether the application or system crashes. Monkey testing is often implemented as random automated unit testing.

Overview of paper#

The article uses dynamic analysis to obtain code coverage for PHP web applications and deletes unused code to achieve the goal of debloating.

In summary, it consists of the following parts:

Selecting CVE vulnerabilities and mapping them to web apps.
Selecting four different user groups and simulating the use of the app.
Recording code coverage and analyzing unused files/functions.
Performing debloating based on coverage and obtaining a debloated app (mainly file-level debloating and function-level debloating).
Simulating normal usage of the debloated app to evaluate whether functionality remains intact.
Conducting known CVE exploits on both the debloated app and the original app to assess the effectiveness of debloating (i.e., whether debloating removed critical code that caused vulnerabilities) and other evaluations and comparisons.

Here is a Figure 1, which contains a typo "Expoits."

Background#

The principle of software debloating has been successfully applied to operating systems (removing unnecessary code from the Linux kernel), shared libraries, and compiled binary applications.

This paper proposes for the first time evaluating the applicability of debloating on web apps to see if critical code that causes vulnerabilities can be removed.

Motivation for web debloating#

The author uses Symfony's CVE-2018-14773 as an example.

This framework supports a legacy IIS header that may lead to abuse. If the server does not need to use that header, the related supporting code can be removed, i.e., debloating.

Target PHP web apps#

phpMyAdmin: Database management
WordPress: Blog management
MediaWiki: Wiki management
Magento: E-commerce management

Mapping vulnerabilities to source code#

Each web app selected the 20 most critical CVEs based on CVSS scores, all of which are CVEs from 2013 and later.

Due to different affected versions for different CVEs, vulnerabilities had to be mapped across multiple versions (as shown in the table below).

The affected versions and line numbers for each CVE are recorded in the database.

Web Application	Version	Known CVEs(≥2013)
Magento	1.9.0, 2.0.5	10
MediaWiki	1.19.1, 1.21.1, 1.24.0, 1.28.0	111
phpMyAdmin	4.0.0, 4.4.0, 4.6.0, 4.7.0	130
WordPress	3.9.0, 4.0, 4.2.3, 4.6, 4.7, 4.7.1	131

Simulating web app usage#

There are four methods to simulate app usage to achieve as broad and deep functional coverage as possible, or code coverage.

General tutorials (executed using Selenium scripts)
Monkey testing
Crawling
Vulnerability scanning

Recording web app code coverage#

The PHP analyzer is provided as a PHP extension, and its principle is to modify the PHP engine to collect code coverage. The one used in this paper is XDebug.

The direct idea was to add xdebug_start_code_coverage() and xdebug_get_code_coverage() to the end of each PHP file, but the author encountered some difficulties.

Since any PHP file can call exit() or die() to exit early, the above two recording functions need to be added before the exit functions.

Additionally, a shutdown function needs to be registered and added to the end of the shutdown function queue.

Finally, for destructors, if a class is destroyed after the shutdown function, that part cannot be covered, so the destructor was rewritten to register itself during execution.

Debloating strategies#

File-level debloat: Remove PHP files that are not executed.
Function-level debloat: A finer-grained debloat than file-level, which can remove unexecuted code blocks within functions.

Here, debloating does not completely delete the code but replaces it with placeholders. If the code execution reaches these placeholders, the program will exit and log information about the missing functions.

Subsequent evidence shows that this method is very effective, recording many files/functions that should not be deleted.

Experimental results#

The standard for measuring code quantity is not simply the number of lines of code, but Logical Lines Of Code (LLOC), which does not count comments, blank lines, necessary syntax structures, etc.

Clearly, function-level debloating reduces more code than file-level debloating, which is also related to the coding practices of these four different projects (for example, WordPress is less dependent on external packages, while Magento and MediaWiki are developed in a more modular way).

Reduction in cyclomatic complexity

Cyclomatic complexity (CC), also known as conditional complexity, is quantitatively represented by the number of independent paths, which can also be understood as the minimum number of test cases needed to cover all possible scenarios.

I learned this concept in my junior software engineering course.

During the debloat process, cyclomatic complexity also decreased, indicating that the debloating method could remove complex instructions and execution paths.

Reduction in CVEs after debloating

The results showed that 38% of vulnerabilities could be removed through file-level debloating, while 10% to 60% could be removed through function-level debloating (with phpMyAdmin and Magento having many external libraries and WordPress being relatively singular).

Note: The rule for determining whether a vulnerability has been debloated in this paper is that all files/functions originally covered by a certain vulnerability are deleted, rather than just removing one link (although in most cases this already breaks the exploitation chain, making the vulnerability practically unexploitable).

I feel that the author did not clearly explain the specific rules for how debloating was based on the previous four scenarios.

Because there are two situations: one is normal usage, which does not trigger vulnerabilities (like tutorials), and the other is intentionally exploiting vulnerabilities (like vulnerability scanning) or causing the application to enter an abnormal state, while monkey testing can produce both situations.

I tentatively assume that debloating is based on the following rules:

Conducting debloating while ensuring the program runs normally, i.e., prioritizing functionality over the potential for vulnerabilities.

Code outside of files/functions covered by normal usage should be deleted.

If the paths covered by malicious exploitation overlap with those covered by normal usage, the non-overlapping parts are deleted, while the overlapping parts are retained if they meet the requirements of the first rule.

In short, code that is not covered by normal usage needs to be deleted.

Impact of different vulnerability types

The degree of debloating varies for different types of vulnerabilities. For example, command execution and SQL injection vulnerabilities are easier to debloat (often found in less frequently used modules), while crypto and cookie-related vulnerabilities are harder to debloat (often found in core components that cannot be deleted).

Checking for POI vulnerabilities

POI, or PHP Object Injection, is essentially a PHP deserialization vulnerability in CTF.

The author used PHPGGC, a tool for generating POP exploit chains, to exploit the debloated app.

The results showed that function-level debloating successfully removed all vulnerabilities corresponding to the exploit chains present in PHPGGC (WordPress is not included here because it does not rely on external packages).

Improper introduction of dev packages

Composer by default places external software in the vendor directory. If this directory can be accessed due to server misconfiguration, it may be exploited for RCE (e.g., PHPUnit).

Experimental results indicate that phpMyAdmin and Magento have this issue.

Qualitative analysis of deleted code

Due to the large number of deleted files and code, this paper used the k-means clustering algorithm to produce file groups and used TFIDF maximum frequency limits to ignore common parts appearing in more than 50% of file paths.

Exploit testing on the debloated app

Finally, the author collected CVEs targeting these four PHP web apps that exist in the Metasploit framework and wrote them into POCs based on publicly available vulnerability information.

After verifying that the original versions of the web apps could be successfully exploited, tests were conducted on the debloated versions, with half failing (4 out of 8).

This result indicates that while debloating is not a panacea for web app security, it is effective.

Performance analysis#

Since code coverage tools increase performance overhead, this section discusses the overhead analysis of the XDebug tool, comparing Selenium scripts with and without XDebug.

The results show that the overhead for the four web apps increased in execution time, CPU consumption, and memory consumption.

However, this overhead can be reduced by improving the coverage calculation method, such as calculating coverage offline, which will be discussed later.

Limitations and future work#

Summarizing the previous work, debloating can reduce hundreds of thousands of lines of irrelevant code, decrease cyclomatic complexity by 30% to 50%, and delete about half of the code related to CVEs that cause vulnerabilities. Even for vulnerabilities that cannot be deleted, debloating can remove some gadgets, making them harder to exploit.

The author believes that this work is still incomplete and has the following limitations:

Lack of exploitable vulnerabilities

There is a lack of publicly exploitable vulnerabilities, including various vulnerability reproductions and detailed descriptions.

The author also mentioned the absence of automated exploitation scripts for web apps (like BugBox), as this could greatly assist researchers' work.

Dynamic code coverage

Web debloating heavily relies on dynamic code coverage analysis, and even with four replicable and unbiased application configuration scenarios, it still cannot claim to cover all benign states of web apps.

In short, the depth of coverage is insufficient, and the author intends to follow up through crowdsourcing and user studies.

Additionally, since this pipeline removes unnecessary features for specified user groups, it cannot perform general static analysis work. However, the author proposed that static analysis could be conducted on the code after debloating to ensure that the required functionality for these user groups still exists.

Handling requests to deleted code

When real users request deleted code, how should it be handled? Simply exiting the application and returning an error is not enough; the deleted code should be reintroduced to handle user requests, and it must be determined whether the request is malicious beforehand.

Metrics for measuring debloating effectiveness

This paper uses reductions in cyclomatic complexity, logical lines of code (LLOC), CVEs, and POP chains as four metrics to measure effectiveness.

However, each line of code contributes differently to the program's attack surface, and the CVE standard does not apply to proprietary software. Additionally, CVEs need to be manually mapped to verify exploitability, which is labor-intensive.

Efficiency of debloating

The efficiency of debloating modular applications is significantly different from monolithic applications (like WordPress).

Here, the author mentions several static analysis debloating works, as well as debloating work for web clients (reducing the attack surface of Chrome), and a dynamic analysis work for custom PHP web applications (the limitation of this work is that it cannot quantitatively determine the number of vulnerabilities reduced because it is custom).

Conclusion#

Since my own thoughts have already been summarized in the previous Overview of paper, I will just paste the original abstract and conclusion here.

Abstract

As software becomes increasingly complex, its attack surface expands enabling the exploitation of a wide range of vulnerabilities. Web applications are no exception since modern HTML5 standards and the ever-increasing capabilities of JavaScript are utilized to build rich web applications, often subsuming the need for traditional desktop applications. One possible way of handling this increased complexity is through the process of software debloating, i.e., the removal not only of dead code but also of code corresponding to features that a specific set of users do not require. Even though debloating has been successfully applied on operating systems, libraries, and compiled programs, its applicability on web applications has not yet been investigated. In this paper, we present the first analysis of the security benefits of debloating web applications. We focus on four popular PHP applications and we dynamically exercise them to obtain information about the server-side code that executes as a result of client-side requests. We evaluate two different debloating strategies (file-level debloating and function-level debloating) and we show that we can produce functional web applications that are 46% smaller than their original versions and exhibit half their original cyclomatic complexity. Moreover, our results show that the process of debloating removes code associated with tens of historical vulnerabilities and further shrinks a web application’s attack surface by removing unnecessary external packages and abusable PHP gadgets.

Conclusion

In this paper, we analyzed the impact of removing unnecessary code in modern web applications through a process called software debloating. We presented the pipeline details of the end-to-end, modular debloating framework that we designed and implemented, allowing us to record how a PHP application is used and what server-side code is triggered as a result of client-side requests. After retrieving code-coverage information, our debloating framework removes unused parts of an application using file-level and function-level debloating. By evaluating our framework on four popular PHP applications (phpMyAdmin, MediaWiki, Magento, and WordPress) we witnessed the clear security benefits of debloating web applications. We observed a significant LLOC decrease ranging between 9% to 64% for file-level debloating and up to an additional 24% with function-level debloating. Next, we showed that external packages are one of the primary sources of bloat as our debloating framework was able to remove more than 84% of unused code in versions that used Composer, PHP’s most popular package manager. By quantifying the removal of code associated with critical CVEs, we observed a reduction of up to 60% of high-impact, historical vulnerabilities. Finally, we showed that the process of debloating also removes instructions and classes that are the primary sources for attackers to build gadgets and perform POI attacks. Our results demonstrate that debloating web applications provides tangible security benefits and therefore should be seriously considered as a practical way of reducing the attack surface of web-applications deployments.