[读论文 USENIX 2019]Less is More: Quantifying the Security Benefits of Debloating Web Applications

前言#

这篇文章是今年 USENIX 一篇关于 php debloating 文章的前置文章（被简称为 LIM，Less is More）

profiles：配置，情景；profilling：分析（性能分析、行为分析等）；profiler：分析器

bloat：膨胀；debloat：去膨胀

ambient authority: 环境权限是系统访问控制研究中的术语。当主体指明它需要的客体的名称和它将要对该客体执行的动作便可以完成该动作的时候，我们称该主体使用了环境权限。

monkey testing: 猴子测试是一种用户通过提供随机输入并检查行为或查看应用程序或系统是否崩溃来测试应用程序或系统的技术。Monkey 测试通常作为随机的自动化单元测试来实现。

Overview of paper#

文章利用动态分析获取 php web 应用的代码覆盖情况，并把未使用的代码删除，来达到去膨胀的目的

概括一下，大致有以下几个部分：

挑选 CVE 漏洞并映射到 web app 中
挑选了四组不同的用户集，并对 app 的使用进行模拟
记录代码覆盖率并分析未使用的文件 / 函数
根据覆盖情况进行 debloat 并得到 debloated app（主要有文件级 debloating 和函数级 debloating）
对 debloated 的 app 进行正常的使用模拟，评估功能是否仍然正常
把 debloated 的 app 和原始 app 同时进行已知 CVE 的 exploit，评估 debloating 的效果（即 debloating 是否删除了导致漏洞的关键代码），以及其它各类的评估和对比

这里配了一个 Figure 1，里面有一个错别字 Expoits

background#

软件 debloating 的原理已在操作系统（从 Linux 内核中删除不必要的代码）、共享库和编译的二进制应用程序上成功过

本文中第一次提出了在 web app 上评估 debloating 的适用性，看看能否删除导致漏洞的关键代码

web deblaoting 的 motivation#

作者用了 Symfony 的 CVE-2018-14773 来讲述

该框架支持一个遗留的 IIS 标头，可能导致滥用。如果服务器不需要使用该标头，则可以从中删掉相关对其支持的代码，即 debloating

目标 php web app#

phpmyadmin：数据库管理
wordpress：博客管理
mediawiki：维基管理
magento：电商管理

漏洞映射至源码#

每个 web app 根据 CVSS 评分挑选了 20 个最 critical 的 CVE，且全部是 2013 年及其后的 CVE

由于不同 CVE 的影响版本不同，为了应用所有这些 CVE，不得不跨多个版本映射漏洞（如下表）

每个 CVE 所影响的版本和行号都记录在了数据库中

Web Application	Version	Known CVEs(≥2013)
Magento	1.9.0, 2.0.5	10
MediaWiki	1.19.1, 1.21.1, 1.24.0, 1.28.0	111
phpMyAdmin	4.0.0, 4.4.0, 4.6.0, 4.7.0	130
WordPress	3.9.0, 4.0, 4.2.3, 4.6, 4.7, 4.7.1	131

web app 使用模拟#

有如下四种方法模拟 app 的使用，以达到尽可能广和深的功能覆盖率，或者说代码覆盖率

一般教程（用 selenium 脚本执行）
monkey testing
爬虫
漏扫

记录 web app 代码覆盖率#

php 分析器作为 php 扩展提供，原理是修改 php 引擎来收集代码覆盖率，文中使用的是 XDebug

直接的想法是将xdebug_start_code_coverage()和xdebug_get_code_coverage()添加到每一个 php 文件的末尾，但是作者遇到了一些困难

由于任意 php 文件都可以提前调用exit()或die()来提前退出，故需要把上面两个记录函数加在退出函数之前

其次，还需要注册 shutdown 函数，并把其添加到 shutdown 函数队列的末端

最后是析构函数，如果类在 shutdown 函数之后被销毁，则这部分 cover 不到，故重写了析构函数使得它们在执行时注册自己

Debloating 策略#

文件级 debloat：删掉没有执行的 php 文件
函数级 debloat：比文件级更细粒度的 debloat，可以删除函数中未执行的代码块

这里的 debloating 并不是完全把代码删除，而是替换成占位符，如果代码执行到这些占位符，则程序会退出，并记录相关缺失函数的信息

后续证明这种方法非常有效，记录了很多不应该删除的 file/function

实验结果#

衡量代码数量的标准不是单纯的代码行数，而是 Logical Lines Of Code（LLOC），其不计入注释、空行、必要语法结构等行数

显然，函数级 debloating 比文件级 debloating 所减少的代码要多，当然也和这四个不同项目的代码实践风格有关（比如 wordpress 就不那么依赖外部 package，比如 magento 和 mediawiki 是以更加模块化的方式开发的）

圈复杂度的减少

圈复杂度（Cyclomatic complexity，CC）也称为条件复杂度，其数量上表现为独立路径的条数，也可理解为覆盖所有的可能情况最少使用的测试用例个数

发现大三的软件工程课学过这个概念

在 debloat 过程中，圈复杂度也有降低，说明应用的 debloat 方法能够删除复杂的指令和执行路径

debloating 后 CVEs 的减少

结果发现 38% 的漏洞能通过文件级 debloating 删除，10%~60% 是通过函数级 debloating 删除的（有大量外部库的 phpmyadmin&magento 与较为单一的 wordpress 是两个极端）

注意：这里本文所认为漏洞是否被 debloat 掉的规则是，原本某个漏洞利用所 cover 的所有 file/function 全部被删除，而不是仅删掉其中一个环节（虽然大部分情况这样已经破坏了利用链而导致漏洞实际上不能利用了）

这一部分我觉得作者关于 debloating 是如何基于前面四种场景进行的具体规则，并没有说的很明白

因为有两种情况，一种是正常使用，其不会触发漏洞（如 tutorial），而另一种是刻意进行漏洞利用（如漏扫）或者导致应用进入异常情况，而 monkey testing 同时会产生两种情况

我暂且认为基于以下的规则进行 debloating：

在保证程序正常运行的基础上进行，即在功能性和会导致漏洞的可能性冲突中给功能性让步

被正常使用所 cover 的 file/function 之外的代码要删除

若恶意利用 cover 的路径和正常使用 cover 的路径有部分重叠，那么非重叠部分就被删除，重叠部分满足第一条的规则要求而保留

其实说白了就是正常使用所 cover 之外的代码需要删除

不同漏洞类型的影响

不同的漏洞类型的 debloat 程度也不同，比如命令执行，SQL 注入等漏洞容易被 debloat（常常存在于不常用的模块中），而 crypto，cookie 相关的漏洞则不容易被 debloat（常存在于主要的加密函数中，属核心组件不能删除）

POI 漏洞的检查

POI，即 PHP Object Injection，在 CTF 里面其实就是 PHP 反序列化漏洞

作者使用了 PHPGGC，一个生成 POP 利用链的工具来进行对 debloated app 的 exploit

结果显示函数级 debloating 成功移除了所有 PHPGGC 中存在的利用链所对应的漏洞（wordpress 不在此列中，因为 wordpress 不依赖于外部软件包）

dev 包的不当引入

composer 默认将外部软件放在 vendor 目录下，如果该目录恰好能通过服务器的错误配置而访问，则可能可以被利用来进行 RCE（比如 PHPUnit）

实验结果表明 phpmyadmin 和 magento 存在此问题

对删除的代码的定性分析

由于删除的文件和代码过多，本文使用的是 k-means 聚类算法来产生文件分组，并使用 TFIDF 最大频率限制来忽略出现在超过 50% 的文件路径中的共同部分

对 debloated app 进行 exploit 测试

最后作者收集了 metasploit 框架中存在的针对这四个 php web app 的 CVE，并根据公开漏洞信息把他们编写成了 POC

在验证原始版本的 web app 都能够被利用成功后，对 debloated 版本进行测试，有一半失败（4 of 8）

此结果说明 debloating 对于 web app 的安全方面虽然不是万能的，但是是有效的

性能分析#

由于代码覆盖工具会对性能增加开销，本节进行的讨论是对 XDebug 工具的开销分析，即 selenium 脚本在有 XDebug 和没有时进行对比

结果显示，四种 web app 的开销在执行时间、CPU 消耗、内存消耗都有提升

但是这种开销是可以通过改进覆盖率计算方式来降低的，比如覆盖率以离线方式计算等，这部分工作留到以后讨论

局限性和未来工作#

对前面的工作总结一下，即 debloating 能减少数十万行无关代码，减少圈复杂度 30%~50%，删除约一半与 CVE 相关的导致漏洞的代码，即使对于不能删除的漏洞，debloating 也能删除一些 gadgets 使其变得更难利用

文章作者认为本文的工作尚不完善，仍有以下局限

缺乏可利用漏洞

缺乏公开可利用的漏洞，包括各类漏洞利用复现、细节说明等

其中作者还提到了缺少针对 web app 的自动利用脚本（比如 BugBox），因为这样可以大大帮助研究人员的工作

动态代码覆盖

web debloating 大大依赖于动态代码覆盖分析，且即使有四种可复制且无偏见的应用使用配置场景，也仍不能声称其覆盖了 web app 的所有良性状态

说白了是覆盖的深度不够，作者打算通过众包（crowd sourcing）和用户研究（user studies）来跟进

其次，由于该 pipeline 是对于指定的用户集来删除不需要的功能，故不能进行一般的静态分析工作，但是作者提出可以在 debloating 之后的代码基础上进行一个静态分析的工作，来确保对于这些用户集来说其需要的功能是仍然存在的

处理流向被删除代码的请求

当真实用户有请求到被删除代码时，如何进行处理？仅退出应用并返回错误是不够的，应该重新引入那些被删除的代码并处理用户的请求，且在这之前需要确定是否该请求是恶意的

衡量 debloating 有效性的指标

本文使用圈复杂度减少、逻辑代码行（LLOC）减少、CVE 减少和 POP 链这四个指标减少来衡量其有效性

但是每一行代码对程序的攻击面的贡献都是不同的，同时 CVE 的标准也不适用于专有软件，且 CVE 还需要手动映射才能验证可利用性，工作量巨大

debloating 的效率

模块化应用的 debloating 效率明显由于单体应用（如 wordpress）

总结#

由于我自己的想法总结已经写在前面的 Overview of paper 了，这里就贴上原文的 abstract 和 conclusion 吧

Abstract

As software becomes increasingly complex, its attack surface
expands enabling the exploitation of a wide range of vulnerabil-
ities. Web applications are no exception since modern HTML5
standards and the ever-increasing capabilities of JavaScript are
utilized to build rich web applications, often subsuming the
need for traditional desktop applications. One possible way of
handling this increased complexity is through the process of
software debloating, i.e., the removal not only of dead code but
also of code corresponding to features that a specific set of users
do not require. Even though debloating has been successfully
applied on operating systems, libraries, and compiled programs,
its applicability on web applications has not yet been investigated.
In this paper, we present the first analysis of the security
benefits of debloating web applications. We focus on four
popular PHP applications and we dynamically exercise them
to obtain information about the server-side code that executes
as a result of client-side requests. We evaluate two different
debloating strategies (file-level debloating and function-level
debloating) and we show that we can produce functional web
applications that are 46% smaller than their original versions
and exhibit half their original cyclomatic complexity. Moreover,
our results show that the process of debloating removes
code associated with tens of historical vulnerabilities and
further shrinks a web application’s attack surface by removing
unnecessary external packages and abusable PHP gadgets.

Conclusion

In this paper, we analyzed the impact of removing unnecessary
code in modern web applications through a process called software debloating. We presented the pipeline details of the
end-to-end, modular debloating framework that we designed
and implemented, allowing us to record how a PHP application
is used and what server-side code is triggered as a result of
client-side requests. After retrieving code-coverage information,
our debloating framework removes unused parts of an application
using file-level and function-level debloating.
By evaluating our framework on four popular PHP applica-
tions (phpMyAdmin, MediaWiki, Magento, and WordPress) we
witnessed the clear security benefits of debloating web applica-
tions. We observed a significant LLOC decrease ranging between
9% to 64% for file-level debloating and up to an additional 24%
with function-level debloating. Next, we showed that external
packages are one of the primary source of bloat as our debloating
framework was able to remove more than 84% of unused code in
versions that used Composer, PHP’s most popular package man-
ager. By quantifying the removal of code associated with critical
CVEs, we observed a reduction of up to 60% of high-impact, his-
torical vulnerabilities. Finally, we showed that the process of de-
bloating also removes instructions and classes that are the primary
sources for attackers to build gadgets and perform POI attacks.
Our results demonstrate that debloating web applications
provides tangible security benefits and therefore should be
seriously considered as a practical way of reducing the attack
surface of web-applications deployments.

For1moc