Introduction to code review

This exercise covers the different ways to perform code review. It also contains a simple application to review to help you get started.

Free
Tier
Easy
--
0


Objective

The application we are going to work on is a simple PHP application that allows users to upload and download files. Think of it as an over-simplified Dropbox.

Introduction

Why bother doing a code review? There are multiple reasons:

  • It can be faster than penetration testing. Some issues are really easy to spot during a code review (for example, a weak encryption algorithm), where others can take a lot more time (an XSS for example).
  • Compliance may require you to perform a security code review (for example, as part of PCI DSS 6.3.2).
  • After doing penetration testing for a while; you want to do something different.
  • You want to find better bugs. Some bugs you will find during a code review can be surprisingly hard to discover with black-box testing.
  • You want to check if some code is backdoored (which is actually really hard to do).
  • You want to write an exploit for a bug.
Methodologies

There are plenty of ways to perform a code review. You can find some methods below:

  • String matching/Grep for bugs.
  • Following user inputs.
  • Reading source code randomly.
  • Read all the code.
  • Check one functionality at a time (login, password reset...).
String matching/Grep for bugs

This is probably the fastest way to find low-hanging fruits; you just try to find patterns of known vulnerabilities. For example, you can use grep to find calls to the PHP system function:

$ grep -R 'system($_' *

You can find a list of regular expressions to try on your code base in the GRaudit project.

This approach suffers from many limitations:

  • You don't get much coverage/assurance on the quality (and therefore security) of the source code. You only know that, based on your list of patterns, you couldn't find any issue.
  • You need to know all the dangerous functions/patterns.
  • You may end up using very complex regular expressions.
This approach works pretty well for timeboxed reviews where you don't have enough time. It can also help you get familiar with a code base as part of a longer review. However, it's probably not the best way to perform proper reviews.

Following user inputs

Another way to approach a review is to follow all the user-controlled inputs and identify all the ways to access the application (the routes/URIs available), for example in PHP:

  • $_POST/$_GET/$_REQUEST.
  • $_COOKIE/$_SERVER.
  • Data coming from the database (for stored XSSs and second-order injections for example).
  • Data read from a file or a cache.
  • ...

This method provides good coverage. However, you will need a good understanding of the framework/language used. Finally, you may end up reviewing the same function again and again if it's called multiple times.

Read everything

The more time-consuming way: just start reading the code one file at the time. A better way to do this is to try to find weaknesses, not vulnerabilities. From there, you can see if the weaknesses can become vulnerabilities on their own or by chaining them.

This method is obviously the most laborious way of working, but it brings excellent coverage. It's crucial to keep good notes when using this approach.

By functionality

Another common way to do a review is to pick one functionality (at a time), for example:

  • "Password reset".
  • "Database access".
  • "Authentication".

And review all the code associated with this functionality. This works especially well if you do this across multiple applications/frameworks, as they will all have different behaviors.

This approach gives you an excellent coverage of the functionality reviewed and will teach you what mistakes people typically make for a given functionality. However, you only have coverage of what you reviewed.

What to look for?

When doing a review, you need to look for everything:

  • Weird behaviors.
  • Missing checks.
  • Complexity.
  • Security checks already in place.
  • Differences between 2 functions/methods/classes.
  • Comparisons and conditions (if/else).
  • Regular expressions/string matching.
  • What is missing?

You will probably encounter functions, classes and methods you aren't currently aware of. To solve this issue, you need to:

  • Google them.
  • Test their behavior.

It's going to take time (especially early on), but the more code you review, the easier it gets.

Make sure you create a snippet with the function/class/method to test its behavior. This will make future reviews far simpler. To test it, you will need to run the snippet locally and try to find some edge cases that the developers may have missed.

Hands-on

For this exercise, you have a really simple web application. We are going to go with the "read everything" approach as there isn't that much source code to read.

To get started, don't focus on finding vulnerabilities, try to find weaknesses, some thoughts to keep in mind are:

  • "What doesn't seem right".
  • "What happens if I put [X] here".

Then, try to see if these weaknesses can become vulnerabilities. Either on their own or by chaining them.

The application

The application is straightforward and has a dozen security issues. As a user, you can only:

  • Register/Login/Logout.
  • Upload and retrieve files.

To get started, you can get the code:

  • Using git:
git clone https://github.com/PentesterLab/cr

You can use the tool cloc to get a better idea of the size of the application:

% cloc .
        14 text files.
        13 unique files.
        2 files ignored.

  github.com/AlDanial/cloc v 1.72  T=0.11 s (120.6 files/s, 46503.2 lines/s)
  -------------------------------------------------------------------------------
  Language                     files          blank        comment           code
  -------------------------------------------------------------------------------
  CSS                              2            676             11           3973
  PHP                             10             48              4            289
  SQL                              1              5              0              5
  -------------------------------------------------------------------------------
  SUM:                            13            729             15           4267
  -------------------------------------------------------------------------------
List of weaknesses

You can find below the list of issues present in the application:

  • Hardcoded credentials or secrets.
  • Information leak.
  • Missing security flags.
  • Weak password hashing mechanism.
  • Cross-Site Scripting.
  • No CSRF protection.
  • A directory listing.
  • A cryptographic issue.
  • A signature bypass.
  • An authentication bypass.
  • An authorization bypass.
  • A remote Code Execution.

This section is intentionally left blank.

The issues
Hardcoded credentials/secrets

It's pretty easy to spot the hardcoded credentials or secrets if you read all the code. A modern web application should store all secrets outside of its source code. Here we can see that the following files contain some sensitive information:

  • classes/db.php:
$lnk = mysql_connect("127.0.0.1", "pentesterlab", "pentesterlab");
  • classes/jwt.php:
return hash("sha256","donth4ckmebr0".$data);

During the deployment of the application, secrets and credentials should be loaded into the application (via environment variables or a configuration file). This will limit access to production secrets/credentials for developers.

Information leak

To find this issue, you need to think about the way the application will get deployed. The web root of the application is the main directory of the source code. Therefore, the following files and directories will be publicly available:

  • .git: this can be used to rebuild the full source code of the application.
  • deploy.sql: this contains the database password in cleartext and the hashed admin password.
Security flags on cookies

The application uses setcookie to set cookies for authentication. However, it doesn't set the secure and HttpOnly flags in the following files:

  • login.php:
setcookie("auth", User::createcookie($_POST['username'], $_POST['password']));
  • register.php:
setcookie("auth", User::createcookie($_POST['username'], $_POST['password']));
Weak password hashing mechanism

If you look at the register function in classes/user.php:

public static function register($user, $password) {
  $sql = "INSERT INTO users (login,password) values (\"";
  $sql.= mysql_real_escape_string($user);
  $sql.= "\", md5(\"";
  $sql.= mysql_real_escape_string($password);
  $sql.= "\"))";
  $result = mysql_query($sql);

You can see that the application uses md5 to hash passwords and the hashing is done without any seed. Furthermore, the hashing of the passwords is done by the database (as opposed to the application). This means that the customers' passwords are likely to end up in the database logs or transmitted in cleartext between the database and the application. Instead, the application should use scrypt, bcrypt or PBKDF2 to store passwords.

Cross-Site Scripting

The application uses the h function to escape data coming from the users. This function is just an alias for htmlentities defined in classes/phpfix.php:

function h() {
  return call_user_func_array("htmlentities", func_get_args());
}

We can see that h is not used in the following places:

  • index.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>
  • login.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>
  • register.php:
<span class="text text-danger"><b><?php echo $error; ?></b></span>

However, $error is not under the user's control. Therefore, those are only weaknesses, not vulnerabilities. They should still get fixed to future-proof the application. The exploitable XSS is located in index.php:

<form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="post" enctype="multipart/form-data">
As $_SERVER['PHP_SELF'] is user-controlled, it can be used to trigger an XSS (using a path like /index.php/">script...).

There is also another XSS in classes/user.php that can be triggered by registering the same username multiple times (due to the call to mysql_error).

No CSRF protection

We can also see that none of the forms are protected by a CSRF token. This is an example of the "what is missing ?" bugs, you cannot see them because they are not there.

A directory listing

When users retrieve the list of their files, the following code is called (in classes/user.php):

public static function getfiles($user) {
  $base = "files/".$user;
  if (!file_exists($base)) {
    mkdir($base);
  }
  return array_diff(scandir($base), array('..', '.'));
}

This function gets the list of files (scandir) in files/[USERNAME], and removes the parent (..) and current (.) directory from the list using array_diff.

Nothing in this code (or during registration) prevents an attacker from using the following usernames: .., ../.., ../../... This will allow the attacker to retrieve a directory listing of any directory on the server hosting the application.

A cryptographic issue

The cryptographic issue is located in classes/jwt.php in the signing function:

public static function signature($data) {
  return hash("sha256","donth4ckmebr0".$data);
}

Here we can see that the application is using a hash instead of an HMAC. This creates multiple issues:

  • It's likely to be vulnerable to Length Extension.
  • It's not the RFC way to sign a JWT token. This implementation will not work with other applications.
A signature bypass

The signature bypass is located in classes/jwt.php:

public static function verify($auth) {
  list($h64,$d64,$sign) = explode(".",$auth);
  if (!empty($sign) and (JWT::signature($h64.".".$d64) != $sign)) {
    die("Invalid Signature");
}

We can see that the application gets three elements from the $auth variable:

  • $h64.
  • $d64.
  • $sign.

Then it checks the signature $sign using: JWT::signature($h64.".".$d64) != $sign. However, it only checks the signature if one is provided: if (!empty($sign) and....

An attacker can provide a malicious $auth variable that doesn't contain a signature. This will allow them to bypass the signature mechanism.

An authentication bypass

The authentication bypass is based on a real vulnerability in the Play framework covered in another PentesterLab exercise: Play Session Injection.

Due to the weird JSON parser:

public static function parse_json($str) {
  $data = explode(",",rtrim(ltrim($str, '{'), '}'));
  $ret = array();
  foreach($data as $entry) {
    list($key, $value) =  explode(":",$entry);
    $key = rtrim(ltrim($key, '"'), '"');
    $value = rtrim(ltrim($value, '"'), '"');
    $ret[$key] = $value;
  }
  return $ret;
}

and the way the JSON is created:

$header = str_replace("=","",base64_encode('{"alg":"HS256","iat":'.time().'}'));
$token = "{";
foreach($data as $key=>$value) {
  $token.= '"'.$key.'":"'.$value.'",';
}
$token .= "}";

It's possible to create a malicious user named test","username":"admin to become the admin in the application.

An authorization bypass

The authorization bypass is very simple. When a file is uploaded, it gets saved in the directory: /files/[USERNAME]/[FILENAME]:

public static function addfile($user) {
  $file = "files/".$user."/".basename($_FILES["file"]["name"]);
  if (!preg_match("/\.pdf/", $file)) {
    return  "Only PDF are allowed";
  } elseif (!move_uploaded_file($_FILES["file"]["tmp_name"], $file)) {
    return "Sorry, there was an error uploading your file.";
  }

It's trivial for an attacker to guess [USERNAME] and [FILENAME] to gain access to other users' files. The developer should always implement some form of access control, or make the links impossible to guess.

A remote Code Execution

The RCE is located in the addfile function:

public static function addfile($user) {
  $file = "files/".$user."/".basename($_FILES["file"]["name"]);
  if (!preg_match("/\.pdf/", $file)) {
    return  "Only PDF are allowed";
  } elseif (!move_uploaded_file($_FILES["file"]["tmp_name"], $file)) {
    return "Sorry, there was an error uploading your file.";
  }

We already know that the uploaded files are stored in the web root. We can also see that the developer restricts the uploadable file type to PDF (.pdf). However, the regular expression is missing a $ to match the end of the string. As a result, it's possible for an attacker to upload a file named exec.pdf.php. As the file's extension is .php and the file is in the web root, it will get executed when accessed by the attacker.

Conclusion

This exercise showed you how to perform your first code review to find potential security issues.

I hope the course provides you with enough information and the methodology to get started. As always with security, practice makes perfect. Try to regularly review some code, this could be from:

  • Changes/patches introduced to fix a CVE.
  • Applications you use every day.
  • Applications in a weird language.

Now it's time to apply what you've just learned!

I hope you enjoyed learning with PentesterLab.