by Phil Martin
Type your question here
Type your answer here
If the organization behind the site had access to sensitive information, they could also leverage this. For example, a bank who has access to a person’s credit history could ask the following:
What type of car was purchased at your address in the last five years?
Ford Fusion
Dodge Ram
Chevrolet Cruz
Toyota Sienna
Presumably, an attacker would not have easy access to this information and not be able to answer it correctly. To prevent an attacker from guessing until they hit on the correct answer, the questions are usually continuously changed during each page load.
More secure than a question challenge approach is to use an automated out-of-band token being sent to the user through a channel that only the user should possess. For example, a system generates a unique one-time, difficult-to-guess token and emails it to the address on record. The user clicks on the link embedded in the email, the system validates the token, and the user can then change their password or have their account unlocked. This process requires a very complex token, such as a 32-character GUID that cannot be guessed. Another out-of-band channel is the use of a cell phone. In this process the system sends a simple token, usually a numerical value from 4 to 8 digits in length, as a text message to the cell phone already on-record for the account in question. The user enters this value and proceeds to gain access to their account. While the email-based token is quite complex and can be valid for several days, the simpler phone-based token should only last for a matter of minutes since it is much easier to brute-force guess. Since an automated system could execute a brute-force attack much more quickly, the length of time for expiration needs to be judged against how fast the system will process each token entered. In some cases, a purposeful delay might need to be introduced to prevent an automated attack, but care must be taken to not cause a self-inflicted DoS attack as a result.
The last approach requires the user to contact a human, usually manning a help desk. This is only feasible for intranet credentials, or highly-secure public applications. In this case, the system must already have sensitive information about the account holder on-file, as the user will be required to prove they are the account owner by answering one or more questions over the phone.
We have previously discussed using tokens as a way to maintain a session across multiple transactions without requiring the end-user to provide their credentials for each transaction. For example, the user authenticates once, and the system generates a unique string of characters, called the token, that represents the authenticated credentials.
The concept of a token is often used when dealing with any kind of sensitive information, not just credentials. Anytime we wish there were a way to pass around secret information, we can instead just generate a random token that represents the secret and give that out instead. Of course, to be of any value, at some point the token needs to be given back to us so that we can turn it back into the secret and do something with it. Let’s use a payment system as an example, shown in Figure 84.
Let’s say you have a credit card already stored in the WeAreSafe.com website, a payment provider that you trust. You decide to purchase a pair of socks from SoksRUs.com, but you really don’t trust this merchant with your credit card information. So, you go to WeAreSafe and ask for a token that represents your credit card number. While checking out with SocksRUs, you give them the token, who turns right around and gives it back to WeAreSafe along with the dollar amount they would like charged to your credit card. WeAreSafe looks up the credit card associated with the token, charges it, and sends the funds to SocksRUs. During this entire process your precious credit card information never leaves the safety of the WeAreSafe database.
Sometimes the token will retain a small amount of the original information, but not enough to do any damage. For example, if your credit card number is 1234-567-8901, then a 10-character token might look like Aksj33h8901, leaving the last 4 digits intact for tracking purposes. Tokenization can be employed for any number of other uses such as banking transactions, stock trading, voter registrations, medical records, and criminal records.
Figure 84: Example of Tokenization
Injection
An injection flaw occurs when user-supplied data is not validated before being processed by an interpreter. The attacker will provide data that is accepted and interpreted as a command, or a part of a command, allowing the attacker to essentially execute commands on a system in an unauthorized manner. For example, an attacker types in “’;drop table users;” into a user name field, and the vulnerable application concatenates the username value into a string of SQL and executes it, resulting in the Users table being deleted. The most common sources of injection vulnerabilities include URL query strings, form input and applets in web applications. The good news is that injection flaws can be easily discovered using code reviews and scanners. There are four common types of injection attacks:
SQL injection
OS command injection
LDAP injection
XML injection
SQL Injection
SQL injections are probably the most common form of injection attacks since databases are such prime targets. In this scenario, an attacker gets his own input to be executed as part of a SQL command. Let’s dig into the same use case of a username being used for a SQL injection attack we explored earlier.
Suppose a web form collects a username and password and constructs the following SQL string:
Now, if the attacker enters ‘ or 1=1 --’ as the username value, our SQL statement winds up being:
When using SQL Server T-SQL, everything after the ‘--’ is ignored so we wind up executing the following statement:
Which means the query will always return at least one record, allowing our attacker to authenticate without knowing a single username or password. Of course, bypassing authentication is not what an attacker is really after – once he has figured out the application is vulnerable to SQL injection, he will try to map the database and manipulate its contents.
SQL injection will always include the following three steps:
1) Explore an interface to see if it is susceptible to SQL injection by executing various queries.
2) Enumerate the internal database schema by forcing database errors.
3) Exploit the vulnerability by bypassing checks to modify, retrieve or delete data.
Step #2 – enumerating the database schema - is crucial in any attack if the application has not been specifically coded to prevent information leakage when unexpected conditions are encountered. In this case, the suppression of database messages is key to thwarting an attacker. However, even if we take care not to leak information, an attacker can use blind SQL injection, in which he constructs simple Boolean expressions to iteratively probe the database. For example, with the previous vulnerability scenario he can enter “’; select * from roles” into the user name field. If an error is generated, he can deduce that there is not a table called ‘roles’. He can also note the time it takes to execute a query to help in determining whether a query was successful or not.
OS Command Injection
An OS command injection results when an application takes user input and uses it to create an OS command that is then executed. This attack can be doubly dangerous when the least privilege principle is not applied, as a simple command interface can be used to cause all sorts of havoc. There are two types of OS command injections – single command and multiple command.
With a single command vulnerability, the programmer allows the user to specify arguments or conditions under which a pre-defined command will execute. In this situation, the programmer has assumed that the arguments provided by the user are trustworthy.
With a multiple command vulnerability, the programmer allows the user to type in a complete command which is then executed. Beyond being dangerous as a single complete command, the attacker
could chain multiple commands together, resulting in a complete security breach. Again, the programmer has assumed that the application interface will never be accessed by a user that is not trustworthy.
LDAP Injection
The Lightweight Directory Access Protocol, or LDAP, is used to access network directories that store information about users, hosts and other objects. If an application does not perform validation of input, this type of injection attack can reveal sensitive information about objects. As an example, if an attacker enters ‘*’ into a user name field which is then used to construct an LDAP query, it could result in the following:
This will result in a listing of all users in the directory. If the user enters “’abeth)(|password=*))’” into the username field, the LDAP query will yield the password for user ‘abeth’.
The best defense against LDAP injection attacks is to escape specific characters. Figure 85 lists the various characters and equivalent escape sequences.
User Input
Characters
Escape Sequence Substitute
To create DN
&!|=<>+-“’;,
As part of a search filter
(
28
)
29
5c
/
2f
*
2a
NUL
0
Figure 85: LDAP Character Escaping
XML Injection
XML injection, like all other injection attacks, results when an application does not properly validate and filter input prior to using it in an unsafe manner. XML injection attacks come in two flavors – XPATH and XQuery. XPATH is actually a subset of XQuery, so both approaches are vulnerable in much the same way. Without going into specifics with the XPATH syntax, a user could include “ ’ or ‘= ’” when entering a password, resulting in the following XPATH syntax:
Mitigation
Regardless of the specific type of injection flaw – SQL, OS, LDAP or XML – all result from the same three common traits:
Input from the user is not sanitized or validated.
The query constructed is dynamically created from the user input.
The query is interpreted as a command at run-time.
The most common consequences resulting from injection flaws are the following:
Disclosure, alteration or destruction of data.
Compromise of the operating system.
Discovery of internal structures.
Enumeration of user accounts.
Circumvention of nested firewalls.
Authentication bypass.
Execution of extended procedures and privileged commands.
There are several approaches we can use to mitigate injection attacks. First, all input – regardless of the source - must be considered to be untrusted and suspect. The values must be sanitized and filtered using a whitelist of allowable characters – only positive matches should be allowed. If we instead attempt to use a blacklist to avoid unwanted characters, you can be sure we will be facing an uphill battle in forever updating this list to keep up with innovative attackers. Validation must happen at the backend, and optionally on the front end – always assume an attacker can bypass the front end completely, because this is 100% true. Input must be validated for data type, range, length, format, values and canonical representations. SQL keywords such as UNION, SELECT, INSERT, UPDATE, DELETE or DROP should be filtered out, in addition to both single and double quotes and comment characters such as ‘--‘.
Encode output, escape special characters and quote all input.
Use structured mechanisms to separate data from code. In other words, never hardcode text in source code.
Avoid dynamic query construction. The best way to ensure this is to use parameterized queries only. This prevents concatenating user-supplied input into a SQL string, and instead references variables as parameters. For example, instead of concatenating a username into a SQL string such as:
We would specify userId.Text as a parameter called ‘userIdValue’ for the following query:
In this way, even if an attacker tries to trip us up by using single quotes and logical statements, the database will treat all input as a single value instead of as part of the SQL statement itself. There is no reason to NOT use parameterized queries in modern languages. The use of parameterized queries is a NON-NEGOTIABLE. Did I mention that parameterized queries are important??? Always use parameterized queries!
Use a safe API that avoids the use of an interpreter or which provides escape syntax for special characters. One example is the ESAPI published by OWASP.
Display generic error messages to avoid information leakage.
Implement a failsecure by capturing all errors and redirecting the user to a generic error page but be sure to log the error first!
Remove any unused functions or procedures from the database, and possible extended procedures allowing a user to run a system command.
Implement least privilege by using views.
Log and audit queries along with execution times to discover injection attacks, particularly blind injection attacks.
Mitigate OS command injection vulnerabilities by running the code in a sandbox that enforces strict boundaries between the process and the operating system. Examples are Linux AppArmor and the Unix chroot jail. Managed code can sometimes provide sandbox protection as well.
Implement a whitelist of allowable commands and reject any command not on the whitelist.
Properly escape LDAP characters as shown in Figure 85.
It is possible that some code cannot be fixed, such as third-party components or legacy software for which it is not cost-effective to address. In these cases, an application layer firewall should be used to detect injection attacks as a compensating control.
Input validation
Perhaps you have heard President Ronald Reagan’s famous statement about Russia’s efforts to denuclearize – “Trust, but verify.” Essentially, Reagan was saying to believe the best, but provide proof the best is really happening. In the security world, though, we use the phrase “Don’t trust, always verify.” In other words, always assume the user is an attacker and never trust input. Instead, we must validate all input to ensure the following four statements are true about the data:
It is of the correct data type and format.
It falls within the expected and allowed range of values.
It cannot be interpreted as code, such as the case with injection attacks.
It does not masquerade in alternative forms that can bypass security controls.
With input validation, we have to address the how, where and what. The ‘how’ is partially dependent on the capabilities of the chosen programming language and toolkits. Most languages provide a regular expression, or RegEx, capability that can be used to validate input. RegEx patterns can be quite difficult to understand and can easily increase the maintenance cost of code and be a large source of bugs. But they are a great way to implement either whitelist or blacklist filtration techniques. A whitelist is a list of allowable good and acceptable characters, commands or data patterns. As an example, when filtering an email address, the whitelist would allow only alphanumeric characters along with ‘@’ and ‘.’. The opposite approach is to use a blacklist, which contains a list of disallowed characters, commands or data patterns. Using the email example, a blacklist might contain ‘!#$%^&*()_+-=`~:;”<>,./?’, all of which are not acceptable characters for an email address. Additionally, the blacklist might contain patterns known to be used in attacks. For example, a SQL injection blacklist might contain a single quote, SQL comment characters --' or ‘1=1’. A whitelist is usually considered to be safer, because if we make a mistake, the chances are we will deny a valid user, whereas a mistake with a blacklist will allow an attacker to proceed.
Now that we have discussed how to implement data validation, let’s discuss where it should be impl
emented. Input can be validated both on the client or the server, or both. Whatever the answer, input validation MUST BE implemented on the server regardless if it is implemented on the client or not. NEVER trust data coming from a client. Let’s take a moment here to explain why this is.
Being ‘security woke’, let’s assume that you put in JavaScript to prevent SQL injection text into an email change field. Let’s also assume that you put in a hidden form field with a static variable specific to this form to detect someone hand-crafting the form POST. Let’s also assume that you use TLS to encrypt the channel to prevent anyone from seeing the data. Even more, let’s assume that you encrypt the session token in a cookie, so an attacker can’t get to it. Sounds pretty secure, doesn’t it?
Here’s what happens – an attacker uses CSRF to get a user to click on their link, which then sends the login POST command to your server using their cookie – they don’t need to decrypt it because it is being sent to you anyway. Inside their post code they have already bypassed your injection script checks for the email field, and included the hidden field with the static variable, since they have an account with you and have already discovered your ‘tricks’. TLS is great but doesn’t stop this attack since it uses CSRF to send the data within the legitimate user’s own communications channel. The end result is that the user has bypassed all of your cute client-side checks, and if you do not implement the proper server-side validations, you are doomed to fall victim to a SQL injection attack.