Hacking Databases, Part 1: Getting Started with Terms and Technologies

This is the first installment of a new series I am running here on Hackers Arise called “Hacking Databases”. We will start with the basics and progress to the more advanced techniques for hacking databases.

Although there is a multitude of different hacker types, the one target they all share is the database. I often refer to the database as the hacker’s Holy Grail, or the ultimate prize for an effective hack.

That’s because the database—almost by definition—is the repository of all the goodies that the hacker is pursuing. These goodies might include credit card data, personally identifiable information, usernames and passwords, and intellectual property. Each of these items can be sold for significant sums of money in the black market.

For instance, credit card numbers can be sold for between $5 to $50 depending upon the quality (gold and platinum cards are worth more than regular cards) and their credit limit. So, a hack that reaps 200,000 credit cards numbers is worth anywhere from $1 to $10 million!

At the same time, state sponsored hackers, like China, are after intellectual property. This would include designs, formulas, engineering, and other things that they can steal and bring to China for manufacturing by Chinese companies. This is happening every day and some consider it the greatest theft in the history of the world!

This is the first installment of a new Hackers Arise series on database hacking. Before we examine how to hack a database, we need to understand the basics of how databases work, what SQL is, and what vulnerabilities are common to databases.

So, let’s get started so that we can quickly get to some DB hacking!

What Is a Database?

A database is a system of software to store and retrieve information in a structured format. Early databases were flat files, kind of like a big Excel file. As databases got bigger and bigger, this simple structure proved inefficient.

As a result, a scientist at IBM, Dr. Codd, developed a structure that came to be known as the relational database model. It is this model that about 97% of all databases now use, and this includes all the major software companies.

The basics of the relational model is that data should be placed in separate tables with unique keys that link the tables to avoid data duplication and to ease the retrieval of this data.

The Structure of a Database

This relational database model links data from separate tables by using a shared column or “key”. The diagram below is of a simple relational database model where all the tables are linked by the column “ID”.

Major Vendors in the Database Market

The enterprise database market has multiple vendors offering products that can accomplish the same task, but in different ways. The major players in this market are:

Oracle. They are the behemoth in this market with nearly 50% market share. They own multiple different database software products, including their namesake and MySQL.
Microsoft SQL Server. Microsoft entered this market in the early ’90s by teaming up with Sybase to develop an enterprise database offering. As a result, MS SQL Server and Sybase still share many similarities. Originally, Microsoft was only a player in the small business market, but is slowly gaining traction in the larger enterprise market.
MySQL. This is an open-source database that you will find behind so many web sites, in part, because it’s free.
IBM DB2. IBM was the original enterprise database provider and made many the major developments in database design, but like everything about IBM, it has been in decline in recent decades.
sqllite – sqlite is an open source, lightweight relational database. Due to its lightweight and efficiency, it is being built into nearly every mobile device and browsers where data must be stored and retrieved, while using little power and CPU cycles. Although hacking these type of databases has not not been popular among hackers, I suspect as more and more data moves to our mobile devices, it will be come more popular and valuable.

Other major vendors include Sybase, SAS, PostgreSQL (open source), and many others. Generally, like any hack, we need to do good recon to determine the software and version to be successful as most of the database hacks are vendor specific.

Structured Query Language (SQL)

When IBM developed the early databases, they also developed a programming language for managing and manipulation this data. They called it “Structured Query Language,” or as it is generally known, SQL.

This is a simple language that uses English words in similar ways that humans who speak English use them. For instance…

SELECT means “select some data from columns in a table”
FROM means “get the data from this table”
WHERE means select the data that meets this condition (lastname = ‘smith’).

Furthermore, words such as UPDATE, INSERT, and DROP mean in SQL exactly what you would expect them to mean.

SQL is not picky about syntax, but it is picky about logic. Although best practice is to CAPITALIZE all keywords (SELECT, FROM, WHERE), it’s not required. In addition, white space is ignored. All but Microsoft, though, require that a SQL statement to end in a semicolon (;). On Microsoft products, it’s optional.

SQL is standardized by ANSI, but this standardization only includes about 80% of the language or the core of SQL. Software publishers are free to add additional commands and features that are not part of the standard. This can sometimes make it difficult to transport SQL code between DBMS. It also makes it critical to do good reconnaissance on the database to know the manufacturer and the version before attacking as the attacks are often specific to the manufacturer and the version.

Each of the DBMS can be used from a command line, but each has its own GUI. Recently, MySQL released a new GUI called Workbench as seen in the previous section.

Oracle, Microsoft, and the others have similar GUIs that allow the administrator to access their systems.

Basic SQL Query

When we need to extract data from the database, it’s said that we are “querying” the database. As databases are repositories for data, the ability to extract or query data is among the most important functions. As a hacker, we want to extract data, so the query is critical to getting the goods.

The basic structure of the query looks like this:

SELECT <columns> FROM <table> WHERE <conditions>

This statement says “give me the data in the columns listed in the SELECT statement from the table that comes after the FROM keyword, but only give me the rows that meet the conditions that come after the WHERE keyword.”

So, if we wanted to get first name, last name, username, and password from the staff table for employees with the last name of “Hillyer,” we could construct a statement like this:

SELECT first_name, last_name, username, password FROM staff WHERE last_name = ‘Hillyer”;

As you can see in the screenshot above, we have extracted Mike Hillyer’s record from the staff table with his username and password (the password is actually a hash of his password).

SQL Injection

SQL injection is the most common database attack and is probably responsible for the largest dollar volume of losses from cyber crime and advanced persistent threat (APT).

It basically involves putting SQL commands in the data forms of webpages that connect to a database. These forms then send these SQL commands back to the database and the database will either authenticate the user (yes, authentication is a database function) or give up the target data.

In future tutorials, we will spend quite a bit of time using SQL injection tools and executing SQL injection attacks.

Other Vulnerabilities

Besides showing you how to do SQL injection, we will examine some of the other of vulnerabilities in databases. These involve authentication, using the database to compromising the operating system, and several others.

Now that we having covered the basic terminology and technology related to databases, in future tutorials I will show you how to hack into these databases, so keep coming back!