Name: CS6035 Project 1- Database Security
SKU: 118558
Availability: InStock

5/5 - (2 votes)

2026 SOLUTION LINK click here!!

2025 SOLUTION LINK click here!!

2024 FALLSOLUTION LINK

2024 SUMMER SOLUTION: LINK

You will learn about two different types of database vulnerabilities and how each attack works. Inference attacks are security attacks that attempt to extract sensitive information from a database by making educated guesses based on the available information. Hackers use this prevalent practice to data mine troves of information to connect the dots to extract sensitive information.

In a SQL injection attack, an attacker injects malicious SQL code into a web application’s input fields to manipulate the database, gain unauthorized access to sensitive data, or execute harmful actions.

In particular, we will cover these learning topics:

Basic T-SQL queries
Basic JavaScript
Database Inference attacks
Database SQL injection attacks

The final deliverables:

A single JSON formatted file will be submitted to Gradescope. This file should be named **project_dbsec.json. _**A template can be found in the /home/dbsec/Desktop _folder.

See Submission Details for more information.

Important Reference Material:

All Encompassing:

SQL Test / SQL Fiddle
Chrome DevTool Documentation
Chrome Debug Javascript
HTML Table to CSV/Excel Converter
Microsoft Excel (you will need to use your Student Account and allow pop-ups)

Inference:

SQL Injection:

Submission:

Gradescope (auto-graded) – see Submission Details

Virtual Machine:

Apple M-based systems (i.e. not Intel Chip)
- You cannot complete this project on an M-based system.
Intel/AMD x64 version
- https://www.virtualbox.org/wiki/Downloads (Version 7.0/latest is required)
- For those curious, the RDBMS used to run the database for this project is Microsoft SQL Server developer edition running on Linux.
- Note: The DBSec project is in the master VM. The link will be provided on project release.

TABLE OF CONTENTS

Frequently Asked Question(s) (FAQ)

Q) Chrome is crashing.

disable/turn off the screen saver/lock
Make sure the VM Application is started
- Start the Container:
  - Click on the Menu Launcher Start Button (Hummigbird bottom).
  - Choose The Menu Option System Tools, then the Menu Link QTerminal.
  - Inside the Terminal window run the command: bash StartContainer.sh
  - You may close the Terminal Window after you see the output: Container Started, currrent datetime.
After opening Chrome, wait a few seconds to allow outside the network resources to load correctly

Q) I deleted or modified files by accident and can’t get them back

You will need to restore the VM from the OVA file

Q) How do I find my GTID?

Please see the Projects page for instructions to retrieve your GTID.

Q)I’m confused about how to start.

Start the Container: + Click on the Menu Launcher Start Button (Hummigbird bottom). + Choose The Menu Option System Tools, then the Menu Link QTerminal. + Inside the Terminal window run the command: bash StartContainer.sh + You may close the Terminal Window after you see the output: Container Started, currrent datetime.
Double-click on Chrome Web Browser Icon on the Desktop (this will open Chrome)
1. If Chrome does not open to http://www.gt-cs6035.com/intro.html automatically, browse to this website or look for the DBSec Bookmark.

Q) Menu Links do not load / Website(s) are not loading (We’re having trouble finding that site).

Wait a little while as the VM may just be spinning to catch up (typically will be seen if you rush into it after you Start the Container).
Stop the Container: + Click on the Menu Launcher Start Button (Hummigbird bottom). + Choose The Menu Option System Tools, then the Menu Link QTerminal. + Inside the Terminal window run the command: bash StopContainer.sh + You may close the Terminal Window after you see the output: Container Stopped, currrent datetime.
Stop the Container: + Click on the Menu Launcher Start Button (Hummigbird bottom). + Choose The Menu Option System Tools, then the Menu Link QTerminal. + Inside the Terminal window run the command: bash StartContainer.sh + You may close the Terminal Window after you see the output: Container Started, currrent datetime.
Double-click on Chrome Web Browser Icon on the Desktop (this will open Chrome)
1. If Chrome does not open to http://www.gt-cs6035.com/intro.html automatically, browse to the website as mentioned earlier or look for the DBSec Bookmark.
If you still experience issues, try restarting your VM.
1. Repeat Steps 2-4

Q)Task 0 submission fails.

Make sure your GTID is correct, with no beginning or ending spaces
Re-complete TASK 0

Q) Task 0 states, “You have reached the maximum # of entries; the VM will need to be restored.”

You will need to restore the VM from the OVA file.

Q)Do I need to re-do Task 0 every time I return to the VM to complete another Task?

No, you only need to do Task 0 once
- The exceptions to this are as follows:
  - You did not enter your correct GTID and your Gradescope Submission for Task0 fails.
  - The VM was in an unstable state and not saved correctly. When trying to complete any task you receive the following question in the FAQ.

Q)Task 1-4 states, “Task 0 needs to be completed before this Task is unlocked”. submission fails.

Complete TASK 0
If you already completed Task 0 and the system was reset, compare the hash you received upon your successful Gradescope submission with the hash you receive from re-completing Task 0

Q)I’m confused about how to solve Task(s) 1-4 and how I can solve this.

Read the Writeup and pay close attention to each Tasks section
- The Writeup can be accessed in the VM (http://www.gt-cs6035.com/intro.html) or outside the VM (https://github.gatech.edu/pages/cs6035-tools/cs6035-tools.github.io/Projects/DatabaseSecurity/) *You will need to log into GitHub with your GaTech credentials
Read through the corresponding DBSec Task in ED to see if a similar question has already been asked and answered
Post your question in the corresponding DBSec Task in ED
- Note: All posts made outside the preconfigured DBSec Task posts will be redirected. This includes private posts.
Extra hints for reading the FAQ (not in the Writeup):
- Report Columns for Tasks 1 and Task 2 are sortable (via click)
- Javascript debugging via Web Developer Tools is helpful for Tasks 3 and 4
- The lessons learned, and logic used in Task 1 will help you in Task 2
- The lessons learned, and logic used in Task 3 will help you in Task 4
- What does “remembering that they often use the same logic on the client and server side mean?
  - Once you access the client-side code via the browser’s Developer Tools and find out how to bypass its security implementation, you will need to do this a again to bypass the database’s security implementation. The database security logic is the exact same as the client-side security logic.

Q) Should my hashes be of different lengths?

flag0 will be your longest hash.
flag1-8 (same length) will be your shortest.
flag9-12 (same length) will be between flag0 & flag1-flag8.

Q) Gradescope will not allow any more submissions.

There is a limit of twenty submissions for this project
We are sorry if you have submitted your JSON file twenty times, but you may no longer submit for this project.
If you feel this is by error, open a private ED discussion asking a TA to review it. Note that any decision will be based on the data provided by the Gradescope audit logs.

Hi, please use this thread throughout the semester; we’ll keep it pinned for your reference.

This thread is where you can discuss any Virtual Machine (VM) troubleshooting you want to share with other students.

Most students have a trouble-free experience with our VM. We can only do so much to support Virtual Box installations; we might recommend you try another machine as a last resort.

Please be sure to read the VM FAQ first! And the Technical Requirements.

You will need a program to run virtual machines, but the TA team for this class only supports the latest version of VirtualBox. If you try other hypervisors, they might work, but if you run into trouble, the TA team will not assist you. Also, the Mx/ARM processors cannot run this VM and, therefore, will not be supported.

Downloads

VirtualBox can be downloaded from here – VirtualBox
The VM image may be downloaded from here – CS6035-Spring2024-rc3.ova

Hash/Checksum of VM Image

Algorithm: SHA256
Hash: D2A19F65BB8C850A4330AFFE7FC0A84DE7ED14B314C6C9E1A2A14E3BD5FF1D9A

To check the SHA-256 checksum of a file in Windows:

Open Windows Powershell
Change the path to the folder that contains the file you want to check
Type Get-FileHash CS6035-Spring2024-rc3.ova
Press Enter

To check the SHA-256 hash of a file on a Mac:

Open Terminal
Change the path to the folder that contains the file you want to check
Type shasum -a 256 CS6035-Spring2024-rc3.ova
Press Enter

To check the SHA256 hash of a file in Linux:

Open Terminal
Change the path to the folder that contains the file you want to check
Type sha256sum CS6035-Spring2024-rc3.ova
Press Enter

This post announces that the Database Security project will be released at midnight on Wednesday, Jan 24. You will have ten days to finish, and it will close on Saturday, Feb 3, at 11:59 PM EDT. Instructions are here – DbSec Project Instructions

The project (and most subsequent projects) will be performed on a class VM with all the projects. Each project will have its own username and password to log into the VM. If you have not already downloaded/set up the VM, you can read how to do so here – VM Download / Questions / Troubleshooting

For the DB Security project, this is the username and password:
Username: dbsec
Password: Iguazu-Falls

Please reference the FAQ & Troubleshooting pages for this project if you encounter issues:

Also, reference the discussion threads:

All questions on the project need to be asked in those locations. We aren’t doing private chats and will ignore questions posted outside those threads.

There is a limit of 20 submissions for the Database Security Project!

You will need your GTID for this and other projects.
This location has recently been changed and was not updated in the VM for the DbSec project.

Where do I find my GTID?

Please see the Projects page for instructions to retrieve your GTID.
Your GTID is a nine-digit numeric ID; it is not your username (DUO 2-factor authentication)

Flag 0

TASK 1: INFERENCE ATTACK- #1 (flags1-4 – 20 pts)

*NOTE – Task 1 has multiple flags (flag1, flag2, flag3, flag4)

Your first two attacks will involve what is called an inference attack. This is not an actual hack against a system, and it often doesn’t involve hackers at all. Data security is about keeping data confidential on a need-to-know basis, and most data at most companies is secured by permissions that limit who can see data. These permissions can be at the level of the table (the full list of a particular type of data, for instance the employee table contains the basic data about employees), the level of a row (a row in this case would correspond to the record for a single employee), or the level of a column (in this case a piece of information about an employee, like gender or position at work). However, it takes thought and care to maintain these controls, and often they can be inadvertently and carelessly circumvented. An inference attack usually involves access to standard reports available to a wide range of employees. Consider the sensitive area of salary. A company might consider that a column to restrict in terms of access and might well make an employee roster report that anyone could view that does not list people’s salaries. However, there may be other standard reports a company uses that are not carefully designed to protect those controls. By using two or more reports together, a rank-and-file employee can discover salary details they are not supposed to know by combining the information on the two reports and inferring the missing data, thus the name “inference attack”. You use multiple data sets to learn information that you are not supposed to know.

This is what has happened with these first reports you are looking at that were provided to you for testing. You have been given four internal reports for a single company. One report is a simple employee roster, one report lists out how long employees have been members of the organization, one report groups data on all employees together to provide the average salary for employees based upon the state that they live in, and the final report lists the average salary for employees based upon how long they have been a part of the company. Individually, these four reports do not violate the controls that the client has placed on access to sensitive HR data. However, as you do your analysis, you realize that there has been a hole opened in those controls to someone who puts the reports together and does a little analysis of the contents of the reports.

The audit reveals to start that at minimum at least one person’s salary is publicly available to all employees that can run these four reports. To successfully complete task 1, you need to find a hole in the protection where you can definitively find the actual salary of employees. After finding the first employee, continue looking for a hole in the protection to find the next employee whose salary you can definitively find. Continue this for 4 employees. Each employee will have a hash associated with them. This hash is unique to your VM and was generated when you completed task 0. Once you have identified which employees have the exposed salary, record the hash value displayed for that employee on the report in your JSON file for flags1-4. You will pass task 1 if you have the correct hash values. Remember that you only get 20 submissions for the entire project, so if you randomly try different hash codes, you will only hurt yourself later in the project.

For Task 1 – order matters (order of finding) for the flags (flag1 is the first employee found, flag 4 is the fourth employee found)!

To earn your hash for flags1-4, you must perform the following actions.

Hover over the Task 1 Menu, which will display four reports: Employee, Duration, Salary by State, and Salary by Duration. \

Click on the report you want to view (i.e., Employee Report). The report will open in a new tab.
Review the data on all four reports. You are searching through the data to see if you can find at least one employee whose salary you can positively identify exactly.
Once you have found an employee whose salary you know exactly, look at the left of their record on the employee report. In the ID column of the report, there will be a hash (ID) for them. Record this hash and enter it into your JSON file. NOTE: the hashes are generated in connection with your GTID so they will be unique to you and your account. You will need to append to this hash a _ followed by the salary of the employee (you may need to do some math).

Hints:

You need to find a place where data is isolated to a single person. Look throughout the four reports to see if there is anything that makes the data unique at an individual level. You might need to combine multiple reports to do this.
Once you isolate an employee, continue to see if you can isolate another employee. Rinse and repeat until you can no longer isolate an employee.
If it looks like a table and acts like a table, it is probably a table. While this is not necessary to complete the task, you can copy data from the report(s) into a spreadsheet program to manipulate it. This may assist you in tracking down the hole that exposes salaries.

Include your flags1-4 hashes_salaries into the JSON file, and now, onto Task 2!

TASK 2: INFERENCE ATTACK- #2 (flags5-8 – 30 pts)

**NOTE – Task 2 has multiple flags (flag5, flag6, flag7, flag8)

Now that you have seen how an inference attack can compromise data in a single company, we next consider how inference attacks can be used to compromise data by combining unrelated data sets.

For this attack, consider four completely unrelated data sources. All four are internet facing and therefore available to anyone with internet access. The first report is a sample report produced for a local hospital concerning types of procedures done within the last few years. The second report is a voter registration database, which has certain well-established fields in it (you could go to your local board of elections and get a report like this on all registered voters in your district most likely). The third report is a partially de-identified report for an insurance company for a sample data set for public use built by the same developers we have been dealing with. “Partially de-identified” means that though there were some attempts made to remove an obvious link to actual patients, there is still some data left on the report that might link back to a real person if looked at carefully. We also have provided a helpful list of medical codes to link to the hospital report.

The problem here comes from the incomplete deidentification of the data in the insurance company’s claim report. Because of this, it may be possible to join these four sources together somehow to find a specific person’s medical history to know that on a specific date, a specific person had a specific procedure done. This is of course a significant breach of HIPAA regulations (in fact, this exercise was inspired by the incident where the medical history of the governor of Massachusetts was made public in just such a fashion). Your objective in this inference attack is to find specific persons who had a procedure that you can link by name and identity to a specific attack in the medical data using all four data sources. You may want to look up the work done on the patient using the codes in the medical history on the medical codes list, which might be helpful in the process. You will find a hash next to each person in the voter report. Select the correct hash as your response to this task by listing the hash in your JSON file for flags5-8. Just as in Task 1, start by isolating the data to one person, then after that isolation, see if you can isolate down to one person again, and contine the process. You will pass task 2 if you have the correct hash values. Remember that you only get 20 submissions for the entire project, so if you randomly try different hash codes, you will only hurt yourself later in the project.

For Task 2 – order does not matters (order of finding) for the flags!

To earn your hash for flags5-8, you must perform the following actions.

Hover over the Task 2 Menu, which will display four sites: Medical History, Voter Registration, Insurance Claims, and Medical Codes. \

Click on the site you want to view (i.e., Medical History). The site will open in a new tab.
You will need to carefully examine all four sources to see where a patient’s history got compromised. It will be up to you to determine how to do this.
Once you have identified the exposed patient, look up the hash code (ID Column) on the voter registration report for the exposed patient and record it. NOTE: the hashes are specific to you and your GTID as provided in task 0 (and we will use that ID when we grade in Gradescope).You will need to append to this hash a _ followed by the cpt (procedure) code of the voter.

Hints:

As with the prior task, feel free to make liberal use of copying data from the VM into a spreadsheet or other similar program on your host, and feel free to do whatever analysis you need to do to figure out the record you seek. As in Task 1, this is not required to complete this task.
unlike Task1 where the data can be more easily narrowed to one record, here you will need to narrow to multiple records to then use logic to narrow to one record. Unlike Task 1 where the order matters to narrow down the data, the order does not matter here as it is logic based instead!
Once you isolate a voter, continue to see if you can isolate another voter. Rinse and repeat until you can no longer isolate a voter.

Include your flags5-8 hashes_cptcodes into the JSON file, and now, onto Task 3!

TASK 3: SQL INJECTION – #1 (flag9 – 10 pts / flag10 – 10 pts / flag11 – 10 pts)

*NOTE – Task 3 has 3 flags (flag9, flag10, flag11)

The next two tasks involve the most common database attack – SQL injection. This attack is one of the most common information security attacks in the world, and yet it is also one of the easiest to mitigate (in other words, lazy or careless programming is what causes this attack to be possible). SQL injection is possible when inputs are not sanitized. That sounds complicated, but what that means simply is that when you give someone a form or a web page to enter data, and there are slots for a user to type in, as a programmer you are responsible to ensure that somewhere in the process those input fields are inspected and any bad input is sanitized. SQL injection happens when the contents of the input fields without data sanitization are used in a text string to create a database query and sent to the database server in a form the developer did not intend.

To get into the basics of SQL injection, you can start by looking up online “SQL Injection Cheat Sheet” which is a helpful introduction to the topic. But what you will need to do to accomplish this task is figure out how to write some basic SQL code (the complexity won’t be in the SQL code, but in the bypass of the SQL Injection security) that can be placed into the input field of a form and passed to the website in such a way that it is a valid SQL statement that does things that the original developer did not intend. This can be as simple as simply bypassing security, collecting information you should not have, or at worst you will be making actual changes to the data of the website that the website owners do not want at all. Once the hole exists, your power to exploit it can be immense as long as you can guess the structure of the underlying database, or to map it out.

For Task 3, you have a website where your user ID is your GTID, as you entered on Task 0. The site wants to upgrade its old Legacy Login page. The legacy page’s main security is done by checking to see if there is a direct match to a user by the username (your GTID). You do not know and will not receive the password to enter the system. For flag9, the developers are asking you to determine the difficulty in bypassing the existing security, knowing that a SQL Injection attack is possible. The developers want to make an attempt to mitigate the risk of such an attack; however, they are wondering if they should be using just client-side data sanitization or both client and server-side data sanitization, with the server-side and client-side data sanitization being the same. For flag10, you will attempt to bypass the security using client-side data sanitization. For flag11, you will attempt to bypass the security using both client and server-side data sanitization. The client provided a snippet of the code they used on the legacy login page. You can use this code to try and figure out how to perform the SQL injection. When you succeed, you will log into the website using your GT ID and no password. The injection will bypass the password requirements and log you in immediately. Once logged in, the hash for flags9-11 keyed to your GT ID will be displayed. All three hashes for Task 3 will be different.

To earn your hash for flags9-11, you must perform the following actions.

Hover over the Task 3 Menu, which will display three links: Legacy Login, New Login – Option 1, New Login – Option 2. \

Click on a link. The page will open in a new tab.

For flag9 (Legacy Login) – this is a straight SQL Injection on the password field.
For flag10 (New Login – Option 1) – this is a client-side data sanitization SQL Injection on the password field.
For flag11 (New Login – Option 2) – this is both a client and server-side data sanitization (2x) SQL Injection of the password field.

Enter in your GTID in the username input field and any additional characters in the password input field you determine will cause the injection. Then press the login button to submit.

Where do I find my GTID?

- Please see the Projects page for instructions to retrieve your GTID.
If your information is correct, you will log into the system and receive your hash.

Hints:

For flag10 and flag11, you should use the browser’s Developer Tools to access and debug the client-side code.
For flag11, remember that there will be server-side data sanitization, that you do not have access to, but know the logic will be a duplicate of the client-side data sanitization.
There is difference between encoding, escaping, and sanitization. Understand the difference and the validity of using one or the other, remembering that the client/server-side code is doing sanitization.
Here is the login code that you were able to obtain from the legacy login page (both username and password have a character limit of 50):

function login($username, $password) {

$sql = “SELECT * FROM users WHERE eid=’$username’ AND password=’$password'”;

$userdata = $this->db->query($sql)->next();

if ($userdata) {

return true;

} else {

return false;

}

Include your flags9-11 hashes into the JSON file, and now, onto Task 4!

TASK 4: SQL INJECTION – #2 (flag12 – 15 pts)

*NOTE – Task 4 has 1 flag (flag12)

In this case, you will be looking at a search engine for a database of music albums for a music store. You have discovered that there is a page called “schema” which offers the user a view of the underlying metadata of the table used to populate the main report. By now, if you have been paying attention, the designers of these sites have followed a similar pattern for how their sites work, especially if you look at the hyperlinks that lead to the pages. You know therefore that there is probably a way to get admin access to that schema leveraging that knowledge of how they use links. And that means that there might be table definitions available to users who poke around and try to find things that they should not.

So your task then is first to figure out if you can find any additional information about tables in the database, and second to figure out how to leverage the information you find in the context of a SQL injection. If you are successful in crafting a SQL injection, you will find an actual database account appear on the report that you can use to log into the system using the “Login” screen on task 4. If you have the correct login information you can only obtain by SQL injection, you can enter the login information into that screen and get access to the management console. Of course, for our purposes “management console” is just the hash you need to get credit for the attack. Once the hash appears on the screen, you can enter it into your JSON file.

To earn your hash for flag12, you must perform the following actions.

Hover over the Task 4 Menu, which will display four links: Schema, Report, and Login.

Click on the link you want to view (i.e., Schema). The link will open in a new tab.
You can use the Schema link to see the schema for the table that is used to build the report. Consider based upon how the links have been designed to this point how you might be able to find out more information than this table. You might need admin access to see information.
Once you have found additional information, consider how you might be able to use it on the report page for a SQL injection.
The report search is a simple “SELECT…..FROM….WHERE” type of query using the schema you can see on the initial schema screen. Based on your previous testing, this probably means you can type in a search that will return rows from other tables in the database than intended. Once you have figured out how to execute the injection, type in your attack into the search field and click “Search”.
Once you have done the injection correctly, you will find a username and a password appear in the report data. Use these values to log into the management console on the “Login” screen. When you have the correct login, you will get into the system and your hash will be displayed. Record the hash into your JSON file and submit to Gradescope.

Hints:

You notice that the pages and urls for the different clients look very similar, with similar structure and conventions.
How can you add contents from another table to an existing SQL query to return one set of data in a report?
You have heard that this company has realized that client-side scripting is easily bypassable and offers a template for the server-side scripting and have started to remove the clien-side sanitization login. However, you are EXTREMELY confident that they are still using the same server-side sanitization logic/code they have previously used for other sites.
There is difference between encoding, escaping, and sanitization. Understand the difference and the validity of using one or the other, remembering that there is no client-side and only server-side sanitization.
Null works great for a SQLi when you don’t know the schema, and when the developers aren’t expecting it. The developers are looking for it!
The login bypass logic you used from Task 3 will not work on Task 4, although the choice to attempt is up to you. We explicitly prevented SQL injection on the login screen for this exercise.

Include your flag12 hash into the JSON file and submit it to Gradescope!