Tutorial answers

The tutorial questions in Appendix A are intended to be informative, and thought provoking. These tutorial answers are ever-evolving, and should not be considered the final say in all cases. As we find better, more informative answers, we'll post them here. If you have an answer that you think better than the ones posted here, please let us know, and we'll add it to the list, or amend it to make this the most interactive and comprehensive set of answers we can provide.

Exercises


Exercise 1: Beginner

1. How many categories of security vulnerabilities are listed for this application?

Twenty four. They are:

  • Buffer Overflow
  • Directory Restriction
  • File Access Race Condition
  • Format String
  • Heap Inspection
  • Inconsistent Implementations
  • Insecure Compiler Optimization
  • Insecure Randomness
  • Insecure Temporary File
  • Integer Overflow
  • Log Forging
  • Memory Leak
  • Obsolete
  • Often Misused: Authentication
  • Often Misused: Path Manipulation
  • Often Misused: Privilege Management
  • Privacy Violation
  • Process Control
  • Resource Injection
  • String Termination Error
  • System Information Leak
  • Unchecked Return Value
  • Uninitialized Variable
  • Unreleased Resource

2. Starting with buffer overflow, how many vulnerability categories can you name?

We keep a regularly-updated answer to this question at http://vulncat.fortifysoftware.com

3. In your company, what categories of security vulnerabilities are most critical?

In our experience, organizations that program primarily in C or C++ cite issues that lead to memory corruption (like buffer overflow) as being the most critical. Organizations that create a lot of web-based applications in Java or .NET tend to be concerned with injection vulnerabilities (SQL injection, command injection) and cross-site scripting.

4. Can you think of (or write) a line of code that would be acceptable in one program but would cause a serious security problem in another program?

Consider the following line of C code:

system("cd /tmp && start &> /dev/null");

This is a perfectly safe thing to do if the program is only run by a user on their own behalf. However, if it is part of a setuid root program that is accessible to unprivileged users, it allows an attacker to gain root privileges. The problem is that the program relies on the user's PATH environment variable to choose which version of start to run. This is acceptable if the program is not privileged, but if the program is setuid root, it allows the user to run any command they wish by putting their own version of start first on their path.

Exercise 1: Advanced

1. Describe a scenario in which a security issue that is not currently exploitable can become a critical security issue in the future.

In 1990, Microsoft added a graphics file format named "WMF" to Windows 3.0. The WMF format included the ability to specify a set of instructions to be run if the user cancelled a print request. At that time, it was just as easy to give someone a floppy disk with a virus on it as it was to give them a floppy disk with an evil media file, so there was little security risk. When the World Wide Web became popular, the authors of Internet Explorer were smart enough to ignore any commands found in a WMF file. In 2005, twenty five years after the WMF format was invented, hackers connected the dots: they forced Internet Explorer to invoke the Windows Picture and Fax Viewer to render WMF files. Since the Windows Picture and Fax Viewer executes the commands in a WMF file, they could take control of a PC using a malicious WMF file viewed in Internet Explorer.

2. What are some common reasons that developers introduce security vulnerabilities?

  1. They don't understand what constitutes a vulnerability or how a bad guy might take advantage of their code.
  2. They think "no one would ever try to hack this."
  3. They think "no one will ever figure out that this is vulnerable."
  4. Many security vulnerabilities are just garden variety bugs that happen to be exploitable. Therefore many security vulnerabilities come into existence for the same reasons all other bugs come into existence: misunderstood specification, lack of specification, typo, misunderstanding about how an interface behaves, etc.

3. What makes one security issue more important than another? How do you determine the importance of a security issue?

The risk that a vulnerability represents is defined to be the probability that the vulnerability is exploited multiplied by the cost of the exploit. The higher the risk, the more important a vulnerability is.

The problem is that accurately determining probability and cost is difficult, and both values evolve as changes occur in the software, the underlying technology, the business, and our society. For example, until states began legislating that citizens had to be notified if their credit information was stolen, vulnerabilities that enable data theft were not nearly as risky as they are today. Risk analysis must be an ongoing process.

When you are in the trenches analyzing code, you may not have the luxury of initiating a full risk assessment program. Even so, you have to rank vulnerabilities according to your best estimate of the risk they represent.

4. Once you have identified and corrected all exploitable security issues, what are the arguments for and against addressing non-exploitable security issues?

The cost of fixing an issue has several components:

  • The time required to make the change
  • The time required to test and release the change
  • The risk associated with the change (I may introduce a new bug).

The risks involved in not fixing an issue also has a number of components:

  • My assessment may be incorrect: the issue may be exploitable
  • Although my current assessment is correct, the issue may become exploitable in the future.

We find that software developers are often good at estimating the cost of fixing an issue but not so skilled at estimating the risks involved in ignoring an issue.


Exercise 2: Beginner

1. How many vulnerability patterns can you consciously look for as you are manually auditing the code—5, 10, 100, 1,000?

The number of vulnerability patterns you can recognize is directly related to the number you have been exposed to. If you have only a day of security training under your belt, you may recognize less than ten. If you have been auditing code for years, you may have a repertoire of several hundred.

2. What techniques would you use to keep track of paths across files?

Write notes at function call sites and a function declarations that explain the paths you are following. Keep a separate "master" set of notes that describe the things you learn and the ideas you try. All of these notes should reduce the amount of time you spend duplicating your previous efforts. Auditing a large codebase is a bit like exploring a maze.

3. How often should a security audit be performed? If you performed an audit today and fixed the problems, what would your confidence be in the code 90, 120, or 180 days later? How much new code would your developers write in 90, 120, or 180 days?

The frequency with which you should audit your code depends on two things:.

  1. How quickly the code is changing
  2. How quickly the external factors affecting risk are changing.

Since new exploit techniques are developed at a regular pace, even code that is not being actively modified can experience new risks. For new code, developers both understand their latest work better and are more willing to make changes when code is new.

Exercise 2: Advanced

1. If you had to set up a process for manually auditing code in your company, how would you estimate the amount of effort and time required to do it effectively? How do the requirements scale as the size of the code base grows?

Perform a 2 hour code audit for a small amount of code under the best possible conditions. This should give you an upper bound on the speed with which you can review code. Now determine the number of lines of code at your company. Assuming a linear auditing rate (X lines of code per hour), you can now estimate the number of hours required to audit all of your company's code. This number will almost certainly be a lower bound. Some pieces of code might only be effectively audited by certain people, some large blocks of code may be written in such a way that they cannot be split up among different auditors, etc.

2. What are the ideal skills for a security code auditor? How many people in your organization are well qualified? What jobs do they do today?

Good code auditors are:

  • Good proof readers
  • Expert at writing the kind of code they are reviewing
  • Conversant in the code they are auditing (the less work you have to do in order to follow along with the code, the more vulnerability patterns you'll be able to remember.)
  • Well-versed in all of the layers of the technology being used.
  • Experienced code auditors. The more you do, the better you are at it.

Developers who tend to take on the "fire fighter" or "jack of all trades" roles many times make good code auditors. An interest in security doesn't hurt!

3. Why do people perform security code audits rather than simply testing the software?

As a group, software developers tend to make the same security mistakes over and over again. In many cases these mistakes can be hard to catch with testing because they require rare circumstances in order to evidence themselves. Some of these same mistakes that are hard to catch with testing are easy to catch in a code review. A call to the C function gets() is a good example. Any program that contains a call to gets() is vulnerable to a buffer overflow attack. Armed with this fact, a code auditor can do a good job of spotting the call in the code they examine. It is significantly harder to write tests that will guarantee the same level of coverage.

4. If you cannot audit all of the code, how should you choose which section of code to audit? How confident are you about the results of the audit?

Picking a subset of code to audit is a dangerous game. If an application has:

  • areas that are accessible to anonymous users
  • areas that are accessible to authenticated users
  • and areas that are accessible only to administrative users

Then it might seem reasonable to audit the more widely available areas and forgo the areas that are only meant for more trusted users. However, a bug in the access control mechanism for the administrative interface could easily be the most important problem in the whole application. Any audit of a fraction of the application is likely to tell only part of the story.

5. Enumerate five programming styles or techniques that make auditing easier or harder.

  1. Comments, particularly comments about goals or assumptions, make auditing easier.
  2. Unintuitive macros, bad or absent naming conventions, and excessively long functions make code harder to understand and therefore harder to audit.
  3. Good encapsulation, where functionality is clearly divided between different program modules, makes a program easier to audit.
  4. Constructs such as function pointers and reflection can obscure the flow of control through a program and make it harder to audit.
  5. Use of standard conventions, patterns, and idioms makes code easier to audit.

Exercise 3

No exercises beyond the actions detailed in Appendix 1.


Exercise 4: Beginner

1. What are the benefits of integrating the SCA Engine into your environment as a compiler?

If you know how to use the compiler then you will also know how to use SCA making it easier to start analyzing code. Also it is likely that you will already have automated scripts in place to compile your program if it is large and complex. By integrating SCA into your environment as a compiler you can reuse any of these scripts to automate your analysis along with your compiles.

2. Why must you specify a compiler for C/C++ code but not for Java or .NET code?

Java and .NET are well maintained standards ensuring that all Java and .NET compilers will have the exact same behavior and interface. This is not true with C and C++. While there are ISO standards for the languages there are significant differences between implementations (different compilers). By giving SCA the name of the compiler it can select the correct parsing and analysis behaviors that match your code.

3. Where is the log file used by the SCA Engine?

In Core/log under the Fortify installation which is in /Program Files/Fortify Software/SCAS-EEX.X.X/ by default.

Exercise 4: Advanced

1. If the SCA Engine cannot find some of the files for the software being built, what information is missing? Consider header files and source files.
How will the missing information affect the results?

If some of the source code is missing, then SCA will not be analyzing the entire program only a subset which may be fine for some uses. While some results may still be generated, this could lead to serious false negative findings in a final audit before release. Even just a small amount of missing code may contain a critical source of tainted data that might flow to multiple security critical sections of the program. While a partial code analysis is perfectly fine for a development scan, it should be followed by a complete analysis of all source code for a final audit prior to release.


Exercise 5: Beginner

1. If a single source code base is used to build multiple executable programs, how can you use the SCA Engine to evaluate the programs independently?

Fortify SCA Engine allows the build dependencies of different executables to be tracked independently. Thus analysis results accurately represent the security analysis for all the components that go into each executable and do not include extraneous modules that may be used by other executables. Consult your Fortify documentation for the appropriate build environment that you are using.

Exercise 5: Advanced

Coming soon


Exercise 6: Beginner

1. How many of the Fortify vulnerability categories can you describe in detail?

Try to answer this off the top of your head. Then test your knowledge by going to http://vulncat.fortifysoftware.com and reviewing the taxonomy of software security vulnerabilities. Perform web searches for some of the categories listed. What do the results look like? Observe the similarities and dissimilarities of the names and descriptions of software vulnerabilities when they emerge from specific exploits from the outside ("black box") versus the root causes inside source code ("white box").

Exercise 6: Advanced

1. Write a piece of code containing an issue that is identified by each of the different analyzers.

Dataflow Analyzer:

			#include <iostream>
			#include <string>
			
			void mytest(char *s)
			{
				  char str[30];
				  strcpy(str, s); 
			}
			
			void main(int argc, char *argv[])
			{
				mytest(argv[1]);       
			}
			
			

Control Flow Analyzer:

			import java.io.*;
			
			public class unrel {
			  
				  public void myunrel(String path) {
				  
					try {
					 DataInputStream dis = new DataInputStream(new FileInputStream(path)); 
					 String s = dis.readLine();
					}
					catch (IOException ie) {
						System.out.println("IO error");
					}
				 }
			
			}
			

Semantic Analyzer:

			#define MAX_SIZE 64
			int main(char** argv, int argc)
			{
					char in[MAX_SIZE];
					int chars;
			
					memset(in, 0, MAX_SIZE);
					chars = read(0, in, MAX_SIZE-1);
					printf("You just entered:");
					printf(in);
					return 0;
			}
			

Configuration Analyzer:

			<!DOCTYPE web-app PUBLIC "-//Sun Microsystems, Inc.//
			DTD Web Application 1.2//EN"
			"http://java.sun.com/j2ee/dtds/web-app_2_2.dtd">
			<web-app>
			  
			  <display-name> My App </display-name>
			
			  <servlet>
				<servlet-name>myservlet</servlet-name>
				<servlet-class>servlets.myservlet</servlet-class>
			  </servlet>
			
			  <servlet-mapping>
				<servlet-name>myservlet</servlet-name>
				<url-pattern>myapp</url-pattern>
			  </servlet-mapping>
			
			</web-app>
			

2. Give an example in which a single issue will be found more than once.

Consider scenarios where the same corrupted data may travel via multiple different branching paths to the same vulnerable location in the application code. Fortify’s comprehensive analysis captures all those different paths. At the same time, for ease of viewing and auditing, the user may choose to combine multiple related issues.

3. Give an example in which a single issue will be found by more than one analyzer.

			#include <iostream>
			#include <string>
			
			void mytest(char *s)
			{
					  char str[30];
					  strcpy(str, s); 
					  printf(s);
			}
			
			void main(int argc, char *argv[])
			{
				mytest(argv[1]);       
			}
			

In this above example, both the semantic and dataflow analyzers will identify the potential issue with strcpy() and printf(). However, the Fortify SCA engine identifies the redundancy and only reports the higher confidence result. To see the effect without the dataflow override, run sourceanalyzer with the "-disable-analyzer dataflow" option.


Exercise 7: Beginner

Coming Soon

Exercise 7: Advanced

Coming Soon


Exercise 8: Beginner

1. Assuming that an attacker does not have your source code, what advantages do you have in finding vulnerabilities?

The primary advantage is that you get to go first, that is you can start looking for vulnerabilities in your code from the first line of code as it is being written while the attacker (in this scenario) must wait until your product has been shipped. If you have found and mitigated all the vulnerabilities during the development process there simply will not be any remaining for the adversary to find. In addition to this, the team that developed the software will have the advantage of access to and understanding of the architecture and a familiarity with the overall structure of the program. Furthermore, depending on the language and runtime environment your application utilizes, your adversary will at the very least have to commit additional time and resources to reverse engineering your executable and in the worst case be burdened with trying to decipher decompiled code that has been stripped of all comments, and names of variables and functions. While there are tools that can analyze binaries for security vulnerabilities available in the black-hat community and via open source, these shareware tools are unable to perform sophisticated dataflow and control flow analysis that are critical to reducing false positives thereby making their auditing work much more manually intensive. If your application is written in Java or one of the Microsoft .NET languages then the resulting binary will be fairly easy for an adversary to reverse engineer unless you make use of an obfuscator. While an obfuscator will not make it impossible to reverse engineer MSIL or Java Byte Code, it will significantly slow down an adversary's progress in understanding how the application behaves. In the end, if you and the adversary are equally skilled in understanding security vulnerabilities, the ability to go first and access to and a detailed understanding of the source code will be a decisive advantage.

2. How do you envision feeding back vulnerabilities found in Audit Workbench to the developers who will fix them?

Issues can be posted while auditing with AWB to any bug tracking system that supports command line invocation. This is done by utilizing the appropriate adapter for your bug tracking system and selecting the option to "file a bug" in the Audit Workbench. If your bug tracking system is not one of those supported out of the box, you can easily write a custom adapter to integrate it. Alternatively, you can bulk upload issues in the form of raw scan results (.fvdl) or audit findings (.fprj) by writing a simple script that converts the Fortify XML schema into the appropriate format for your bug tracking system.

3. If you only had the text output for a large project, how would you go through it without Audit Workbench?

Since security issues, unlike most bugs, tend to be the result of a sequence of events, it is often times very useful to browse through a set of source files that contain the data or control flow in question. Without the (AWB) Audit Workbench (or one of the IDE plugins) a significant amount of time will be spent following the dataflow or control flow path of interest. This can be accomplished manually but the AWB helps prune these paths and provides effective navigation along each path to better help understand the issue being considered. Likewise there is a significant potential for time savings with AWB given the quick access to detailed descriptions of each vulnerability along with code specific examples (issues descriptions are dynamically generated using the code being analyzed for examples). Finally, without the issue resolution mechanism, an alternative will have to be devised. This can be facilitated by loading the scan results into a bug tracking system and setting up the appropriate resolutions or by simply putting the text output into a spreadsheet for smaller project.

4. If the Source Code Analysis Engine runs on a build server but you run Audit Workbench on your local machine, will you run into problems?
How will you solve them?

If the local machine does not have a copy of the source files, then AWB (Audit Workbench) will ask for the path to the source files. Without resolving this path, you will only be able to see small snippets of the source code along the path of each issue (the Fortify .fvdl stores code snippets for each issue to help facilitate offline browsing in the Software Security Manager or in other cases where the source code is not present. To solve this limitation, you just need to provide access on the local machine to the source files analyzed on the server.

Exercise 8: Advanced

1. How many Source Code Analysis vulnerability categories can you describe in detail along with example exploitable code?

According to the Fortify Software Security Research Lab, there are over 100 types of vulnerability commonly found in the languages and platforms used by corporate developers and ISVs. This is of course an evolving list as new attacks and new API's are introduced into the development community. To see the latest version of this Taxonomy along with examples for each category, go to http://vulncat.fortifysoftware.com.

2. What kind of comments do you tend to use most often when you are auditing?

Different auditors have their favorites, but in general it is a good technique to scan the code for comments left by developers that may point to issues known but not addressed and development time. Many auditors will look for comments like "TODO", "FIX", or "???" and many variants thereof to gain useful insight into the code they are reviewing.

3. In the last 30 days, how many of these vulnerability categories have appeared on BugTraq?

See http://www.securityfocus.com/archive/1.

4. Name some vulnerability categories that have appeared on BugTraq that are not Fortify Source Code Analysis vulnerability categories.

There are occasionally vulnerabilities referred to on lists like BugTraq that are unique to a known product or platform. Likewise there are issues reported that have to do with the configuration of a product (poor selection of default administrative passwords for example). However, the majority of the issues reported daily on a wide variety of commercial products on lists like these fall into the well defined set of vulnerability categories detected by SCA.

5. Do you think an external attacker viewing the program as a black box would name vulnerability categories in the same manner as an internal auditor who is analyzing the source code (white box) from the inside, or would they be different?
Why?

Categories defined by black box attacks are going to take the form of observed stimulus response, whereas categories defined by analysis of the code are going to be more precise, including the source code constructs. A good example would be the Struts Duplicate Validation Forms vulnerability found in the Input Validation and Representation kingdom in the Fortify Software Security Taxonomy. This detail description of a vulnerability based on the observation of a known code construct is impossible to differentiate from others based on solely watching the stimulus response of the running application.


Exercise 9: Advanced

1. Return to the first lesson, "Introducing the Audit Workbench," and locate the Buffer Overflow in the wu-ftpd-2.6.0 file using the SCA Engine and Audit Workbench.

You undoubtedly found these much more easily and in far less time (and probably more thoroughly) by letting the analyzer do the work as compared to reviewing the code manually in the first lesson. If you want to double check your findings against an audit performed by our Security Research Group, you can open the completed AWB project file (.fprj) located in the Tutorial/understand_AWB/wu-ftpd.fprj in the Fortify Software home directory.

2. What other methods for identifying security vulnerabilities can you name? How do they overlap or complement source code analysis?

  • Red Teaming – This involves hiring a security expert to break the software and may or may not include the use of source code analysis depending on the skills and techniques of the expert.
  • Dynamic Testing (fuzzing) – This involves running the program and manipulating input data with unexpected values (fuzzing) in order to trigger an exploit and thereby reveal a vulnerability. This approach is rather limited by the fact that 100% code coverage is extremely hard to achieve in any dynamic test as is the determination of exactly what input values will be required for a specific vulnerability. This approach will usually yield a large number of false positives since it is hard to tell if a particular program behavior was indeed a vulnerability or some other benign affect. Since the set of unexpected values to feed into a program are essentially infinite and the numbers of vulnerabilities in any given code base rather small, this approach is likely to yield false negatives as well. When combined with source code analysis these short comings can be overcome as very targeted tests against suspect code paths can be constructed and executed based on the analysis changing the Dynamic Testing from blind or "Black Box" to directed or "White Box" tests.

Brought to you by: