Tag-Protector: An Effective and Dynamic Detection of Illegal Memory Accesses Through Compile-time Code Instrumentation

Programming languages permitting immediate memory accesses through pointers often result in applications having memory-related errors, which may lead to unpredictable failures and security vulnerabilities. A light-weight solution is presented in this paper to tackle such illegal memory accesses dynamically in C/C++ based applications. We propose a new and eﬀective method of instrumenting an application’s source code at compile time in order to detect illegal spatial and temporal memory accesses. It is based on creating tags, to be coupled with each memory allocation and then placing additional tag checking instructions for each access made to the memory. The proposed solution is evaluated by instrumenting applications from the BugBench benchmark suite and publicly available benchmark software, Runtime Intrusion Prevention Evaluator (RIPE), detecting all the bugs successfully. The performance and memory overheads are further analyzed by instrumenting and executing real world applications from various renowned benchmark suites. In addition, the proposed solution is also tested to analyze the performance overhead for multi-threaded applications in multi-core environments. Overall our technique can detect a wide range of memory bugs and attacks with reduced performance overhead and higher detection rate as compared to the similar existing countermeasures when tested under the same experimental set-up.


INTRODUCTION
Illegal memory accesses (IMAs) such as out-of-bound buffer read/write operations and dangling pointer dereferences are major concerns in applications written with programming languages like C/C++.These languages provide a powerful set of low-level features to software developers such as direct memory accesses and arithmetic operations on pointers.Normally in such languages, the starting address is assigned to a pointer when a memory area of required size is allocated, whereas an access is considered legal only when either its actual pointer or a pointer derived from it is used between the allocation and deallocation of a specific memory area.A pointer directing to a memory location that has already been deallocated is called a dangling pointer.A spatial IMA, which is more commonly known as buffer overflow or underflow, may occur when a pointer accesses memory outside the range of its allocated memory object.A temporal IMA, also known as dangling pointer dereference, occurs when a dangling pointer is used at the time of the access.
Typical programming errors, such as out-of-bound array indexing and dangling pointer dereferences, are common and cause indeterministic behavior because these operations can write to memory locations in ways not defined by the designer.For example, memory accesses can happen outside the intended range if the index calculation of an array is based on an erroneous formula.It is a difficult and tedious job to detect and diagnose such behavior using static analysis-based tools.Even when an application is tested intensively through these tools, such bugs can still exist as it is practically impossible to create all of the input combinations for an error to occur in the development or test phase.Furthermore, without required protection, many security threats like viruses, Trojans and worms can modify appli-0:2 cation data by gaining illegitimate access to secured blocks of the memory such as through buffer overflow attacks [1].Software-based attacks have become increasingly widespread and buffer overflow is one of the major causes of such security outbreaks.According to a report published by Sourcefire [2], buffer overflow based attacks are responsible for 14% of all and 35% of critical vulnerabilities over the past 25 years.Stack smashing [3] is a classic example of such an attack where an attacker can simply replace the return address of a function on the stack through a buffer overflow.In the case of unprotected execution, on function return, the control may be switched to the specific location where malicious code is placed.Similarly, return-oriented programming (ROP) [4] is a relatively new way to accomplish security exploits, after gaining control of the execution flow through a buffer overflow.ROP and Return-into-libc based attacks execute a group of instructions from the existing code to create new functionality.Different solutions already exist to detect such attacks that are either based on removing vulnerabilities statically through safe languages (e.g., [5], [6], [7], [8]) or inserting run-time checks for the detection of out-of-bounds memory accesses (e.g., [9], [10], [11], [12], [13], [14], [15], [16], [17], [18]).Moreover, some of the existing dynamic memory checkers do not cover all bugs while others provide complete protection at the cost of notable performance overhead.
In this paper, a fast and effective memory protection technique is presented to detect a wide range of IMA bugs and attacks dynamically such as out-of-bound (OOB) buffer and dangling pointer based memory accesses, which can be sub-categorized as buffer overreads and over-writes, buffer under-reads and under-writes, OOB accesses through direct indexing and dereferencing dangling pointers.Our tag-protection solution is built upon the well-known Jones and Kelly approach [10] of tracking each memory object pointed to by its referent pointer.Ruwase et al. carried forward this work and presented an enhanced solution called CRED [11].Their solution fails to detect overflows when buffers are allocated inside struct type data variables and for many library functions such as snprintf() and fscanf() under specific cases for memory objects allocated on stack/bss/data-segments.On the other hand, our proposed solution is generalized and traverses through all the memory allocations, linking the memory objects with tag marks and tracking the memory accesses through these tag marks.The base and end addresses of each memory object are calculated and stored in the corresponding tag marks.The address comparisons instructions are inserted for each instruction that either read or write to that particular memory object.Whenever a reserved memory area is accessed through an allocated pointer, our solution verifies the access by comparing the address being accessed with the bound addresses that are stored in the tag marks linked with that particular memory object.If any underflow or overflow occurs, the address of the accessed memory block would either fall behind or exceed the address values stored in the associated tag marks, then an IMA bug alert will be generated.The access is considered legal only if the address is found within the bounds.Furthermore, to detect dangling pointer dereferences our tag-protection solution creates a dedicated tag address and initializes the corresponding tag mark with it when a memory object is deallocated.On each memory access, to detect any dangling pointer dereference, the tag marks are also compared with that dedicated tag address.
The proposed solution does not require modifications to the application's source code and it is based on the automatic code instrumentation at compile time by adding new instructions without affecting the actual flow of data.We have placed the run-time checks through compile-time code instrumentation.The LLVM v3.4 [19] compiler infrastructure has been used for implementation purposes.Our proposed solution is different from existing solutions [15; 20; 17; 16] in a way that it does not perform table-up searches to load start and end addresses of a particular memory object.The proposed solution effectiveness and performance overhead is measured on real-world applications from several benchmark suites [21; 22; 23; 24] and test-bed codes [25].Based on the experimental results, it is shown that our approach has lower performance overhead when compared with the existing solutions and hence it is a good choice for testing applications that are written in C/C++.Our solution can detect various memory bugs and security attacks including out-of-bound read and write accesses, stack-overflows, stack-underflows, heap-overflows, heap-underflows, overflows and underflows in globally defined variables (data and bss segments), directindexed overflows/underflows and dangling pointer dereferences.
The paper is organized as follows: The current work related to the existing memory bug detectors is summarized in section 2. The general approach and implementation details of the proposed tag-protection solution are described in section 3. Section 4 presents the effectiveness, performance evaluation, and comparison with the similar existing solutions while section 5 concludes this paper.

RELATED WORK
Different solutions have been proposed to detect memory errors in C/C++ based applications.Memory errors are typically grouped into spatial errors and temporal errors.Various Hardware-based memory safety solutions [26], [27], [28] have been presented in the literature with reduced performance overhead as compared to software-based solutions.Such solutions are not generic and require either dedicated hardware modules or specific modifications in the processor pipeline or cache architecture.Similarly, dynamic information flow tracking (DIFT) based hardware-assisted solutions such as Secure Program Execution [29], Dynamic Tainting [30] and libdft [31] are based on tagging data coming from untrusted sources and then tracking their usage as the application executes.These techniques require modifications in application's data, processor architecture, and memory layout.Therefore, they are not feasible for such systems that use more common hardware design approaches.
As mentioned earlier, static analysis tools do not provide 100% guarantee, and hence, they have been substituted with dynamic techniques.Many software-based dynamic memory bug detection techniques have been proposed in the literature that vary in implementation level, memory utilization, run-time overhead, types of bugs detected, the probability of detecting bugs, supported architectures and many other features.Based on the implementation level, these tools can be classified into two types.For instance, solutions such as Purify [32], Valgrind [33], Dr. Memory [34], and Dynamic Tainting [30] operate at binary level while other techniques like Backwards-Compatible Bounds Checking [10], Dynamic Buffer Overflow Detector [11], Backwards-compatible Array Bounds Checking [12], SAFECode [13], BaggyBounds [16], SoftBound [15], PAriChecks [17], LBC [18] and AddressSanitizer [35] require source code to insert run-time checks through compile-time instrumentation.Some dynamic IMA bug detectors such as StackGuard [9], LibSafe [36], ProPolice [37] and Stack-Shield [38] provide protection for stack memory only, while WIT [14] detects invalid writes only.Other dynamic tools can detect a wide range of bugs successfully, but at the expense of large performance overhead.In this section, we have discussed only those solutions that are similar to our proposed solution.
Referent-object based approaches such as Backwards-Compatible Bounds Checking [10], CRED [11], PAriCheck [17] and BaggyBound [16] work at source-code level by maintaining a separate table, using different data structures, to record bounds of each memory allocation.This table is then used to verify memory accesses by performing table lookups at run-time.These techniques differ in the implementation and handling of record tables.Avijit et al. extended the solution of LibSafe [36] by implementing LibsafePlus and TIED [39; 40].Static allocations are handled by TIED whereas LibsafePlus deals with dynamic information about the stack size and heap allocations.These details are then used at run-time to detect any overflow.Furthermore, these solutions do not provide dangling pointer dereference detection capabilities and also require customized memory allocator libraries to achieve reduced performance overhead.SAFECode [13] which is also an object-based approach, operates at the source-code level.It instruments loads and stores to prevent illegal memory accesses, uses points-to analysis and type-inference to find type-safe regions of the heap, and partitions the heap into regions to eliminate load/store checks on type-safe heap regions.They have also presented a solution [41] that provides protection against dangling pointer dereferences.Similarly, PAriCheck [17] computes bounds and assigns label to each fixed-size memory block and stores these labels in a separate table at run-time.On each pointer arithmetic operation, the label is compared by values in the table.This solution has reported an average overhead of 9.5% for SPEC CINT2000 benchmark suite but does not detect dangling pointer dereferences and overflows in structures.There are also object-based approaches that have been applied on the whole operating system [42].
On the other hand, the pointer-based approaches such as MSCC [43], CCured [6], Soft-Bound [15], LBC [18] and others [44; 26; 45] associate base and size metadata with every pointer and insert run-time checks manipulating metadata information during load/store operations of pointer values.CCured [6] combines instrumentation with static analysis to insert run-time checks and removes redundant checks at compile time.CCured fails to work when un-instrumented pre-compiled libraries are in use.Unlike prior pointer-based (also known as fat pointer) approaches that modify pointer representations and object layouts [44; 46; 18], SoftBound records the bounds information in a disjoint metadata which is accessed via explicit table lookups on loads and stores of pointer values only.Nagarakatte et al. extend their SoftBound technique and presented another solution called CETS [20] to detect temporal safety errors.
AddressSanitizer [35] verifies whether each allocated memory block is safe to access by creating shadow memory around stack and global objects to detect overflows.The shadow memory is checked on each load and store request.The only drawback is that it also requires modified run-time library to create shadow memory around allocated heap regions.The current implementation is based on the LLVM compiler infrastructure.AddressSanitizer might fail to detect dangling pointer dereferences when a large amount of memory is allocated and deallocated (between the deallocation and its next use).
Among different techniques that operate at the binary level, Purify [32] is one of the early solutions in this area.It enforces the insertion of extra checking instructions directly into the application's object code and verifies every memory read and write operation performed by the application under execution.Valgrind [33] is an openly available instrumentation framework for building dynamic analysis tools.The Valgrind's MemCheck tool uses shadow memory to keep track of which memory areas have been allocated and pinpoints illegal accesses to uninitialized memory.Another approach presented by Doudalis et al. [30] associates unique taint marks with each pointer and its allocated memory block.These taint marks are then propagated and verified through taint checking instructions, whenever memory is accessed.This approach works at binary level but requires modification in run-time libraries to generate taint marks properly.Such tools do not require source code and recompilation but they increase the memory utilization and execution time overhead largely as the tool is first loaded into the memory and main application runs on the top of it.
Each of the above mentioned memory bug detection tools has its own strengths and limitations.For example, tools that operate at binary level detect bugs effectively without any source code requirement but at the cost of more execution time overhead.Moreover, solutions such as [30; 32] are not open source and therefore not available for performance comparison.As discussed earlier, the tools that operate at source-code level also require either modified run-time memory allocators or a dedicated compiler driver as it is the case in several existing solutions [30; 13; 35; 16; 17].Other software based techniques such as LBC [18] have presented much lower performance overhead but they require modifications in the source code, thus presenting compatibility issues.
Unlike SoftBound, our tag-protection solution is a combination of object and pointerbased approach.We create tag marks for each memory object and propagate these tag marks to all the pointers that are associated with that memory object.Moreover, we do not perform any table-lookup search at run-time which makes our solution more efficient.SoftBound with the CETS [20] extension is capable of detecting both temporal and spatial safety errors but with a much higher performance overhead.On the contrary, lower performance and memory overheads have been reported by our solution when it is compared with the openly available tools under the same experimental set-up.While our solution requires source code for instrumentation, it does not need customized run-time libraries and static analysis.Furthermore, our technique has presented higher detection rate by instrumenting all buffer allocations including buffers allocated inside struct type data variables which are left undetected by the existing solutions excluding SoftBound.

TAG-PROTECTION
The key concept behind our proposed tag-protection solution, detection of dangling pointer dereferences, current limitations and how it is implemented at compile time have been discussed in this section.

General Approach and Design
Tag-protection is a compile-time code instrumentation approach and it is based on storing memory objects bounds information in separately allocated tag marks and then inserting tag checking instructions to detect IMAs at run-time.For example, when a memory object created and a certain memory area is reserved for it, our technique creates two tag marks tag start and tag end.The base address and the end address of that memory object are calculated and stored in the tag start mark and thetag end mark respectively.Finally, when that memory object is accessed, the run-time check instructions compare the address being accessed with the bounds stored in its associated tag marks.In the case of any spatial illegal memory access, either buffer underflow or overflow, an alert signal will be generated before terminating the program execution.
Our solution is compatible with C/C++ source code as it does not change the memory layout and tag marks are generated and maintained separately.The source code of C/C++ based applications is converted into an intermediate representation form which is used to detect each memory allocation, create tag marks and insert run-time checks.The typical memory layout of a C/C++ program is shown in Fig. 1(a).For each memory object, whether it is globally, statically or dynamically allocated, the tag marks are created and memory object bounds are assigned to these tag marks as shown in Fig. 1(b).The record of tag marks along with their associated memory objects is kept separately in a specialized record table as depicted in Fig. 1(c).The tag marks are also created for sub-objects that are defined inside a structure memory object.In that case, our solution stores the sub-object information along with the main memory object in the third column of the record table.It should be noted that the record table is used at the time of code instrumentation only to place relevant check instructions and it is not part of the final executable file.
Unlike other compile-time code instrumentation solutions (such as SoftBound), which use record tables for bounds lookup at runtime, our solution utilizes this record table at compile time only and is deleted before generating final instrumented executable.The record table is created at the time of instrumentation and stores the address bounds.Using this record table, the tag checking instructions are inserted before each load or store instruction detected for the corresponding memory object.This approach has resulted in lower execution time as bound lookup step is performed at compile time and tags are accessed directly at run-time .During tag address comparison, the given memory access will be considered legal only if the address used to access the memory area is greater than the address stored in its tag end mark and less than the address contained by its tag end mark.In the case of any spatial IMA bug, the addresses stored in tag marks will be surpassed and tag check instructions will raise bug alarm and abort the application.In order to handle temporal IMA bugs, run-time checks are also inserted to detect dangling pointer dereferences.This is achieved by detecting memory deallocation function calls and assigning a dedicated tag address to the respective tag marks.The memory access will be considered dangling pointer dereference if the tag mark address matches with that dedicated tag address.The design of tag-protection solution is based on the following steps which are also presented in Fig. 2. 3.1.1.Function Duplication Enabling Inter-procedural Tag propagation.When memory objects are accessed by the pointers and these pointers are passed as function arguments then the corresponding tag marks must also be propagated with them as well.One solution is to allocate all the tag marks globally so that the tag marks can be accessed anywhere in the program without modifying the function arguments.However during testing, this approach failed when functions are called recursively and in multi-threaded applications where multiple threads call the same function with different arguments which in turn generate false IMA bug alarm.To handle this problem, the tag marks are created in the same memory segment where corresponding memory object is being allocated.Moreover, the function duplication technique is used to create a copy of the function and additional arguments are added to pass tag start and tag end marks for each pointer argument.In order to mark the function being duplicated the actual function name is used along with a unique attribute as presented in Algorithm 1.The details of each function being duplicated are saved in a separate table that is used in later stages of tag-protection pass to replace function calls.

Initialized data (data
3.1.2.Tag Creation for Global Memory Objects.For globally defined memory objects that are defined statically, such as buffers and structures, the memory is allocated directly at program entry level in data and bss segments as shown in Fig. 1(a).In order to instrument such global allocations the tag marks are created globally and a dedicated tag address is also allocated to handle dangling pointers dereferences as illustrated in Algorithm 2.
3.1.3.Tag Creation for Local Memory Objects.For the local memory objects that are defined at function level statically, the memory is reserved explicitly on stack and tag marks are also allocated on the stack for such memory objects.Our tag-protection computes the start and end address of such memory objects after which tag start and tag end marks are initialized respectively with these addresses.For each memory object pointer that is used to allocate memory dynamically, tag start and tag end mark pointers are also allocated ALGORITHM 1: Stage-1:Function duplication with modified argument list to enable tag marks for functions that take pointers as arguments Input: Uninstrumented LLVM-IR code α generated through LLVM llvm-link command Output: Instrumented LLVM-IR code β 1 generated through LLVM opt command using stage- and initialized with NULL.For such dynamically created objects, the memory is reserved on heap implicitly by calling special memory allocation functions (e.g, malloc, calloc, realloc, xcalloc etc.) and starting address is returned to a pointer variable.Our proposed solution intercept such function calls and the start address is assigned to its tag start mark pointer.The end address of the allocated object is also determined by our pass and it is assigned to its corresponding tag end mark pointer.To ensure that the tag marks are written atomically, the tag update instructions are placed in the same basic block1 where memory objects are actually allocated.In this way, all the tag marks will be thread-safe for multi-threaded applications.The record of tag marks, along with their initialized values and memory objects, is kept in a separate table.This step of tag creation is achieved by implementing the steps as defined in Algorithm 3.  As discussed earlier, it is very critical to allocate tag marks in the same memory segment(e.g., heap, stack, bss, data) where memory object is being created.The memory objects can be accessed inside the body of another function through pointers that are passed as arguments at the function call.To handle interprocedural tag marks propagation, functions containing pointers as arguments are duplicated as shown in Algorithm 4. In order to update function calls for these newly created functions, the function call instructions are detected by our pass and replaced with new instructions so that tag marks can be propagated separately without changing the data-flow of the application under instrumentation.3.1.5.Tag Checks Placements.In final step, the tag checks are created by following the steps as shown in Algorithm 5. Memory read and write accesses are performed through LOAD and STORE operations respectively at LLVM-IR level.Our tag-protection pass detects such instructions and uses record table to locate the memory object pointer and tag marks to be accessed.Tag check instructions, to compare the start and end address of memory object with its associated tag marks, are then inserted before each load and store instruction to detect spatial IMA bug.Memory accessed through LLVM intrinsic functions (e.g., memset, memcpy) are also instrumented and accordingly by detecting memory objects being accessed and interesting tag checking instructions accordingly.Furthermore, the tag marks are also compared with the dedicated tag address to detect any dangling pointer deference.In order to explain the threat model clearly, an example C code with possible IMAs bugs is presented in Fig. 1.The code consists of a function which simply copies strings into two local buffers and prints the results before exiting the function.Here at line 2, a buffer2 of length MAX size is allocated on stack and at line 3, a heap memory of the same size is allocated for buffer.The function calls on line 6 and 7 may initiate spatial IMA bugs through heap and stack overflow if the length of input strings, passed to the main function at run time, exceeds MAX size.The memcpy function at line 9 also causes dangling pointer dereference error as the respective memory area is deallocated at line 8  Save modified LLVM-IR code as a final instrumented LLVM-IR code γ longer be used.It is not possible to detect these kinds of IMAs through static analysis as the user inputs are not known at compile time.The details of instrumenting this example code are presented in the following subsection.

Handling Pointer Operations:
If a pointer is derived from another pointer, the tag mark pointer associated with the actual object pointer must also be propagated.Our proposed technique detects store instructions, at LLVM-IR level, that are used to pass address values from one pointer object to another pointer object.To illustrate this, consider the line 5 in Fig. 1, where pointer object ptr passes its contained address value to another pointer object buffer.In that case, the extra instructions are inserted by our pass to copy the tag mark pointer of ptr to the tag mark pointer of buffer.

Detecting Dangling Pointer Dereferences
To handle dangling pointer dereferences, our solution detects calls to dedicated memory deallocating functions such as free.The memory object to be deallocated is identified along with its tag mark pointer.Extra instructions are then inserted to assign dedicated tag address, globaltag, to the corresponding tag mark pointer of the memory object to be deallocated as mentioned in Algorithm 3. The check instructions, being inserted for each LOAD or STORE operation as mentioned in Algorithm 5, measure the address contained by the tag mark pointer and perform address comparison.If the memory object pointer has not been reallocated and it is used to access the deallocated memory, the tag mark pointer still points to the dedicated tag address,globaltag.In that case, check instructions being inserted by our tag-protection pass will generate the dangling pointer dereferences signal to terminate the application.
For example, consider a call to memcpy function at line 9 in Listing 1 where pointer object ptr is used.This memory object is deallocated at line 8 through pointer buffer.Our tag-protection solution will initialize its tag mark pointer with the dedicated tag address, globaltag.Check instructions inserted for the tag mark pointer of ptr before the function call memcpy will raise dangling pointer dereference bug alarm and terminate the application before this function call executes.

Implementation
The proposed tag-protection approach operates at the source-code level and it is loaded as an instrumentation pass at compile time.The current implementation is based on the LLVM v3.4 compiler infrastructure as shown in Fig. 3.The Clang compiler is used to compile C/C++ source file and generate Intermediate Representation (LLVM-IR) code.The LLVM Linker (llvm-link) is then used to link and generate a single LLVM-IR code file before running the tag-protection pass.In order to have minimum overhead, this pass is placed at the end of optimization pipeline and instruments only those memory operations that sustains other optimizations implemented by the LLVM Optimizer (opt).For instance, the memory operations such as accesses to local stack variables and objects created through LLVM code generator (e.g., debug information and metadata) will not be instrumented by our pass as these will be optimized out by the LLVM during pre-processing at compile time.After the LLVM-IR code has been instrumented by our tag-protection pass, it is passed again through LLVM optimization pipeline in order to simplify the tag marks propagation and checks.Furthermore, our tag-protection pass is independent of any specific Instruction Set

Limitations
Our proposed solution works primarily as an Intrusion Detection System (IDS) and prevents further damage by terminating the application's execution.The alert signal is generated followed by an exit call, whenever an overflow occurs either due to unintentional error or as a result of deliberate attack.Moreover, the current implementation of tag-protection pass requires C/C++ based applications source code and does not support instrumentation of code generated at run-time by dynamically interpreted languages (such as Python, Ruby, Perl, etc.).This means that the type of each memory object must be known at compile-time in order to detect its start and end address.
Calls to pre-compiled library functions where source code is not available (e.g., memcpy, strcpy, sscanf etc.) are also identified by our tag-protection pass.In such cases, it is not possible to insert tag address check instructions.Alternatively, our tag-protection pass inserts one tag value check instructions after such function calls by detecting memory objects passed as function arguments and loading tag mark values as defined in Algorithm 5. Any overflow that occurs as a result of the sequential write operation will overwrite the memory pointed by the tag mark.Such overflows will be eventually detected, on function return, by tag value check instructions placed after function call instruction.If the overflow attack occurs at the same memory location with different values over a period of time (such as brute-force or adaptive attacks), the tag mark can be overwritten with its initial value.In such cases, our address checking mechanism can detect these kinds of attacks only if the complete source code is provided.
Our tag-protection pass will not be able to detect any read overflow that occurs during execution of the un-instrumented pre-compiled library functions.Furthermore, our solution currently does not instrument functions variable argument functions, therefore any memory safety violations occurring inside such functions will not be reported by our solution.

EVALUATION
To evaluate the effectiveness and overhead of our tag-protection pass, we have instrumented applications from various benchmark suites with our tag-protection pass.All applications are compiled and instrumented using Clang with -O2 optimization level.The instrumented applications are executed in 64-bit mode on a DELL OPTIPLEX 780 machine with Intel core i5-2400 CPUs and 4GB RAM running Ubuntu 12.04 with kernel 3.11.0.26.
Please note that, for any solution that performs compile-time transformations, the changes made to the source code must not affect the actual flow of data and the generated binary executable should produce the expected output.In our case, the tag marks and the check instructions inserted by the tag-protector pass should not and will not generate any false alarms.In addition, inserting and executing extra checks always result in performance and memory overhead.In order to achieve an efficient solution, these overheads should be manageable.

Effectiveness
To measure the effectiveness of our technique, we have created different test programs, initializing various IMA bugs caused by illegal array indexing and invalid pointer arithmetic operations.These programs are instrumented with our tag-protection pass and all the bugs detected successfully.To test our solution on real-world applications that have been reported with buffer overflow vulnerabilities, C language based applications from BugBench benchmark suite [22] are compiled and instrumented with tag-protection pass.This benchmark suite [22] has a set of applications that contain various known software defects including buffer overflows, stack smashing, double frees, uninitialized reads, data races and atomic violations.We have selected only those applications that have buffer overflows as our solution targets these kinds of memory vulnerabilities.These applications are then executed using input sets, triggering each known IMA bug.Our proposed solution detected all the bugs successfully as presented in table I.The second column of this table represents the total lines of code compiled for each application and the third column provides the line number of the bug location in the given file.Many open-source tools provide a wide range of test programs.To further check the validity of our solution, it is also tested successfully on a set of diverse test-bed programs that come along with SAFECode source files [25].
We have also assessed our tag-protection pass using publicly available Wilander and Nikiforakis' benchmark software, runtime intrusion prevention evaluator (RIPE) [47].Various buffer-overflow vulnerabilities depending on the technique used to overflow the buffer, the kinds of attacks performed and the location of the buffer to be overwritten, have been covered by RIPE.For instance, RIPE covers four memory locations:Stack, Heap, BSS, and Data segment to allocate a buffer to be overflowed and uses a return address, old base pointer, function pointer, longjmp buffers and buffers inside the structs to as code target pointers.Our solution provides 100% accuracy by successfully detecting all the overflows.The execution of the attack code is prevented by terminating the program execution.This software is currently supported for 32-bit architecture so we have made some modifications to enable its execution on 64-bit architecture We have tested our tag-protection pass and other publicly available solutions on 64-bit Ubuntu 12.4 in order to measure detection rates.The comprehensive comparison with the existing countermeasures that are available openly, is presented in Table II.CRED and SAFECode failed to prevent direct stack/B-  [20] 36.26%SAFECode [13] 68.89% AddressSanitizer [35] 85.5% Proposed Tag-Protector 100% SS/databased overflows toward function pointers, longjmp buffers, and static arrays defined within structs 2 for many library functions such as sprintf(), snprintf(), sscanf(), and fscanf().
SoftBound has presented the detection rate of 36% only and on close examination of the RIPE source code and SoftBound transformation pass, it is found that the SoftBound fails to insert overflow checks for arrays that are being defined inside structs 2 based memory objects.Under specific circumstances, where direct out-of-bound access is made through absolute indexing, our proposed tag-protection solution can detect such IMA bugs whereas other existing tools like Valgrind and AddressSanitizer leave such bugs undetected.

Performance Overhead
The performance overhead is defined as the percentage increase in the execution time of instrumented binaries as compared to the execution time of the binaries built using the standard Clang compiler with -O2 optimization level.As our solution terminates the application's execution as soon as it detects any overflow or dangling pointer dereference, the performance overhead can not be measured using applications from BugBench benchmark suite [22] or RIPE [47].To measure the actual performance overhead of our solution, we have instrumented all the C/C++ based applications from SPEC CPU2006 benchmark suite [23] comprising over 1.11M lines of code.The applications in this benchmark are realworld applications and are assumed to be virtually bug-free due to their general usage and no bugs have been reported so far in the community.All the applications instrumented by tag-protection pass have been executed successfully without generating any IMA bug alert which further proves the effectiveness of our solution.
The applications are also instrumented with similar openly available solutions (such as AddressSanitizer and SAFECode) and executed on the same machine under similar experimental set-up.Our tag-protection pass on average incurs 26.42% increase in execution time which is still far lower than other solutions.Each application has been executed five times to get the average execution time.The complete performance overhead results along with the error bars, presenting standard deviation in the execution time, are shown in Fig. 4.
To compare the performance overhead of our solution with the SoftBound+CETS [20], the latest available source code [48] is downloaded and installed as per instructions provided.Various benchmark applications from SPEC CPU2006 and SPEC CPU2000 are instrumented and execution time overhead comparison is presented in Fig. 5.To have a fair comparison, this figure only presents execution time overhead for those applications as reported in the SoftBound+CETS [20] and as well as the percentage increase in the execution time of tag-protection enabled applications is shown in this figure.Our solution has presented lower performance overhead except two applications (lbm, crafty ) where the difference in percentage increased is still less than 3%.
In order to compare execution time overhead with SAFECode, the latest available source code [25] is downloaded and installed as per instructions.The whole-program analysis feature of SAFECode is also enabled through libLTO plug-in that performs these analyses and transformations.Fig. 6 presents performance overhead comparison in terms of percentage increase in execution time when C/C++ based benchmark applications from SPEC CPU2006 are executed after instrumenting with our proposed tag-protection pass, SAFE-Code and AddressSanitizer respectively.The SAFECode failed to instrument two applications, 403.gcc and 473.omnetpp , which are not included in this figure.Furthermore, two applications, 477.dealll and 483.xalancbmk , have generated false bug alarms when executed after instrumenting with SAFECode.This is probably due to the fact that certain optimizations such as type-safe load/store check elimination, static array bounds checking, automatic pool allocation and fast pool run-time checks for loads and stores, and the use of multiple splay trees, have not been enabled in the source code that is available online [25].
Intel Pointer Checker [45] tool has also been evaluated by installing the trail version as it is not available publicly.Applications from SPEC CPU2006 benchmark suite are instrumented using this tool by following the instructions carefully but it failed to instrument many applications completely and has presented very large performance overhead.
To make a fair comparison with similar existing solutions which are not available publicly such as PAriCheck [17], BaggyBound [16] and WIT [14], we have instrumented applications from the Olden [21] and SPEC CINT2000 [49] benchmarks as the indicated solutions have reported the performance overhead for these benchmark suites.Furthermore, our solution has detected a heap overflow bug in application "em3d" from Olden benchmark, when executed with same input set as used by PAriCheck.The same bug is also detected by AddressSanitizer and Valgrind's MemCheck tool whereas PAriCheck and other solutions have not reported this bug.Our solution has presented an average overhead of 26.64% for SPEC CINT2000 and 19.98% for Olden applications when compiled and executed with similar settings whereas the BaggyBound has reported 60% overhead for SPEC CINT2000 applications.Only PAriCheck [17] has reported lower execution time overhead than our technique but their solution fails to detect dangling pointer dereferences and overflows in pointer buffers declared inside structure objects.WIT [14] has also reported lower execution time but their solution has limited functionality as they provide protection against illegal writes only.

Memory Overhead
To measure peak memory usage for the instrumented applications we have examined the VmPeak field from /proc/(pid)/status file.From the table III, it is clear that our tagprotection solution has presented minimal memory utilization overhead.Under specific conditions, where memory usage is the main constraint, this overhead can be further reduced by disabling tag error reporting and allocating tag marks for array memory objects only.Table IV summarizes the increase in binary size when applications from SPEC CPU2006 benchmark suite are instrumented with tag-protection pass and other existing solutions under similar experiment setup as explained earlier.On average, the binary size is increased by 5.09x.This increase is still lower than the existing solutions as presented in this table.

Multi-threaded Applications
To evaluate our solution for multi-threaded applications in a multi-core environment, the PARSEC v2.1 benchmark suite [24] is used.Through successful instrumentation and execution of benchmark applications, it is shown that the our proposed solution is thread-safe 0:18 and it is suitable for multi-core systems.So far seven applications from this benchmark are instrumented successfully with our prototype solution whereas other applications require changes in the compiler drivers to integrate our tag-protection pass completely in order to achieve complete instrumentation.These issues will be addressed in our future work.The simulation results in terms of percentage increase in execution time are presented in Fig. 7.
In order to get performance comparison with the SAFECode and AddressSanitizer, for the multi-threaded applications, the instrumented binaries are thus executed using the same machine configuration as defined earlier.The performance comparison and execution time overhead is presented in Fig. 8. From these results, it is clear that even in the worst-case scenario, the tag-protection presents 25.4% performance overhead for which is the least as compared to the existing solutions.As shown in this figure,  one application, dedup , has generated false bug alarm while when an application, swaptions , failed to complete its execution in the correct manner as required when compiled with SAFECode.One application, freqmine , fails to complete its execution when instrumented with AddressSanitizer.On the other hand, our technique does not result in any false alarms and all the benchmark applications generated outputs as expected.
Unlike AddressSanitizer, which uses compact shadow mapping and customized run-time libraries, the tag-protection allocates 8-bit tag mark against each memory allocation which relatively uses less memory.Contrary to SAFECode, our approach does not require static analysis and customized compiler driver which increases the compilation time.The instrumentation done by SAFECode has resulted in slower execution time.This effect is visible from our results where binaries instrumented through SAFECode bear huge performance overhead.Furthermore, in our technique the tag checks are inserted only for those store instructions that write to the allocated memory areas, skipping pointer copy instructions, which result in less execution time overhead.

FUTURE WORK
The proposed solution has been designed for C/C++ based applications targeting embedded systems as these languages provide a powerful set of features such as low memory footprint, little run-time support, low-level direct memory accesses and arithmetic operations through pointers.Real-time embedded systems having hard deadlines can not afford the luxury of executing extra instructions along with the actual code.For such systems, the proposed solution can be improved further by exploring the possibility of executing run-time checks, being inserted by the tag-protection pass, through dedicated hardware module.This hardware module can be designed to run in parallel within the main processing core.In this way, the performance overhead will be reduced approximately to zero.For example, for FPGA-based embedded systems, such hardware module can be implemented by designing customized instruction-set architecture in order to differentiate between run-time checks and application instructions.The current implementation of the tag-protection pass requires single LLVM-IR code file, as shown in Fig. 3.The LLVM linker (llvm-link ) has failed to generate LLVM-IR code file for remaining applications.In our future work, the TPP will be integrated completely within the compiler framework in order to compile and generate instrumented executable files.

CONCLUSION
In this paper, a fast and effective tag-protection solution is presented to detect illegal memory accesses in the applications that are written in C/C++.It is implemented as an instrumentation pass using LLVM and operates at the source-code level.The effectiveness of the proposed solution is tested using several benchmarks and test applications.Through various experimental results, it is shown that our solution has less performance overhead when compared with the publicly available tools.The applications instrumented with the tag-protection pass incur only 26.42% and 12.48% performance overhead on average for the SPEC CPU2006 and the multi-threaded PARSEC v2.1 benchmark suites, respectively.Fur-thermore, through the execution of instrumented multi-threaded applications, it is shown that our proposed solutions are thread-safe and the performance overhead is minimal when these applications are executed with a higher number of threads in a multi-core system.This demonstrates that our proposed technique is a scalable solution for multi-core environments as well.

Fig. 1 :
Fig. 1: (a) Typical memory layout of a C program.(b) Memory objects coupled with tag start and tag end marks.(c) Record table layout used by tag-protection at the time of code instrumentation.

Fig. 2 :
Fig. 2: Flow chart used to implement the tag-protection solution

ALGORITHM 3 :
Stage-3:Tag creation for memory objects that are declared locally inside each function Input: Instrumented LLVM-IR code β 2 generated in stage-

Fig. 7 :Fig. 8 :
Fig. 7: Performance overhead in terms of percentage increase in execution time for PARSEC v2.1 Benchmark applications 1 of tag-protection pass Create f un map table for each function f un def declared in α do if f un def has definition in place and arguments contain pointer memory objects then Create a function clone f un def clone of f un def .Create two new pointer arguments(tag start arg and tag end arg) for each pointer argument in f un call clone argument list.Store function name of f un def clone and f un def in f un map table end end Save modified LLVM-IR code as semi instrumented LLVM-IR code β 1 ALGORITHM 2: Stage-2:Tag creation for globally declared memory objects by tag-protection pass Input: Instrumented LLVM-IR codeβ 1 generated in stage-1 of tag-protection pass Output: Instrumented LLVM-IR code β 2 generated through LLVM opt command using stage-2 of tag-protection pass Create dedicated tag address globaltag; Create T ag map table.
for each global memory object global var in α do if global var is not a pointer memory object then if global var is an array memory object then Find start and end address, create tag start and tag end mark pointers and assign them start and end addresses.Store memory object global var, its type and tag marks information in T ag map table end else Create tag start and tag end mark pointers and initialize them with NULL value.Store memory object global var,its type and tag information in T ag map table end if global var is a structure memory object struct then for each variable struct var in struct do if struct var is not a pointer memory object then if struct var is an array memory object then Find start and end address, create tag start and tag end mark pointers and assign them start and end addresses.Store memory object global var, its type, struct var index and tag marks information in T ag map table end else Create tag start and tag end mark pointers and initialize them with NULL value.Store memory object global var, its type, struct var index and 2 of tag-protection pass ; memory map table T ag map table; Dedicated tag address globaltag ; Output: Instrumented LLVM-IR code β 3 generated through LLVM opt command using stage-3 of tag-protection pass for each function definition f un def in β 3 do for each instruction f un inst in f un def do if f un inst is a memory object allocation instruction and does not create pointer object then if f un inst creates an array memory object then Find start and end address, create tag start and tag end mark pointers and assign them start and end addresses.Store memory object f un inst, its type and tag marks information in T ag map table end end if f un inst is memory allocation instruction and creates pointer object then Create local tag start and tag endmark pointers and initialize them with NULL value.Store memory object instruction f un inst,its type and tag information in T ag map table end if f un inst is heap memory allocation function call instruction then Find start and end address of heap allocation.Find respective memory object and retrieve tag start and tag end marks from T ag map table.Create new STORE instructions to assign start and end addresses to the tag marks.end if f un inst is heap memory deallocation function call instruction then Retrieve corresponding tag marks from T ag map table and initialize it with globaltag end if f un inst is a STORE instruction and updates an allocated memory object pointer address from source operand.then Retrieve respective tag start and tag end marks from T ag map table for source memory object .Retrieve respective tag start and tag end marks from T ag map table for destination memory object .Create store instructions to copy address values from source to destination tag marks.
3 of tag-protection pass ; memory map table T ag map table and f un map table ; Output: Instrumented LLVM-IR code β 4 generated through LLVM opt command using stage-4 of tag-protection pass for each function definition f un def in β 3 do for each instruction f un inst in f un def do if f un inst is a function call instruction then Get function name f un call name being called by f un inst instruction if function called by f un inst is present in f un map table.then Retrieve respective f un def clone from f un map table.Create new function call instruction f un inst new pointing to f un def clone.for each pointer function argument ptr arg in f un inst do Retrieve respective tag start and tag end marks from T ag map table and add them to f un inst new argument list.end Remove f un inst instruction and replace it with f un inst new.end end end end Delete memory map table f un map table Save modified LLVM-IR code as an instrumented LLVM-IR code β 4 and hence can no Stage-5:Tag checks placement.Input: Instrumented LLVM-IR code β 4 generated in stage-4 of tag-protection pass ; memory map table T ag map table;Dedicated tag address globaltag Output: Final Instrumented LLVM-IR code γ generated through LLVM opt command using stage-5 of tag-protection pass for each function definition f un def in β 3 do for each instruction f un inst in f un def do if f un inst is function call without definition and not a memory allocation or deallocation call then for each function argument f un arg in f un inst do Create two memory objects bef ore f un and af ter f un.Retrieve respective tag start and tag end marks from T ag map table.Read address location next to tagend address before f un inst instruction and store the read value in bef ore f un.Read address location next to tagend address after f un inst instruction and store the read value in af ter f un.Place tag check instruction after function call f un inst comparing bef ore f un and af ter f un memory objects.end end if f un inst is a STORE instruction and updates a memory object then Retrieve respective tag start and tag end marks from T ag map table and get address to be accessed address tobe accessed by the f un inst instruction.
Perform dangling pointer dereference check.compare tag end with the globaltag.Perform address comparison checks: address tobe accessed with the tag start and tag end.end if f un inst is a LOAD instruction and read from allocated memory object then Retrieve respective tag start and tag end marks from T ag map table and get address to be accessed by the f un inst instruction.Perform dangling pointer dereference check.compare tag end with the globaltag.Perform address comparison checks: address tobe accessed with the tag start and tag end.end end end Delete memory map table T ag map table.

Table I :
Effectiveness of the proposed tag-protection solution on different applications from BugBench benchmark suite

Table II :
Comprehensive Comparison with existing publicly available countermeasures when tested on Ubuntu 12.4 64-bit architecture

Table III :
Increase in memory utilization for instrumented SPEC CPU2006 Benchmark applications

Table IV :
Increase in binary size for instrumented SPEC CPU2006 Benchmark applications.