TASK MIGRATION SUBSYSTEM FOR MACRO-PROGRAMMED WIRELESS SENSOR NETWORK

TASK MIGRATION SUBSYSTEM FOR MACRO-PROGRAMMED WIRELESS SENSOR NETWORK

Vitalijus Martusevicius, Egidijus Kazanavicius, Vaidas Jukavicius

Kaunas University of Technology, Department of Computer Engineering, Studentu st. 50-208, LT-51368, Kaunas, Lithuania, [email protected],

[email protected], [email protected]

Abstract. Over the past years of research various layers of programming abstraction for networked sensor systems have evolved and the motivation of macro-programming is discussed. Among several recently proposed macro-programming techniques, the abstract task graph macro-programming framework was developed to alleviate application developer work when creating a wide range of wireless sensor network applications. However, its underlying runtime system currently supports only initial task placement onto the nodes of the target network. Since sensor networks operate in a dynamic environment, runtime system should be improved to support the migration of tasks based on energy situation in the networked sensor system. In this paper, an approach for task migration in the runtime is proposed. To achieve this goal, a set of abstract task annotations is extended with migration elements, runtime system is complemented with task migration components, and destination node selection method for the task to be migrated is provided.

Keywords: wireless sensor network, macro-programming, abstract task graph, runtime system, task migration.

1 Introduction Wireless sensor network (abbrev. WSN) consists of many small independent battery-powered nodes

with wireless communication and sensing capabilities. Such sensor networks have a significant potential for various applications which are deployed in diverse areas. Currently a wide range of applications are created for habitat monitoring [23], environmental monitoring [14], target tracking [21], personal health [9] etc.

Since the early days, application developers had to specify application functionality at the level of individual node using programming languages like nesC [8], galsC [6]. In addition, application developer was responsible to ensure desired distributed application functionality. That involves read the values from local sensing interfaces, maintain application level state in the memory, send messages to other nodes and process incoming messages from other nodes. This approach is called node centric programming and is not easy to use in the large sensor networks.

As opposed to the above, sensor network macro-programming provides the ability to specify applications at a high level of abstraction. Abstractions are used to specify the high level collaborative behavior at the system level, while the low level details are controlled by underlying runtime system and are hidden from the developer. Macro-programming languages provide abstractions to specify application behavior which is automatically synthesized into software for each node in the target deployment [3], [10], [17].

Our work is focused on abstract task graph called ATaG [3] and its underlying runtime system [2]. The ATaG is a data driven programming model for application development on networked sensor systems. ATaG is designed to create architecture independent specifications of the application functionality. ATaG runtime system called DART performs software synthesis and enables integration of different protocols and services at the lower layers. Current ATaG compilation framework [18] supports only initial task mapping techniques on the target network. Since sensor networks operate in a dynamic environment, DART should be extended to support task migration based on the rest of energy in the networked sensor system. This involves addressing the questions when to migrate tasks, which tasks to migrate and how to migrate them.

Current ATaG model supports only static logical scopes [16]. In order to specify tasks with migration capability, we extended ATaG notation [3] with dynamic logical scopes. Dynamic logical scopes also enable more flexible application development. Furthermore, we suggested an improvement to the DART with the aim to implement task migration in the runtime. Additionally, a special method is proposed to select a destination node for the task to be migrated. Finally, we evaluated task migration advantages compared with existing initial task mapping technique [19].

2 Related work Sensor network programming approaches are classified into low and high level models. Low level

programming models are focused on abstracting hardware and allowing flexible control of nodes. In this case, the developer has to translate global application behavior into tasks for each node and individually program these nodes. One of the earliest examples in this class is TinyOS [24] which allows application software to access

- 175 -

hardware directly. The logical neighborhood concept is common in node centric programming, because neighborhood creation motivated the design of node level libraries [27], [28]. These libraries handle the low level details of control and provide application programming interface to the developer. Middleware [11], [13] also increased programming abstraction level by providing services to use virtual topologies, logical and dynamic relationships.

High level programming models use an application oriented view and are further divided into group and network level abstractions. Group level abstractions provide a set of programming elements to handle a group of nodes as a single object [26]. In network level abstractions a sensor network is treated as a single abstract machine. Furthermore, network level abstractions are classified into database abstractions and macro-programming languages. The database examples used for high level abstractions are TinyDB [4] and Cougar [5].

Macro-programming languages have greater flexibility compared with database approach and can be used to develop more complex applications. Various functional, data and control driven, imperative and declarative mechanisms are used as a basis for macro-programming languages. Regiment [17] is a declarative functional language with support for region-based functions like aggregation, filtering and mapping. Kairos [10] is an imperative, control driven macro-programming language that provides certain global abstractions and a mechanism to translate a centralized program into a distributed program for sensor nodes. ATaG [3] represents a data driven mechanism as a basis for architecture independent programming of sensor network applications.

A large body of work exists for energy efficient task mapping problem in the networked sensor system. Task placement approaches for unconstrained graphs are presented in [25]. For high level applications a specific greedy solution [1] is suggested. Task placements are specified using roles in [7] article. Authors of [19] present initial task mapping in terms of energy balance and total energy spent.

Finally, MagnetOS [12] system statically partitions a monolithic program into components that are distributed over the sensor network nodes and performs component migration at a runtime. We adapt component migration approach for the DART [2] with the goal to enhance application longevity in the WSN.

3 Dynamic scopes for abstract task graph The abstract task graph is a macro-programming framework which provides a mixed declarative-

imperative approach. As briefly discussed in [3], an ATaG programs are composed of a set of abstract declarations which can be one of three types: abstract task, abstract channel and abstract data. Abstract task and abstract channel have associated annotations. Furthermore, abstract task annotations consist of two parts: instantiation and firing rules. Firing rule specifies when the processing in a task must be triggered, while instantiation rule controls the placement of tasks on the nodes.

Instantiation rules are described in the context of logical scopes. ATaG scoping is thoroughly described in [16] where mainly three additional static constructs are introduced. Authors also notice that supporting dynamic scopes would make a programming model complicated and lead to unpredictable application behavior. We argue with this approach and introduce conditional and task migration logical scopes.

Networked sensor systems monitor an area of interest using nodes equipped with a corresponding sensing device. Time varying properties of the nodes can be evaluated using dynamic scopes. Therefore we complemented particular instantiation rules described in [3] and corresponding scopes [16] with conditional operator using programming language syntax as depicted in expression 1.

[ ] [ ][ ]nodes-per-instance: / n @sensingDevice &(condition) (1)

Dynamic scopes expressed via conditional operators help to specify application functionality more easily and make a declarative part of ATaG model more flexible. Consequently application developer work with imperative part of ATaG program is reduced.

Some macro-programmed WSN nodes execute more tasks or even more complicated tasks compared with others, i.e. collect and process information gathered from neighboring nodes etc. Thus such nodes go out of battery power quicker than desired and can determine a shortened application lifetime. To increase system and application longevity, we introduced task migration construct |→ as shown in expression 2.

[ ] [ ]|area-per-instance: / area → (2)

Thus, tasks with migration capability are specified by application developer in the ATaG declaration. Migration capability is supported only for tasks, which are assigned to the arbitrarily selected node from a specified area, region or partition. In such cases, initial task mapping problem is solved in [19] article, but authors do not provide a solution for task migration in the runtime.

Sensor network applications usually perform data gathering of some kind. Thus, we created a sample of environmental temperature monitoring application to introduce the syntax and usage of the dynamic scoping constructs. Application functionality we expressed by the ATaG program depicted in figure 1. Normally, data

- 176 -

generated by sensors is collected inside the network before being sent out to the base station. Therefore our ATaG program uses a hierarchical approach for data collection and processing.

Figure 1. ATaG program for environmental temperature monitoring

Monitor task is instantiated on each WSN node equipped with temperature sensor. This task is responsible for periodic data reading from a temperature sensor every 5 minutes. If sampled temperature value exceeds a predefined threshold, which in our case is 40°C, then temperature data item is added into local data pool for further processing.

To ensure scalability, WSNs are often partitioned into areas. As specified by instantiation rule of the collector task, entire WSN is divided into nonoverlapping areas of 100 square meters. Only one collector task is instantiated in each area, while the exact initial task placement is under responsibility of the compiler. Temperature data items, which are produced on nodes within the same area, are routed to the particular node with the assigned collector task. Furthermore, task migration attribute indicates that collector task can be migrated to either node in the area. Migration is controlled by the underlying runtime subsystem proposed in section 4.

The supervisor task is instantiated on a base station and gathers data from the nodes with the assigned collector task. It analyzes information about temperature violation notifications and triggers required actuators.

4 Task migration subsystem ATaG runtime system [2] called DART provides the required underlying mechanisms for

communication and coordination between instances of abstract tasks. But current DART implementation makes no assumption about network changes in the runtime. Moreover, ATaG programs are synthesized onto WSNs that are dynamic in nature, where nodes fail due to exhaustion of limited energy resources. Therefore, task migration subsystem could significantly improve application longevity by using the available power within the migration area more effectively.

The entire DART functionality is divided into a set of modules to facilitate customization to various deployments. For the sake of completeness each component is described shortly:

• UserTask – represents an application level task which corresponds to abstract task

• DataPool – manages abstract data items in the local data pool

• ATaGManager – stores the ATaG representation that is relevant to the particular node

• Dispatcher – handle the transmission of data items produced on the node

• NetworkArchitecture – maintains topology related information for the node

• NetworkStack – manages network interface of the node

In order to implement task migration in the runtime, we complemented DART with two modules:

• TaskMigration – determines task migration conditions, selects destination node and initiate migration process

• MigrationDataPool – manages all required information associated with task under migration, i.e. saves and restores task related state information during migration process, and controls the dispatch of task environment

Figure 2 depicts intra-node and inter-node flow of control in the task migration process. Inter-node communication corresponds to source and destination nodes of the task under migration. Only DART modules included in the migration process are shown. Individual steps for task migration will be explained.

- 177 -

Figure 2. Control flow in the task migration process

Every node in the migration area regularly monitors exhaustion of its battery power. In the cases, when a node, hosting the task with migration attribute, detects that its battery level is below the critical threshold, then destination node research procedure is activated. It consists of mainly two functional parts:

• Estimates if the task migration is expedient according to the residual battery power of nodes in the same area. Therefore NetworkArchitecture module is invoked to obtain information about the node’s own location and the composition of its neighborhood in that area. In addition, nodes of the same area are identified by the task instantiation rules that are maintained in the ATaGManager module.

• If task migration is expedient, then destination node is selected using a special method that will be explained in section 5. In general, the destination node is chosen according to the residual battery power and distance from the source node.

Whenever destination node is detected, all required migration information is collected inside the local migration data pool and the runtime system is adapted accordingly. This is a complex process that consists of several actions among components of the runtime system:

• User task code is saved in the migration data pool for transmission to the destination node while own task is disabled and latter removed from the list of tasks running on this node.

• ATaGManager component is reconfigured – information about the declarative part of ATaG program corresponding to the particular task is preserved in the migration data pool and removed from the ATaGManager. This information is composed of input-output channel declarations and annotation of the task to be migrated.

• Data item associated with output channel of the task under migration is also preserved in the migration data pool.

• If source node contains data items that are associated only with the task under migration, then these data items are removed as redundant and their corresponding memory is de-allocated from the data pool.

- 178 -

• NetworkArchitecture component is reconfigured to cease gathering information about a neighborhood which is determined by channel annotations of the abstract task to be migrated.

Eventually, task migration process is activated and coordinated by synchronization object that controls acquisition and release operations of the data under migration. To handle transmission errors, a specific failure detection technique, similar to the mechanism of forwarding pointers in [12], is to be implemented.

When the migration data arrives to the destination node, it is preserved in the local migration data pool and appropriate instantiation procedure is invoked:

• User task code is restored from the migration data pool, while own task is activated by assigning it to the list of tasks placed onto this node.

• ATaGManager component is updated with the declarative part of ATaG program corresponding to the migrated task from the MigrationDataPool. Thus, ATaGManager is complemented with input-output channel declarations and annotation of the migrated task.

• If the destination node does not contain data items associated with input-output channels of the task, then additional memory is allocated in the data pool for missing data items. Furthermore, if the memory is allocated for a data item associated with the output channel, then its value is restored from the migration data pool.

• NetworkArchitecture component is reconfigured to collect information about the neighborhood, which is determined by channel annotations of the migrated task.

To sum up, we proposed a task migration subsystem, which is intended for the DART. It allows migration of running task from one node to another with minimal disruption to application functionality. The migration operation maintains complete task transactional integrity. Thus, transfers the entire task environment, including task code, associated declarative part, required data items and neighborhood adjustment rules. In the next section we provide a method to select a destination node for the task to be migrated.

5 Destination node selection method As described in section 3, task with migration capability can be transferred into any node in the

migration area. While executing such task in a node, if the remaining energy reaches threshold level, task is migrated into particular node which is healthier. We propose a destination node selection method which is designed to find the most suitable node for the task in respect of remaining energy of the node and task transmission costs.

According to the authors of [15], energy loss in the path id is evaluated as follows:

( ) ( )i i iE d p dηα β= + (3)

Where id – transmission distance on edge ie , i 1 n= … , as n is the number of hops in the route R ,

while R is selected by the routing protocol in use; ip – total bits transmitted on edge ie ; η – path loss

exponent; α and β – distance independent and dependent energy components for one bit communication. This

model is platform independent, because values of α , β and η are to be measured for the WSN platform in use.

Thus, total energy consumption in the route R we evaluate as follows:

( ) ( )i i

R i i ie R e R

C E d p dηα β∈ ∈

= = +∑ ∑ (4)

Subsequently, for all the nodes in the migration area, energy balance BE is calculated:

, ,B j j R jE E C= − (5)

Where jE – remaining energy of the node, j 1 m= K , as m is the number of nodes in the task

migration area. Thus, ,B jE indicates suitability of each node in the migration area for the task to be migrated.

Suitability of every node is evaluated according to the residual battery power and energy costs required to receive the task from the source node.

Finally, the best destination node destN for the task under migration is selected as follows:

{ },dest B jj

N max E= (6)

This method increases application longevity, because particular tasks are adaptive in respect of energy distribution in the WSN.

- 179 -

6 Simulation results Proposed task migration technique is implemented in a simulation environment and its advantages are

explored. Basically, the node that reaches critical level of energy, transfers task with migration capability into healthier node before it dies. To ensure enough time to complete task migration before node goes out of battery power, a migration process is initiated when battery level of the node reaches a predefined critical threshold.

Sensor node parameters in the experiment are selected according to the Sun SPOT node [22] capabilities. It has a 180MHz 32bit ARM920T processor with 512K RAM and 4M Flash memory and is equipped with temperature sensor. Moreover, a linear battery model [20] is used, as it allows evaluate the efficiency of task migration by providing a simple metric of energy consumption for computation and communication.

A sample of environmental temperature monitoring application discussed in section 3 is used in the research. Sensor network is simulated with 24 nodes randomly deployed in the area of 100×200 meters as depicted in figure 3. Maximum transmission range of a node is 50m and the links between nodes represent network connectivity.

Figure 3. Task migration example

According to the ATaG program depicted in figure 1, entire network is divided into two non-overlapping areas of 100 square meters. It is assumed that all nodes are equipped with a temperature sensor and therefore contain monitor task. Furthermore, only two of these nodes have a collector task, which is assigned to the particular node in each area. Initial task assignment is thoroughly discussed in [19] and is out of scope in this article.

To demonstrate task migration process, the migration route of the collector task is illustrated in the first part of figure 3. This route is determined by the algorithm described in section 5. The collector task is initially assigned to node 6. It goes out of battery power quicker than other nodes in the area, because it executes both monitor and collector tasks. Node 6 initiates task migration process when its residual energy reaches 15%. For each node an initial energy of 9000 Joule is assigned.

Using proposed task migration technique, at 1103 minute, sixth node determines that task migration to other node is required before its energy is entirely depleted. The collector task migrates from node 6 through 8, 9, 7 and 4, while available power within the migration area is exhausted.

- 180 -

0500 1000 1500 2000 2500

3000 Node6

Node8

Node9

Node7

Node4

0

2000

4000

6000

8000

10000

Time (minutes)

Rem

ain

ing

en

erg

y (J

oul

es)

Figure 4. Remaining energy of nodes in the task migration route

Remaining energy of the nodes that collector task goes through is shown in figure 4. Because of task migration capability, the collector task operates until 2426 minute and that is 80% longer time compared to the case without task migration feature.

7 Conclusions In this paper, an approach for task migration in the runtime was presented. At first, ATaG annotations

were extended with conditional and migration elements to improve flexibility of the declarative part and to specify tasks under migration. Secondly, the DART was complemented with task migration subsystem. Finally, a corresponding destination node selection method was provided. This technique allows transparently move tasks in the runtime in order to achieve good utilization of available energy in the migration area.

Using results from the simulation, it is verified that proposed task migration technique significantly extends the lifetime of application in the WSN. In addition, application longevity increases with the density of nodes in the migration area.

This work is focused particularly on task migration, but it can be extended to support distributed task execution, when the task under migration is divided into subtasks and executed on several nodes. Thus, even better energy utilization is to be provided.

References [1] Abrams Z. and Liu J. Greedy is good: on service tree placement for in-network stream processing. Proceedings of

the 26th IEEE international conference on distributed computing systems. 2006, pp. 72-72.

[2] Bakshi A., Pathak A., and Prasanna V. K. System-level support for macroprogramming of networked sensing applications. Proceedings of the international conference on pervasive systems and computing. 2005, pp. 3-11.

[3] Bakshi A., Prasanna V. K., Reich J., and Larner D. The Abstract Task Graph: A methodology for architecture-independent programming of networked sensor systems. Proceedings of the workshop on end-to-end, sense-and-respond systems, applications and services. 2005, pp. 19-24.

[4] Bonnet P., Gehrke J., and Seshadri P. Querying the physical world. IEEE Personal Communications. 2000, vol. 7, pp. 10-15.

[5] Bonnet P., Gehrke J., and Seshadri P. Towards sensor database systems. Proceedings of the 2nd international conference on mobile data management. 2001, vol. 1987, pp. 3-14.

[6] Cheong E. and Liu J. galsC: A language for event-driven embedded systems. Proceedings of design, automation and test in europe. 2005, vol. 2, pp. 1050-1055.

[7] Frank C. and Römer K. Solving generic role assignment exactly. Proceedingsof the 20th IEEE international parallel and distributed processing symposium. 2006, pp. 10-10.

[8] Gay D., Levis P., von Behren R., Welsh M., Brewer E., and Culler D. The nesC language: A holistic approach to networked embedded systems. Proceedings of the ACM SIGPLAN 2003 conference on programming language design and implementation. 2003, pp. 1-11.

[9] Giorgetti G., Manes G., Lewis J. H., Mastroianni S. T., and Gupta S. K. S. The personal sensor network: a user-centric monitoring solution. Proceedings of the ICST 2nd international conference on body area networks. 2007.

- 181 -

[10] Gummadi R., Gnawali O., and Govindan R. Macro-programming wireless sensor networks using Kairos. Proceedings of the 1st international conference on distributed computing in sensor systems. 2005, vol. 3560, pp. 126-140.

[11] Heinzelman W. B., Murphy A. L., Carvalho H. S., and Perillo M. A. Middleware to support sensor network applications. IEEE Network. 2004, vol. 18, pp. 6-14.

[12] Liu H., Roeder T., Walsh K., Barr R., and Sirer E. G. Design and implementation of a single system image operating system for ad hoc networks. Proceedings of the 3rd international conference on mobile systems, applications, and services. 2005, pp. 149–162.

[13] Liu T. and Martonosi M. Impala: A middleware system for managing autonomic, parallel sensor systems. Proceedings of the 9th ACM SIGPLAN symposium on Principles and practice of parallel programming. 2003, pp. 107-118.

[14] Martinez K., Hart J. K., and Ong R. Environmental sensor networks. IEEE computer. 2004, vol. 37, pp. 50-56.

[15] Min R. and Chandrakasan A. A framework for energy-scalable communication in high-density wireless networks. Proceedings of international symposium on low power electronics and design. 2002, pp. 36–41.

[16] Mottola L., Pathak A., Bakshi A., Prasanna V. K., and Picco G. P. Enabling scope-based interactions in sensor network macroprogramming. Proceedings of the 4th IEEE international conference on mobile ad-hoc and sensor systems. 2007, pp. 1-9.

[17] Newton R. and Welsh M. Region streams: Functional macroprogramming for sensor networks. Proceedings of the 1st international workshop on data management for sensor networks. 2004, vol. 72, pp. 78-87.

[18] Pathak A., Mottola L., Bakshi A., Prasanna V. K., and Picco G. P. A Compilation framework for macroprogramming networked sensors. Proceedings of the 3rd international conference on distributed computing in sensor systems. 2007, vol. 4549, pp. 189-204.

[19] Pathak A., Prasanna V. K. Energy-efficient task mapping for data-driven sensor network macroprogramming. Proceedings of the 4th IEEE international conference on distributed computing in sensor systems. 2008, vol. 5067, pp. 516-524.

[20] Savvides A., Park S., and Srivastava M. B. On modeling networks of wireless micro sensors. Poster session at SIGMETRICS 2001. 2001.

[21] Simon G., Maróti M., Lédeczi Á., Balogh G., Kusy B., Nádas A., Pap G., Sallai J., and Frampton K. Sensor network-based countersniper system. Proceedings of the 2nd international conference on embedded networked sensor systems. 2004, pp. 1-12.

[22] SunTM Small Programmable Object Technology (Sun SPOT), www.sunspotworld.com.

[23] Szewczyk R., Osterweil E., Polastre J., Hamilton M., Mainwaring A., and Estrin D. Habitat monitoring with sensor networks. Communications of the ACM. 2004, vol. 47, pp. 34-40.

[24] Szewczyk R., Woo A., Hollar S., Culler D., and Pister K. System architecture directions for networked sensors. Proceedings of the 9th international conference on architectural support for programming languages and operating systems. 2000, vol. 35, pp. 93-104.

[25] Tian Y., Ekici E., and Özgüner F. Energy-constrained task mapping and scheduling in wireless sensor networks. Proceedings of the IEEE international workshop resource provisioning and management in sensor networks. 2005, pp. 211-218.

[26] Welsh M. and Mainland G. Programming sensor networks using abstract regions. Proceedings of the 1st conference on symposium on networked systems design and implementation. 2004, vol. 1, pp. 3-3.

[27] Whitehouse K., Sharp C., Brewer E., and Culler D. Hood: a neighborhood abstraction for sensor networks. Proceedings of the 2nd international conference on mobile systems, applications, and services. 2004, pp. 99-110.

[28] Whitehouse K., Zhao F., and Liu J. Semantic streams: a framework for declarative queries and automatic data interpretation. Technical Report MSR-TR-2005- 45, Microsoft Research. 2005.

- 182 -

ANALYSIS AND EVALUATION OF DISTRIBUTED DENIAL OF SERVICE ATTACKS IDENTIFICATION METHODS

Saulius Grusnys, Ingrida Lagzdinyte

Kaunas University of Technology, Department of Computer Networks, Studentu 50, Kaunas, Lithuania, [email protected], [email protected]

Abstract. Defending against Distributed Denial of Service (DDoS) attacks is one of the most important tasks to ensure service availability. In this paper we present a software system, which implements some of the available methods to detect DDoS attacks and creates firewall rules to stop the traffic from the hosts suspected to be participating in the attack. Implemented methods include Change Point Approach, Covariance model and Passive Measurement based Heuristics. The system enables to analyze characteristics of implemented DDoS identification methods and evaluate their efficiency in different conditions.

Keywords: Distributed Denial of Service, DDoS, DDoS identification methods.

1 Introduction

1.1 Distributed Denial of Service Attacks Distributed Denial of Service (DDoS) is the kind of attacks that are performed in order to interrupt

Internet services by flooding the victim with a high volume of malicious packets originating from many different sources [1]. There are two general types of DDoS attacks classified: a) direct attacks; b) reflector attacks [2].

In case of direct attacks all the packets are sent from attacking hosts directly to victim. Most of the time spoofed IP addresses are used, making the attack more effective as the victim tries to repeat reply packets to non-existing hosts.

Using the reflector attacks, requests with spoofed victim's source address are sent to lots of different servers. Victim gets flooded by the replies coming from those servers. It is very hard to stop such attacks as the traffic comes from legitimate servers which may be needed to provide services on the victim side.

Due to the fact that a DDoS attack has to be detected on-line, the detection of an attack should be as quick as possible in order to prevent attack from the very beginning [3]. The detection of a DDoS attack is also very complicated by the fact that it is very similar to the traffic generated by increased number of legitimate users. Another problem with DDoS attacks is that the equipment of the victim or victim's ISP can be taken down by the amount of traffic generated by attacking hosts. No methods for detecting an attack are effective in such case.

The detection of the attack can be more effective if it is performed on ISP routers as close to the attackers as possible. Each node takes the part of the attack load and, if detected effectively, blocks it.

1.2 DDoS Identification Methods There are number of methods available which aim to detect DDoS attacks. Figure 1 presents the

classification of available DDoS identification methods.

There first group of methods (see Figure 1 (a)) propose measures to stop DDoS attacks in global scale [2, 4]. The idea of such methods is that the number of systems detecting DDoS attacks is located in many different places on the Internet. When the attack is suspected, it is communicated with other system whether the traffic is suspicious to them or not. If the attack is confirmed, routers are instructed to block packets from attacking hosts [4].

The first advantage is that systems using those methods can get more information about the environment communicating with each other. Another one is that the amount of traffic that each system must process is smaller. Disadvantage of such methods is that implementing them requires more cost. Besides it is more complicated to add traffic analyzing systems in various places of the network.

- 183 -

Another group of methods (see Figure 1 (b)) is available for detecting DDoS attacks on the victim side [1, 3, 5-9]. Most of these methods rely on monitoring traffic for some period of time when the traffic load is normal and no attacks are performed. Normal traffic profiles are created examining different traffic parameters in different ways.

(a) (b)

Figure 1. DDoS Identification Methods Classification. (a) Distributed methods, (b) Single host methods

When attack starts the difference between parameter values in profiles reflecting normal traffic and values reflecting current traffic occurs. In that way the system knows that the attack has been launched.

The main advantage of such methods is that they can be implemented and run on one host. The second advantage is that the threshold values used by the methods can be tuned according to the particular traffic that flows through monitoring host. The main disadvantage of such methods is that one system can easily be flooded by great amount of traffic coming from many different hosts.

Further in this paper we will focus on detecting DDoS attacks on a single host, saying that ISP and our hardware are able to cope with the load generated by the attack.

As single host DDoS detection methods operate differently, evaluate different parameters and are suitable to detect different kinds of DDoS attacks there is a need to identify what method or methods should be used under particular circumstances. In this article three methods will be evaluated: Change Point Approach, Covariance Model and Passive Measurement based Heuristics. We think that these methods can be effective and easily implemented in hardware level if necessary.

The rest of the paper is structured as follows: in the Section 2 the main characteristics of Change Point Approach, Covariance model and Passive Measurement based Heuristics methods are discussed. In Section 3 we describe the software that implements these methods and is used for their evaluation. Section 4 presents some experimental results. Finally, in the Section 5, the conclusions are made.

2 Change Point Approach, Covariance model and Passive Measurement based Heuristics methods

Change Point Approach, Covariance Model and Passive Measurement based Heuristics methods are presented and fully described in [3, 6, 8]. Here we will discuss only the most important aspects of these three methods.

2.1 Change Point Approach Using the Change Point Detection algorithm, the number of packets arrived during certain period of

time is measured. Then cumulative sum is calculated for predefined number of intervals. Drastic changes in cumulative sum values mean that the change in the state of the traffic has occurred [8]. In such way the beginning of the attack can be identified quickly and effectively even if the number of legitimate packets arriving has been high before starting the attack.

The algorithm used to calculate cumulative sum for any parameter can be illustrated by 5 steps: 1) define number of intervals and the length of the interval; 2) count the number of packets matching particular feature during each interval; 3) calculate average of packets in all the intervals; 4) starting with 0 count cumulative sum for all other intervals by adding the value of the earlier interval with the value of current interval and subtracting the average; 5) the change is called if the differences in cumulative sums between intervals exceed the maximum defined value.

- 184 -

The main drawback of this method is that the attack cannot be detected if the volume of traffic is increasing steadily while not triggering the maximum difference between cumulative sum values. The difference of the values in cumulative sum that triggers the attack alarm should be identified by monitoring the legitimate traffic for some period of time and by setting the maximum difference between cumulative sum values.

2.2 Covariance model Covariance method used for identifying DDoS attacks is based on calculating covariance between two

parameters in arriving packets [3]. When attack starts the covariance value should differ from the value calculated during normal load. Covariance between two parameters in N measurements can be calculated by expression (1):

cov xy= 1N∑ x i y i− xy , (1)

where: N – is the number of measurements; ix - is the value of the first variable in i-th measurement;

iy - is the value of the second variable in i-th measurement, i ranging from 1 to N;

This method is effective in detecting TCP SYN flooding attacks, as the number of packets containing SYN flags set and the number of packets containing FIN flag set becomes different and not related to each other. The tuning of this method involves identifying the pairs of parameters to monitor and the normal values for the selected pairs. To be able to detect other types of attacks, the relations between various traffic parameters should be identified. However this is not always possible, meaning that this method cannot be used in detecting all kinds of flooding attacks.

2.3 Passive Measurement based Heuristics There is also a way to detect DDoS attacks by using heuristics [6]. This simple method calculates the

number of packets received during predefined interval of time. During some period of time the average and maximum of packets received is determined. If the number of packets received exceeds the predefined maximum, flooding attack is detected.

The main advantage of this method is the short period of time needed to identify the attack. Quicker identification results in quicker reaction. However this method can produce a number of false positives if there are short high traffic peaks.

3 Software for DDoS Identification Methods Analysis and Evaluation In order to analyze and evaluate Change Point Approach, Covariance Model and Passive Measurement

based Heuristics methods they were implemented and integrated into one testing system.

The testing system was programmed using PCAP library interface and C programming language. The structures representing protocol headers were used from Linux include files so for now the system can only be compiled in Linux operating system. Each method was written as a separate module. Each of the function is called passing every received packet as a parameter. After the evaluations are made the internal variables of the system are updated and the decision is made whether to call an attack or not. Figure 2 illustrates the main architecture of implemented system.

Figure 2. The graphical view of the system

- 185 -

Every incoming packet is processed by Linux kernel. After that the packet is passed to appropriate user level processes. Once the packet is passed to the process of our implemented system it gets evaluated against the LibPCAP filter to determine whether the packet should be further processed in order to detect an attack. Currently the filters are set to only allow packets arriving to TCP 80'th port and all types of ICMP packets. Our system then checks its internal state as well as the parameters set by the user and decides which DDoS identification module should be called. The system can be set to evaluate each packet using all available methods to get more reliable results.

Attack can be registered either using the results of one of the methods, or using the results of all three methods. Using PCAP library is also helpful in testing since it is much easier to set filters for incoming packets and work with the particular packets that are needed for evaluation.

The main strength of the developed system is that it combines a number of different DDoS identification methods. Each method has it's own strengths and weaknesses working with particular kind of attacks. The system can be adjusted to use the most suitable attack identification method according to the type of the attack being detected. The system is also made easily extendible due to its modular structure.

4 Performance measurement

4.1 Experiment environment The system is tested in local area network using bonesi DDoS Botnet simulator [12]. Packets are

generated using random source IP addresses in packets sent to the victim host. The experimental environment is illustrated in Figure 3.

Figure 3. An experimental network topology

The server where the implemented system was installed has 1.8 GHZ Intel Pentium 4 CPU and 512 MB RAM. It runs on Linux 2.6.29.4 kernel. IPTABLES firewall is configured to allow all incoming and outgoing TCP traffic to and from the ports used in testing.

The system is run for some time under normal load. Normal load is considered 15 HTTP GET requests per second, using 50 different source IP addresses during one minute interval. Values reflecting normal load are recorded for future use. All the calculations are made using data collected in 10 seconds intervals. There are 60 intervals. As a result the system is checking its state in last 10 minutes, being able to detect attack in less than 10 seconds. The nominal values are presented in Table 1.

Table 1. Parameters of normal traffic

Traffic parameters Value DDoS identification methods

Change Point Approach

Covariance model

Passive Measurement based Heuristics

http requests/incoming packets per second/number of clients

15/40/50 2512.31 > 0 407

Packet averages in 10 seconds interval

350 SYN:350, FIN:350

-

4.2 Performance Evaluation Metrics Our system evaluates each method separately and all of them as one. Since all the methods used

depends on preset values for normal traffic we think there is no need to test the number of false positives or false negatives, since it would not reflect the actual environment where the system will be implemented. The methods

- 186 -

will always give an alert if preset values are exceeded or not matched. Instead we will test the performance of each method by measuring CPU usage with different number of packets arriving. We will also determine the values for each method against those counts.

4.3 Experimental Results The testing was started by identifying the maximum number of packets per second that operating system

can handle. The experiments showed that it can process 3000 requests per second by replying to every one of them. Sending more requests brings operating system into state when kernel processes and user processes are not given any CPU time. In such case testing the CPU usage of the system is worthless.

The system was tested with 100, 500, 1000, 2000 and 3000 TCP SYN packets incoming to 80 port. TCP SYN flooding attack is a kind of attack when only the SYN packet is sent and no replies are expected from victim. The purpose of attack is to run target system out of resources. No service was listening on 80 port, so all the load has been taken by the testing system. Each method has been launched separately to identify the CPU usage of the system. The system was also tested with all three methods activated. The results are provided in Figure 4.

Figure 4. CPU usage in every DDoS identification method’s case

As it can be seen in the chart, CPU usage of the system when all three methods are active rises significantly under higher loads. However it is slightly less then the sum of usages of all three methods separately. We can also see that CPU usage using Change-Point and Covariance methods grow faster when number of packets increases, while method based on Passive Heuristics maintains constant relation with the number of packets being processed.

During the testing values of the methods under the certain load were calculated. The values are presented in Table 2.

Table 2. Calculated values of Change Point and Passive Measurement based Heuristics methods

Method name 100 packets/s 500 packets/s 1000 packets/s 2000 packets/s 3000 packets/s

Change point 2110 10354 19811 37806 59757

Passive heuristics 1000 5000 10000 20000 30000

The values of Change Point method reflect the difference between the number of packets under normal load and the number of packets when the attack is launched. The values identified during the testing may be used to set the threshold value to detect increase of the number of packets. If we set the threshold to 10000, the attack can be called when system receives more then 500 packets per second.

The values of passive heuristics method directly depends on the values used in testing. Since we use the interval of 10 seconds in testing, the values we get are number of packets per second multiplied by 10. As it was discussed in previous chapter, the performance of this method is very high, but the values are static and do not depend on any traffic context. The values in the table can also be used as thresholds to detect when the peaks of traffic are exceeded. Choosing longer interval would produce approximate values and reduce the risk of getting false positives due to very short bursts of incoming traffic.

- 187 -

Values of Covariance method are not provided since in our testing they were either positive or negative and since the testing was made with SYN flooding attack, the value was always negative.

5 Conclusions After running the tests we can see that all three methods can effectively detect flooding attacks.

However these methods are not intended to distinct legitimate packets from the flow. They are also unable to track or log the IP addresses of the hosts participating in the attack.

The Covariance method is able to detect changes between relations in selected traffic parameters. To get the most from using this method, more observations should be performed in relations between various traffic parameters. Having the relations identified, the method can be adopted to monitor more parameters, making it possible to detect various kinds of flooding attacks.

Change-Point approach is able to detect the exact moment of increase or decrease of traffic, which provides quick detection of an attack. Using this method, the changes in values should be identified when the traffic gets lower than nominal traffic, because the change would also be registered and that can lead to a false positive.

Using Passive Heuristics method, attacks can be detected as soon as the last packet exceeding the threshold arrives. However that may produce a number of false positives in cases of short peaks in traffic.

The CPU usage of every DDoS identification method is low when number of requests does not exceed 1000 requests per second. It greatly increases when number of requests per second is greater than 2000. Change Point method produces the best CPU performance when number of requests per second is in interval from 700 from 2500. In other conditions Passive Heuristics method is more effective.

CPU usage of the system when all three methods are active rises significantly under higher loads. However it is slightly less then the sum of usages of all three methods separately.

6 Future work According to the tests performed we can see that the system developed is still in a need of improvement.

First of all the ability to distinct legitimate traffic from malicious traffic during the attack is necessary. After having such ability, the system would be able to block malicious hosts when the attack is detected.

The system can also be extended by implementing other available DDoS detection methods and by improving the currently implemented ones.

References

[1] P. E. Ayres, H. Sun, H. J. Chao, Fellow, W. C. Lau. ALPi: A DDoS Defense System for High-Speed Networks, IEEE Journal on Selected Areas in Communications, October 2006, Vol. 24, No. 10, pp. 1864–1876.

[2] K.Park, H.Lee. On the Effectiveness of Route-Based Packet Filtering for Distributed DDoS Attack in Power-Law Internets, SIGCOMM Comput. Commun., October 2001, Rev. 31, No. 4, pp. 15-26.

[3] S. Jin D. S. Yeung. A Covariance Analysis Model for DDoS Attack Detection, IEEE International Communication Conference (ICC04), June 2004, Vol. 4, pp. 20-24.

[4] K. K. K. Wang, R. K. C. Chang. Engineering of a global defense infrastructure for DDoS attacks, Proceedings of IEEE International Conference on Networking: ICON2002, August 2002, pp. 419-427.

[5] Y. Xiang, Y. Lin, W.L. Lei, S.J. Huang. Detecting DDOS attack based on network self-similarity, Communications, IEE Proceedings, June 2004, Vol. 151, Issue: 3, pp. 292-295.

[6] C. Siaterlis, B. Maglaris. Detecting DDoS attacks with passive measurement based heuristics, Proceedings of the Ninth International Symposium on Computers and Communications 2004, 2004, Vol. 2, pp. 339-344.

[7] L. Feinstein, D. Schnackenberg, R. Balupari, D. Kindred. Statistical Approaches to DDoS Attack Detection and Response, Proceedings of DARPA Information Survivability Conference and Exposition, 2003, Vol. 1, pp. 303-314.

[8] Y. Chen, K. Hwang, W. S. Ku. Distributed Change-Point Detection of DDoS Attacks: Experimental Results on DETER Testbed, Proceedings of the DETER Community Workshop on Cyber Security Experimentation and Test on DETER Community Workshop on Cyber Security Experimentation and Test 2007, August 2007, pp.7-7.

[9] M. Lee, E. J. Kim, C. W. Lee. A Source Identification Scheme against DDoS Attacks in Cluster Interconnects, Proceedings of the 2004 International Conference on Parallel Processing Workshops, August 2004, pp. 354 – 361.

[10] A. Akella, A. Bharambe, M. Reiter, S. Seshan. Detecting DDoS Attacks on ISP Networks, ACM SIGMOD/PODS Workshop on management and processing of data streams (MPDS) FCRC, 2003, P. 3.

[11] R. K. C. Chang. Defending against Flooding-Based Distributed Denial-of-Service Attacks: A Tutorial, IEEE In Communications Magazine, 2002, Vol. 40, No. 10, pp. 42-51.

[12] http://code.google.com/p/bonesi/

- 188 -

DYNAMIC RESPONSE ZONE ROUTING FOR MANET

Rimantas Plestys, Rokas Zakarevicius

Kaunas University of Technology, Department of Computer Networks, Studentu str. 50-416, Kaunas, Lithuania, [email protected], [email protected]

Abstract. A mobile Ad Hoc network (MANET) is made of mobile nodes connected to each other via wireless links. The nodes function both as routers and as host devices. Routing protocols are responsible for reestablishing network routes that break off due to topology changes. Any routing protocol generates the additional stream of control packets – the overhead. A response zone is a network space, where nodes send response control packets or forward request control packets further to the network. In this paper, the Dynamic Response Zone Routing algorithm is proposed, which operates by changing the response zone size according to the network structure in separate steps of routing process. Different cases of dynamic routing are simulated and results are presented in this paper.

Keywords: Ad Hoc networks, response zone, location-based routing, signal strength.

1 Introduction A mobile Ad Hoc network (MANET) is made of mobile nodes capable to communicate to each other

via wireless links. Ad hoc network nodes function simultaneously as routers performing packet routing functions and as hosts sending and receiving data packets. The nodes can move freely or stay fixed in limited area and therefore the network may take different topology. The applications of MANETs include corporate, home and personal area networking, sensor networks, emergency services, military communications and etc. Ad Hoc network routing protocols are responsible for reestablishing network routes that break off due to topology changes.

Any routing protocol generates the additional stream of control packets – the overhead, which temporarily reduces the network bandwidth. It is desirable to minimize the amount of control packets and at the same time to get routes reestablished as soon as possible.

Up to now, the definition response zone was not mentioned in any mobile Ad Hoc network research. Response zone is a space, where nodes send response packets or forward route control packets further to the network, i.e. the nodes react to the request packets they receive. Request zone is a space within the wireless signal transmission range of the node S; therefore, all the neighbour nodes inside this space receive all the packets sent.

In this paper, the dynamic response zone routing algorithm for mobile Ad Hoc networks is proposed, which operates by dynamically changing the response zone in separate steps of the routing process in order to reduce the routing overhead.

2 Related work Depending on routing protocol activity, they can be categorized as table driven (proactive) and on-

demand (reactive) routing protocols [1]. Table driven (proactive) protocols maintain consistent routing information about each node in the network. This information is updated regularly and stored on each node. On-demand (reactive) routing protocols create routes only when desired by the source node. Ad-Hoc On Demand Distance Vector (AODV) [2] and Dynamic Source Routing (DSR) [3] are reactive routing protocols for Ad Hoc networks, operating on the on-demand basis [1], [2], [3]. They request a route only when needed and does not require nodes to maintain routes to the destinations that are not currently communicating. On-demand routing protocols operate by flooding the network with route request (Rreq) packets in all directions. Such flooding-based routing protocols (AODV and DSR) generate a big network overhead, especially when the network is more dynamic and dense. There is a number of routing protocols proposed, seeking to achieve efficient routing by decreasing the overhead of route discovery. Some of these routing protocols use network node location information to find the route – LAR [4], NB-GEDIR [5], GPSR [6] etc. The signal strength is also used in some routing protocol proposals [9], [10].

Location-based routing protocols use node location information to reduce the routing overhead of a network [4], [5], [6], [7]. The assumption is made that each node knows the current locations of all other network nodes. Location-aided routing-1 (LAR-1) [4] algorithm operates by flooding a fixed rectangular response zone with route search packets. The nodes inside the response zone forward the Rreq packets to other nodes, while nodes outside the response zone ignore the Rreq packets. In LAR-2 [4] case the response zone contains only the nodes that are closer to the destination node D than the node from which they received the route request packet. Each node in the request zone has to calculate the distance to the destination D to detect if it is inside the

- 189 -

response zone. Another location-based routing protocol No-Beacon Geographic Distance (NB-GEDIR) [5] also operates by calculating distances to the destination D. The source node S or some intermediate node requests location information from neighbour nodes. The nodes in the request zone respond by sending their coordinates to the requesting node. After receiving the location reply packets, the source (or intermediate) node determines its next-hop node with the minimum distance to the destination D. Greedy On-demand routing using location information (GOLI) [7] protocol operates in a similar way as NB-GEDIR, but it also complements the route discovery process by defining a certain threshold within the radio range. The threshold helps to avoid selecting a next hop node that is very close to the boundary of the radio range in order to find the more stable route to the destination.

The routing overhead can be even more reduced by making response zone limitations during the routing process [8]. The Limited Response Zone Routing (LRZR) [8] algorithm is based on the NB-GEDIR protocol, and operates by applying the limit radius r inside the request zone radius R in order to reduce the response zone and thus decrease the routing overhead. The radius r is inserted into the location request packet, and neighbour nodes check for their existence in the response zone by using distance calculations. I.e. location reply packets are being sent from the nodes that are farther from the source node S than the distance r.

The Signal Stability-based Adaptive Routing (SSA) protocol [9] uses signal strength and stability of individual hosts as route selection criteria. Every node maintains the signal stability table by recording the signal strength values of its neighbour nodes and classifying the records to strongly connected and weakly connected. The signal strength is retrieved from link layer beacons that are being sent once every time quantum. The strongly connected nodes also have entries in the routing table, which is being dynamically updated by route search packets. The historical signal strength has been added as a factor into the location-based Beacon-Less Routing Algorithm (BLR), which improves in avoidance of routing into sparse area [10]. The signal strength is obtained from neighbouring nodes within a latest historical period, and is considered as a measure for the neighbouring density. It is assumed that the stronger the neighbouring signal strength, the more dense area the node is possibly in. As the BLR routing algorithm is efficient mostly in dense area, the use of signal strength allows improving it to choose the route direction to a more dense instead of sparse area. Excluding the sources mentioned the network node signal strength is not used as much in Ad Hoc network routing as other techniques, such as location-based or flooding-based routing. Therefore, there is a lot of space for making research in this field.

3 Dynamic Response Zone Routing (DRZR) algorithm As mentioned in the introduction of this paper, a request zone is a space within the wireless signal

transmission range of the node. When the network nodes are distributed in an open space without surrounding obstacles and every Ad Hoc network node has the same transmitter power, receiver sensitivity and antennas, the request zone of a node S is a circle around the node with the radius Rs (Figure 1a). In such ideal case the request zone radius R is equal for every node. In Figure 1a, the signal strength of the node S reduces evenly in all directions as the distance increases, and

min21 Sxx ′>> . x1 and x2 are the received signal strength values of the

node S (or D) at the corresponding points in the request zone. S’min is the marginal signal strength, corresponding to the request zone, i.e. it is the lowest signal strength possible for successful radio communication. However, often there are cases when terrain, transceiver and antenna diversity have a significant impact on the actual request zone of a particular network node. Therefore, the request zone can become irregularly-shaped, and even slight distance among the nodes may lower the signal strength significantly (Figure 1b).

S

Rs

D

Rd

x1 dBm

x2 dBm

x1 dBm

x2 dBm

S’min dBm

S’min dBm

a) b)

Figure 1. Request zones: circles (a) and irregularly-shaped (b).

- 190 -

The mobile Ad Hoc network node density can be heterogeneous. When route request packets are being sent into the sparse network area, the use of fixed response zone limitations for location-based routing (as in [8]) can lead to failed route search as there can be no nodes sending back reply packets due to low density. Therefore, the routing algorithm has to change the response zone dynamically in order to adapt to network density changes.

Most of location-based routing algorithms use only geographical distances among the network nodes during the route search process. Request zones can become irregularly-shaped in the networks with surrounding obstacles. Therefore, routing towards the shortest distance to a destination is not always the best choice, as it may lead to choosing the next-hop node with weak signal level. The received signal strength values could be used to avoid routing to the network nodes with low quality radio links.

In order to implement all these features in routing, the Dynamic Response Zone Routing (DRZR) algorithm for mobile Ad Hoc networks (MANETs) has been developed. It is a location-based routing algorithm, which operates by dynamically changing the response zone in different steps of the routing process. The request zone nodes make decisions either being in a response zone or not, by evaluating the received signal strength levels of network nodes.

The node location information is needed for DRZR operation. Initially it is assumed that the source S knows the geographic location of the destination node D, and each network node knows its own location. The techniques of receiving and distributing the location information are not discussed in this paper, because they are beyond the scope of this research.

In DRZR the response zone is set by applying the signal strength range [Sk; sk] , where kkk_min sSS <≤′

and k is the current network node. The signal strength s limits the response zone from inside and S – from outside (Figure 2 and Figure 3). The signal strength values S and s are calculated by the sending node according to the Free Space loss model.

The DRZR algorithm operates according to the steps below:

a) The source node S broadcasts the location request (Lreq) packet to all neighbour nodes within a wireless signal transmission range. The request contains the following data: (xS; yS), (xD; yD), [SS; sS] , where (xS; yS), (xD; yD) is the location information of both source S and destination D.

b) On receipt of Lreq packets every node M in the request zone check its existence in the response zone by measuring the sender’s S received signal strength value pM and comparing with the signal strength range [SS; sS]. If ( ) ( )DSDM ,, ∆≤∆ , and

SMS spS ≤≤ , the node transmits its location information (xM; yM) to the

source S in a location reply (Lrep) packet.

c) After the receipt of Lrep packets the source S chooses the node Mi, which is closest to the destination D, i.e. ( ) ( )DMDM i ,min, ∆=∆ , and sends route request (Rreq) packet to the node Mi. Now the node Mi becomes

the next-hop node and the route search process is repeated. The node Mi is added to the routing table as a next-hop node for the destination D. If there are no nodes sending reply packets, the signal strength limit value sS is increased and set to some new value nS ss = , where 1−> nn ss . Then actions from step a) are

repeated.

d) There are a number of different schemes for changing sS and SS values. Usually sS is being changed and SS is being fixed. For example, scheme ( ) ( ) ( ) ( ){ }SS Pssss →+→+→= 126 111

works by changing sS values in

three smaller steps, as scheme ( ) ( ){ }SS Pss →= 1 has only one big step. In all cases of increasing sS values,

the algorithm finally transforms into NB-GEDIR, where SS Ps = and min_SS SS ′= . PS is the signal

transmitting power of the node S. However, it is desirable to keep SS higher than S’ S_min as long as possible to avoid selecting a next hop node with low signal strength, as signal receiving errors can occur in such case. If there are still no nodes sending reply packets, the route search process is aborted considering that the path was not found.

e) Then the Rreq packet is received by the destination D, it is considered that the path has been found. As the Rreq packets have travelled all the way through a number of intermediate nodes, this circuit of nodes is considered the shortest route from the source S to the destination D.

There are cases then routes will not be found by using DRZR algorithm, even that they exist in the network. It happens when there is no any device Mi, that ( ) ( )DSDMi ,, ∆≤∆ , even though some nodes exist in an

opposite direction that could be used for route creation. In such cases, DRZR aborts the route search process considering that the route was not found, even when the route exists in the network. Here the flooding-based protocols (AODV or DSR) should be used, because they will always find the route, if it exists in the network, although flooding the network with a large number of route request packets can reduce the quality of some network services.

- 191 -

Figure 2. DRZR algorithm. Response zone is the shaded area. Request zone is circle-shaped.

In Figure 2, ρ1, ρ2, ρ3 are the distances from the nodes to the destination D, i.e. the radii of the arcs that limit the response zone from inside the request zone. The node Mj will not send its location information since it is outside the zone, i.e.

SM spj> . The nodes Ma and Mb will not send their location information, because

( ) ( )DSDMa ,, ∆>∆ and ( ) ( )DSDMb ,, ∆>∆ . The node Mk will not send its location information since it is outside

the signal threshold SS, i.e. SM Spk< . The nodes Mi and Mf will reply with Lrep packets, because

( ) ( )DSDMi ,, ∆≤∆ and SMS spSi<< , and ( ) ( )DSDM f ,, ∆≤∆ and

SMS spSf<< . The node Mi is elected to

be a next-hop node, because it is closest to the destination D, i.e. ( ) ( )DMDM i ,min, ∆=∆ .

Figure 3. DRZR algorithm. Response zone is the shaded area. Request zone in bold is irregularly-shaped.

Due to surrounding obstacles, request zones are irregularly-shaped (Figure 3). The node Mf will not send its location information since the node’s S received signal strength value at the node point Mf is lower than the threshold specified in the Lreq packet, i.e. SM Sp

f< . If the distance was used to set the response zone

instead of signal strength SS (as in LRZR algorithm [8]), the node Mf would get into the response zone even having a weak signal level. Thus, limits sS and the use of signal strength for response zone reduction helps to avoid routing to the network nodes with low quality radio links. The node Mi will reply with Lrep packet, because ( ) ( )DSDMi ,, ∆≤∆ and

SMS spSi<< . It is elected to be a next-hop node, because it is the only node

that replied to the location request query.

- 192 -

4 Routing protocol simulations The simulation programs have been written using MatLab to implement the newly proposed DRZR

(Dynamic Response Zone Routing) algorithm. The operation of the NB-GEDIR routing algorithm was also implemented in the simulator, according to authors [5]. The purpose of the research was to perform routing algorithm simulations on the Ad Hoc network model in order to analyze the routing overhead and algorithm dynamics in separate steps of the routing process.

4.1 The network model It is required to create a universal network model in order to evaluate Ad Hoc network operation and

simulate the performance of network routing protocols. A rectangular grid structure was chosen, which can be described by the matrix ( ) njmiwheretT ij ,1,,1, === . Network nodes can be in two states: “On” (1=ijt ) or

“Off” ( 0=ijt ). The “Off” state can also indicate the absence of the node in the network. Two main network

model types should be emphasized: regular grid structure and shaped network structure.

The regular grid network can be described by matrix ( ) njmiwheretT ij ,1,,1,1 ==== . Each matrix

element corresponds to a network node, and the distances among adjacent nodes in perpendicular directions are equal. The network nodes are fixed. The request zone radius R is initially set before the simulation as well as the distances d among the network nodes.

In shaped network case, some nodes can be in “Off” state, i.e. 0=ijt . Therefore, the network model can

imitate different network topologies: circle, line, scattered network etc. Simple shapes can be built from the regular grid network by setting appropriate nodes to the “Off” state, i.e. 0=ijt . In the scattered network case

such network structure can be created by using uniformly distributed pseudorandom numbers. The matrix A elements aij are pseudorandom numbers from the range [0; 1]. Suppose the number of matrix A elements is

nmN ⋅= , and Z is the approximate desirable number of network nodes. Then the threshold value h=Z/N is set, and the network matrix T is build using the formula (1).

≥

<=

hawhen

hawhent

ij

ij

ij ,0

,1 (1)

The network node exists in the network when 1=ijt , and does not exist when 0=ijt . The network

mobility can be simulated by changing the coordinates of each network node in every step of the routing process.

4.2 The simulations of routing dynamics In order to analyze and illustrate the performance of the DRZR algorithm the simulations were made on

a randomly generated network. This network model imitates the network with a circle-shaped response zone. The signal strength threshold value was set to the minimum, i.e. dBmSS min′= , so only the limiting signal

strength s (dB) was being dynamically changed during the simulations. S and s are calculated by the sending node: )( 1dLPs −= and )( 2dLPS −= . P is the transmitting signal power in dBm. L(dn) is a path loss (in dB)

for a particular distance dn in free space, calculated using the formula (2), where d is a distance in kilometres, and f is radio frequency in MHz, and constant A=32.45.

( ) ( )fdAdBL 1010 log20log20)( ++= (2)

The purpose of this simulation is to analyze the number of control packets (the overhead) in the network, generated by the operation of the DRZR algorithm, depending on the scheme used for maintaining the response zone size. Three DRZR algorithm operation schemes as well as NB-GEDIR algorithm have been simulated on two network cases, when R=500m and R=300m, in the same randomly generated network with approximately 400 network nodes. If there is assumed, that the radio signal transmission range is constant, then the size of the request zone corresponds to the density of the network. Therefore, the network when R=500m is more dense than the case with R=300m. The source node S was in the middle of the network matrix, and the destination node D was at the upper right corner.

The simulation results, when R=500m, are presented in Figure 4. The simulation results, when R=300m, are presented in Figure 5. The diagrams show the cumulative number of packets generated during the single route search process. As seen in the figures, there are three schemes of DRZR algorithm operation. s1 is the initial signal strength value, used to reduce the response zone from inside. The DRZR ( ) ( ){ }Pss →= 1

and

DRZR ( ) ( ) ( ) ( ){ }Pssss →+→+→= 126 111 algorithm schemes can change the response zone size only in one

direction – the signal strength limit value s is being doubled (+6dB) in case of failure in getting any Lrep packets, thus the response zone size is increases. However, the DRZR ( ) ( ) ( ) ( ){ }Pssss ↔+↔+↔= 126 111

- 193 -

algorithm scheme changes the response zone size in two directions – the s is being increased in case of failure in getting any Lrep packets, and s is being decreased backwards in case of a successful receipt of any Lrep packet. Therefore, the algorithm seeks to maintain the optimal size of the response zone by decreasing s value after it has been increased in the previous step of the routing process. The DRZR algorithm schemes differ in the number of steps the algorithm performs until it finally transforms into NB-GEDIR algorithm, where Ps= and minSS ′= .

0

10

20

30

40

50

60

70

0 2 4 6 8 10 12 14 16 18 20 22 24

Steps of the routing process

Nu

mb

er o

f p

acke

ts

NB-GEDIR (s=P, S=S')DRZR

( ) ( ){ }Pss →= 1

( ) ( ) ( ) ( ){ }Pssss →+→+→= 126 111

DRZR

DRZR

( ) ( ) ( ) ( ){ }Pssss ↔+↔+↔= 126 111

Figure 4. The simulation results of the DRZR algorithm, when the request zone R=500.

The routing overhead of the DRZR algorithm in all cases was smaller than NB-GEDIR algorithm. However, the route search process usually takes longer, becau se the location requests are re-sent when response zone size is changed. The algorithm case DRZR ( ) ( ) ( ) ( ){ }Pssss →+→+→= 126 111

creates lower overhead

when comparing with case DRZR ( ) ( ){ }Pss →= 1, so it is more suitable to increase the size of the response zone

gradually rather than removing response zone limitations on the first failure of getting a Lrep packet from neighbour network nodes. The DRZR algorithm with operation scheme ( ) ( ) ( ) ( ){ }Pssss ↔+↔+↔= 126 111

performed best in this simulation as the response zone was kept as small as possible during all the steps of the route search process. However, the route search takes the longest time when comparing with other algorithm cases simulated. This is because it takes more route search steps when often switching from smaller to bigger response zones and vice versa, as Lreq packets are being re-sent after each change.

0

10

20

30

40

50

60

70

0 5 10 15 20 25 30 35 40 45 50Steps of the routing process

Nu

mb

er o

f p

acke

ts

DRZR

NB-GEDIR (s=P, S=S')

DRZR

DRZR

( ) ( ){ }Pss →= 1( ) ( ) ( ) ( ){ }Pssss →+→+→= 126 111

( ) ( ) ( ) ( ){ }Pssss ↔+↔+↔= 126 111

Figure 5. The simulation results of the DRZR algorithm, when the request zone R=300.

- 194 -

It can be clearly seen from Figure 4 and Figure 5 that the efficiency of dynamic response zone routing algorithm depends on the request zone size – i.e. the network density. The denser the network, the more significant reduce in overhead can be achieved by changing the response zone forwards and backwards in every step of the route search process, when comparing with other routing protocols.

5 Conclusions The routing control packet stream is generated in the network during the route search process. It is

desirable to minimize the amount of control packets and at the same time to get routes reestablished as soon as possible. A lot of research is being made to achieve efficient routing by decreasing the overhead of route discovery. Some of these routing protocols use network node location information to find the route, but the response zone is fixed during a route search.

In this paper, the location-based Dynamic Response Zone Routing (DRZR) algorithm for mobile Ad Hoc networks is proposed, which operates by dynamically changing the response zone in separate steps of the routing process. As mobile Ad Hoc networks can be affected by surrounding obstacles or other factors, it is not always effective to rely only on geographical distances among the network nodes. Therefore, the signal strength is used in DRZR to reduce response zones. The signal strength values are calculated according to the Free Space loss model. The network nodes make decisions either being in a response zone or not, by evaluating the received signal strength levels of network nodes.

The software simulations were made on the randomly generated network model in order to analyze and illustrate the performance of the DRZR algorithm, depending on the scheme used for maintaining the response zone size. They indicate that using the DRZR algorithm to adapt the response zone size in every step of the routing process results in lowest routing overhead. Adaptation is achieved by increasing or decreasing the response zone, depending on success or failure of getting the location information. However, the route search delay may increase, because of additional re-sending of location request packets after each change of a response zone. The efficiency of the DRZR algorithm depends on network density, and it is higher in denser networks.

References [1] Jayakumar G., Gopinath G., Ad Hoc Wireless Networks Routing Protocols – A Review, Journal of Computer

Science 3 (8): 2007, pages 574-582.

[2] Perkins C. E., Royer E. M., Ad-Hoc On-Demand Distance Vector Routing, Mobile Computing Systems and Applications, 1999. Proceedings. WMCSA '99. Second IEEE Workshop, Feb. 1999, pages 90-100.

[3] Johnson D. B., Maltz D. B., Dynamic Source Routing in Ad Hoc Wireless Networks, Mobile Computing, T. Imielinski and H. Korth, Eds. Kluwer Academic Publishers, 1996, ch. 5, pages 153-181.

[4] Ko Y-B., Vaidya N.H., Location-Aided Routing (LAR) in Mobile Ad Hoc Networks, Conference Proceedings, Mobile Computing MOBICOM, 1998, pages 66-75.

[5] Watanabe M., Higaki H., No-Beacon GEDIR: Location-Based Ad Hoc Routing with Less Communication Overhead, ITNG '07. Fourth International Conference on Information Technology, 2007, pages 48-55.

[6] Karp B., Kung H.T., GPSR: Greedy Perimeter Stateless Routing for Wireless Networks, Proceedings of the Sixth Annual ACM/IEEE International Conference on Mobile Computing and Networking (MobiCom 2000), August, 2000, pages 243-254

[7] Nakagawa, H. Ishida, K. Ohta, T. Kakuda, Y., GOLI: Greedy On-Demand Routing Scheme Using Location Information for Mobile Ad Hoc Networks, 26th IEEE International Conference on Distributed Computing Systems Workshops (ICDCSW’06)., 2006.

[8] Plestys R., Zakarevicius R., Variable Response Zone Routing for Ad Hoc Networks, Information Technologies 2009: proceedings of the 15th International Conference on Information and Software Technologies, KTU, Kaunas, Lithuania, 2009, pages 158-164.

[9] Dube R., Rais C. D., Wang K., Tripathi S. K., Signal Stability based adaptive routing (SSA) for ad hoc mobile networks, IEEE Personal Communication, Feb. 1997

[10] Chen G., Itoh K., Sato T., Beaconless Location-Based Routing with Signal Strength Assisted for Ad-Hoc Network, IEEE VTC 2007 fall, Baltimore. USA, Oct.2007.

- 195 -

INTERNET CONTENT SEARCH USING FPGA’S

Inan Aydin1, Ali Ziya Alkar2

1Karel A.Ş., Ankara, Turkey, [email protected] 2Hacettepe University, Department of Electrical Eng, Beytepe, Ankara, Turkey, [email protected]

Abstract. Network Intrusion Prevention Systems (IPS) are used in computer communication to block and deter intrusions by taking necessary precautions for the networks to be more secure. By analyzing the information such as the IP address, protocol, port number and payload contained in the packets, the systems can be defended against network attacks. The ultimate point where the IPS peeks at its workload is where the evaluation of hundreds of packets reaches at a high network speed. In this research, Aho Corasick (AC) algorithm is chosen as the character matching algorithm. In order to catch up with the high network speeds, the matching algorithm is designed to work in parallel in hardware. All variables used in the system are flashed to the hardware through an interface software program without the need for a hard coded program. The design is highly reconfigurable and finally placed to work in a low cost Field Programmable Gate Array (FPGA).

Keywords: Computer Network Security, IDS, IPS, FPGA, Firewalls, TCP/IP.

1 Introduction With the extended usage of internet, the variety of applications using internet as a media and the

connectivity has increased almost exponentially. These kinds of systems are under constant threat of malicious programs. These attacks range from simple SMPT e-mail worm attacks to the Trojan programs embedded into so called innocent executable files or even screensavers. Some cause the loss of all the invaluable data on the computer, or hijack of the computer for other attacks [22][7].

As the ADSL and cable modem usage has increased throughout the years, the extended bandwidth demand is doubling every 100 days [11]. Due to this, intrusion detection and counter attack and defense studies have moved from the end user to the network connection points and from the side packet forwarders to the central packet forwarders.

Intrusion Detection Systems (IDS) are quite promising in the protection of the systems on the network and internet. The functionality of the IDS can be resembled to a complicated alarm system built for homes and offices. Just like the instances of an anomaly detection such as motion, noise or a break in, for the alarm systems, the IDS constantly monitors the contents of the packets, looking for attack signatures and upon detection takes the necessary measures against the offending actions. Every one of data searched in its entirety is a huge step towards protecting the computer networks at the cost of reduced network performance. With the IDS the network traffic can be monitored in real time and proactively the attack can be neutralized, logged, and then added to the signature data base as well as the system administrators can be warned about the attack for an immediate action if necessary.

Firewalls while placed at the network entry/exit points simply check the headers of the packets. However the attacks can be generated using confirming header information deceiving the system. IDS’s use mechanisms to check the payloads as well as the header information. This enables the IDS systems to prevent attacks that can penetrate through the classical firewall systems. With the investigation of the header and the data payload, the attacks that are matched with signatures already in the database, IDS’s provide a better protection from the threats of the network.

In order for the IDS to spot the anomalies some criteria are needed to be defined and introduced to the system. These criteria include the types of packets to be treated, the character set included in the payload and the location of the payload within the packet. The actions taken are defined separately upon detecting a packet which matches all the defined criteria.

IDS systems can be ported to be utilized in a variety of operating systems. Client type IDS’s can monitor the incoming network traffic for an attack. Initially IDS’s are placed for diagnosis purposes on certain locations along the network. Then these are installed in central and side packet managers once they became popular. An IPS (Intrusion Prevention System) has all the features of a good IDS, but can also stop malicious traffic from invading the enterprise. IPS sits inline with traffic flows on a network, actively shutting down attempted attacks as they’re sent over the wire. In other words IPS is proactive compared to the IDS systems. In our implementation, we designed an IDS with proactive defence system built into it.

As the network speed and the diversity in the types of attacks has increased through time the need for the effectiveness of the algorithms are also needed to be increased. Software methods by their nature may not be effective compared to dedicated hardwares when the performance is the criteria. The client side IDS

- 196 -

the effectiveness of the algorithms are also needed to be increased. Software methods by their nature may not be effective compared to dedicated hardwares when the performance is the criteria. The client side IDS performances are measured in terms of the general success rate whereas the packet managers are evaluated with respect to their worst case success rate [6]. This success rate can be well increased using effective algorithms and running them in parallel [17] where possible. Hardware methods are well suited for parallel execution of string matching algorithms [15]. The usage of FPGA’s are increasing as well due to their fast reconfigurability and price [2][8-13] [16][18].

In the next section, string matching algorithms in particular our choice of algorithm is described. In Section 3 Snort is introduced.. In Section 4 our application platform will be explained. In the final two sections tests and conclusions will be presented.

2 String matching algorithms In the fundamentals of a modern IPS there is a string matching algorithm running in its intrusion

detection core. The contents of the packets contain the “fingerprints” of an attack. In this respect, the algorithm should be capable of both allowing a network at very high speeds as well as checking packets with a number of different parameters. This causes the string matching algorithms to be the bottleneck point within the intrustion detection [5].

The importance of string matching algorithms can be shown by analyzing the frequently referred IDS software solution, a freeware namely the Snort system. Snort accommodates a large database of the known anomalies and rules derived from other suspicious network activities. These rules are identified by the experts, by analyzing the packets in their entirety including the payloads.

Figure 1. The snort rule increase through the years.

The rule such as matching of a string at a determined field then an action is taken. These actions may

include logging the packet, informing the system administrator, ignoring the whole packet or activating other rules. There are several commercial IPS’s in the market using the rules defined by Snort. The Snort rule increment since 1999 to the publication date is illustrated in Figure 1. The increment in the rule number is quite dramatic and reaches up to 16000 rules today.

Figure 2. Byte lengths histogram in the rules [21]

- 197 -

The byte lengths used in matching is shown in Figure 2. The 4 byte point corresponds to the number of rules dependent on the IP number. Byte lengths for most of the rules vary from 5 to 20. However, some byte lengths for some rules exceed 50. Therefore the operation time for the string matching algorithm needs to be invariant of the byte length for consistent performance purposes.

The following can be concluded from both Figure 1 and 2: It is a known fact that the 70% of the total time for Snort system workload consists of string matching algorithms [23]. The linear search techniques that require increased operation time can be quite useless. A search algorithm needs to meet the constant rate demand for a stable performance. Aho Corasick algorithm can meet this demand [1][21], in addition it has a parallel notion in its structure which can be applied to hardware.

. Aho Corasick works by forming states from each element in a string. During the execution of the algorithm each incoming character is compared against a rule and upon finding a match a state transfer occurs. When the final state is reached, the keyword matching is completed. A typical example on Aho Corasick algorithm execution is illustrated in Figure 3.

Figure 3. Aho-Corascik State Table

3 Snort Rule Structure As mentioned in the introduction, Snort consists of a database formed of rules that define intrusion

detection attacks. The basic Snort rule structure is shown as follows: alert tcp $EXTERNAL_NET any->212.148.21.4/24 80 (msg:”Sample alert”;) The title of the rule is formed by “alert tcp any any -> any any” and contains the following information: • Log alert: When a suspected packet is spotted the action to be taken is specified. • Protocol (ip, tcp, udp, icmp, any): Defines which packet protocol is to be applied for the rule. • Src IP & Port: The source information about the packet is specified. This field may consist of

variables such as ($HOME_NET), personal IP addresses, CIDR (Classless Inter-Domain Routing) or a list of these variables.

• Dst IP & Port: The destination information about the packet is specified. Port numbers may consist of individual port numbers (such as 1024) or list of port numbers (80:85).

More complicated rules can be formed by adding various choices on top of the basic rule structure. These rules can be built by analysis of attack packets previously spotted.

The body of the rule (msg:”Sample alert” ;) is enclosed within parenthesis “()”. The choices are separated by comma “,” from each other. There are total of 5 type choices for the rule body:

• Metadata, • Payload detection, • Non payload identification, • Post Detection, • Thresholding and suppression Metadata options provide the IDS information about the rule itself. For example “msg” is a warning

message legible by everyone. “reference” indicates a URL address for further information. “classtype” and “priority”, informs about the attack type and the event that will occur. “classtype” indicates the default priority that needs to be used for the class types and priority changes these default priorities. “sid” and “rev” privatizes the rule and every “sid” needs to be private. A rule example where the mentioned options are included is as follows:

alert tcp $EXTERNAL_NET any -> 192.168.3.0/24 80 (msg:”Sample alert”; classtype:web-application-activity; reference:url,http://www.vorant.com/advisories/20060405.html;sid:2000123; rev:1;)

- 198 -

Payload detection option is used to indicate a search on data fields of the packet, not on the header section. This option, as mentioned in the Section 1, is the distinguishing feature of the IDS systems from a typical firewall system. The content option is used for string search If “nocase” is used during the investigation of the payload then the search becomes case insensitive. “offset” option is used to skip the search of a certain byte length. An example rule built with these options is as follows:

alert tcp $EXTERNAL_NET any -> 192.168.3.0/24 80 (msg:”Sample alert”;content:”http|3a|//www.vorant.com/ test.cgi ?id=pwn3d”; nocase; offset:12; classtype: web-application-activity;reference:url,http://www.vorant.com/ advisories/20060405.html; sid:2000123; rev:1;)

This rule searches the case insensitive occurrence “http://www.vorant.com/test.cgi?id=pwn3d” string within all the network traffic, containing the header information in the specified rule. While doing this, in order to increase the efficiency, the first 12 bytes are skipped.

4 Application platform In our hardware application, the strings specified by the user are flashed to the hardware system, and are

searched within the streaming network traffic. The basic operation functionality of the IDS, the header and payload information are traced with different variables thus a Snort like implementation has been realized with this prototype. The aim in this hardware implementation is to be able to search the header and payload of the packet traffic at the network speeds. The search algorithm has been implemented using the Aho Corasick algorithm which provided a network speed search capability in hardware, with the only identifiable worst case scenario that we know of [21].

The application is implemented with a laptop connected to the internet with a cable ADSL modem. The first stage in packet search is applied on the header parts. The headers are searched with respect to the predefined user defined variables. The second stage consists of a search of user defined variables in the payload. In the system, we included 8 parallel search engines that implement string matching. While doing this our ultimate goal was not to saturate the performance of network traffic. At least we demanded the network speed to be satisfactory while increasing the characters to be found in the research. The entire architecture of this system will be investigated in two sections, namely the hardware and the data entry interface as the software.

4.1 Hardware Architecture The application platform is placed between the ADSL modem and the computer as shown in Figure 4.

This provides a suitable transparent flow of internet traffic for constant monitoring. This structure forms the hardware based IPS. This system will not require any modifications or setup program on the client side. This is an important feature in terms of its applicability.

Figure 4. The hardware architecture of the system

The hardware platform consists of a slightly modified version of a Dragon PCI FPGA1 platform as

shown in Figure 5.

Figure 5. Modified Dragon development platform.

Dragon is an FPGA card capable of connection to a PC via PCI or/and USB ports. It includes a

XC2S100 FPGA from the Spartan-II series. This low cost platform is suitable for many applications without the

1 KNJN LLC Electronic Development Products

- 199 -

necessity of large, expensive and complicated systems. The USB interface included on the platform includes a 1 MB/s data rate. This interface serves as a power supply to the card as well as providing the interface to convey the variables inputted through a user interface by the user.

4.2 Data Entry Interface A program has been prepared to send the variables determined by the user to the hardware

implementation. This software is a perfect interface between the hardware implementation and changing user needs eliminating a complicated hardware load process for the FPGA. The program is written in MS Windows Visual Studio .NET 2003 development environment. Dragon card is controlled through the USB connection and the programming is done through the same interface as well. The data entry program sends the variables received by a user interface to the system through the USB.

With the interface program the user can supply the variables that form the Internet layer that forms the TCP/IP model [3][4][14][19][20], transport layer and the application layer. A snapshot is given in Figure 6.

Figure 6. Data Entry User Interface

There are three logical address slots to restrict the traffic from IP addresses entered. The variables

entered in these slots will be searched in the header field of the packets. In the transport layer, one of two basic transport methods for packet transfer is included namely: TCP

and UDP protocols can be selected from a drop down button. In addition to the protocol selection in this layer, the port numbers used to convey the packets to the right application can also be selected to prevent data block traffic flow from this particular port number. Since the application specific port numbers are located in the source port field, the IDS is able to take action against the source port numbers. As an example: An HTML data request by any client might have a port number 38755 however on the server side the destination port number has to be 80; since the data requested is sent as an HTTP packet. Similarly the server side has to select the destination port of this request as 38755. The 80 port number on the server side indicates the packet is an HTTP packet. At the very top most location the “Application Layer” includes slots for strings to be searched. In the initial application the number of characters allowed in each string is limited to be 11 with up to 4 different strings. The variables defined in these slots are going to be used in the generation of states in the Aho–Corasick algorithm. The variables entered in these slots will be searched in the payload field of the packets.

For demonstration purposes, a warning led indicator as shown on Figure 5 by a red arrow is signaled at any instance on a match of the transport and application layers' parameters of the incoming packet along with parameters defined by user for each layer. If no keyword is chosen the indicated port/protocol will be blocked. In addition the Network IP’s are independent parameters that can be blocked independent of the transport and the application layer values. At normal operation this led indicator is not utilized instead the traffic is simply blocked to demonstrate the IPS characterists.

Wild characters are supported in the search mechanism. The user can place the ‘?’ in places where that character can be any character. As an example for the string “ca?st”, the system will accept any character following the ‘s’ and ‘t’ characters and look for the ‘s’ and ‘t’ characters following the ‘?’ character replacement. As a result all the strings suited for this scenario such as any of the words “carst”, “canst” and “cavst” will cause the alarm to go off.

During the search for the character string if the string is required to be an isolated word, then a blank

- 200 -

space before and after the word is needed to be added. As an example the word: “ sleep ” will be differentiated from the search of “sleeper” word and will not cause a trigger if found within other longer strings including the word “sleep” partially.

If a port number and protocol is required to be blocked only a ‘?’ in the keyword slot will be sufficient to block that particular port number and protocol.

5 Tests performed Colasoft Packet Builder 1.0 is used to simulate the system. The program generates a certain amount of

network packets and simulates the transport and network layer functionality. In addition the keywords are entered in the packet contents are loaded with keywords contained in the application layer filter. As a result the response time of the system has been measured to the simulated attacks. Besides this test platform, the HTML pages are randomly selected to test the entire system.

The tests performed on the platform are designed to push the Aho Corasick hardware implementation to its limits. The keyword length is gradually increased to the limit where the states formed would cause a bottleneck in the matching algorithm hence the network transmission would fail and require re-transmissions. The re-transmissions are bounded to be 3 for our case and operating system we used in our system. The server shutsdown the connection after 3 retries. The RTT (Round Trip Time), between our client and the “www.wikipedia.org” is 64ms as the basis of our tests. The adaptive RTO (Retransmission Timeout) goes up to 96ms.

When the device under test is gradually pushed to the limit of 847 characters, due to late respond times, the server disconnects transmission.

96 ms / 847 states = 0.11 ms/states (1) As a result 0.11 ms is spent per character matching. This value is found by the parallel code processing

structure of the FPGA. A second test is performed to test the effectiveness of the parallel execution. The test engines are

serialized. The character length of 478 is found to bottleneck the system. 96 ms / 478 states = 0.20 ms/states (2) With these results we can conclude that the serial engines required twice the amount of time per

matching of a character with respect to parallel matching as we initially implemented.

6 Conclusions In this work, we aimed to design a low cost, fast, portable and easily reconfigurable hardware IPS

system. The whole research aimed at giving support at intrusion detection at all levels of TCP/IP layers. Parallelism has been incorporated to accommodate faster string searching with respect to other approaches that use serial engines, including software approaches. This design can be enhanced to include more rules for all spectrums of network threats. More Snort rules can be added to the system to improve the system. In addition the intrusion attempts can be stored for future monitoring.

The application platform is a simple modified low cost FPGA dragon card with a Xilinx Spartan XC2S100 FPGA chip. The hardware platform is supported by a user friendly interface to easily include the search parameters to the existing hardware within less than a single second. The RAM on the development card is kept minimal for cost concerns. The total RAM or the size of FPGA can be enhanced at a cost to improve the states of the algorithm. The major limitation of RAM can also be eliminated by using bit compression techniques [23]

References

[1] Aldwairi, M , Hardware Efficient Pattern Matching Algorithms and Architectures for Fast Intrusion Detection, 2006

[2] Baker, Z. K., Prasanna, V. K., A Methodology for Synthesis of Efficient Intrusion Detection Systems on FPGAs, Proceedings of the 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, Pages: 135 – 144, 2004

[3] Behrouz, A., F., TCP/IP Protocol Suite, McGraw-Hill.

[4] Braden, R. T., Borman, D. A., Partridge C., 1988, RFC1071: Computing the Internet checksum, 2003

[5] Cisco, Interconnecting Cisco Network Devices, CA: Cisco System, Inc., 1999

[6] Fide, S., Jenks, S., A Survey of String Matching Approaches in Hardware, 2006

[7] Floyd, B., The Changing Face of Network Security Threats, 2006

[8] Graczyk, R., Pozniak, K. T., Romaniuk, R. S., FPGA based, modular, configurable controller with fast synchronous optical network, TESLA Report 2006-09, 2006.

- 201 -

[9] Hegarty, D., McDonald, S., An FPGA-based Configurable Network Interface System, Networking, International Conference on Systems and International Conference on Mobile Communications and Learning Technologies, 2006. ICN/ICONS/MCL 2006.

[10] Kachris, C., Design and Implementation of a TCP/IP core for reconfigurable logic, 2001.

[11] Lockwood, J. W., An Open Platform for Development of Network Processing Modules in Reprogrammable Hardware, 2001.

[12] Lu, W ., Designing TCP/IP Functions In FPGAs, 2003.

[13] Moscola, J., Lockwood, J., Loui, R. P., Pachos, M., Implementation of a content-scanning module for an Internet firewall, Field-Programmable Custom Computing Machines, 2003. FCCM 2003. 11th Annual IEEE Symposium, 2003.

[14] Önal, H., TCP/IP Ağlarda Parçalanmış Paketler, Parçalanmış Paketler ve Güvenlik Sistemlerine Etkileri, 2009.

[15] Pagiamtzis, K., Sheikholeslami, A., Content-Addressable Memory (CAM) Circuits and Architectures: A Tutorial and Survey, IEEE Journal of Solid-State Circuits, Vol. 41, NO. 3, 2006.

[16] Sidhu, R., Prasanna, V. K., Fast Regular Expression Matching using FPGAs, Field-Programmable Custom Computing Machines, 2001. FCCM '01. The 9th Annual IEEE Symposium.

[17] Soewito, B., Weng, N. Methodology for Evaluating DNA Pattern Searching Algorithms on Multiprocessor, Proceedings of the 7th IEEE International Conference on Volume , Issue , 14-17 Oct. 2007 Page(s) 570 – 577.

[18] Sourdis, I., Pnevmatikatos, D., Fast, Large-Scale String Match for a 10Gbps FPGA-Based Network Intrusion, 2003.

[19] Stevens, R., Wesley, A., TCP/IP Illustrated, Volume 1, 1994.

[20] Tanenbaum, A. S., Computer Networks, Chapter 2, 3, 4, 5, 6, 7, 8, 2003.

[21] Tuck, N. Sherwood, T., Calder, B., Varghese, G., Deterministic Memory Efficient String Matching Algorithms for Intrusion Detection, INFOCOM 2004. Twenty-third Annual Joint Conference of the IEEE Computer and Communications Societies, 2004.

[22] Whitman, M. E., Ph.D. and H., J., Mattord, M.B.A, Management of Information Security, Boston, MA: Thomson Course Technology, 2004, ch 9.

[23] Xyan, Z., Sahni, S., Highly compressed Aho-Corasick automata for efficient intrusion detection, Computers and Communications, 2008. ISCC 2008. IEEE Symposium on Volume , Issue , 6-9 July 2008 Page(s):298 – 303.

- 202 -

Documents

TASK MIGRATION SUBSYSTEM FOR MACRO-PROGRAMMED WIRELESS SENSOR NETWORK