Techpush | Agent Based Efficient Anomaly Intrusion Detection System in Adhoc networks IEEE Project



Mobile Agent Based IDS
Intrusion Detection System

Agent Based Efficient Anomaly Intrusion Detection System in Adhoc networks
For project Contact : techpush.project@gmail.com
http://www.techpush.in

"IEEE Project" "Java Project" "IEEE Java Project" "DotNET Project" "PHP Project" 2012 "Project 2012" "IEEE Project 2012" "Techpush"



Opinion Mining & Social Networking a Promising match Project





Opinion Mining and Social Networks:
a Promising Match
Abstract—In this paper we discuss the role and importance of social networks as preferred environments for opinion mining and sentiment analysis especially. We begin by briefly describing selected properties of social networks that are relevant with respect to opinion mining and we outline the general relationships between the two disciplines. We present the related work and provide basic definitions used in opinion mining. Then, we introduce our original method of opinion classification and we test the presented algorithm on real world datasets acquired from popular Polish social networks, reporting on the results. The  results are promising and soundly support the main thesis of the paper, namely, that social networks exhibit properties that make them very suitable for opinion mining activities.

Keywords: opinion mining, sentiment analysis, social
computing, social networks

I. INTRODUCTION
Graphs and networks certainly rank among one of the most popular data representation models due to their universal applicability to various application domains. The need to analyze and mine interesting knowledge from graph and network structures has been long recognized, but only recently the advances in information systems have enabled the analysis of graph structures at huge scales. Analysis of graph and network structures gained new momentum with the advent of social networks. While the analysis of social networks has been a field of intensive research, particularly in the domains of social sciences and psychology, economy or chemistry, it is the emergence of huge social networking services over the Web that
spawned the research into large-scale structural properties of social networks.. Social networks exhibit a very clear community structure. Such community structure partially stems from objective limitations (e.g., internal organizational structure of a company can be closely represented by the ties within a particular social network) or, to some extent, may result from subjective user actions and activities (e.g., bonding with other people who share one’s interests and hobbies). Unveiling the true structure of a social network and understanding of communities forming within the network is the key factor in understanding what the future structure of network will be. The main goal of social network analysis is the study of structural properties of networks. Structural analysis of the social network investigates the properties of individual vertices and the global properties of the network as a whole. It answers two basic classes of questions about the network: what is the structural position of any given individual node and what can be said about groups (communities) forming within the network. The main measurement of a node’s social power (also called member’s prestige) is centrality, which allows to determine node’s relative and absolute importance in the network. There are several methods to determine node’s centrality, such as the degree centrality (the number of links that connect to a given node), the betweenness centrality (the number of shortest paths between any pair of nodes in the network that traverse a given node) or the closeness centrality (the mean of shortest paths lengths to other nodes in the network). From the point of view of opinion mining the ability to assess the node’s prestige is essential as it allows to differentiate between opinions of different individuals. More specifically, node’s prestige allows to assign different weights to opinions and associate more importance to opinions expressed by prominent individuals. Another factor that is often considered in opinion mining is the identification of influential individuals. An influential individual does not have to be necessarily characterized with high degree centrality to influence the average opinion within the network. Usually, such individuals are characterized by high betweenness
centrality, impacting the dissemination of opinion rather than forming the opinion. For instance, an individual with high betweenness centrality can stop a negative opinion from spreading through the network, or, on the other hand, she can amplify the opinion. Due to psychological reasons humans tend to form their opinions in such way that the opinions conform with the norm established within a given social group. Thus,
when mining opinions one has to take into consideration the influence of the context in which the opinion is forming, i.e. the social milieu of an individual. Social networks are highly effective in bolstering group formation

Three-Dimensional Password for More Secure Authentication


Abstract—Current authentication systems suffer from many weaknesses. Textual passwords are commonly used; however, users do not follow their requirements. Users tend to choose meaningful words from dictionaries, which make textual passwords easy to break and vulnerable to dictionary or brute force attacks. Many available graphical passwords have a password
space that is less than or equal to the textual password space. Smart cards or tokens can be stolen. Many biometric authentications have been proposed; however, users tend to resist using biometrics because of their intrusiveness and the effect on their privacy. Moreover, biometrics cannot be revoked. In this paper, we present and evaluate our contribution, i.e., the 3-D password.
The 3-D password is a multifactor authentication scheme. To be authenticated, we present a 3-D virtual environment where the user navigates and interacts with various objects. The sequence of actions and interactions toward the objects inside the 3-D environment   constructs the user’s 3-D password. The 3-D password can combine most existing authentication schemes such as textual passwords, graphical passwords, and various types of biometrics into a 3-D virtual environment. The design of the 3-D virtual environment and the type of objects selected determine the 3-D password key space.

Index Terms—Authentication, biometrics, graphical passwords, multifactor, textual passwords, 3-D passwords, 3-D virtual environment.

Introduction:
THE DRAMATIC increase of computer usage has given rise to many security concerns. One major security concern is authentication, which is the process of validating who you are to whom you claimed to be. In general, human authentication techniques can be classified as knowledge based (what
you know), token based (what you have), and biometrics (what you are).
Knowledge-based authentication can be further divided into two categories as follows: 1) recall based and 2) recognition based [1]. Recall-based techniques require the user to repeat or reproduce a secret that the user created before. Recognitionbased techniques require the user to identify and recognize the
secret, or part of it, that the user selected before [1]. One of the most common recall-based authentication schemes used in the computer world is textual passwords. One major drawback of the textual password is its two conflicting requirements: the selection of passwords that are easy to remember and, at the same time, are hard to guess.

Related Work
Many graphical password schemes have been proposed  [6]–[8], [10]–[12]. Blonder [6] introduced the first graphical password schema. Blonder’s idea of graphical passwords is that by having a predetermined image, the user can select or touch regions of the image causing the sequence and the location of
the touches to construct the user’s graphical password. After Blonder [6], the notion of graphical passwords was developed. Many graphical password schemes have been proposed. Existing graphical passwords can be categorized into two categories as follows: 1) recall based and 2) recognition based [1]. Dhamija and Perrig [7] proposed Déjà Vu, which is a recognition-based graphical password system that authenticates users by choosing portfolios among decoy portfolios. These portfolios are art randomized portfolios. Each image is derived from an 8-B seed. Therefore, an authentication server does not need to store the whole image; it simply needs to store the 8-B seed. Another recognition-based graphical password is Passfaces [8]. Passfaces simply works by having the user select a subgroup of k faces from a group of n faces. For authentication, the system shows m faces and one of the faces belongs to the subgroup k. The user has to do the selection many times to complete the authentication process. Another scheme is the Story scheme [9], which requires the selection of pictures of objects (people, cars, foods, airplanes, sightseeing, etc.) to form a story line.

Our Approach
3-D PASSWORD SCHEME
In this section, we present a multifactor authentication scheme that combines the benefits of various authentication schemes. We attempted to satisfy the following requirements. 1) The new scheme should not be either recall based or recognition based only. Instead, the scheme should be a combination of recall-, recognition-, biometrics-, and token-based authentication schemes.
2) Users ought to have the freedom to select whether the 3-D password will be solely recall-, biometrics-, recognition-, or token-based, or a combination of two schemes or more. This freedom of selection is necessary because users
are different and they have different requirements. Some users do not like to carry cards. Some users do not like to provide biometrical data, and some users have poor memories. Therefore, to ensure high user acceptability,
the user’s freedom of selection is important. 3) The new scheme should provide secrets that are easy to remember and very difficult for intruders to guess. 4) The new scheme should provide secrets that are not easy to write down on paper. Moreover, the scheme secrets should be difficult to share with others. 5) The new scheme should provide secrets that can be easily
revoked or changed. Based on the aforementioned requirements, we propose

3D Password Overview
The 3-D password is a multifactor authentication scheme. The 3-D password presents a 3-D virtual environment containing various virtual objects. The user navigates through this environment and interacts with the objects. The 3-D password is simply the combination and the sequence of user interactions
that occur in the 3-D virtual environment. The 3-D password can combine recognition-, recall-, token-, and biometrics-based systems into one authentication scheme. This can be done by designing a 3-D virtual environment that contains objects that request information to be recalled, information to be recognized, tokens to be presented, and biometrical data to be verified. For example, the user can enter the virtual environment and type something on a computer that exists in (x1, y1, z1) position, then enter a room that has a fingerprint recognition device that exists in a position (x2, y2, z2) and provide his/her fingerprint. Then, the user can go to the virtual garage, open the car door, and turn on the radio to a specific channel. The combination and the sequence of the previous actions toward the specific objects construct the user’s 3-D password.

(10, 24, 91) Action = Open the office door;
(10, 24, 91) Action = Close the office door;
(4, 34, 18) Action = Typing, “F”;
(4, 34, 18) Action = Typing, “A”;
(4, 34, 18) Action = Typing, “L”;
(4, 34, 18) Action = Typing, “C”;
(4, 34, 18) Action = Typing, “O”;
(4, 34, 18) Action = Typing, “N”;
(10, 24, 80) Action = Pick up the pen;
(1, 18, 80) Action = Drawing, point = (330, 130).

Persuasive Cued Click-Points: Design, implementation, and evaluation of a knowledge-based authentication mechanism


Abstract—This paper presents an integrated evaluation of the Persuasive Cued Click-Points graphical password scheme,
including usability and security evaluations, and implementation considerations. An important usability goal for knowledge-based
authentication systems is to support users in selecting passwords of higher security, in the sense of being from an expanded
effective security space. We use persuasion to influence user choice in click-based graphical passwords, encouraging users to
select more random, and hence more difficult to guess, click-points.
Index Terms—authentication, graphical passwords, usable security, empirical studies


problems of knowledge-based authentication,
typically text-based passwords, are well known.
Users often create memorable passwords that are easy
for attackers to guess, but strong system-assigned
passwords are difficult for users to remember [6].
A password authentication system should encourage
strong passwords while maintaining memorability.
We propose that authentication schemes allow
user choice while influencing users towards stronger
passwords. In our system, the task of selecting weak
passwords (which are easy for attackers to predict)
is more tedious, discouraging users from making
such choices. In effect, this approach makes choosing
a more secure password the path-of-least-resistance.
Rather than increasing the burden on users, it is
easier to follow the system’s suggestions for a secure
password — a feature lacking in most schemes.

Authentication Schemes for Session Passwords using Color and Images

Textual passwords are the most common method used for authentication. But textual
passwords are vulnerable to eves dropping, dictionary attacks, social engineering and shoulder surfing.
Graphical passwords are introduced as alternative techniques to textual passwords. Most of the
graphical schemes are vulnerable to shoulder surfing. To address this problem, text can be combined
with images or colors to generate session passwords for authentication. Session passwords can be used
only once and every time a new password is generated. In this paper, two techniques are proposed to
generate session passwords using text and colors which are resistant to shoulder surfing. These methods
are suitable for Personal Digital Assistants.
Index Terms: Authentication, session passwords, shoulder surfing

Speech Oriented Computer System Handling


Abstract-- Our aim is to provide the computer with a natural
interface, including the ability to understand human speech. For this
purpose, we propose a way how to handle the Computer System with
voice command. At first, the user initiates a given command by his
voice through the microphone then the software of the proposed
system will take over to recognize the command. If the recognition is
succeeded or matched with one of the given voice command then it
will perform the operation according to speaker’s command. In our
proposed system we are going to use Microsoft Speech SDK for
voice recognition process and Voice-XML for creating the voice
grammar in the software part. It has the flexibility to work with the
speech of any user.
Keywords-- Dynamic Programming Algorithm, Hidden Markov
Model, Microphone, Microsoft speech SDK, Phonemes, Speech
recognition, Voice-XML.
I. INTRODUCTION
ECENT years it has been seen that the improvements in
the quality and performance of speech-based human
machine interaction is steady. The next generation of speechbased
interface technology will enable easy to use automation
of new and existing communication services, making humanmachine
interaction more natural. For the disabled people the
absence of the data bases and diversity of the articulator
handicaps are major obstacles for the construction of reliable
speech recognition systems, which explains poverty of the
market in systems of speech recognition for disabled people
[1]. If a person finds it difficult or is not capable of handling
the mouse ports and the keyboard and if the keyboard or
mouse is faulty, there have to be other ways to handle the
operating system. “Speech” may act as one of them. There is a
growing demand for systems capable of handling Operating
System using only the voice commands given by a person.
And this paper represents a way how to control the OS by
using voice command [5].

Utilizing RSS feeds for crawling the Web


We present “advaRSS” crawling mechanism which
is created in order to support peRSSonal, a mechanism used to
create personalized RSS feeds. In contrast to the common
crawling mechanisms our system is focalized on fetching the
latest news from the major and minor portals worldwide by
utilizing their communication channels. The challenge between
“advaRSS” and a usual crawler is the fact that the news is
produced in a random order any time of the day and thus the
freshness of the offline collection can be measured even in
minutes. This means that the system has to be updated with
news every single time they occur. In order to achieve this we
utilize the communication channels that exist on the modern
architecture of the WWW and more specifically in almost
every modern news portal. As the RSS feeds are used by every
major and minor portal it is possible to keep our crawler up to
date and retain a high freshness of the “offline content” that is
maintained in our system’s database by applying algorithms in
order to observe the temporal behaviour of each RSS feed.
Keywords-rss crawling, web crawler, rss analysis, offline
content.
I. INTRODUCTION
The World Wide Web has grown from a few thousand
pages in 1993 to more than three billion pages at present.
The consequence of the popularity of the Web as a global
information system is that it is flooded with a large amount
of data and information and hence finding useful
information on the Web is often a tedious and frustrating
experience. New tools and techniques are crucial for
intelligently searching for useful information on the Web.
However, the mechanisms that were invented to make Web
seem less chaotic need information and waste a great
amount of time in order to collect it. Web crawlers are an
essential component of all search engines and are
increasingly becoming important in data mining and other
indexing applications. Web crawlers are programs which
browse the Web in a methodical, automated manner. They
are mainly used to create a copy of all the visited pages for
future use by mechanisms which will index the downloaded
pages to provide fast searches and further processing.
Much research has been done for creating crawlers that will
have “fresh” collection of web pages. Web pages are
changing at different rates which means that the crawler
should decide which page should be revisited by using an
efficient method [1]. This leads to creation of crawlers that
have at least two basic modules, one for periodical crawling
(scheduled) and another for incremental crawling (update
the most frequently changing pages). In [2] and [3] is
denoted that most web pages in the US are modified during
the US working hours a statement that is extremely logical.
In [4], Cho and Garcia-Molina show that different domains
have very different “page change” rates. Arasu et al in [5]
report a half-life of 10 days for web pages in order to create
an algorithm for maintaining the freshnesh of their “offline
collection”.

Feature-based opinion mining and ranking


The proliferation of blogs and social networks presents a new set of challenges and
opportunities in the way information is searched and retrieved. Even though facts still
play a very important role when information is sought on a topic, opinions have become
increasingly important as well. Opinions expressed in blogs and social networks are
playing an important role influencing everything from the products people buy to the
presidential candidate they support. Thus, there is a need for a new type of search engine
which will not only retrieve facts, but will also enable the retrieval of opinions. Such a
search engine can be used in a number of diverse applications like product reviews to
aggregating opinions on a political candidate or issue. Enterprises can also use such an
engine to determine how users perceive their products and how they stand with respect
to competition. This paper presents an algorithm which not only analyzes the overall
sentiment of a document/review, but also identifies the semantic orientation of specific
components of the review that lead to a particular sentiment. The algorithm is integrated
in an opinion search engine which presents results to a query along with their overall tone
and a summary of sentiments of the most important features.

Opinion Mining and Social Networks: a Promising Match Project


Opinion Mining and Social Networks:
a Promising Match
Abstract—In this paper we discuss the role and importance of social networks as preferred environments for opinion mining and sentiment analysis especially. We begin by briefly describing selected properties of social networks that are relevant with respect to opinion mining and we outline the general relationships between the two disciplines. We present the related work and provide basic definitions used in opinion mining. Then, we introduce our original method of opinion classification and we test the presented algorithm on real world datasets acquired from popular Polish social networks, reporting on the results. The  results are promising and soundly support the main thesis of the paper, namely, that social networks exhibit properties that make them very suitable for opinion mining activities.

Keywords: opinion mining, sentiment analysis, social
computing, social networks

I. INTRODUCTION
Graphs and networks certainly rank among one of the most popular data representation models due to their universal applicability to various application domains. The need to analyze and mine interesting knowledge from graph and network structures has been long recognized, but only recently the advances in information systems have enabled the analysis of graph structures at huge scales. Analysis of graph and network structures gained new momentum with the advent of social networks. While the analysis of social networks has been a field of intensive research, particularly in the domains of social sciences and psychology, economy or chemistry, it is the emergence of huge social networking services over the Web that
spawned the research into large-scale structural properties of social networks.. Social networks exhibit a very clear community structure. Such community structure partially stems from objective limitations (e.g., internal organizational structure of a company can be closely represented by the ties within a particular social network) or, to some extent, may result from subjective user actions and activities (e.g., bonding with other people who share one’s interests and hobbies). Unveiling the true structure of a social network and understanding of communities forming within the network is the key factor in understanding what the future structure of network will be. The main goal of social network analysis is the study of structural properties of networks. Structural analysis of the social network investigates the properties of individual vertices and the global properties of the network as a whole. It answers two basic classes of questions about the network: what is the structural position of any given individual node and what can be said about groups (communities) forming within the network. The main measurement of a node’s social power (also called member’s prestige) is centrality, which allows to determine node’s relative and absolute importance in the network. There are several methods to determine node’s centrality, such as the degree centrality (the number of links that connect to a given node), the betweenness centrality (the number of shortest paths between any pair of nodes in the network that traverse a given node) or the closeness centrality (the mean of shortest paths lengths to other nodes in the network). From the point of view of opinion mining the ability to assess the node’s prestige is essential as it allows to differentiate between opinions of different individuals. More specifically, node’s prestige allows to assign different weights to opinions and associate more importance to opinions expressed by prominent individuals. Another factor that is often considered in opinion mining is the identification of influential individuals. An influential individual does not have to be necessarily characterized with high degree centrality to influence the average opinion within the network. Usually, such individuals are characterized by high betweenness
centrality, impacting the dissemination of opinion rather than forming the opinion. For instance, an individual with high betweenness centrality can stop a negative opinion from spreading through the network, or, on the other hand, she can amplify the opinion. Due to psychological reasons humans tend to form their opinions in such way that the opinions conform with the norm established within a given social group. Thus,
when mining opinions one has to take into consideration the influence of the context in which the opinion is forming, i.e. the social milieu of an individual. Social networks are highly effective in bolstering group formation


RELATED WORK
Literature related to social network analysis is extremely abundant and rich. The first proposals toperform social network analysis originated in the domains of social sciences and psychology [12] or economy [13]. Interestingly, much of this research rephrased what has been previously discussed in physics within the context of complex systems [14]. The most thorough summary of social network analysis topics, models and algorithms can be found in [17]. Opinion mining is a relatively new domain spanning between the fields of data mining, machine learning and natural language processing. Sentiment Analysis methods can be regarded both as a supervised [1][5] and an unsupervised learning methods [6][15], and an information retrieval methods [16][18]. Many works concerning
opinion mining present conceptions based on dealing with text documents modelled as sets of words [1] or vectors, where dimensions represents words and values are weights of words in the document [2]. In the vast majority of sentiment analysis methods, information about connotations of a word with a positive or a negative class is used to calculate document’s
semantic orientation γ



where 􀝐􀯜 is the i-th term of the document d, |􀝀| is
the number of terms appearing in the document d, 􀜥􀯉 and
􀜥􀯇 are positive and negative classes, respectively, and
score() is a function that assigns positive or negative
values to terms, depending on their relationship with
the respective class. Semantic orientations of individual terms are aggregated using a dictionary method [5]. This method uses two small sets of manually identified positive and negative adjectives, which serve as seed sets. New terms are subsequently added to these sets if they are linked by semantically loaded conjunctions such as “and”, “but”, “however”, etc. Some opinion mining algorithms use the pointwise mutual information measure to determine semantic orientation of a term [3][4][6]. In this case semantic orientation of a term is inferred from the association between the term and a word (or a set of words) assigned unambiguously to only one class (positive or negative),
e.g. excellent and poor. The pointwise mutual information
of the term t and the word w is defined as



OUR APPROACH
The method proposed in this paper for determining term’s semantic orientation is a variant of the method used in [1]. The drawback of the original method is that it assigns maximum or minimum value to all terms if they occur in only one class, regardless of the number of occurrences. Therefore, we have proposed an alternative way of calculating the semantic orientation of a term. Our method is based on the ratio of term occurence frequency in documents assigned to positive and negative classes. According to our approach the scoring function for assigning positive and negative scores to terms becomes

Example: Let us compute token polarity evaluation in
the way presented above. Let’s assume training set
contains 1000 positive and 200 negative examples, token T
occured 9 times in positive examples, and 3 times in
negative examples.


Software and hardware requirements

4.2.2.1 Development Environment

·       Operating System: Windows 2000 Pro\NT\98\xp\7
The system will be built on windows compatible environment. The application will be web based using Java technology
·       Web Server: 
IIS – Internet Information Services
·       Server side Application Software:  Active Server Pages.NET (ASP.NET)
·       Client Side Application Software: Java Script, HTML
·       Data Base:  SQL Server 2000 \2005
The system requires SQL Server as a database, however the system will be ODBC complaint to work on any standard database.
·       Client Browsers:
Internet Explorer 5.0 or Netscape Navigator 4.7
The system requires Internet Explorer or Netscape Navigator browser for client side.
·       Hardware: Pentium PCs with 128 MB RAM/ 20 GB HDD.
4.2.2.2 Production Environment

·       Operating System: Windows 2000 Pro/NT/98 /xp/7 
The system will be built on windows compatible environment. The application will be web based using ASP.NET technology.
·       Web Server: 
IIS – Internet Information Services.
·       Server side Application Software: ASP.NET
Client Side Application Software: Java Script, HTML.
·       Data Base: SQL Server 2000 \2005
The system requires SQL Server as a database, however the system will be ODBC complaint to work on any standard database.

·       Client Browsers:
Internet Explorer 4.0 and above
        Netscape Navigator 4.0 and above
The system requires Internet Explorer or Netscape Navigator browser for client side.
·       Hardware: Pentium PCs with 128 MB RAM/ 20 GB HDD.


An Automatic Answering System with Template Matching for Natural Language Questions Project


An Automatic Answering System with Template
Matching for Natural Language Questions

Abstract
Using computers to answer natural language questions is an interesting and challenging problem. Generally such problems are handled under two categories: open domain problems and close domain problems. This paper presents a system that attempts to solve close domain problems.
Typically, in a close domain, answers to questions are not available in the public domain and therefore they cannot be searched  using a search engine. Hence answers have to be stored in a database by a domain expert. Then, the challenge is to understand the natural language question so that the solution could be matched to the respective answer in the database. We use a template matching technique to perform this matching. In addition, given that our target is to use this system with non-native English speakers, we developed a method to overcome the mismatches we might encounter due to spelling mistakes. The system is developed such that the questions can be asked using short messages from a mobile phone and therefore the system is designed to understand SMS language in addition to English. One of the main contributions of this paper is the outcome presented of a deployment of this system in a real environment.

KeywordsFAQ, Answering System, SMS, Template Matching

Introduction
EVELOPING mechanisms for using computers to answer user questions is becoming an interesting problem with the increased use of computers. Such mechanisms allow users to ask questions in a natural language and give a  concise and accurate answer. Understanding user questions in natural languages requires Natural Language Processing (NLP). Being an active area of research, NLP plays a big role in the ICT and Question Answering (QA) systems.
Natural language processing is the computerized approach to analyzing text based on both a set of theories and a set of technologies. It will become important to be able to ask queries and obtain answers, using natural language (NL) expressions, rather than the keyword based retrieval mechanisms. The QA system can better satisfy the needs of users as they will provide an accurate, quicker, convenient and effective way of giving answers to user questions. The approach we have adopted in this project is an automated FAQ (Frequently Asked Question) answering system that replies with pre-stored answers to user questions asked in ordinary English, rather than keyword or syntax based retrieval mechanisms. This is achieved using a template matching technique with some other mechanisms
like disemvoweling, matching synonyms, etc.


Related Work

         Q&A system research received considerable attention from the research community through Text Retrieval Conference Q&A track since 1999.
         The original aim of the track is to systematically evaluate both academic and commercial Q&A systems. Maybury has discussed the characteristics of Q&A systems and resources needed to develop and evaluate such systems.
          Main approaches in Q&A systems could be found in which template based approach discussed in detail.
         Although, most Q&A systems are based on Web environments, SMS has also been used as an environment in contexts such as in learning and agriculture.


Our Approach
Main modules:
         pre-processing,
         question template matching
         Answering
         SMS Abbreviation
         Stop Word
         Ward Parser
         Synonyms Matcher
         Security
         disemvoweling

Architecture

In this section we describe the architecture of our system. The overall architecture of the system can be subdivided into three main modules:
(1) pre-processing,
(2) question template matching, and
(3) answering.


A. Pre-Processing Module
Pre-processing module mainly consists of three operations: (1) converting SMS abbreviations into general English words, (2) removing stop words, and (3) removing vowels. Since the system is expected to process texts with both natural and SMS languages it is necessary to replace the SMS abbreviations with the corresponding English words before processing user questions further. This is done by referring to pre-stored frequently used SMS abbreviations. Stop words are the words that add no effect to the meaning of a sentence even if they are removed.
Removing stop words is done to increase the effectiveness of the system by saving time and disk space.
Examples of stop words are the, a, and, etc. Next step in this module is to remove vowels from the text to handle spelling
mistakes. This process is called disemvoweling which will be discussed in details in coming sections.


B. Question-Template Matching Module
The pre-processed text is matched against each and every pre stored template until it finds the best matched template with the received text. In order to do this, templates are created according to a specific syntax and the details are described in section IV. Further in this module, words that are considered to have synonyms are referred in a synonym file. This synonym file can be modified according to the relevant domain and are updated from a standard database such as WordNet [6]. It is worth noting that the templates here are for questions and not for answers. The main target of this system is to identify the closest template that matches the question we have received from the user.


C. Answering Module
Since each and every template representing a question are pre stored in a database with its answer, just when the best matched template for the question is found, the corresponding answer will be returned to the end user.

Algorithms
         Disemvowelling
         SMS Abbreviation Replace
         Stop Words
         Template Matching
         MD5
         Top-down parser


Software and hardware requirements

         Hardware :
         GSM Modem

         Software :
         JAVA  jdk 1.6
         Apache Tomcat 6
         MYSQL 5
         NetBeans 7.0