dosrank.tex 2.67 KB
 yorn committed Jul 03, 2014 1 2 3 4 5 \chapter{DOSRank} \label{chp:dosrank} To test Giraph, two simple algorithms, DOSRank and ReverseDOSRank are created, which count the amount of respectively incoming and outgoing edges from or to a vertex. This proof of concept will count the amount of flows per \gls{service}.  yorn committed Jul 04, 2014 6 It can be used to detect if a machine is executing or suffering from a \gls{DoS} attack,  yorn committed Jul 03, 2014 7 8  as these are typically identified by a large number of flows. These two algorithms will count respectively the amount of outgoing and the amount of incoming edges.  yorn committed Jul 04, 2014 9 10 DOSRank~(\ref{eq:dosrank}), which will calculate outgoing edges (incoming connections) is trivial; it will set its value during the first superstep and then vote to halt. ReverseDOSRank~(\ref{eq:reversedosrank}) is a bit more complex;  yorn committed Jul 03, 2014 11 12 13 14 15 16 17 18 19 20 21 22 23  since incoming edges are not visible in Giraph, the first superstep is used to send a message over every edge. During the second superstep, every vertex will receive an amount of messages that is equal to its amount of incoming edges. \label{eq:dosrank} f(x) = N^{-}(x). \label{eq:reversedosrank} f(y) = N^{+}(y).  yorn committed Jul 04, 2014 24 \begin{algorithm}[h]  yorn committed Jul 03, 2014 25  \caption{DOSRank}  yorn committed Jul 04, 2014 26  \label{alg:dosrank}  yorn committed Jul 03, 2014 27 28 29 30 31  \begin{verbatim} result = vertex.getNumEdges(); vertex.voteToHalt(); \end{verbatim} \end{algorithm}  yorn committed Jul 04, 2014 32 \begin{algorithm}[h]  yorn committed Jul 03, 2014 33  \caption{ReverseDOSRank}  yorn committed Jul 04, 2014 34  \label{alg:reversedosrank}  yorn committed Jul 03, 2014 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57  \begin{verbatim} if superstep = 0 then begin result = 0; sendMessageToAllEdges(); end else begin while(messages.hasNext) do message.next(); result++; end; vertex.voteToHalt(); end. \end{verbatim} \end{algorithm} \section{Expected results} The algorithm will find IP addresses that are associated with large amounts of flows. Opening a large amount of flows is different from sending large amounts of data, as sending a large amount of data is often a completely legitimate thing to do: For example, sharing files via peer-to-peer protocols or serving large files over HTTP.  yorn committed Jul 04, 2014 58 59 60 A large amount of flows; however, may be an indication that something is amiss. It can indicate port scanning, or even targeted attacks like SYN flooding; however, a large amount of flows can also simply indicate a very active or popular host.  yorn committed Jul 03, 2014 61 62 63 64  \section{Purpose} These algorithms are just a proof-of-concept to show that graph systems can be used for traffic analysis. The algorithm itself is most likely easier to implement using other systems.  yorn committed Jul 04, 2014 65 66 A simple MapReduce algorithm could probably yield the same results a smaller amount of time~\cite{Morken352472}; however, this algorithm does show that Giraph can be used for traffic analysis,  yorn committed Jul 03, 2014 67 68 69  and it opens the door for more complex algorithms, like SpreadRank.