-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathobsah.tex
More file actions
909 lines (595 loc) · 79.8 KB
/
obsah.tex
File metadata and controls
909 lines (595 loc) · 79.8 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
\chapter{Introduction}\label{intro}
Today's field of computer networking and its research is heavily centered around the underlying architecture of the Internet and its protocol suite, which is known as TCP/IP for its two most prominent protocols Transmission Control Protocol (TCP) and Internet Protocol (IP). While TCP/IP remains in use for several decades and seems to work as intended, there has been a growing trend in the research community of introducing new network architectures. This thesis aims to analyze several examples of such architectures and contribute to implementation of one of them.
Network simulation is an ideal approach for examining new network architectures since it provides a quick and efficient way of setting up test scenarios and observing all aspects of their behavior. The discrete event network simulation framework OMNeT++\footnote{http://www.omnetpp.org} has been chosen as the target implementation platform.
\section{Goals}
The theoretical part of this thesis aims to describe some alternatives to the currently prevalent network architectures. Since the Internet is by far the largest and most important example of an internetwork, its underlying architecture shall be used as a base for comparison. This is only fitting since nearly all of the recent network architecture research is directed towards improving Internet's technology stack. Description of each architecture includes information about prototyping efforts, both in real-life platforms and in the field of network simulation.
The technical report describes design and implementation of a component of one of the presented architectures, Recursive InterNetwork Architecture (RINA), for the OMNeT++ framework.
\section{Thesis Structure}
Chapter~\ref{problems} describes the shortcomings and weak parts of current Internet technology which create the need for alternative architecture research. This overview serves in the next chapter as a reference point for evaluating contributions of alternative architectures.
Chapter~\ref{archs} provides a brief analysis and evaluation of several new network architectures. It also documents their prototyping efforts, both in real settings and in simulation.
Chapter~\ref{forwarding} takes a closer look at parts of RINA that are related to the implementation task of this thesis.
Chapter~\ref{implementation} describes implementation of RINA's Relaying and Multiplexing Task for OMNeT++.
Chapter~\ref{testing} presents evaluation of the implementation in form of sample test scenarios and their outputs.
\chapter{Problems of The Current Internet}\label{problems}
The Internet could be considered one of the most important technological achievements of the 20th century. It has brought a previously unimaginable degree of interconnection and information access to the whole world and its importance still keeps growing decades after its inception. Nevertheless, the very basic core of its technology was constructed over three decades ago in the era of first small experimental networks such as ARPANET and CYCLADES~\cite{Kurose}, when the the demands on internetworking capabilities were nowhere compared to now.
During the Internet's growth, whenever there was a problem that required a solution, it has been usually dealt with in a non-intrusive evolutionary fashion by applying a new principle on top of the underlying technology. In another words, problems have been mostly solved by adding a new protocol to the TCP/IP protocol stack.
The Internet's evolution can be presented on EvoArch~\cite{EvoArch}, an abstract model for studying protocol stacks and their evolution. The model suggests that the Internet protocol stack resembles an hourglass (see Figure~\ref{fig:inet_hourglass}). While the top and down layers are often expanded with new protocols, the usage of the internet layer's IP protocol remains constant because it acts as a common technology for all different networks interconnected by the Internet.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.5\textwidth]{fig/problems_hourglass.pdf}
\caption{Internet's ``hourglass architecture''.}
\label{fig:inet_hourglass}
\end{center}
\end{figure}
The evolutionary approach to improving the Internet's base technology is convenient since each paradigm shift in foundations of the Internet (i.e. replacing the ``thin waist'' of the hourglass) can require a long and expensive transfer of existing network configurations to the new technology. The most notable example is the internet layer protocol IPv6 which requires explicit firmware support from active network components. The problem of IPv4 space exhaustion has been known of since the first half of 1990s~\cite{rfc1631} and the first formal IPv6 specification arose in 1998~\cite{rfc2460}, but yet, as of April 2015, four years after the top-level IPv4 pool exhaustion~\cite{ipv4_exhaustion}, IPv6 still represents only a miniscule fraction of the total traffic on the Internet. For example, Google IPv6 adoption statistics indicate around 6\% coverage amongst the users of its services~\cite{ipv6stats}.
As such, some of the Internet's widely recognized problems are inherent because of the base design and it is usually difficult, if not impossible, to solve them in a non-intrusive and backward-compatible way. The following sections illustrate such problems.
\section{Incomplete Naming Scheme}\label{problems:naming}
In 1982, Jerome Saltzer in his work ``On the Naming and Binding of Network Destinations''~\cite{rfc1498} described the entities and the relationships that make a complete naming and addressing schema in networks. According to Saltzer, there are four elements that need to be identified: \emph{applications}, \emph{nodes}, \emph{points of attachment} to the network (PoAs) and \emph{paths}. The relationships between them are illustrated in Figure~\ref{fig:saltzer}. At the time, network architectures such as CYCLADES, XNS, DECNET and OSI conformed to this scheme~\cite{internet_demo}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.8\textwidth]{fig/problems_saltzer.png}
\caption{Jerome Saltzer's view of computer networking.}
\label{fig:saltzer}
\end{center}
\end{figure}
TCP/IP does not follow this proposal: the layer for node naming is completely missing. While TCP/IP does work with two distinct layers with their own address scopes -- the link layer with physical addresses and the internet layer with IP addresses -- both effectively identify the same: a host interface, i.e. a PoA address. This effectively means that the IP addressing is semantically overloaded to represent both identity and location. The need for an explicit identifier$\rightarrow$locator mapping was eventually recognized still in the era of ARPANET and this was solved by creating a globally available file \texttt{HOSTS.TXT} containing mappings of alphabetic host names to IP addresses. Later on, this method was obsoleted by the Domain Name System (DNS). However, both approaches move the matter of node naming into the application layer, forcing applications to work with location-dependent interface PoA addresses.
The fact that the Internet forwarding is location-based instead of identity-based has a great impact on difficulty of multihoming\footnote{Multihoming refers to a computer or device connected to more than one computer network. Such computer or device generally needs a separate interface for each network.} (section~\ref{problems:multihoming}) and mobility (section~\ref{problems:mobility}).
\section{Lack of Multihoming}\label{problems:multihoming}
Since IP addresses serve as points of attachment (i.e. they identify network interfaces) and routing is done exclusively on the internet layer, there is not any inherent mechanism for distinguishing whether multiple IP addresses identify a common node. Thus, in effect, multihoming on IP alone is not feasible.
The insufficient base for multihoming support is one of the oldest recognized problems of the Internet: it became apparent back in 1972, when Tinker Air Force Base joined ARPANET and voiced a request for redundant connections to a single node to ensure reliability~\cite{Patterns}. In spite of this, the switch to a new protocol suite that happened 11 years later (on 1.1.1983, the ``flag day'') did not bring any solution to this problem.
Since then, some attemps were made to implement multihoming on top of the current architecture.
\begin{itemize}
\item \textbf{SCTP.}
Message-oriented transport protocol Stream Control Transmission Protocol (SCTP)~\cite{rfc4960} provides a partial support: two SCTP hosts are able to provide each other with lists of fallback IP addresses that may be used in case of the primary address going offline. However, thus far, multiple reasons have been preventing SCTP from becoming a widely known and used solution; the most notable disadvantage lies in the fact that due to TCP/IP not recognizing a distinct session layer on top of its transport layer (such as in ISO/OSI stack), the transport protocol has to be explicitly specified by the application using the BSD sockets Application Programming Interface (API). Therefore, SCTP adoption would require a rewrite of network-aware applications themselves. Other SCTP adoption issues include unsatisfactory operating system support (Microsoft Windows systems require a third-party kernel driver) and weak awareness of its existence outside the networking community.
\item \textbf{BGP Multihoming.} Another implementation of multihoming capabilities can be seen in Border Gateway Protocol (BGP)~\cite{rfc4271}, which provides a means for load-balancing and fallback over multiple links on T1 networks. To make use of such multihoming over the Internet, a public IP address range and an Autonomous System number are required. BGP Multihoming is one of the most significant causes contributing to the growth of the global Internet routing table.
\item \textbf{Multipath TCP.} The most recent TCP/IP multihoming initiative is the TCP extension Multipath TCP (MPTCP) which is currently in its experimental phase, although a large scale commercial deployment has been already made by Apple for its Siri network application in mobile operating system iOS 7~\cite{Apple_MP}.
\end{itemize}
\section{Lack of Mobility}\label{problems:mobility}
Since nodes are identified solely by IP addresses of their interfaces (i.e. points of attachment to the network), mobility is essentialy non-existent.
There have been three distinct approaches to solving the mobility problem~\cite{MobilityFirst}.
\begin{itemize}
\item \textbf{Mobility by indirection.} A fixed host or device is dedicated for keeping track of mobile devices and forwarding traffic to them. This leads to path inflation.
This approach is used by technologies such as Mobile IP or Locator/Identifier Separation Protocol (LISP).
\item \textbf{Global name resolution.} The endpoint identifier is resolved to its network location by looking up a logically centralized service.
This approach is used by technologies such as eXpressive Internet Architecture (described in Section~\ref{archs:xia}) or MobilityFirst (described in Section~\ref{archs:mf}).
\item \textbf{Name-based routing.} The network locator is not used at all and routing is done on names.
This approach is used by technologies such as Named Data Networking (described in Section~\ref{archs:ndn}).
\end{itemize}
The second and third approaches require a clean-slate architectural design.
\section{Lack of Security Mechanisms}\label{problems:security}
The specifications of the fundamental protocols of TCP/IP stack -- IP, TCP and DHCP -- were originally completed at the beginning of 1980s. The Internet has since then turned into a massive world-wide internetwork connecting people of different types and agendas. Naturally, once the Internet began to be used for transferring sensitive data (especially by companies), cyber crime started to emerge as well and some attention was turned to security aspects of Internet protocol (or lack thereof).
As the IP protocol lacks any inherent verification mechanism, the internet layer is easily subjected to hijacking and spoofing. This provides opportunity for many types of exploits, most notably the Man-in-the-middle attack or IP address spoofing~\cite{rfc1948}.
Since security is not enforced by the architecture in any way, it is still common even for the application layer with its plethora of protocols to be subjected to security problems. Protocols like TCP are still widely in use and remain vulnerable to attacks such as TCP sequence prediction attack~\cite{rfc1948}. Today, application protocol security is usually achieved by ``wrapping'' protocols into other cryptographical protocols such as Transport Layer Security (TLS) or Datagram Transport Layer Security (DTLS).
\section{Routing Table Size Growth}
Default-free zone (DFZ) is the collection of all Internet autonomous systems (ASs) that do not require a default route to forward a packet to any destination. Since they comprise the root of Internet's routing infrastructure, their database must be complete.
With the increasing number of hosts connected to the Internet, the DFZ routing table sizes grow as well (see Figure~\ref{fig:bgp-growth}).
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/problems_bgp-growth.png}
\caption{Global Internet routing table size growth~\cite{bgpgrow}.}
\label{fig:bgp-growth}
\end{center}
\end{figure}
While the exponential growth observed during the 1990s was later mitigated by mass deployment of Classless Inter-Domain Routing (CIDR) and BGP route aggregation, the number of items is still increasing superlinearly and the high-end router hardware needs to keep up, especially with the increasing use of BGP-based multihoming and IPv6. This can sometimes lead to scalability problems, as in August of 2014, when reaching the 512k entry limit of multiple routers caused globally observable outages~\cite{512k_day}.
As of April 2015, the Internet routing table consists of over 560k entries~\cite{bgpgrow}. There are concerns about whether the technology of high-end routers will keep scaling along with Moore's law to keep route management efficient~\cite{rfc4984}.
\chapter{Alternative Network Architectures}\label{archs}
This chapter provides descriptions of several new network architectures.
Due to a limited scope of this thesis, the chapter describes and evaluates only a small representative subset of new architectures. The subset consists of projects funded by the Future Internet Architecture -- Next Phase program~\cite{fia} (Named Data Networking ~\cite{ndn} [Section~\ref{archs:ndn}], MobilityFirst ~\cite{MobilityFirst} [Section~\ref{archs:mf}] and eXpressive Internet Architecture ~\cite{xia} [Section~\ref{archs:xia}]) and the Recursive InterNetwork Architecture ~\cite{Patterns} (\ref{archs:rina}).
Special attention will be given to Recursive InterNetwork Architecture as one of its components is the implementation goal of this thesis.
\section{Design Approaches}
The networking research community has exhibited many attempts of moving the field forward. The undertaken research directions are often classified into one of two groups:
\begin{itemize}
\item \textbf{Evolutionary design.} Backward-compatible solutions that are incrementally deployable on top of the current Internet (e.g. LISP or DiffServ).
\item \textbf{Clean slate design.} Completely new standalone architectures that are not constrained by Internet technology's limitations.
\end{itemize}
Considering the scope of this thesis, the focus will be given exclusively to ``clean-slate design'' architectures.
\section{Named Data Networking}\label{archs:ndn}
Named Data Networking is one instance of a more general research direction called \emph{Information-centric networking} (ICN)~\cite{icn}. ICN explores the possibilities of moving the Internet infrastructure away from its host-centric communication paradigm towards the idea of named content.
\subsection{Premise}
During its early years, ARPANET was heavily influenced by telecommunication technology as the Public switched telephone network (PSTN) was the only example a large-scale network. Due to this, technology of the Internet has been built on the paradigm of end-to-end communication based on network addresses. However, while this base paradigm has remained constant over the decades, the way we use the Internet has considerably changed: the Internet is now used primarily as a content distribution network. Since the mechanism of data retrieval over the Internet is based on creating end-to-end communication channels and transferring data through them, the content itself is transparent to the network and this generates an enormous amount of data redundancy.
Named Data Networking proposes a solution more fitting for today's needs: instead of working with the source/destination addresses, the ``thin waist'' of the Internet should be based on working with names of data chunks (as illustrated by Figure~\ref{fig:ndn_waist}).
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.7\textwidth]{fig/archs_ndn-hourglass.pdf}
\caption{``Thin waists'' of the current Internet and NDN.}
\label{fig:ndn_waist}
\end{center}
\end{figure}
\subsection{Concepts}
\subsubsection{The Building Blocks}
The NDN architecture specifies:
\begin{itemize}
\item two types of packets: an \emph{interest packet} containing the name of desired data and a \emph{data packet} containing the requested data (shown in Figure \ref{fig:ndn-packets}),
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/archs_ndn-packets.pdf}
\caption{NDN packet types.}
\label{fig:ndn-packets}
\end{center}
\end{figure}
\item two types of hosts: a \emph{consumer} (data requester) and a \emph{producer} (data provider), and
\item a \emph{router} maintaining three fundamental data structures:
\begin{itemize}
\item \emph{Forwarding Information Base} (FIB) (forwarding table)
\item \emph{Pending Interest Table} (PIT) (data request management)
\item \emph{Content Store} (data cache)
\end{itemize}
\end{itemize}
\subsubsection{The Communication Model}
Communication in NDN is driven by the data receiver, i.e. the consumer. The steps are:
\begin{enumerate}
\item The consumer sends out an interest packet containing the name of the desired data.
\item When a router receives the interest packet, it first consults its Content Store for requested data.
\begin{itemize}
\item If the data requested by the interest packet are present, they are returned in the direction of the requesting interface.
\item Otherwise, it'll look up the PIT.
\begin{itemize}
\item If there's an entry present for the named data request, the entry is updated by adding the originating interface into the list of requesting interfaces, thus aggregating the new request together with an existing one.
\item Otherwise, a new entry is inserted, a FIB lookup is made and the interest packet is forwarded to interface(s) returned by the FIB.
\end{itemize}
\end{itemize}
\item A data packet is returned to the router by either the producer or another router with cached data. The router finds a matching PIT entry and forwards the data to all interfaces listed in the PIT entry. The PIT entry is then removed and data are cached into the Content Store.
\end{enumerate}
\subsubsection{Naming}
NDN assumes data chunk names to be hierarchically structured. Consumers must be able to deterministically construct the name for a desired piece of data without having previously seen the name or data. This can be achieved by a deterministic algorithm allowing both consumer and producer to construct the same name based on data available to both.
The management of such namespace is not defined by the architecture itself and should be a subject of further research.
\subsection{Current State of Implementation}
NDN's implementation efforts are open-source and available as a package called \texttt{NDN Platform}\footnote{http://named-data.net/codebase/platform/}. The package contains a C++ library (\texttt{ndn-cxx}), the NDN Forwarding Daemon (\texttt{NFD}), client libraries for C++, Python, Java and JavaScript, NLSR routing protocol, NDN repository and additional networking tools (a ping-like application, a traffic generator and a traffic capture tool).
\texttt{ndnSIM}\footnote{http://ndnsim.net/2.0/}, based on \texttt{ns-3}\footnote{https://www.nsnam.org/}, is a network simulator using \texttt{ndn-cxx} and \texttt{NFD} as the architecture backend. \texttt{ndnSIM} extends \texttt{ns-3} with a new network-layer protocol model which can be used atop of any available link-layer protocol, thus providing a flexible solution for simulating various development scenarios.
\section{MobilityFirst}\label{archs:mf}
\subsection{Premise}
The current Internet is designed for interconnecting fixed endpoints and fails to address dramatically increasing demands of mobile devices and services. MobilityFirst, as its name would suggest, aims to provide a means for better mobility, while also introducing intristic security properties and faciliating services.
\subsection{Concepts}
MobilityFirst is based on three basic principles: separation of locator and identifier (i.e. node name and PoA address), intristic security and global name resolution.
\subsubsection{Locator/Identifier Separation}
MobilityFirst's ``thin waist'' consists of location-independent names and a global name service for mapping them to addresses. A name is a \emph{globally unique identifier} (GUID) that can be used to identify a variety of \emph{principals} such as an interface, a node, a service, an end-user or content. An example of a principal is a \emph{network address} (NA), a network identifier resembling Internet's autonomous system.
\subsubsection{Intristic Security}\label{archs:mf:security}
GUIDs are self-certifying, so any principal can authenticate another principal without relying on an external authority. This is achieved through bilateral challenge-response mechanism.
e.g. Principal \texttt{X} wants to verify authenticity of principal \texttt{Y} before establishing a connection to him.
\begin{enumerate}
\item \texttt{X} sends a random nonce \texttt{n} to \texttt{Y}
\item \texttt{Y} responds with $\{pubkey, privkey(nonce)\}$
\item if $hash(pubkey) = Y$ and $pubkey(privkey(nonce)) = \texttt{n}$, \texttt{Y} is authenticated
\end{enumerate}
\subsubsection{Name Resolution}
MobilityFirst defines a naming service for dynamic mapping of GUIDs to network addresses with real-time response latencies. Unlike today's DNS with its reliance on a single root authority (ICANN), the naming service is decentralized.
In addition to this, a principal can also be assigned an optional human-readable name which is bound to its public key by a name certificate. In this case, the certificate has to be obtained from a trusted certification authority.
The system encompassing both the name resolution service and the name certification service is called the \emph{Global Name System} (GNS).
\subsubsection{The Communication Model}
\begin{enumerate}
\item To contact a GUID, the sending endpoint queries the GNS to obtain an NA corresponding to a GUID (much like it queries DNS to obtain an IP address for a domain name).
\item The sending endpoint then begins sending data, using the tuple \texttt{[GUID, NA]} (which is a routable destination identifier) in packet headers.
\end{enumerate}
Senders can also send a packet addressed just to a GUID, thereby implicitly delegating to the first-hop router the task of querying the name service for an NA.
\subsection{Current State of Implementation}
The MobilityFirst prototype is available by request to project leaders and consists of following components:
\begin{itemize}
\item \texttt{msocket}, an endpoint socket library extending the BSD sockets API.
\item \texttt{Auspice}, a GNS implementation.
\item Two prototypes of the forwarding plane: one based on the Click modular router\footnote{http://www.read.cs.ucla.edu/click/click}, other based on OpenFlow.
\end{itemize}
There is no tool available for simulation.
\section{eXpressive Internet Architecture}\label{archs:xia}
\subsection{Premise}
As presented in the previous sections, some of the future architecture research is centered around the idea of replacing Internet's ``thin waist'' of end-to-end communication with a different principal or a set of principals (e.g. NDN and its named content). eXpressive Internet Architecture (XIA) takes this approach one step further and proposes a novel principle: the ``thin waist'' should provide support for multiple principals and the ability to evolve by accomodating new principals over time.
\subsection{Concepts}
XIA is built around three basic principles: evolvable thin waist (achieved by configurable principal types), intristic security and flexible addressing mechanism (achieved by DAG addressing).
\subsubsection{Principal Types}
An \emph{XIA principal} is specified by the semantics of communicaton between principals of the same type, the processing that is required to forward traffic with addresses of its type and a unique \emph{XIA identifier} (XID). The initial XIA architecture defines four basic XIA principal types:
\begin{itemize}
\item \textbf{Host XID (HID).} HIDs support unicast host-based communication similar to IP where the host identifier is a hash of the host’s public key. HIDs define who you communicate with.
\item \textbf{Service XID (SID).} SIDs support communication with (typically replicated) services and realize anycast forwarding scheme. SIDs define what entities do.
\item \textbf{Content XID (CID).} CIDs allow hosts to retrieve content from ``anywhere'' in the network, e.g., content owners, CDNs, caches, etc. CIDs are defined as the hash of the content, so the client can verify the correctness of the received content. CIDs define what it is.
\item \textbf{Network XID (NID).} NIDs specify a network, i.e., an autonomous domain, and they are primarily used for scoping. They allow an entity to verify that it is communicating with the intended network.
\end{itemize}
Apart from the above, other basic types have been introduced and experimented with, e.g. 4IDs replicating IPv4 addresses.
\subsubsection{Intristic Security}
Security properties are included in principal type definitions and entity validation is achieved through use of self-certifying identifiers in a manner similar to MobilityFirst (see Section \ref{archs:mf:security}).
\subsubsection{DAG Addressing}
Addressing in XIA is realized by using Directed Acyclic Graphs (DAGs). In their simplest form, address DAGs may be used only for specifying packet's destination ID as in traditional network architectures (\ref{fig:xia_dag}a). However, in addition to that, they can also contain scoping information (e.g. target network + a service located in the network [\ref{fig:xia_dag}b]) or fallback alternatives (e.g. Network XID to be used by forwarding when given SID is not recognized [\ref{fig:xia_dag}c]).
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.4\textwidth]{fig/archs_xia-dag.pdf}
\caption{DAG-based addressing in XIA.}
\label{fig:xia_dag}
\end{center}
\end{figure}
\subsection{Current State of Implementation}
XIA's prototypes are open-source and publicly hosted on GitHub. One is a Click-based prototype which includes the XIA protocol stack, a routing daemon, a nameserver and an API. Another is a native Linux network stack implementation with attempts to port different architectures to XIA.
There is no tool available for simulation.
\section{Recursive InterNetwork Architecture}\label{archs:rina}
Recursive InterNetwork Architecture (RINA) is an architecture based on a set principles described by John~Day in his book \emph{Patterns in Network Architecture}~\cite{Patterns}.
\subsection{Premise}
RINA introduces a new perspective on computer networking:
\begin{quotation}
\centering
Computer networking is a recursively scalable set of distributed applications specialized to do inter-process communication.
\end{quotation}
\subsection{Concepts}
RINA specifications define a rich set of new concepts based mostly on theory of distributed computing. They are described in the following sections.
\subsubsection{Distributed Application Facility}
\emph{Distributed Application Facility} (DAF) is a collection of two or more cooperating application processes in one or more computing systems, which exchange information using \emph{inter-process communication} (IPC) and maintain shared state.
\subsubsection{Distributed IPC Facility}
\emph{Distributed IPC Facility} (DIF) is a collection of two or more application processes cooperating to provide IPC. A DIF's application is called an IPC process and DIF is a DAF that does IPC. The DIF provides IPC services to application processes via a set of API primitives that are used to initiate flow and exchange data with the application's peer.
Since DIFs are conceptually DAFs as well, their application processes can also exchange information using other DIFs. This yields a recursive structure.
A DIF is essentially RINA's equivalent of an abstraction layer. However, unlike the traditional network architectures such as the seven-layer ISO/OSI, RINA has only one layer which ``vertically repeats''. This also requires a different kind of notation: instead of using absolute layer identifiers such as \emph{Layer 2} or \emph{Layer 7}, RINA's DIFs are referred to in a relative manner in relation to a specific level, e.g. \emph{(N)-DIF}, \emph{(N+1)-DIF} or \emph{(N-2)-DIF}.
A computing system may be a member of $<0,n>$ DIFs and has a separate IPC process for each DIF. Each (N)-DIF handles data coming from (N+1)-DIFs in the same way as from an application.
A bare minimum for a computer internetwork consists of three levels of DIFs and three types of devices: a \emph{host}, an \emph{interior router} and a \emph{border router}. The internetwork is shown in Figure~\ref{fig:rina_internetwork}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/archs_rina-net.png}
\caption{An example of a RINA internetwork with 3 levels of DIFs.}
\label{fig:rina_internetwork}
\end{center}
\end{figure}
As the architecture is recursively scalable, this model can be expanded further: for example, other DIFs can be added on top of the stack in hosts for creating another scope of communication (e.g. for VPN-like facilities).
DIF operates in its own scope isolated from other DIFs of the same level. Therefore, DIF maintains its own distinct namespace and configuration (such as policies related to security and data transfer). When a computing system wants to communicate with another system inside a foreign DIF, it needs to go through a process of enrollment to the DIF first.
\subsubsection{IPC Process}
Each IPC process executes routing, transport, security/authentication and management functions. The components of an IPC process responsible for providing these functions can be categorized under three decoupled parts operating at separate timescales: \emph{IPC transfer}, \emph{IPC control} and \emph{IPC management} (see Figure~\ref{fig:rina_ipcp}). The behaviour of each part can be configured via policies.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/archs_rina-ipcp.pdf}
\caption{Parts of RINA's IPC process.}
\label{fig:rina_ipcp}
\end{center}
\end{figure}
\begin{itemize}
\item IPC transfer consists of following modules:
\begin{itemize}
\item \textbf{\emph{Delimiting.}} Marks boundaries on incoming (N+1)-SDUs.
\item \textbf{\emph{Error and Flow Control Protocol} (EFCP).} A Delta-T~\cite{deltat} based protocol that handles individual data flows. In general, EFCP instances exchange data with each other via PDUs.
\item \textbf{\emph{Relaying and Multiplexing Task} (RMT).} A stateless function that 1) retrieves PDUs from EFCP instances and management tasks and multiplexes them onto a common (N-1)-flow, and 2) takes incoming PDUs and relays them within current IPC or passes them to outgoing port(s).
\end{itemize}
\item IPC control is an optional mechanism handling flow control (for example, TCP-like flow control could be implemented here via appropriate policies). Its functionality is encompassed in the EFCP module.
\item IPC management handles management tasks such as routing, resource allocation or access control. It consists of following modules:
\begin{itemize}
\item \textbf{\emph{Resource Allocator} (RA).} Monitors the resource allocation and performance of the IPC Process and makes adjustments to its operation to keep it within the specified operational range.
\item \textbf{\emph{Resource Information Base daemon} (RIB).} The heart of DIF management. Receives/sends management messages and notifies other submodules about management changes.
\end{itemize}
\end{itemize}
\subsubsection{Addressing}
Since each DAF operates within its own address scope, the architecture needs to provide a means of resolving (N)-application names to addresses of (N-1)-IPC processes. Such mappings are stored in IPC process's Directory. Directories are managed by a decentralized distributed \emph{Name Space Manager} (NSM) embedded in each DIF.
The NSM maintanence is one of the tasks of IPC management; each IPC process keeps track of local registered applications and some of them are specialized to maintain aggregated repository of non-local mapping to serve as forwarders (such processes are called \emph{Repository IPC Processes}). This is illustrated in Figure~\ref{fig:rina_nsm}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/archs_rina-nsm.png}
\caption{Distributed mapping of applications to addresses in a RINA network.}
\label{fig:rina_nsm}
\end{center}
\end{figure}
\subsubsection{The Communication Model}\label{archs:rina:communication}
A connection between two applications in RINA needs to go through the initial process of flow allocation.
When an application named \texttt{App1} wants to create a flow with other application named \texttt{App2} reachable via a common DIF, the Allocate procedure proceeds as follows:
\begin{enumerate}
\item \texttt{App1} requests an IPC connection to \texttt{App2} with desired QoS requirements. This request is handled by Flow Allocator of the underlying IPC process with address \texttt{IPC 1}.
\item The Flow Allocator validates the request. If the request is well formed and the IPC process has enough resources to honor it, it is accepted and an EFCP instance is created. Otherwise, an error is returned.
\item The Flow Allocator asks Directory for address of the IPC Process to which \texttt{App2} is mapped. In this case, it is \texttt{IPC 2}.
\item The Flow Allocator asks \texttt{IPC 1}'s Resource Allocator to find a suitable (N-1)-flow to map the new flow to it. If there is not any, RA requests an IPC connection to \texttt{IPC 2}.
\item The Flow Allocator instructs RIB to send out an \texttt{M\_CREATE} request to \texttt{IPC 2}.
\item Upon the request receival, the \texttt{IPC 2}'s Flow Allocator delivers the Allocate request to App2. If App2 submits a positive response, the Flow Allocator creates an EFCP instance.
\item \texttt{IPC 2}'s Flow Allocator instructs RIB to send out \texttt{M\_CREATE\_RESPONSE} request back to \texttt{IPC 1}.
\item \texttt{IPC 1}'s RIB receives the \texttt{M\_CREATE\_RESPONSE} request. If it is positive, the Allocate() procedure is complete.
\item \texttt{App1} and \texttt{App2} can now use the IPC API to send and receive data SDUs to/from each other. When the communication is over, both of them can invoke the Deallocate call to release allocated resources.
\end{enumerate}
\subsubsection{Policy Separation}
RINA heavily relies on the design approach of separating mechanism and policy: parts related to authorization of operations and allocation of resources remain constant, while the decisions regarding how to use them are left to policies. In effect, there's only one application protocol (CDAP) and one error and flow control protocol (EFCP). A clear example of the policy separation is EFCP, which is a single mechanism configurable for both reliable and unrealiable data transfer, as opposed to TCP and UDP, which are two distinct mechanisms.
Because of this, RINA can serve as a platform for evaluating multiple approaches to a given problem just by replacing policies. An example can be observed in RINA's approach to routing, which is a policy by itself, providing a platform for implementing well-known protocols (such as those based on distance vector or link-state algorithms) as well as experimental new paradigms (such as hierarchical and topological routing).
\subsection{Current State of Implementation}
\texttt{ProtoRINA} is a Boston University's reference prototype written in Java, publicly available for download from the project's pages. \texttt{IRATI Stack} is an open-source attempt to port RINA into the Linux kernel network stack.
\texttt{RINASim} is an open-source OMNeT++ implementation of RINA developed by FIT BUT under the PRISTINE\footnote{http://ict-pristine.eu} project. It is publicly hosted on GitHub.
\section{Evaluation}
In the previous sections, I have described 4 new network architectures. This section puts them into context of problems mentioned in Chapter~\ref{problems} and points out their main strengths and weaknesses.
\subsection{Named Data Networking}
The most significant advantage of NDN is its native support for caching all sorts of data inside the network itself. While this should be beneficial mostly for static data such as web pages and images, dynamic content can take advantage of this as well in case of multicasting or packet retransmission on packet loss.
With its named data paradigm, NDN renders the problems of node naming, mobility and multihoming irrelevant, because data names remain the same no matter the locaton. Same case with security as each data chunk is required to be cryptographically signed by architecture itself.
The router table size growth presents a challenge as the NDN namespace is unbounded and addressing of named data implies that much more identifiers need to be used for global routing (by some estimations, the number of items it the global routing table might end up even 4 orders of magnitude higher~\cite{NDNvsMF}).
\subsection{MobilityFirst}
MobilityFirst solves the problem of locator/identifier conflation by introducing routable GUIDs and a service for mapping them to network addresses. This in effect eliminates the problem with multihoming and mobility as the traffic may be easily rerouted in case one of the interfaces of a host becomes unavailable. The security is enforced by the architecture due to its concept of self-certifying identifiers and bilateral challenge-response mechanism.
The concept of routing on NAs and GUIDs appears to mimic the currently used routing on IP network prefixes and host identifiers. However, MobilityFirst attempts to improve this by researching internetwork routing design based on small number of levels of hierarchy.
\subsection{eXpressive Internet Architecture}
XIA's answer to the locator/identifier problem (and, in turn, the multihoming/mobility problem) are the HID principals used to identify a node by a hash of its public key. Furthermore, multiple approaches to ensuring mobility can be implemented thanks to the flexibility of DAG addressing. Security is achieved by self-certifying principals.
The the number of principal types might rise over time and this presents a challenge in regards to routing table size growth. The XIA researchers are experimenting with a forwarding table scheme that should be capable of scaling to billions per entries while saturating 80 gigabytes per second~\cite{xia}.
The novel concept of evolvable ``thin waist'' supporting multiple paradigms at a time requires a control plane capable of handling diversity and incremental deployment. This will be the main aim of further XIA research efforts.
\subsection{Recursive InterNetwork Architecture}
In RINA, the problem of mobility/multihoming is solved inherently by providing a complete naming and addressing schema and introducing a distributed service for mapping application names to IPC addresses. RINA's scaling by recursion promises to deliver much greater global routing scalability than the current Internet. Furthermore, the concept of autonomous DIFs imply security by isolation.
In comparison to the other presented architectures, RINA presents the most radical paradigm shift by disregarding a great deal of common computer networking knowledge and building a new set of principles from the ground up. While this yields the most complete and architecturally clean solution (all of the problems presented in Chapter~\ref{problems} are solved inherently without aiming for them from the start), it also means that the biggest hurdle facing adoption of RINA might lie in unwillingness of networking community to accept its way.
\chapter{Forwarding In RINA}\label{forwarding}
This chapter takes a closer look at components of Recursive InterNetwork Architecture that are related to the implementation target of this thesis. This includes a conceptual description of the forwarding and routing principles in RINA and what role RMT plays in it.
\section{Distinction of Forwarding And Routing}
Each IPC process has to solve the forwarding problem: given a set of EFCP PDUs and a number of (N-1)-flows leading to various destinations, to which flow(s) should each PDU be forwarded? In RINA, the decision is handled by the Relaying and Multiplexing Task and its forwarding policy. The action may consist of looking up the PDU's destination in a forwarding table (resembling the forwarding mechanism in traditional TCP/IP routers), but it is not a requirement; other experimental forwarding paradigms, such as forwarding based on topological addressing, may not require a forwarding table at all.
Generating information necessary to do forwarding is one of the tasks of IPC process's Resource Allocator, namely its subcomponent called PDU Forwarding Generator. For this purpose, Resource Allocator generally uses pieces of information provided by other sources, most notably the Routing Policy.
The Routing Policy exchanges information with other IPC Processes in the DIF in order to generate a next-hop table for each PDU (usually based on the destination address and the id of the QoS class the PDU belongs to). The next-hop table is then converted into a PDU Forwarding Table with input from the Resource Allocator's PDU Forwarding Generator, by selecting an (N-1)-flow for each ``next-hop''. The Routing Policy may resemble distance vector and link-state routing protocols used in today's Internet, but the current research is also aimed at other paradigms such as topological/hierarchical routing, greedy routing or MANET-like routing.
\section{Relaying and Multiplexing Task}
\subsection{Formal description}
RMT has, as its name suggests, two main reponsibilities: relaying and multiplexing of PDUs. The goal of multiplexing is to pass outgoing PDUs (from EFCP instances and management tasks) to the appropriate (N-1)-flows and pass incoming PDUs (from (N-1)-flows) to EFCP instances and management tasks. Relaying handles incoming PDUs from (N-1)-ports\footnote{A handle for an (N-1)-flow, not unlike the traditional BSD socket.} that are not directed to its IPC process and forwards them to other (N-1)-ports using information provided by its forwarding policy. A conceptual model of RMT is presented in Figure~\ref{fig:rina:rmt:model}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.6\textwidth]{fig/fwding_rmt-module.pdf}
\caption{Diagram of Relaying and Multiplexing Task.}
\label{fig:rina:rmt:model}
\end{center}
\end{figure}
RMT instances in hosts and bottom layers of routers usually perform only the multiplexing task (Figure~\ref{fig:rina:rmt:devices}a), while RMTs in top layers of interior routers (Figure~\ref{fig:rina:rmt:devices}b) and border routers (Figure~\ref{fig:rina:rmt:devices}c) do both multiplexing and relaying.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth]{fig/fwding_rmt-devices.pdf}
\caption{A simplified view of data flow direction in different types of devices.}
\label{fig:rina:rmt:devices}
\end{center}
\end{figure}
Each (N-1)-port handled by RMT has its own set of input and output buffers. The number of buffers, their monitoring, their scheduling discipline and classification of traffic into distinct buffers are all matter of policies.
RMT is a straightforward high-speed component. As such, most of its management (state configuration, forwarding policy input, buffer allocation, data rate regulation) is handled by the Resource Allocator which makes the decisions based on observed IPC process performance.
\subsection{Policies}
Even though the Relaying and Multiplexing Task serves as a low-overhead component similar to the traditional view of router data plane, several policies are defined for modifying its behavior.
\begin{itemize}
\item \textbf{Scheduling policy.} A scheduling algorithm (also commonly known as ``network scheduler algorithm'' or ``queueing discipline'') that determines the order in which input and output buffers are serviced. This policy should be invoked each time a PDU needs to be taken from a queue for processing and works for both input and output directions. Examples of possible algorithms could be FIFO, LIFO or fair queueing.
\item \textbf{Monitoring policy.} A state-keeping queue monitoring algorithm that is invoked each time a PDU enters or leaves a queue. This policy should compute variables to be used in decision process of other policies. Examples of such variables could be average queue length or queue idle time, which are often used by congestion prevention mechanisms.
\item \textbf{MaxQ policy.} An algorithm that is invoked each time the number of PDUs waiting in a queue exceeds the queue's threshold. This policy is used mostly for implementing congestion avoidance mechanisms (e.g. by dropping or marking the last PDU in a queue).
\item \textbf{Forwarding policy.} An algorithm used for deciding where to forward a PDU. The policy is given the PDU's \emph{Protocol Control Information} (PCI) and in turn returns a set of (N-1)-ports to which the PDU has to be sent. This provides enough granularity to implement multiple communication schemes apart from unicast (such as multicast or load-balancing), because the decision is left to the policy. E.g. a simple forwarding policy would return a single (N-1)-port based on PDU's destination address and QoS-id, whereas in case of a load-spreading policy and multiple (N-1)-ports leading to the same destination, the policy could split traffic by PDUs' flow-ids and always return a single (N-1)-port from the set.
\end{itemize}
\chapter{Implementation of Relaying and Multiplexing Task}\label{implementation}
This chapter documents the implementation of RINA's Relaying and Multiplexing Task for the RINASim library.
\section{OMNeT++}
OMNeT++ is an open-source discrete event simulation framework used primarily in the field of network simulation. In this context, the adjective ``network'' refers to the more general meaning of the word, which means that apart from modelling TCP/IP networks (especially in conjecture with the INET library), it also provides a means for modelling other networked systems such as on-chip networks or queuing networks. As we are implementing a clean-slate architecture from the ground up, this is an ideal approach.
OMNeT++ provides a component architecture for models. Components (simple modules) are programmed in C++, then assembled into larger components (compound models) and interconnected using the high-level language NED. All events are message-driven. In theory, there are no scalability limits for networks modelled in NED and the only constraint is given by computing platform processing power.
\section{RINASim}
RINASim, developed by a networking research group at Faculty of Information Technology of Brno University of Technology, is an open-source OMNeT++ library developed for project PRISTINE. The purpose of the library is to provide a framework for modelling RINA networks and observing their behavior. In the current stage of its development, RINASim is used primarily by other PRISTINE researchers to experiment with the architecture and efficiently evaluate their working theories.
The library is open-sourced with the MIT licence and publicly hosted on GitHub.
\section{Implementation Design}
In RINASim, all functionality of RMT including a policy architecture is encompassed in a single compound module named \texttt{relayAndMux} which is present in every IPC process. The module serves for (de)multiplexing, relaying and aggregating PDUs of data flows traversing the IPC processes. The RINASim's compound module for an IPC process can be seen in Figure~\ref{fig:rinasim:ipcp}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.5\textwidth]{fig/impl_rinasim-ipcp.png}
\caption{RINASim's IPC process}
\label{fig:rinasim:ipcp}
\end{center}
\end{figure}
\subsection{Module Structure}
\texttt{relayAndMux} (an instance of compound module type \texttt{RelayAndMux}) consists of multiple simple modules of various types, some of which are instantiated only dynamically at runtime. Types of modules start with a capital letter while instances start with a small letter. Figure~\ref{fig:rinasim:rmt} presents state of the \texttt{relayAndMux} module when working with two allocated (N-1)-flows and their queues.
\begin{figure}[H]
\begin{center}
\includegraphics[width=0.8\textwidth]{fig/impl_rinasim-rmt.png}
\caption{Contents of \texttt{relayAndMux} and one of its ports.}
\label{fig:rinasim:rmt}
\end{center}
\end{figure}
\subsubsection{Static modules}
\begin{itemize}
\item \texttt{rmt}, the central logic of Relaying And Multiplexing task that decides what should be done with messages passing through the module.
\item \texttt{allocator}, a control unit providing an API for adding and deleting instances of dynamic modules (RMTQueue, RMTPort).
\item \texttt{schedulingPolicy}, the Scheduling policy of an RMT instance.
\item \texttt{monitorPolicy}, the Monitoring policy of an RMT instance.
\item \texttt{maxQueuePolicy}, the MaxQ policy of an RMT instance.
\end{itemize}
\subsubsection{Types of dynamic modules}
\begin{itemize}
\item \texttt{RMTPort}, a representation of one endpoint of an (N-1)-flow.
\item \texttt{RMTQueue}, a representation of either input or output queue. The number of RMTQueues per RMTPort is determined by Resource Allocator policies.
\item \texttt{RMTPortWrapper}, a compound module encapsulating an (N-1)-port and its queues.
\end{itemize}
A class diagram showing implementation of both static and dynamic modules can be seen in Appendix \ref{appendix:class}.
\subsection{Module Parameters}
The RMT module contains several user-configurable parameters that can be used to alter its behavior. These are presented in Table~\ref{fig:rmt_params}.
\begin{table}[H]
\begin{center}
\begin{tabular}{ | r | r | l | }
\hline
data type & name & function \\
\hline
\texttt{string} & \texttt{schedPolicyName} & name of the desired Scheduling policy \\
\texttt{string} & \texttt{qMonitorPolicyName} & name of the desired Monitoring policy \\
\texttt{string} & \texttt{maxQPolicyName} & name of the desired MaxQ policy \\
\texttt{string} & \texttt{ForwardingPolicyName} & name of the desired PDU Forwarding policy \\
\texttt{int} & \texttt{defaultMaxQLength} & default maximum length of instantiated queues \\
\texttt{int} & \texttt{defaultThreshQLength} & default threshold length of instantiated queues \\
\texttt{bool} & \texttt{pduTracing} & a switch for turning on PDU tracefile generation \\
\hline
\end{tabular}
\caption{RMT module parameters.}
\label{fig:rmt_params}
\end{center}
\end{table}
\subsection{Module Workflow}
The core function of RMT is driven by two algorithms: the \emph{decision algorithm} and the \emph{dispatcher algorithm}.
The decision algorithm decides what to do with the incoming PDU and executes appropriate policies. A state diagram representation can be seen in Figure~\ref{fig:rmt-fsm}. It is implemented by \texttt{RMT::handleMessage()} and methods that are subsequently called from it (\texttt{RMT::portToEFCPI()}, \texttt{RMT::portToPort()} etc.).
\begin{figure}[H]
\begin{center}
\makebox[\textwidth]{\includegraphics[width=0.9\textwidth]{fig/impl_rmt-decision.pdf}}
\caption{The RMT decision algorithm.}
\label{fig:rmt-fsm}
\end{center}
\end{figure}
The dispatcher algorithm ensures continuous invocation of the Scheduling policy whenever there are PDUs waiting in queues. The decision of what queue should be processed next is left to the policy. A petri net representation of an algorithm used for both input and output direction can be seen in Figure~\ref{fig:sched-petri}. The algorithm is implemented by methods \texttt{onQueueArrival()}, \texttt{postQueueDeparture()}, \texttt{writeToPort()} and \texttt{readToPort()}.
\begin{figure}[H]
\begin{center}
\makebox[\textwidth]{\includegraphics[width=0.6\textwidth]{fig/impl_rmt-sched.pdf}}
\caption{A petri net representation of the RMT dispatcher algorithm.}
\label{fig:sched-petri}
\medskip
\small
PDUs arrive at rate \texttt{x} into port's queues. Each time the port is ready to read/write, the scheduling policy is invoked to choose which queue should be processed, then a PDU is transmitted for time interval \texttt{t}. For output, \texttt{t} is determined by the PDU's size and characteristics of the medium; for input, it is configurable. A port block/unblock may be requested by RMT (for input) or by (N-1)-EFCPI (for output); in such case, the first ever request has to be for blocking and there cannot be multiple consecutive calls of only block requests or only unblock requests.
\end{center}
\end{figure}
\subsection{Module Management}\label{implementation:mgmt}
RMT's purpose in an IPC process is fairly straightforward: providing a stateless function for relaying PDUs to their predetermined destinations and multiplexing PDUs of multiple data flows onto a common predetermined medium. The entire management of the RMT is decoupled and exercised by the Resource Allocator.
As Resource Allocator lacked any concrete specifications at the time of implementation, I~have designed a set of management mechanisms, some of which are customizable by policies.
\subsubsection{Initial RMT Mode Setup}
When a RMT instance is located inside a bottommost IPC process that does not work with any further (N-1)-IPC processes, the RMT is switched to an \texttt{onWire} mode that functions over a single serializing medium instead of an IPC connection.
If a RMT instance is located inside an IPC process that has the \texttt{relay} configuration parameter configured as \texttt{true}, the RMT's Relaying Task is enabled by calling \texttt{enableRelay()}. The default value of the \texttt{relay} parameter is \texttt{true} in top layers of routers, \texttt{false} everywhere else.
\subsubsection{Forwarding Table Management}
The content of the PDU forwarding table is generated by the PDU Forwarding Generator policy module which generally accepts input from other sources such as the Routing policy module.
\subsubsection{Queue Allocation}
PDUs traversing an (N-1)-port may be momentarily buffered in input or output queues; the number of input and output queues per (N-1)-port and assignment of traffic classes to queues (e.g. all-in-one or fair queuing)
is determined by two Resource Allocator policies.
\begin{itemize}
\item \textbf{QueueAlloc.} The (N-1)-port queue allocation strategy. The interface contains a set of event hook methods (\texttt{onPolicyInit}, \texttt{onNM1PortInit}, \texttt{onNFlowAlloc}, \texttt{onNFlowDealloc}) that allow the user to specify how many queues should be allocated or deallocated in response to which events.
\item \textbf{QueueIDGen.} A companion classification policy to QueueAlloc which generates a queue ID for given PDU. This policy is used by RMT when it needs to determine which of the port's queue should a PDU be placed in.
\end{itemize}
\subsubsection{(N-1)-port Control}
Since Resource Allocator manages (N-1)-flows leading to other IPC processes, it also provides (N-1)-ports (or handles) for RMT.
In some scenarios, it may be required for an (N-1)-port to cease/slow down sending or providing more data because of congestion. Resource Allocator can momentarily disable or slow down data rate on distinct ports if this is required by EFCP instances.
\subsection{Statistics Collection}
OMNeT++ modules provide a means for declaring scalar or vector NED variables used for statistics collection. Processing of such statistical data (e.g. generating summaries and graphs) is decoupled from the act of data collection itself, so it is up to the user to pick out which data he wants to work with.
Since the Relaying and Multiplexing Task is the shared point of data flow traversal in the IPC process, it is well-suited for monitoring data flow performance. Several statistical variables have been defined for this very purpose:
\begin{itemize}
\item \textbf{(N-1)-port PDU traversal count.} Two scalar variables for both input and output containing the number of PDUs transferred through the port in each direction.
\item \textbf{RMT queue length.} A vector variable documenting number of PDUs in a queue over time.
\item \textbf{RMT queue drop count.} A scalar variable providing the number of PDUs dropped by a queue.
\end{itemize}
The RMT module also supports ns2-style tracefile generation. This can be enabled by setting the \texttt{tracing} parameter of \texttt{relayAndMux} to \texttt{true}.
\section{Sample policy implementations}\label{implementation:policies}
As Relaying And Multiplexing follows the design principle of separation of mechanism and policy, most of its complexity lies in the policies. Hence, to demonstrate the use of RMT's policy framework, I~have implemented a diverse set of simple RMT policies.
\begin{itemize}
\item \textbf{Scheduling policy}
\begin{itemize}
\item \emph{LongestQFirst.} Pick the queue which contains the most PDUs.
\end{itemize}
\item \textbf{Monitoring policy}
\begin{itemize}
\item \emph{REDMonitor.} Used in conjecture with REDDropper; Random Early Detection implementation.
\end{itemize}
\item \textbf{MaxQ policy}
\begin{itemize}
\item \emph{ECNMarker.} If queue size $\geq$ threshold, apply ECN marking on new PDUs; if size $\geq$ max, drop.
\item \emph{ReadRateReducer.} If queue size $\geq$ allowed maximum, stop receiving data from input ports.
\item \emph{REDDropper.} Used in conjecture with REDMonitor; Random Early Detection implementation.
\item \emph{TailDrop.} If queue size $\geq$ allowed maximum, drop new PDUs.
\item \emph{UpstreamNotifier.} If queue size $\geq$ allowed maximum, send a notification to the PDU sender.
\end{itemize}
\item \textbf{PDU Forwarding policy}
\begin{itemize}
\item \emph{SimpleTable.} A table with \{(dstAddr, QoS) $\rightarrow$ port\} mappings.
\end{itemize}
\end{itemize}
To demonstrate the abilities of Resource Allocator's RMT management, I~have also implemented an additional set of management policies (introduced in Section~\ref{implementation:mgmt}).
\begin{itemize}
\item \textbf{QueueAlloc}
\begin{itemize}
\item \emph{QueuePerNFlow.} Maintain a queue for each (N)-flow.
\item \emph{QueuePerNQoS.} Maintain a queue for each (N)-QoS cube.
\end{itemize}
\item \textbf{QueueIDGen}
\begin{itemize}
\item \emph{IDPerNFlow.} Companion policy for QueueAlloc::QueuePerNFlow.
\item \emph{IDPerNQoS.} Companion policy for QueueAlloc::QueuePerNQoS.
\end{itemize}
\end{itemize}
\chapter{Testing and Evaluation}\label{testing}
I~have created a set of basic network topologies to demonstrate the implementation's features. The following tests put them into use.
Each test case contains a description of RMT-related events that happen during simulation. Each type of event is described only once to prevent needless repetition.
For purposes of testing, RINASim contains a ping-like application called \texttt{AEPing} that can serve as both sender and receiver of a ping-like data frame. Each of the following scenarios consists of a NED topology and two instances of \texttt{AEPing} on different hosts. The first instance with application name \texttt{App1} allocates a flow to the other instance with application name \texttt{App2}, pings the other instance multiple times in sucession and then deallocates the flow. The simulation times of flow allocation, ping send events and flow deallocation are configurable via the \texttt{AEPing} NED module.
Additionally, each channel connecting two hosts is configurable to simulate postponed message delivery based on its bandwidth or fixed delay. To keep the format of timestamps simple, channels in the following examples operate with a fixed delay in seconds.
\section{Basic Multiplexing}
This example presents the multiplexing function of RMT in a simple scenario. Coincidentally, it also presents RINASim's implementation of the Allocate algorithm described in Section \ref{archs:rina:communication}.
\subsection{Topology}
The TwoCS topology is the most basic example of a computer network. It consists of two interconnected hosts, each with a single application and two levels of IPC processes. Details are depicted on Figure~\ref{fig:examples:muxing:events:topology}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth]{fig/tests_twocs.png}
\caption{TwoCS simulation scenario}
\label{fig:examples:muxing:events:topology}
\end{center}
\end{figure}
\subsection{Scenario}
The configuration of this simulation scenario is described in Table~\ref{fig:examples:muxing:config}.
\begin{table}[H]
\begin{center}
\begin{tabular}{ | r | l | l | }
\hline
module & configuration directive & value \\
\hline
\texttt{host1}'s \texttt{AEPing} & flow allocation & 10 s \\
\texttt{host1}'s \texttt{AEPing} & first ping & 35 s \\
\texttt{host1}'s \texttt{AEPing} & flow deallocation & 200 s \\
\texttt{host1}'s \texttt{AEPing} & number of pings & 10 \\
\texttt{host1}$\rightarrow$\texttt{host2} channel & delay & 2 s \\
\hline
\end{tabular}
\caption{Configuration for Basic Multiplexing.}
\label{fig:examples:muxing:config}
\end{center}
\end{table}
\subsection{Simulation}
What follows is a simplified description of events related to RMT instances.
\begin{itemize}
\item \textbf{t=0}: The RMT instances undergo their initial setup. For the bottom IPC processes in both hosts, this consists of creating a RMTPort instance named \texttt{PHY} and its queues. Since the QueueAlloc policy is set to \texttt{SingleQueue}, the following queues are created: \texttt{outQ\_M} (management output), \texttt{inQ\_M} (management input), \texttt{outQ\_0} (data output) and \texttt{inQ\_0} (management input).
\item \textbf{t=10}: \texttt{App1} invokes \emph{Allocate(\texttt{App2})}. This in turn causes IPC \texttt{11}'s Resource Allocator to invoke \emph{Allocate(\texttt{22})}.
The RIB in IPC \texttt{1} sends out an \texttt{M\_CREATE} request addressed to IPC \texttt{2}. RMT stores the PDU to the \texttt{PHY}'s queue \texttt{outQ\_M}.
The arrival of PDU into the queue causes invocation of the active Monitoring policy and MaxQ policy. If the MaxQ policy has not caused the PDU to be dropped, the Scheduling policy is invoked.
The Scheduling policy decides to release the PDU from \texttt{outQ\_M}. The PDU traverses through the port and gets sent to the other host via the medium.
\item \textbf{t=12}: IPC \texttt{2}'s RMT receives the PDU on port \texttt{PHY}. As this is a management PDU, it is stored into the management input queue \texttt{inQ\_M}.
The arrival of PDU into the queue causes invocation of the active Monitoring policy and MaxQ policy. If the MaxQ policy has not caused the PDU to be dropped, the Scheduling policy is invoked by the dispatcher algorithm.
The Scheduling policy decides to release the PDU from \texttt{inQ\_M}. The PDU travels into the dispatcher which relays the PDU to the RIB daemon.
The RIB daemon sends out an \texttt{M\_CREATE\_RESPONSE} reply to IPC \texttt{1}.
\item \textbf{t=14}: The IPC \texttt{1}'s RIB daemon receives the \texttt{M\_CREATE\_RESPONSE} reply and the \emph{Allocate(\texttt{22})} procedure is sucesfully completed. The new connection is mapped to a newly instantiated RMTPort called \texttt{p0} and the previously suspended \emph{Allocate(\texttt{App2})} procedure is triggered to continue and carry out the same series of events, only one level higher and using IPC \texttt{11}'s port \texttt{p0} and its data queues.
\item \textbf{t=18}: The \emph{Allocate(\texttt{App2})} procedure is sucessfully completed and \texttt{App1} is now connected to \texttt{App2}.
\item \textbf{t=35}: In \texttt{host1}, \texttt{App1} sends out the first ping. The ping traverses RMTs in IPC \texttt{11} and IPC \texttt{1} and then it is sent to the medium. This is repeated nine times in the following 9 seconds.
\item \textbf{t=37}: In host2, IPC \texttt{2}'s port \texttt{PHY} receives the first ping. The ping traverses RMTs in IPC \texttt{2} and IPC \texttt{22} and then it is handed to App2. This is repeated nine times in the following 9 seconds.
\item \textbf{t=200}: App1 invokes \emph{Deallocate(\texttt{App2})}.
\item \textbf{t=204}: The connection between \texttt{App1} and \texttt{App2} is deallocated.
\end{itemize}
\subsection{Evaluation}
The implementation reflects the behavior expected from RINA specifications. The procedures of the basic communication model described in Section~\ref{archs:rina:communication} are executed properly and RMT's multiplexing functionality correctly handles traversing PDUs.
\section{Basic Relaying}
This example presents the relaying function of RMT in a simple scenario.
\subsection{Topology}
The SimpleRelay topology is the most basic example of a computer network with a relaying device. It consists of two hosts interconnected by a router with three IPC processes. More on Figure~\ref{fig:examples:simplerelay}.
\begin{figure}[H]
\begin{center}
\includegraphics[width=\textwidth]{fig/tests_simplerelay.png}
\caption{TwoCSs network topology.}
\label{fig:examples:simplerelay}
\end{center}
\end{figure}
\subsection{Scenario}
The configuration of this simulation scenario is described in Table~\ref{fig:examples:relaying:config}.
\begin{table}[H]
\begin{center}
\begin{tabular}{ | r | l | l | }
\hline
module & configuration directive & value \\
\hline
\texttt{host1}'s \texttt{AEPing} & flow allocation & 10 s \\
\texttt{host1}'s \texttt{AEPing} & first ping & 35 s \\
\texttt{host1}'s \texttt{AEPing} & flow deallocation & 200 s \\
\texttt{host1}'s \texttt{AEPing} & number of pings & 10 \\
\texttt{host1}$\rightarrow$\texttt{interiorRouter} channel & delay & 2 s \\
\texttt{interiorRouter}$\rightarrow$\texttt{host2} channel & delay & 2 s \\
\hline
\end{tabular}
\caption{Configuration for Basic Relaying.}
\label{fig:examples:relaying:config}
\end{center}
\end{table}
\subsection{Simulation}
\begin{itemize}
\item \textbf{t=10}: \texttt{App1} invokes \emph{Allocate(\texttt{App2})}. This in turn causes IPC \texttt{11}'s Resource Allocator to invoke \emph{Allocate(\texttt{33})}.
\item \textbf{t=14}: The IPC \texttt{1}'s RIB daemon receives the \texttt{M\_CREATE\_RESPONSE} reply and the \emph{Allocate(IPC \texttt{33})} procedure is sucesfully completed. The new connection is mapped to a newly instantiated RMTPort called \texttt{p0} and the previously suspended \emph{Allocate(\texttt{App2})} procedure is triggered to continue. The \texttt{M\_CREATE} sent by IPC \texttt{11}'s RIB daemon is directed to IPC \texttt{33}'s RIB daemon.
\item \textbf{t=16}: The IPC \texttt{33}'s RIB daemon sees that the \texttt{M\_CREATE} requires acces to \texttt{App2}, which is not a local aplication, so the request has to be forwarded to IPC \texttt{22}. It holds the request and waits for Resource Allocator to process \emph{Allocate(\texttt{22})}.
\item \textbf{t=20}: The IPC \texttt{4}'s RIB daemon receives the \texttt{M\_CREATE\_RESPONSE} reply and the \emph{Allocate(IPC \texttt{22})} procedure is sucesfully completed. The new connection is mapped to a newly instantiated RMTPort called \texttt{p1}.
The \texttt{M\_CREATE} which was withheld in t=16 then continues forward to IPC \texttt{22}.
\item \textbf{t=22}: The IPC \texttt{22}'s RIB daemon receives the \texttt{M\_CREATE} request and responds with a \texttt{M\_CREATE\_RESPONSE} reply, this time directed directly to IPC \texttt{11}.
\item \textbf{t=26}: The \emph{Allocate(\texttt{App2})} procedure is sucessfully completed and \texttt{App1} is now connected to \texttt{App2} via \texttt{interiorRouter}.
\item \textbf{t=35}: In \texttt{host1}, \texttt{App1} sends out the first ping.
\item \textbf{t=37}: In IPC \texttt{33}, the RMT receives the ping PDU from port \texttt{p0}. It discovers the PDU's address does not equal IPC \texttt{33}, so it looks up the PDU Forwarding Policy for an item with PDU's destination (IPC \texttt{22}) and PDU's QoS-ID (\texttt{1}).
The Forwarding policy returns a single RMTPort, \texttt{p1}. The QueueIDGen policy is given the PDU as a parameter and returns \texttt{outQ\_0}.
The dispatcher directs the PDU to \texttt{p1}'s queue \texttt{outQ\_0}, where it is immediately processed by policies and sent.
\item \textbf{t=39}: In \texttt{host2}, \texttt{App2} receives the ping and sends back an acknowledgment message.
\item \textbf{t=41}: In IPC \texttt{33}, the RMT dispatcher receives the ping reply PDU from port \texttt{p1}. It looks up the Forwarding Policy again and then sends out the PDU through \texttt{p0}.
\item \textbf{t=200}: \texttt{App1} invokes \emph{Deallocate(\texttt{App2})}.
\item \textbf{t=208}: The connection between \texttt{App1} and \texttt{App2} is deallocated.
\end{itemize}
\subsection{Evaluation}
The implementation reflects the behavior expected from RINA specifications. The procedures of the basic communication model described in Section~\ref{archs:rina:communication} are executed properly and RMT's relaying functionality correctly forwards PDUs.
\section{Advanced examples}
The medium attached to this thesis contains more example simulation scenarios. They are mostly aimed at demonstrating the use of the default policy set described in Section \ref{implementation:policies} and more advanced network topologies.
\chapter{Conclusion}\label{conclusion}
In this thesis, I~have taken a brief look into the field of network architecture research. I~have described the motivations behind research efforts and analyzed several new network architectures.
The implementation goal of this thesis was to contribute to prototyping attempts of one of the presented network architectures, RINA. This task has been sucessfully completed by implementing RINA's Relaying and Multiplexing Task into an existing library for simulation tool OMNeT++. The resulting solution is currently in use by multiple research groups for modelling RINA networks and experimenting with various policies.
\section{Own Contributions}
To be able to describe the principles of new network architectures, I~have studied research articles that have been written about them. However, to analyze their usefulness to the Internet, I needed to gain understanding of the driving factors behind their inception. This has been achieved by studying the Internet and its history to learn about choices that led to its current state, including its present problems. I~have noticed that some of the problematic design choices of its architectural design have been recognized right at the early beginning of its deployment, but they were eventually chosen as foundations for other then technological reasons.
To understand RINA, I had to learn about its design principles and adapt to its paradigm shift which abandons most of today's widely recognized principles of computer networking. This required extensive studying of John Day's book \emph{Patterns in Network Architecture}~\cite{Patterns} and bleeding-edge architectural specifications.
The implementation part of this thesis required me to learn about the OMNeT++ programming framework and the RINASim library. As RMT's specifications were only brief and its other implementations provided only a limited set of functionality, I needed to put some initiative into coming up with the implementation design. This usually involved discussing architectural matters with RINA researchers. In the later phase of development, several of the researchers have raised multiple functionality requests; all of them were eventually implemented.
\section{Future Development}
The implementation of Relaying and Multiplexing Task provides users with a policy framework that allows them to experiment with virtually unlimited number of approaches to the problems of forwarding and congestion avoidance in RINA. Therefore, potential for future development lies mainly in expansion of the policy set.
Several examples of such new policies written by RINA researchers are already available in the GitHub source code repository at the time of writing this thesis. Such examples usually explore new directions in the areas of congestion avoidance, routing and distributed resource allocation.