aboutsummaryrefslogtreecommitdiff
path: root/content
diff options
context:
space:
mode:
Diffstat (limited to 'content')
-rw-r--r--content/en/_index.html28
-rw-r--r--content/en/blog/20191006-new-site.md (renamed from content/en/blog/news/20191006-new-site.md)0
-rw-r--r--content/en/blog/20200116-hn.md (renamed from content/en/blog/news/20200116-hn.md)0
-rw-r--r--content/en/blog/20200212-ecmp.md (renamed from content/en/blog/news/20200212-ecmp.md)0
-rw-r--r--content/en/blog/20200216-ecmp.md (renamed from content/en/blog/news/20200216-ecmp.md)0
-rw-r--r--content/en/blog/20200502-frcp.md (renamed from content/en/blog/news/20200502-frcp.md)0
-rw-r--r--content/en/blog/20200507-python-lb.png (renamed from content/en/blog/news/20200507-python-lb.png)bin218383 -> 218383 bytes
-rw-r--r--content/en/blog/20200507-python.md (renamed from content/en/blog/news/20200507-python.md)2
-rw-r--r--content/en/blog/20201212-congestion-avoidance.md (renamed from content/en/blog/news/20201212-congestion-avoidance.md)2
-rw-r--r--content/en/blog/20201212-congestion.png (renamed from content/en/blog/news/20201212-congestion.png)bin54172 -> 54172 bytes
-rw-r--r--content/en/blog/20201219-congestion-avoidance.md (renamed from content/en/blog/news/20201219-congestion-avoidance.md)25
-rw-r--r--content/en/blog/20201219-congestion.png (renamed from content/en/blog/news/20201219-congestion.png)bin189977 -> 189977 bytes
-rw-r--r--content/en/blog/20201219-exp.svg (renamed from content/en/blog/news/20201219-exp.svg)0
-rw-r--r--content/en/blog/20201219-ws-0.png (renamed from content/en/blog/news/20201219-ws-0.png)bin419135 -> 419135 bytes
-rw-r--r--content/en/blog/20201219-ws-1.png (renamed from content/en/blog/news/20201219-ws-1.png)bin432812 -> 432812 bytes
-rw-r--r--content/en/blog/20201219-ws-2.png (renamed from content/en/blog/news/20201219-ws-2.png)bin428663 -> 428663 bytes
-rw-r--r--content/en/blog/20201219-ws-3.png (renamed from content/en/blog/news/20201219-ws-3.png)bin417961 -> 417961 bytes
-rw-r--r--content/en/blog/20201219-ws-4.png (renamed from content/en/blog/news/20201219-ws-4.png)bin423835 -> 423835 bytes
-rw-r--r--content/en/blog/20210320-ouroboros-rina.md930
-rw-r--r--content/en/blog/20210402-multicast.md479
-rw-r--r--content/en/blog/2021115-rejected.md61
-rw-r--r--content/en/blog/20211226-2022andbeyond.md89
-rw-r--r--content/en/blog/20211229-flow-vs-connection.md351
-rw-r--r--content/en/blog/20211229-oecho-1.pngbin0 -> 442563 bytes
-rw-r--r--content/en/blog/20211229-oecho-2.pngbin0 -> 543765 bytes
-rw-r--r--content/en/blog/20211229-oecho-3.pngbin0 -> 552489 bytes
-rw-r--r--content/en/blog/20211229-oecho-4.pngbin0 -> 352163 bytes
-rw-r--r--content/en/blog/20211229-oecho-5.pngbin0 -> 621822 bytes
-rw-r--r--content/en/blog/20220206-hole-punching.md99
-rw-r--r--content/en/blog/20220206-hole-punching.pngbin0 -> 149682 bytes
-rw-r--r--content/en/blog/20220212-mvc.pngbin0 -> 58080 bytes
-rw-r--r--content/en/blog/20220212-tcp-ip-architecture.md449
-rw-r--r--content/en/blog/20220220-half-deallocated-flows.md113
-rw-r--r--content/en/blog/20220228-flm-app.pngbin0 -> 80382 bytes
-rw-r--r--content/en/blog/20220228-flow-liveness-monitoring.md150
-rw-r--r--content/en/blog/20220520-oping-flm.md87
-rw-r--r--content/en/blog/20221207-loc-id-mobility-1.pngbin0 -> 403093 bytes
-rw-r--r--content/en/blog/20221207-loc-id-mobility-2.pngbin0 -> 411592 bytes
-rw-r--r--content/en/blog/20221207-loc-id-split.md216
-rw-r--r--content/en/blog/20221207-loc-id.pngbin0 -> 81278 bytes
-rw-r--r--content/en/blog/20241110-auth.md140
-rw-r--r--content/en/blog/_index.md8
-rw-r--r--content/en/blog/news/20201024-why-better.md119
-rw-r--r--content/en/blog/news/_index.md5
-rw-r--r--content/en/blog/releases/_index.md8
-rw-r--r--content/en/blog/releases/upcoming.md7
-rw-r--r--content/en/docs/Concepts/broadcast_layer.pngbin0 -> 176866 bytes
-rw-r--r--content/en/docs/Concepts/dependencies.jpgbin12970 -> 0 bytes
-rw-r--r--content/en/docs/Concepts/elements.md90
-rw-r--r--content/en/docs/Concepts/fa.md2
-rw-r--r--content/en/docs/Concepts/layers.jpgbin104947 -> 0 bytes
-rw-r--r--content/en/docs/Concepts/model_elements.pngbin0 -> 27515 bytes
-rw-r--r--content/en/docs/Concepts/ouroboros-model.md635
-rw-r--r--content/en/docs/Concepts/problem_osi.md174
-rw-r--r--content/en/docs/Concepts/rec_netw.jpgbin63370 -> 0 bytes
-rw-r--r--content/en/docs/Concepts/unicast_layer.pngbin0 -> 206355 bytes
-rw-r--r--content/en/docs/Concepts/unicast_layer_bc_pft.pngbin0 -> 444657 bytes
-rw-r--r--content/en/docs/Concepts/unicast_layer_bc_pft_split.pngbin0 -> 688010 bytes
-rw-r--r--content/en/docs/Concepts/unicast_layer_bc_pft_split_broadcast.pngbin0 -> 894152 bytes
-rw-r--r--content/en/docs/Concepts/unicast_layer_dag.pngbin0 -> 36856 bytes
-rw-r--r--content/en/docs/Concepts/what.md78
-rw-r--r--content/en/docs/Contributions/_index.md23
-rw-r--r--content/en/docs/Extra/ioq3.md7
-rw-r--r--content/en/docs/Extra/rumba.md13
-rw-r--r--content/en/docs/Intro/_index.md67
-rw-r--r--content/en/docs/Overview/_index.md120
-rw-r--r--content/en/docs/Releases/0_18.md109
-rw-r--r--content/en/docs/Releases/0_20.md70
-rw-r--r--content/en/docs/Releases/_index.md6
-rw-r--r--content/en/docs/Start/_index.md220
-rw-r--r--content/en/docs/Start/check.md49
-rw-r--r--content/en/docs/Start/download.md28
-rw-r--r--content/en/docs/Start/install.md57
-rw-r--r--content/en/docs/Start/requirements.md76
-rw-r--r--content/en/docs/Tools/_index.md7
-rw-r--r--content/en/docs/Tools/grafana-frcp-constants.pngbin0 -> 42048 bytes
-rw-r--r--content/en/docs/Tools/grafana-frcp-window.pngbin0 -> 107506 bytes
-rw-r--r--content/en/docs/Tools/grafana-frcp.pngbin0 -> 89571 bytes
-rw-r--r--content/en/docs/Tools/grafana-ipcp-dt-dht.pngbin0 -> 137833 bytes
-rw-r--r--content/en/docs/Tools/grafana-ipcp-dt-fa.pngbin0 -> 214580 bytes
-rw-r--r--content/en/docs/Tools/grafana-ipcp-np1-cc.pngbin0 -> 235720 bytes
-rw-r--r--content/en/docs/Tools/grafana-ipcp-np1-fu.pngbin0 -> 284669 bytes
-rw-r--r--content/en/docs/Tools/grafana-ipcp-np1.pngbin0 -> 282859 bytes
-rw-r--r--content/en/docs/Tools/grafana-lsdb.pngbin0 -> 27429 bytes
-rw-r--r--content/en/docs/Tools/grafana-system.pngbin0 -> 43809 bytes
-rw-r--r--content/en/docs/Tools/grafana-variables-interval.pngbin0 -> 75086 bytes
-rw-r--r--content/en/docs/Tools/grafana-variables-system.pngbin0 -> 43642 bytes
-rw-r--r--content/en/docs/Tools/grafana-variables-type.pngbin0 -> 50204 bytes
-rw-r--r--content/en/docs/Tools/grafana-variables.pngbin0 -> 14056 bytes
-rw-r--r--content/en/docs/Tools/metrics.md298
-rw-r--r--content/en/docs/Tools/rumba-topology.pngbin0 -> 16656 bytes
-rw-r--r--content/en/docs/Tools/rumba.md676
-rw-r--r--content/en/docs/Tools/rumba_example.py41
-rw-r--r--content/en/docs/Tutorials/tutorial-1.md18
-rw-r--r--content/en/docs/Tutorials/tutorial-2.md2
-rwxr-xr-xcontent/en/docs/_index.md4
96 files changed, 5510 insertions, 728 deletions
diff --git a/content/en/_index.html b/content/en/_index.html
index 2cbf29d..d8e4c6c 100644
--- a/content/en/_index.html
+++ b/content/en/_index.html
@@ -6,15 +6,15 @@ linkTitle = "Ouroboros"
{{< blocks/cover title="Ouroboros" image_anchor="top" color="primary" >}}
<div class="mx-auto">
<a class="btn btn-lg btn-primary mr-3 mb-4"
- href="{{< relref "/docs/Overview" >}}">
+ href="/wiki">
Learn More <i class="fas fa-arrow-alt-circle-right ml-2"></i>
</a>
<a class="btn btn-lg btn-secondary mr-3 mb-4"
- href="/docs/start/">
+ href="/wiki/Ouroboros_Prototype">
Download <i class="fab fa-git ml-2 "></i>
</a>
<p class="lead mt-5">
- Decentralized packet networking rebuilt from the ground up
+ Packet networking rebuilt from the ground up
</p>
<div class="mx-auto mt-5">
{{< blocks/link-down color="info">}}
@@ -23,13 +23,15 @@ linkTitle = "Ouroboros"
{{< /blocks/cover >}}
{{% blocks/lead color="secondary" %}} Ouroboros is a <b>peer-to-peer
-transport network prototype</b> inspired by a <b>recursive network
-paradigm</b> and implemented according to a <b>UNIX design
-philosophy</b>. The aim is to provide a <b>secure and private
-networking</b> experience and to provide a simple API for writing
-distributed software and networked application libraries. Ouroboros
-provides a very <b>compact API</b> support
-both <b>unicast</b> <b>multicast</b> communications. All protocols
-carry <b>minimal header information</b>, with easy-to-enable
-<b>encryption</b>.
-{{% /blocks/lead %}}
+packet network prototype</b>. It unifies all packet communications
+--whether it is two programs the same machine or a set of programs in
+computers on different continents -- using a small set of
+abstractions, which we call <b>layers</b> and <b>flows</b>. The
+Ouroboros architecture improves security, privacy and efficiency
+through <b>simplicity</b>. It provides a very
+<b>compact API</b> for writing distributed software and networked
+application libraries, with support for both <b>unicast</b>
+and <b>multicast</b> communications. Being rebuilt from the ground up,
+Ouroboros is not directly compatible with IP or UNIX sockets, but it
+can run on top of and/or below <b>UDP/IP and Ethernet</b>.
+{{%/blocks/lead %}}
diff --git a/content/en/blog/news/20191006-new-site.md b/content/en/blog/20191006-new-site.md
index c04ff2d..c04ff2d 100644
--- a/content/en/blog/news/20191006-new-site.md
+++ b/content/en/blog/20191006-new-site.md
diff --git a/content/en/blog/news/20200116-hn.md b/content/en/blog/20200116-hn.md
index b80a7bd..b80a7bd 100644
--- a/content/en/blog/news/20200116-hn.md
+++ b/content/en/blog/20200116-hn.md
diff --git a/content/en/blog/news/20200212-ecmp.md b/content/en/blog/20200212-ecmp.md
index 019b40d..019b40d 100644
--- a/content/en/blog/news/20200212-ecmp.md
+++ b/content/en/blog/20200212-ecmp.md
diff --git a/content/en/blog/news/20200216-ecmp.md b/content/en/blog/20200216-ecmp.md
index ce632c9..ce632c9 100644
--- a/content/en/blog/news/20200216-ecmp.md
+++ b/content/en/blog/20200216-ecmp.md
diff --git a/content/en/blog/news/20200502-frcp.md b/content/en/blog/20200502-frcp.md
index 28c5794..28c5794 100644
--- a/content/en/blog/news/20200502-frcp.md
+++ b/content/en/blog/20200502-frcp.md
diff --git a/content/en/blog/news/20200507-python-lb.png b/content/en/blog/20200507-python-lb.png
index 89e710e..89e710e 100644
--- a/content/en/blog/news/20200507-python-lb.png
+++ b/content/en/blog/20200507-python-lb.png
Binary files differ
diff --git a/content/en/blog/news/20200507-python.md b/content/en/blog/20200507-python.md
index d4b3504..2b05c91 100644
--- a/content/en/blog/news/20200507-python.md
+++ b/content/en/blog/20200507-python.md
@@ -65,7 +65,7 @@ released after the weekend.
Oh, and here is a picture of Ouroboros load-balancing between the C (top right)
and Python (top left) implementations using the C and Python clients:
-{{<figure width="60%" src="/blog/news/20200507-python-lb.png">}}
+{{<figure width="60%" src="/blog/20200507-python-lb.png">}}
Can't wait to get the full API done!
diff --git a/content/en/blog/news/20201212-congestion-avoidance.md b/content/en/blog/20201212-congestion-avoidance.md
index f395a4f..0b010c5 100644
--- a/content/en/blog/news/20201212-congestion-avoidance.md
+++ b/content/en/blog/20201212-congestion-avoidance.md
@@ -334,7 +334,7 @@ big oscillations because of AIMD), when switching the flows on the
clients on and off which is on par with DCTCP and not unexpected
keeping in mind the similarities between the algorithms:
-{{<figure width="60%" src="/blog/news/20201212-congestion.png">}}
+{{<figure width="60%" src="/blog/20201212-congestion.png">}}
The periodic "gaps" were not seen at the ocbr endpoint applicationand
may have been due to tcpdump not capturing everything that those
diff --git a/content/en/blog/news/20201212-congestion.png b/content/en/blog/20201212-congestion.png
index 8e5b89f..8e5b89f 100644
--- a/content/en/blog/news/20201212-congestion.png
+++ b/content/en/blog/20201212-congestion.png
Binary files differ
diff --git a/content/en/blog/news/20201219-congestion-avoidance.md b/content/en/blog/20201219-congestion-avoidance.md
index e13fdba..240eb88 100644
--- a/content/en/blog/news/20201219-congestion-avoidance.md
+++ b/content/en/blog/20201219-congestion-avoidance.md
@@ -6,10 +6,11 @@ description: ""
author: Dimitri Staessens
---
-In my recently did some quick tests with the new congestion avoidance
-implementation, and thought to myself that it was a shame that
-Wireshark could not identify the Ouroboros flows, as that could give
-me some nicer graphs.
+I recently did some
+[quick tests](/blog/2020/12/12/congestion-avoidance-in-ouroboros/#mb-ecn-in-action)
+with the new congestion avoidance implementation, and thought to
+myself that it was a shame that Wireshark could not identify the
+Ouroboros flows, as that could give me some nicer graphs.
Just to be clear, I think generic network tools like tcpdump and
wireshark -- however informative and nice-to-use they are -- are a
@@ -34,7 +35,7 @@ First, a quick refresh on the experiment layout, it's the the same
4-node experiment as in the
[previous post](/blog/2020/12/12/congestion-avoidance-in-ouroboros/#mb-ecn-in-action)
-{{<figure width="80%" src="/blog/news/20201219-exp.svg">}}
+{{<figure width="80%" src="/blog/20201219-exp.svg">}}
I tried to draw the setup as best as I can in the figure above.
@@ -51,12 +52,12 @@ generated with our _constant bit rate_ ```ocbr``` tool trying to send
about 80 Mbit/s of application-level throughput over the unicast
layer.
-{{<figure width="80%" src="/blog/news/20201219-congestion.png">}}
+{{<figure width="80%" src="/blog/20201219-congestion.png">}}
The graph above shows the bandwidth -- as captured on the congested
100Mbit Ethernet link --, separated for each traffic flow, from the
same pcap capture as in my previous post. A flow can be identified by
-a <destination address, endpoint ID> pair, and since the destination
+a (destination address, endpoint ID)-pair, and since the destination
is all the same, I could filter out the flows by simply selecting them
based on the (64-bit) endpoint identifier.
@@ -141,7 +142,7 @@ from the "wire":
## The network protocol
-{{<figure width="80%" src="/blog/news/20201219-ws-0.png">}}
+{{<figure width="80%" src="/blog/20201219-ws-0.png">}}
We will first have a look at packets captured around the point in time
where the second (red) flow enters the network, about 14 seconds into
@@ -193,7 +194,7 @@ endpoint for the ```ocbr``` server.
## The flow request
-{{<figure width="80%" src="/blog/news/20201219-ws-1.png">}}
+{{<figure width="80%" src="/blog/20201219-ws-1.png">}}
The first "red" packet that was captured is the one for the flow
allocation request, **FLOW REQUEST**[^6]. As mentioned before, the
@@ -234,7 +235,7 @@ relevant for this message.
## The flow reply
-{{<figure width="80%" src="/blog/news/20201219-ws-2.png">}}
+{{<figure width="80%" src="/blog/20201219-ws-2.png">}}
Now, the **FLOW REPLY** message for our request. It originates our
machine, so you will notice that the TTL is the starting value of 60.
@@ -246,7 +247,7 @@ for this flow.
## Congestion / flow update
-{{<figure width="80%" src="/blog/news/20201219-ws-3.png">}}
+{{<figure width="80%" src="/blog/20201219-ws-3.png">}}
Now a quick look at the congestion avoidance mechanisms. The
information for the Additive Increase / Multiple Decrease algorithm is
@@ -255,7 +256,7 @@ active, they experience congestion since the requested bandwidth from
the two ```ocbr``` clients (180Mbit) exceeds the 100Mbit link, and the
figure above shows a packet marked with an ECN value of 11.
-{{<figure width="80%" src="/blog/news/20201219-ws-4.png">}}
+{{<figure width="80%" src="/blog/20201219-ws-4.png">}}
When the packets on a flow experience congestion, the flow allocator
at the endpoint (the one our _uni-s_ IPCP) will update the sender with
diff --git a/content/en/blog/news/20201219-congestion.png b/content/en/blog/20201219-congestion.png
index 5675438..5675438 100644
--- a/content/en/blog/news/20201219-congestion.png
+++ b/content/en/blog/20201219-congestion.png
Binary files differ
diff --git a/content/en/blog/news/20201219-exp.svg b/content/en/blog/20201219-exp.svg
index 68e09e2..68e09e2 100644
--- a/content/en/blog/news/20201219-exp.svg
+++ b/content/en/blog/20201219-exp.svg
diff --git a/content/en/blog/news/20201219-ws-0.png b/content/en/blog/20201219-ws-0.png
index fd7a83a..fd7a83a 100644
--- a/content/en/blog/news/20201219-ws-0.png
+++ b/content/en/blog/20201219-ws-0.png
Binary files differ
diff --git a/content/en/blog/news/20201219-ws-1.png b/content/en/blog/20201219-ws-1.png
index 0f07fd0..0f07fd0 100644
--- a/content/en/blog/news/20201219-ws-1.png
+++ b/content/en/blog/20201219-ws-1.png
Binary files differ
diff --git a/content/en/blog/news/20201219-ws-2.png b/content/en/blog/20201219-ws-2.png
index 7cd8b7d..7cd8b7d 100644
--- a/content/en/blog/news/20201219-ws-2.png
+++ b/content/en/blog/20201219-ws-2.png
Binary files differ
diff --git a/content/en/blog/news/20201219-ws-3.png b/content/en/blog/20201219-ws-3.png
index 2a6f6d5..2a6f6d5 100644
--- a/content/en/blog/news/20201219-ws-3.png
+++ b/content/en/blog/20201219-ws-3.png
Binary files differ
diff --git a/content/en/blog/news/20201219-ws-4.png b/content/en/blog/20201219-ws-4.png
index 3a0ef8c..3a0ef8c 100644
--- a/content/en/blog/news/20201219-ws-4.png
+++ b/content/en/blog/20201219-ws-4.png
Binary files differ
diff --git a/content/en/blog/20210320-ouroboros-rina.md b/content/en/blog/20210320-ouroboros-rina.md
new file mode 100644
index 0000000..19bbc0a
--- /dev/null
+++ b/content/en/blog/20210320-ouroboros-rina.md
@@ -0,0 +1,930 @@
+---
+date: 2021-03-20
+title: "How does Ouroboros relate to RINA, the Recursive InterNetwork Architecture?"
+linkTitle: "Is Ouroboros RINA?"
+description: "A brief history of Ouroboros"
+author: Dimitri Staessens
+---
+
+```
+There are two kinds of researchers: those that have implemented
+something and those that have not. The latter will tell you that there
+are 142 ways of doing things and that there isn't consensus on which
+is best.The former will simply tell you that 141 of them don't work.
+ -- David Cheriton
+```
+
+When I talk to someone that's interested in Ouroboros, a question that
+frequently pops up is how the project relates to the
+[Recursive InterNet(work) Architecture](https://en.wikipedia.org/wiki/Recursive_Internetwork_Architecture),
+or **RINA**. I usually steer away from going into the technical
+aspects of how the architectures differ, mostly because not many
+people know the details of how RINA works. But the origin of Ouroboros
+definitely lies with our research and our experiences implementing
+RINA, so it's a good question. I'll address it as best as I can,
+without going overboard on a technical level. I will assume the reader
+is at least somewhat familiar with RINA. Also keep in mind that both
+projects are ongoing and should not be considered as "done"; things
+may change in the future. These are my -- inevitably subjective and
+undoubtedly somewhat inaccurate -- recollections of how it went down,
+why Ouroboros exists, and how it's different from RINA.
+
+If you're in a hurry, this is the TL;DR: We spent 4-5 years
+researching RINA in EU-funded projects and understand its concepts and
+ideas very well. However, we looked beyond the premises and the
+us-vs-them mentality of the RINA community and found areas for
+improvement and further simplification. And more than a couple of
+things in RINA that are just plain old wrong. While RINA insiders may
+suggest that Ouroboros is 'RINA-inspired' or use some other phrasing
+that insinuates our prototype is an inferior design or some watered
+down version of RINA: it is not.
+
+And a quick note here: Ouroboros _the network prototype_ has no
+relation to Ouroboros _the Proof-of-Stake protocol_ in the Cardano
+blockchain. That some of the Cardano guys are also interested in RINA
+doesn't help to ease any confusion.
+
+### IBBT meets RINA
+
+I first came into contact with RINA somewhere in 2012, while working
+as a senior researcher in the field of telecommunication networks at
+what was then known as IBBT (I'll save you the abbreviation). IBBT
+would soon be known as iMinds, and is now integrated into
+[IMEC](https://www.imec-int.com). A new research project was going to
+start and our research group was looking for someone to be responsible
+for the IBBT contributions. That project, called
+[IRATI](https://cordis.europa.eu/project/id/317814) was a relatively
+short (2 years duration) project in the "Future Internet Research and
+Experimentation" (FIRE) area of the _7th framework programme_ of the
+European Commission. I won't go into the details and strategies of
+research funding; the important thing to know is that the objectives
+of FIRE are "hands-on", aimed at building and deploying Internet
+technologies. Given that I had some experience deploying experiments
+(at that time OpenFlow prototypes) on our lab testbeds, I listened to
+the project pitch, an online presentation with Q&A given by the
+project lead, Eduard Grasa from [i2cat](https://i2cat.net/), who
+explained the concepts behind RINA, and got quite excited about how
+elegant this all looked. So I took on the project and read John Day's
+[Patterns in Network Architecture](https://www.oreilly.com/library/view/patterns-in-network/9780132252423/),
+which we later usually referred to as _PNA_. It was also the time
+when I was finishing my PhD thesis, so my PostDoc track was going to
+be for a substantial part on computer network architecture and RINA.
+Unifying
+[Inter-Process Communication](https://en.wikipedia.org/wiki/Inter-process_communication)
+(IPC) and networking. How exciting was that!
+
+IRATI -- Investigating RINA as an Alternative to TCP/IP -- was
+something different from the usual research projects, involving not
+only some substantially new and unfamiliar ideas, but it also relied
+very heavily on software development. Project work was performed as
+part of PhD tracks, so who would do the work? There was a PhD student
+under my guidance working mostly on OpenFlow, Sachin -- one of the
+kindest people I have ever met, and now a professor at TU Dublin --
+and we had a student with us, Sander Vrijders, who just finished his
+master's thesis. We invited him to talk about a possible PhD track,
+aligned to ongoing and upcoming projects in our group. Sander decided
+to take on the challenge of IRATI and start a PhD track on RINA.
+
+### IRATI
+
+**IRATI** kicked off in January 2013 at i2cat in Barcelona. It was
+followed by a RINA workshop, bringing the project in touch with the
+RINA community, which had its epicenter at Boston University
+(BU). It's where I first met John Day, who gave a 2-day in-depth
+tutorial of RINA. Eduard also presented an outline of the IRATI
+objectives. The project promised an implementation of RINA in Linux
+_and_ FreeBSD/JunOS, with detailed comparisons of RINA against TCP/IP
+in various scenarios, and also demonstrate interoperability with other
+RINA prototypes: the
+[TINOS prototype](https://github.com/PouzinSociety/tinos) and the
+[TRIA](http://trianetworksystems.com/) prototype. IRATI would also
+prepare the European FIRE testbeds for RINA experiments using the
+prototype. In 2 years, on 870k Euros in research funding. A common
+inside joke at project kick-off meetings in our field was to put a
+wager on the number slides that the presentation deck at the final
+project review meeting would differ from the slide decks presented at
+the initial kick-off meeting. IRATI was _not_ going to be one of those
+projects!
+
+With the RINA community gathered at the workshop, there were initial
+ideas for a follow-up research proposal to IRATI. Of course, almost
+every potential participant present was on board.
+
+Three partners were responsible for the implementation: i2cat, who had
+experience on RINA; [Nextworks](https://www.nextworks.it) a
+private-sector company with substantial experience on implementing
+networking solutions, and iMinds/imec, bringing in our testbed
+experience. Interoute (now part of [GTT](https://gtt.net)) validated
+the test scenarios that we would use for evalutions. Boston University
+had an advisory role in the project.
+
+The first work was determining the software design of the
+implementation. IRATI was going to build an in-kernel implementation
+of RINA. A lot of the heavy lifting on the design was already done
+during the project proposal preparation phase, and about 3 months into
+the projects, the components to be implemented were
+[well-defined](https://core.ac.uk/download/pdf/190646748.pdf).
+Broadly speaking, there were 3 things to implement: the IPCPs that
+make up the RINA layers (Distributed IPC Facilities, DIFs), the
+component that is responsible for creating and starting these IPCPs
+(the IPC manager), and the core library to communicate between these
+components, called _librina_. The prototype would be built in 3 phases
+over the course of 2 years.
+
+i2cat was going to get started on most of the management parts (IPC
+Manager, based on their existing Java implementation; librina,
+including the Common Distributed Application Protocol (CDAP) and the
+DIF management functions in the normal IPCP) and the Data Transfer
+Protocol (DTP). iMinds was going to be responsible for the kernel
+modules that will allow the prototype to run on top of
+Ethernet. Nextworks was taking a crucial software-architectural role
+on kernel development and software integration. For most of these
+parts we had access to a rough draft of what they were supposed to do,
+John Day's RINA reference model, which we usually referred to as _the
+specs_.
+
+i2cat had a vested interest in RINA and was putting in a lot of
+development effort with 3 people working on the project: Eduard,
+Leonardo Bergesio and Miquel Tarz&aacute;n. Nextworks assigned
+Francesco Salvestrini, an experienced kernel developer to the
+project. From iMinds, the development effort would come from
+Sander. My personal involvement in the project software development
+was limited, as I still had other ongoing projects (at least until the
+end of 2014) and my main role would be in the experimentation work,
+which was only planned start after the initial development phase.
+
+The project established efficient lines of communications, mostly
+using Skype and the mailing lists and the implementation work got
+underway swiftly. I have been fortunate to be a part of a couple of
+projects where collaboration between partners was truly excellent, but
+the level of teamwork in IRATI was unprecedented. There was a genuine
+sense of excitement in everybody involved in the project.
+
+So, Sander's first task was to implement the
+[_shim DIF over Ethernet_](https://ieeexplore.ieee.org/document/6798429).
+This is a Linux loadable kernel module (LKM) that wraps the Ethernet
+802.1Q VLAN with a thin software layer to present itself using the
+RINA API. The VLAN ID would be used as the layer name. No
+functionality would be added to the existing Ethernet protocol so with
+only the src and dst address fields left, this _shim DIF_ was
+restricted to having only a single application registered at a time,
+and to a single RINA "flow" between the endpoints. We could deploy
+about 4000 of these _shim DIFs_ in parallel to support larger RINA
+networks. The name resolution for endpoint applications was planned to
+be using the Address Resolution Protocol (ARP), which was readily
+available in the Linux kernel.
+
+Or so we thought. The ARP implementation in the kernel assumed IPv4 as
+the only L3 protocol (IPv6 doesn't use ARP), so it could not handle
+the resolution of RINA _application names_ to MAC addresses, which we
+needed for the shim DIF. So after some deliberation, we decided to
+implement an RFC 826 compliant version of ARP to support the shim DIF.
+
+In the meantime, we also submitted a small 3-partner project proposal
+the GEANT framework, tailored to researching RINA in an NREN (National
+Research and Education Networks) environment. The project was lead by
+us, partnering with i2cat, and teaming up with
+[TSSG](https://tssg.org/). [IRINA](https://i2cat.net/projects/irina/)
+would kick off in October 2013, meaning we'd have 2 parallel projects
+on RINA.
+
+The project had made quite some progress in its first 6 months, there
+were initial implementations for most of the components, and in terms
+of core prototype functionality, IRATI was quickly overtaking the
+existing RINA prototypes. However, the pace of development in the
+kernel was slower than anticipated and some of the implementation
+objectives were readjusted (and FreeBSD/JunOS was dropped in favor of
+a _shim DIF for Hypervisors_). With the eye on testbed deployments,
+Sander started work on the design of a second _shim DIF_, one that
+would allow us to run the IRATI prototype over TCP/UDP.
+
+In the meantime, the follow-up project that was coined during the
+first RINA workshop took shape and was submitted. Lead by our IRINA
+partner TSSG, it was envisioned to be a a relatively large project,
+about 3.3 million Euros in EC contributions, running for 30 months and
+bringing together 13 partners with the objective to build the IRATI
+prototype into what was essentially a carrier network demonstrator for
+RINA, adding _policies_ for mobility, security and reliability.
+[**PRISTINE**](https://cordis.europa.eu/project/id/619305) got
+funded. This was an enormous boon to the RINA community, but also a
+bit of a shock for us as IRATI developers, as the software was already
+a bit behind schedule with a third project on the horizon. The
+furthest we could push forward the start of PRISTINE was January 2014.
+
+As the IRATI project was framed within
+[FIRE](https://dl.acm.org/doi/10.1145/1273445.1273460), there was a
+strong implied commitment to get experimental results with the project
+prototype. By the last quarter of 2013, the experimentation work got
+started, and the prototype was getting its first deployment trials on
+the FIRE testbeds. This move to real hardware brought more problems to
+light. The network switches in the OFELIA testbed wasn't agreeing very
+well with our RFC-compliant ARP implementation, dropping everything
+that hadn't IPv4 as the network addresses. One of the testbeds also
+relied on VLANs to seperate experiments, which didn't fare well with
+our idea to (ab)use them within an experiment for the _shim
+DIF_. While Sander did the development of the _shim DIFs_ using the
+actual testbed hardware, other components had been developed
+predominantly in a virtual machine environment and had not been
+subjected to the massive parallellism that was available on dual-Xeon
+hardware. The stability of the implementation had to be substantially
+improved to get stable and reliable measurements. These initial trials
+in deploying IRATI also showed that configuring the prototype was very
+time consuming. The components used json configuration files which
+were to be created for each experiment deployment, causing substantial
+overhead.
+
+The clock was ticking and while the IRATI development team was working
+tirelessly to stabilize the stack, I worked on some (kernel) patches
+and fixes for the testbeds so we could use VLANs (on a different
+Ethertype) in our experiment. We would get deployment and stability
+testing done and (internally) release _prototype 1_ before the end of
+the year.
+
+### PRISTINE
+
+January 2014. The PRISTINE kick-off was organized together with a
+workshop, where John Day presented RINA, similar to the IRATI kick-off
+one year earlier, except this time it was in Dublin and the project
+was substantially bigger, especially in headcount. It brought together
+experts in various fields of networking with the intent of them
+applying that experience into developing polcies for RINA. But many of
+the participants to the PRISTINE project were very new to RINA, still
+getting to grips with some of the concepts (and John didn't shy away
+from making that abundantly clear).
+
+The first couple of months of PRISTINE was mostly about getting the
+participants up-to-speed with the RINA architecture and defining the
+use-case, which centered on a 5G scenario with highly mobile end-users
+and intelligent edge nodes. It was very elaborate, and the associated
+deliverables were absolute dreadnoughts.
+
+During this PRISTINE ramp-up phase, development of the IRATI prototype
+was going on at a fierce pace. The second project brought in some
+extra developers to work on the IRATI core Bernat Gaston (i2cat),
+Vincenzo Maffione (Nextworks), and Douwe de Bock (a master student at
+iMinds). i2cat focusing on management and flow control and was also
+porting the Java user-space parts to C++, Vincenzo was focusing on the
+_shim Hypervisor_, which would allow communications between processes
+running over a VM host and guest, and we were building the shim layer
+to run RINA over TCP and UDP.
+
+By this time, frustrations were starting to creep in. Despite all the
+effort in development, the prototype was not in a good shape. The
+development effort was also highly skewed, with i2cat putting in the
+bulk of the work. The research dynamic was also changing. At the start
+of IRATI, there was a lot of ongoing architectural discussions about
+what each component should do, to improve the _specs_, but due to the
+ever increasing time pressure, the teams were working more and more in
+isolation. Getting it _done_ became a lot more important than getting
+it _right_.
+
+All this development had led to very little dissemination output,
+which didn't go unnoticed at project reviews. The upshot of the large
+time-overlap between the two projects was that, in combination with
+the IRATI design paper that got published early-on in the project, we
+could afford to lose out a bit on dissemination in IRATI and try to
+catch up in PRISTINE. But apart from the relatively low output in
+research papers, this project had no real contributions to
+standardization bodies.
+
+In any case, the project had no choice but to push on with
+development, and, despite all difficulties, somewhere mid 2014 IRATI
+had most basic functionalities in place to bring the software in a
+limited way into PRISTINE so it could start development of the
+_PRISTINE software developement kit (SDK)_ (which was developed by
+people also in IRATI).
+
+Mostly to please the reviewers, we tried to get some standardization
+going, presenting RINA at an ISO SC6 JTC1 meeting in London and also
+at IETF91. Miquel and myself would continue to follow up on
+standardization in SC6 WG7 on "Future Network" as part of PRISTINE,
+gathering feedback on the _specs_ and getting them on the track
+towards ISO RINA standards. I still have many fond memories of my
+experiences discussing RINA within WG7.
+
+The IRATI project was officially ending soon, and the development was
+now focusing on the last functions of the Data Transfer Control
+Protocol (DTCP) component of EFCP, such as retransmission logic
+(delta-t). Other development was now shifted completely out of IRATI
+towards the PRISTINE SDK.
+
+In the meantime, we also needed some experimental
+results. Experimentation with the prototype was a painful and very
+time-consuming undertaking. We finally squeezed a publication at
+Globecom 2014 out of some test results and could combine that with a
+RINA tutorial session.
+
+January 2015, another new year, another RINA workshop. This time in
+Ghent, as part of a Flemish research project called RINAiSense --
+which should be pronounced like the French _renaissance_ -- that would
+investigate RINA in sensor networks (which now falls under the nomer
+"Internet of Things" (IoT). After the yearly _John Day presents RINA_
+sessions, this was also the time to properly introduce the IRATI
+prototype to everyone with a hands-on VM tutorial session, and to
+introduce [RINAsim](https://rinasim.omnetpp.org/), an OMNET++ RINA
+simulator developed within PRISTINE.
+
+After the workshop, it was time to wrap up IRATI. For an external
+observer it may lack impact and show little output in publications,
+and it definitely didn't deliver a convincing case for _RINA as an
+alternative for TCP/IP_. But despite that, I think the project really
+achieved a lot, in terms of bringing for the first time some tools
+that can be used to explore RINA, and for the people that worked on
+it, an incredible experience and deeps insights into computer networks
+in general. This would not have been possible without the enthousiasm
+and hard work put in by all those involved, but especially Eduard and
+the i2cat team.
+
+As IRINA was wrapping up, a paper on the how the _shim DIF over
+Hypervisors_ could be used to [reduce complexity of VM
+networking](https://ieeexplore.ieee.org/document/7452280) was
+submitted for IEEE COMMAG.
+
+We're approaching the spring of 2015, and IRATI was now officially
+over, but there was no time to rest as the clock was ticking on
+PRISTINE. The project was now already halfway its anticipated 30-month
+runtime, and its first review, somewhere end of 2014, wasn't met with
+all cheers, so we had to step up. This was also the period where some
+of my other (non-RINA) projects were running out. Up to then, my
+personal involvement on RINA had been on (software) design our
+components, reviewing the _specs_, and the practical hands-on was in
+using the software: deploying it on the testbeds and validating its
+functionality. But now I could finally free up time to help Sander on
+the development of the IRATI prototype.
+
+Our main objective for PRISTINE was on _resilient routing_: making
+sure the _DIF_ survives underlying link failures. This has been a
+long-time research topic in our group, so we pretty much quickly know
+_how_ to do it at a conceptual level. But there were three
+requirements: first and foremost, it needed _scale_: we needed to be
+able to run something that could be called a network, not just 3 or 4
+nodes and not just a _couple_ of flows in the network. Second, it
+needed _stability_: to measure the recovery time, we needed to send
+packets at small but -- more importantly -- steady intervals and
+thirdly, we needed measurement _tools_.
+
+As part of IRINA, we developed a basic traffic-generator, which would
+be extended for PRISTINE and tailored to suit our needs. Stability was
+improving gradually over time. Our real problem was _scale_, to which
+the biggest hurdle was the configuration of the IRATI stack. It was a
+complete nightmare. Almost anything and everything had to be
+preconfigured in _json_. I remember that by that time, Vincenzo had
+developed a tool called the _demonstrator_ based on tiny buildroot VMs
+to create setups for local testing, but this wasn't going to help us
+deploy it on the Fed4FIRE testbeds. So Sander developed one of the
+first orchestrators for RINA, called the _configurator_ for deploying
+IRATI on [emulab](https://www.emulab.net/portal/frontpage.php).
+
+Somewhere around that time, the _one-flow-only-limitation_ of the
+_shim DIF over VLAN_ was showing and a _shim DIF over Ethernet Link
+Layer Control (LLC)_ was drafted and developed. By mapping endpoints
+to LLC Service Access Points (SAPs), this _shim DIF_ could support
+parallel flows (data flows and management flows) between the client
+IPCPs in the layer above.
+
+With the PRISTINE SDK released as part of "openIRATI" somewhere after
+the January workshop a good month prior, there was another influx of
+code into the prototype for all the new features
+(a.k.a. _policies_). Francesco, who had been managing a lot of the
+software integration, was also leaving the RINA projects. This is the
+point where I really noticed that Sander and Vincenzo were quickly
+losing faith in the future of the IRATI codebase, and the first ideas
+of branching off -- or even starting over -- began to emerge.
+
+The next Horizon-2020-proposal deadline was also approaching, so our
+struggles at that point also inspired us to propose developing a more
+elaborate RINA orchestrator and make deployment and experimentation
+with (open)IRATI a much more enjoyable experience. That project,
+[ARCFIRE](https://ict-arcfire.eu/) would start in 2016.
+
+Now, we were still focusing on the basics: getting link state routing
+running, adding some simple _loop-free alternates_ policy to it, based
+on the operation of [IP FRR](https://tools.ietf.org/html/rfc5286) and
+running a bunch of flows over that network to measure packet loss when
+we break a link. Sander was focusing on the policy design and
+implementation, I was going to have a look at the IRATI code for
+scaling up the flow counts, which needed non-blocking I/O. I won't go
+into the details, but after that short hands-on stint in the IRATI
+codebase, I was onboard with Sander to starting looking to options for
+a RINA implementation beyond IRATI.
+
+It was now summer 2015, PRISTINE would end in 12 months and the
+project was committed to openIRATI, so at least for PRISTINE, we again
+had no choice but to plow on. A couple of frustrating months lied
+ahead of us, trying to get experimental results out of a prototype
+that was nowhere near ready for it, and with a code base that was also
+becoming so big and complex that it was impossible to fix for anyone
+but the original developers. This is unfortunately the seemingly
+inescapable fate of any software project whose development cycle is
+heavily stressed by external deadlines, especially deadlines set
+within the rigid timeline of a publicly funded research project.
+
+By the end of summer, we were still a long way off the mark in terms
+of what we hoped to achieve. The traffic generator tool and
+configurator were ready, and the implementation of LFA was as good as
+done, so we could deploy the machines for the use case scenarios,
+which were about 20 nodes in size, on the testbeds. But the deployment
+that actually worked was still limited to a 3-node PoC in a triangle
+that showed the traffic getting routed over the two remaining link if
+a link got severed.
+
+In the meantime, Vincenzo had started work on his own RINA
+implementation, [rlite](https://github.com/vmaffione/rlite), and
+Sander and myself started discussing options on a more and more
+regular basis on what to do. Should we branch off IRATI and try to
+clean it up? Keep only IRATI kernel space and rewrite user space? Hop
+on the _rlite_ train? Or just start over entirely? Should we go
+user-space entirely or keep parts in-kernel?
+
+In the last semester of 2015, Sander was heading for a 3-month
+research stint in Boston to work on routing in RINA with John and the
+BU team. By that time, we had ruled out branching off of openIRATI.
+Our estimate was that cleaning up the code base would be more work
+than starting over. We'd have IRATI as an upstream dependency, and
+trying to merge contributions upstream would lead to endless
+discussions and further hamper progress for both projects. IRATI was
+out. Continuing on rlite was still a feasible option. Vincenzo was
+making progress fast, and we knew he was extremely talented. But we
+were also afraid of running into disagreements of how to proceed. In
+the meantime, Sander's original research plans in Boston got subverted
+by a 'major review' decision on the _shim Hypervisor_ article, putting
+priority on getting that accepted and published. When I visited Sander
+in Boston at the end of October, we were again assessing the
+situation, and agreed that the best decision was to start our own
+prototype, to avoid having _too many cooks in the kitchen_.
+Development was not part of some funded project, so we were free to
+evaluate and scrutinize all design decisions, and we could get
+feedback on the RINA mailing lists on our findings. When all
+considerations settled, our own RINA implementation was going to be
+targeting POSIX and be user space only.
+
+We were confident we could get it done, so we took the gamble. ARCFIRE
+was going to start soon, but the first part of the project would be
+tool development. Our experimentation contributions to PRISTINE were
+planned to wrap up by April -- the project was planned to end in June,
+but a 4-month extension pushed it to the end of October. But starting
+May, we'd have some time to work on Ouroboros relatively
+undisturbed. In the very worst case, if our project went down the
+drain, we could still use IRATI or rlite to meet any objectives for
+ARCFIRE. We named our new RINA-implementation-to-be _Ouroboros_, the
+mythical snake that eats its own tail represented recursion, and also
+-- with a touch of imagination -- resembles the operation of a _ring
+buffer_.
+
+### ARCFIRE
+
+Another year, another RINA project kick-off, this time it was again in
+Barcelona, but this time without a co-located workshop. ARCFIRE (like
+IRATI before it) was within the FIRE framework, and the objective was
+to get some experiments running with a reasonable number of nodes (on
+the order of 100) to demonstrate stability and scale of the prototypes
+and also to bring tooling to the RINA community. The project was
+coordinated by Sven van der Meer (Ericsson), who had done significant
+work on the PRISTINE use cases, and would focus on the impact of RINA
+on network management. The industry-inspired use cases were brought by
+Diego L&oacute;pez (Telef&oacute;nica), _acteur incontournable_ in the
+Network Functions Virtualization (NFV) world. The project was of
+course topped off with i2cat, Nextworks, and ourselves, as we were
+somewhere in the process of integration into IMEC. The order at hand
+for us was to develop an fleshed-out testbed deployment framework for
+RINA, which we named [Rumba](https://gitlab.com/arcfire/rumba). (A
+rhumba is a bunch of rattlesnakes, and Ouroboros is a snake, and it
+was written in Python -- rhumba already existed, and rumba was an
+accepted alternate spelling).
+
+In early 2016, the RINA landscape was very different from when we
+embarked on IRATI in 2013. There were 2 open source prototypes, IRATI
+was the de-facto standard used in EC projects, but Vincenzo's rlite
+was also becoming available at the time and would be used in
+ARCFIRE. And soon, the development of a third prototype -- _ouroboros_
+-- would start. External perception of RINA in the scientific
+community had also been shifting, and not in a positive direction. At
+the start of the IRATI project, we had the position paper with project
+plans and outlines, and the papers on the _shims_ showed some ways on
+how RINA could be deployed. But other articles trying to demonstrate
+the benefits of RINA were -- despite all the efforts and good will of
+all people involved -- lacking in quality, mostly due to the
+limitations of the software. All these subpar publications did more
+harm than good, as the quality of the publications rubbed off on the
+perceived merits of the RINA architecture as a whole. We were always
+feeling this pressure to publish _something_, _anything_ -- and
+reviewers were always looking for a value proposition -- _Why is this
+better than my preferred solution?_, _Compare this in depth to my
+preferred solution_ -- that we simply couldn't support with data at
+this point in time. And not for lack of want or a lack of trying. But
+at least, ARCFIRE had at 2 years to look forward to, a focused scope
+and by now, the team had a lot of experience in the bag. But for the
+future of RINA, we knew the pressure was on -- this was a _now or
+never_ type of situation.
+
+### Ouroboros
+
+We laid the first stone on Ouroboros on Friday February 12th, 2016. At
+that point in time Ouroboros was still planned as a RINA
+implementation, so we started from the beginning: an empty git
+repository under our cursor, renewed enthousiasm in our minds, fresh
+_specs_ -- still warm from the printer and smelling of toner -- in our
+hands, and Sanders initial software design and APIs in colored marker
+on the whiteboard. Days were long -- we still had work to do on
+PRISTINE, mind you -- and evenings were short. I could now imagine the
+frustration of the i2cat people, who a couple of years prior were
+probably also spending their evenings and nights enthousiastically
+coding on IRATI while, for us, IRATI was still a (very interesting)
+job rather than a passion. We would feel no such frustrations as we
+knew from the onset that the development of Ouroboros was going to be
+a two-man job.
+
+While we were spending half our days gathering and compiling results
+from our _LFA_ experiments for PRISTINE, which -- fortunately or
+unfortunately depending on the way I look at it -- did not result in a
+publication, and half our days on the rumba framework, our early
+mornings and early evenings were filled with discussions on the RINA
+API used in Ouroboros. It was initially based on IRATI. Flow
+allocation used source and destination _naming information_ -- 4
+objects that the RINA _specs_ (correctly, might I add) say should be
+named: Application Process Name, Application Process Instance Id,
+Application Entity Name and Application Entity Instance Id. This
+_naming information_ as in IRATI, was built into a single structure --
+a 4-tuple -- and we were quickly running into a mess, because, while
+these names need to be identified, they are not resolved at the same
+time, nor in the same place. Putting them in a single struct and
+passing that around with NULL values all the time was really ugly. The
+naming API in Ouroboros changed quickly over time, initially saving
+some state in an _init_ call (the naming information of the current
+application, for instance) and later on removing the source naming
+information from the flow allocation protocol altogether, because it
+could so easily be filled with fake garbage that one shouldn't rely on
+it for anything. The four-tuple was then broken up to pass two 2-tuple
+name and instance-id, using one for the Process, the other for the
+Entity. But we considered these changes to be just a footnote in the
+RINA service definition, -- taste, one could take it or leave it, no
+big deal. Little did we know that these small changes were just the
+start -- the first notes of a gentle, breezy prelude that was slowly
+building towards a fierce, stormy cadenza that would signify the
+severance of Ouroboros from RINA almost exactly one year later.
+
+Another such change was with the _register_ function. To be able to
+reach a RINA application, you need to register it in the _DIF_. When
+we were implementing this, it just struck us that this code was being
+repeated over and over again in applications. And just think about it,
+_how does an application know which DIFs there are in the system?_.
+And if new DIFs are created while the application is running, how do I
+feed that information? That's all functionality that would have to be
+included in _every_ RINA application. IRATI has this as whole set of
+library calls. But we did something rather different. We moved the
+registering of applications _outside_ of the applications
+themselves. It's _application management_, not _IPC_. Think about how
+much simpler this small change makes life for an application
+developer, and a network administrator. Think about how it would be if
+-- in the IP world -- you could create a socket on port 80 or port 443
+_from the shell_, and set options on that socket _from the shell_, and
+then tell your kernel that incoming connections on that socket should
+be sent to this Apache or that Nginx program _from the shell_, and all
+that the Apache or Nginx developers would need to do is call accept()
+and read/write/select/epoll etc calls, instead of having to handle
+sockets and all their options. That's what the bind() and register()
+calls in Ouroboros do for Ouroboros applications: you bind some
+program to a name _from the command line_, you register that name in
+the layer _from the command line_ , and all the (server) program has
+to do is call _flow\_accept()_ and it will receive incoming flows. It
+is this change in the RINA API that inspired us to name our very first
+public presentation about Ouroboros, at FOSDEM 2018,
+[IPC in 1-2-3](https://archive.fosdem.org/2018/schedule/event/ipc/).
+
+When we tried to propose them to the RINA community, these changes
+were not exactly met with cheers. The interactions with that community
+was also beginning to change. RINA was the _specs_. Why are we now
+again asking questions about basic things that we implemented in IRATI
+years ago? IRATI shows its works. Want to change the _specs_: talk to
+John.
+
+We had also implemented our first _shim DIF_, which would allow to run
+the Ouroboros prototype over UDP/IPv4. We started with a UDP shim
+because there is a POSIX sockets API for UDP. Recall that we were
+targeting POSIX, including FreeBSD and MacOS X to make the Ouroboros
+prototype more accessible. But programming interfaces into Ethernet,
+such as _raw sockets_, were not standard between operating systems, so
+we would implement an Ethernet _shim DIF_ later. Now, the Ouroboros
+_shim DIF_ stopped being a _shim_ pretty fast. When we were developing
+the _shim DIFs_ for IRATI, there was one very important rule: we were
+not allowed to add functionality to the protocol we were wrapping with
+the RINA API, we could only _map_ functions that were existing in the
+(Etherent/UDP) protocol. This -- was the underlying reasoning -- would
+show that the protocol/layers in the current internet were
+_incomplete_ layers. But that also meant that the functions that were
+not present -- the flow allocator in particular -- would need to be
+circumvented through manual configuration at the endpoints. We weren't
+going to have any of that -- the Ouroboros IPCP daemons all implement
+a flow allocator. You may also be wondering why none of the prototypes
+have a _shim DIF_ directly over IP. It's perfectly possible! But the
+reason is simple: it would use a non-standardized value for the
+_protocol_ field in the IP header, and most IP routers simply drop
+such packets.
+
+Somewhere around April, we were starting the implementation of a
+_normal_ IPCP in Ouroboros, and another RINA component was quickly
+becoming a nuisance to me: the _Common Distributed Application
+Protocol_ or _CDAP_. While I had no problem with the objectives of
+CDAP, I was -- to put it mildly -- not a big fan of the
+object-oriented paradigm that was underneath it. Its methods,
+_read/write, create/destroy, start/stop_ make sense to many, but just
+like the HTTP methods PUT/GET/DELETE/POST/... there is nothing
+_fundamental_ about it. It might as well have just one method,
+_[execute](http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html)_.
+It's taste, and it definitely wasn't _my_ taste. I found that it only
+proved my long-holding observation that for every engineer there are
+at least three overengineers. I made a bold prediction to Sander: one
+day, we would kick CDAP out of the Ouroboros prototype.
+
+Summer was approaching again. Most of the contributions to PRISTINE
+were in, so the ARCFIRE partners could start to focus on that
+project. There was a risk: ARCFIRE depended on the Fed4FIRE testbeds,
+and Fed4FIRE was ending and its future was not certain. The projected
+target API for _rumba_ was
+[jFed](https://jfed.ilabt.imec.be/).
+To mitigate the risk, we made an inventory of other potential
+testbeds, and to accomodate for the wait for the results of the
+funding calls, we proposed (and got) an extention to ARCFIRE with 6
+months to a 30-month project duration. In the end, Fed4FIRE+ was
+funded, ARCFIRE had some breathing space -- after all, we had to fire
+on all cylinders to get the best possible results and make a case for
+RINA -- and Sander and myself had some extra time to get Ouroboros up
+and running.
+
+Sander quickly developed an Ethernet LLC _shim DIF_ based on the UDP
+one, and after that, we both moved our focus on the key components in
+the _normal IPCP_, implementing the full flow allocator and building
+the data transfer protocol (DTP), and the routing and forwarding
+functionality. CDAP was getting more and more annoying, but apart from
+that, this part of the RINA _specs_ were fairly mature following the
+implementation work in IRATI, and the implementation progress was
+steady and rather uneventful. For now.
+
+Work on the PRISTINE project was wrapped up, and the final
+deliverables were submitted at the end of October. PRISTINE was a
+tough project for us, with very little outcomes. Together with Miquel,
+I did make some progress with RINA standardization in ISO
+JTC1/SC6. But Sander and myself could show few research results, no
+published papers where we were the main authors. PRISTINE as a whole
+also fell short a bit in its main objectives, the RINA community
+hadn't substantially grown, and its research results were still --
+from an external vantage point -- mediocre. For us, it was a story of
+trying to do too much, too soon. Everyone tried their best, and I
+think we achieved what was achieveable given the time and resources we
+had. The project definitely had some nice outcomes. Standardization at
+least got somewhere, with a project in ISO and also some traction
+within the Next Generation Protocols (NGP) group at
+[ETSI](www.etsi.org). RINAsim was a nice educational tool, especially
+for visualizing the operation of RINA.
+
+Our lack of publication output was also noticed by our direct
+superiors at the University, who got more and more anxious. The
+relationship deteriorated steadily, we were constantly nagged about
+publications, _minimum viable papers_, and the _value proposition_ of
+RINA: _killer features_, _killer apps_. For us, the simplicity and
+elegance of the design was all we needed as a motivation to
+continue. There were some suggestions to build a simulator instead of
+a full prototype. My feeling was that a simulator would be
+unconvincing to show any _benefits of RINA_ -- I can't express in
+words how much I hated that phrase. To prove anything, simulators need
+to be validated against the real thing. And there are certain pitfalls
+that can only be found in an implementation. This is the reason why I
+chose that particular quote at the top of this blog post. Both parties
+started to sound like broken records to eachother, every meeting was
+devolving into a pointless competition in
+who-knows-the-most-workarounds. As the saying goes, arguing with an
+engineer is like wrestling a pig in the mud. There wasn't anything
+constructive or useful to those interactions, so we stopped giving a
+shit -- pardon my French. The Ouroboros prototype was coming along, we
+were confident that we knew what we were doing. All we needed was time
+to get it done. We'll write a paper on Ouroboros when we had one worth
+writing.
+
+By January 2017, we had a minimal working _normal_ IPCP. Sander was
+looking into routing, working on a component we called the _graph
+adjacency manager_ (GAM). As its name suggest, the GAM would be
+responsible for managing links in the network, what would be referred
+to as the _network topology_, and would get policies that instruct it
+how to maintain the graph based on certain parameters. This component,
+however, was short-lived and replaced by an API to connect IPCPs so
+the actual layer management logic could be a standalone program
+outside of the IPCPs instead of a module inside the IPCPs, which is
+far more flexible.
+
+### Ouroboros diverges from RINA
+
+In the meantime, I was implementing and revising _CACEP_, the Common
+Application Connection Establishment Phase that was accompanying CDAP
+in RINA. Discussions on CACEP between Sander and myself were
+interesting and sometimes heated -- whiteboard markers have
+experienced flight and sudden deceleration. CDAP was supposed to
+support different encoding schemes -- the OSI _presentation layer_. We
+were only going to implement Google Protocol Buffers, which was also
+used in IRATI, but the support for others should be there. The flow
+allocator and the RIB were built on top of our CDAP
+implementation. And something was becoming more and more obvious. What
+we were implementing -- agreeing on protocol versions, encoding etc --
+was something rather universal to all protocols. Now, you may
+remember that the flow allocator is passing something -- the
+information needed to connect to a specific Application Entity or
+Application Entity Instace -- that was actually only needed after the
+flow allocation procedure was basically established. But after a
+while, it was clear to me that this information should be _there_ in
+that CACEP part, and was rather universal for all application
+connections, not just CDAP. After I presented this to Sander _despair_
+over IRC, he actually recognized how this -- to me seemingly small --
+change impacted the entire architecture. Now, I will never forget the
+exchange, and I actually saved that conversation as a text file. The
+date was February 24th, 2017.
+
+```
+...
+<despair> nice, so then dev.h is even simpler
+<despair> ae name is indeed not on the layer boundary
+<dstaesse> wait why is dev.h simpler?
+<despair> since ae name will be removed there
+<dstaesse> no
+<dstaesse> would you?
+<despair> yes
+<despair> nobody likes balls on the line
+<despair> it's balls out
+...
+```
+
+Now, RINA experts will (or should) gasp for air when reading this. It
+refers to something that traces back to John's ISO JTC1/SC6 days
+working on Open Systems Interconnect (OSI), when there was a heavy
+discussion ongoing about the "Application Entity": _where was it
+located_? If it was in the _application_, it would be outside of SC6,
+which was dealing with networks, if it was in the network, it would be
+dealt with _only_ in SC6. It was a turf battle battle between two ISO
+groups, and because Application Entities were usually drawn as a set
+of circles, and the boundary between the network and the application
+as a line, that battle was internally nicknamed -- boys will be boys
+-- the _balls-in, balls-out_ question. If you ever attended one of
+John's presentations, he would take a short pause and then continue:
+"this was the only time that a major insight came from a turf war":
+_the balls were on the line_. The Application Entity needed to be
+known in both the application and the network. Alas! Our
+implementation was clearly showing that this was not the case. The
+balls were _above_ the line, the _network_ (or more precise: the flow
+allocator) doesn't need to know _anything_ about application entities!
+Then and there, we had found a mistake in RINA.
+
+Ouroboros now had a crisp and clear boundary between the flow in a
+_DIF_, and any connections using that flow in the layer above. Flow
+allocation creates a flow between _Application Instances_ and after
+that, a connection phase would create a _connection_ between
+_Application Entity Instances_. So roughtly speaking -- without the
+OSI terminology -- first the network connects the running programs,
+and after that, the programs decide which protocol to use (which can
+be implicit). What was in the _specs_ , what the RINA API was actually
+doing, was piggybacking these exchanges! Now, we have no issues with
+that from an operational perspective: _en effet_, the Ouroboros flow
+allocator has a _piggyback API_. But the contents of the piggybacked
+information in Ouroboros is _opaque_. And all this has another, even
+bigger, implication. One that I would only figure out via another line
+of reasoning some time later.
+
+With ARCFIRE rolling along and the implementation of the _rumba_
+framework in full swing, Sander was working on the link-state routing
+policy for Ouroboros, and I started implementing a _Distributed Hash
+Table (DHT)_ that would serve as the directory -- think of the
+equivalent of [DNS-SRV](https://en.wikipedia.org/wiki/SRV_record) for
+a RINA DIF -- a key-value store mapping _application names_ to
+_addresses_ in the layer. The link-state routing component was
+something that was really closely related to the Resource Information
+Base -- the RIB. That RIB was closely coupled with CDAP. Remember that
+prediction that I made about a year prior, somewhere in April 2016? On
+September 9th 2017, two weeks before the ARCFIRE RINA hackathon, CDAP
+was removed from Ouroboros. I still consider it the most satisfying
+[git commit](https://ouroboros.rocks/cgit/ouroboros/commit/?id=45c6615484ffe347654c34decb72ff1ef9bde0f3&h=master)
+of my life, removing 3700 lines of utter uselessness -- CDAP got 3 out
+of 4 characters right. From that day, Ouroboros could definitely not
+be considered a RINA implementation anymore.
+
+It was time to get started on the last big component: DTCP -- the
+_Data Transfer Control Protocol_. When implementing this, a couple of
+things were again quickly becoming clear. First, the implementation
+was proving to be completely independent of DTP. The RINA _specs_, you
+may recall, propose a state vector between DTP and DTCP. This solves
+the _fragmentation problem_ in TCP: If an IP fragment gets lost, TCP
+would resend all fragments. Hence TCP needs to know about the
+fragmentation in IP and only retransmit the bytes in that fragment.
+But the code was again speaking otherwise. It was basically telling
+us: TCP was independent of IP. But fragmentation should be in TCP, and
+IP should specify its maximum packet size. Anything else would result
+in an intolerable mess. So that's how we split the _Flow and
+Retransmission Control Protocol_ (FRCP) and the _Data Transfer
+Protocol_ (DTP) in Ouroboros. Another mistake in RINA.
+
+With FRCP split from DTP in roughly along the same line as TCP was
+originally split from IP, we had a new question: where to put FRCP?
+RINA has DTCP/DTP in the layer as EFCP. And this resulted in something
+that I found rather ugly: a normal layer would "bootstrap" its traffic
+(e.g. flow allocator) over its own EFCP implementation to deal with
+underlying layers that do not have EFCP (such as the _shim
+DIFs_). Well, fair enough I guess. But there is another thing. One
+that bugged me even more. RINA has an assumption on the _system_, one
+that has to be true. The EFCP implementation -- which is the guarantee
+that packets are delivered, and that they are delivered in-order -- is
+in the IPCP. But the application process that makes use of the IPCP is
+a _different process_. So, in effect, the transfer of data, the IPC,
+between the Application Process and the IPCP has to be reliable and
+preserve data order _by itself_. RINA has no control over this
+part. RINA is not controlling _ALL_ IPC; there is IPC _outside of
+RINA_. Another way of seeing it, is like this: If a set of processes
+(IPCPs) are needed to provide reliable state synchronization between
+two applictions A and B, who is providing reliable state
+synchronization between A and the IPCP? If it's again an IPCP,
+that's _infinite_ recursion! Now -- granted -- this is a rather
+_academic_ issue, because most (all?) computer hardware does provide
+this kind of preserving IPC. However, to me, even theoretical issues
+were issues. I wanted Ouroboros to be able to guarantee _ALL_ IPC,
+even between its own components, and not make _any_ assumptions! Then,
+and only then, it would be universal. Then, and only then, the
+_unification of networking and IPC_ would be complete.
+
+The third change in the architecture was the big one. And in
+hindsight, we should already have seen that coming with our
+realization that the application entity was _above the line_: we moved
+FRCP into the application. It would be implemented in the library, not
+in the IPCP, as a set of function calls, just like HTTP
+libraries. Sander was initially skeptic, because to his taste, if a
+single-threaded application uses the library, it should remain
+single-threaded. How could it send acknowledgements, restransmit
+packets etc? And the RINA specs had congestion avoidance as part of
+EFCP/DTCP. At least that shouldn't be in the application!? I agreed,
+but said I was confident that it would make the single-threaded thing
+work by running the functionality as part of the IPC calls,
+read/write/fevent. And congestion avoidance logic should be in the
+IPCP in the flow allocator. And that's how it's implemented now. All
+this meant that Ouroboros layers were not DIFs, and we stopped using
+that terminology.
+
+By now, the prototype was running stable enough for us to go _open
+source_. We got approval from IMEC to release it to the public under
+the GPLv2 / LGPL license, and in early 2018, almost exactly 2 years
+after we started the project, we presented the first public version of
+Ouroboros at FOSDEM 2018 in Brussels.
+
+But we were still running against the clock. ARCFIRE was soon to end,
+and Ouroboros had undergone quite some unanticipated changes that
+meant the implementation was facing the reality of [Hofstadter's
+Law](https://en.wikipedia.org/wiki/Hofstadter%27s_law).
+
+We were again under pressure to get some publications out; in order to
+meet ARCFIRE objectives, and Sander had to meet some publication quota
+to finish his PhD. The design of Rumba was interesting enough for a
+[paper](https://www.geni.net/), the implementation allowed us to
+deploy 3 Recursive Network prototypes (IRATI, rlite and Ouroboros) on
+testeds using different APIs: jFed for Fed4Fire and
+[GENI](https://www.geni.net/), Emulab for iMinds virtual wall testbed,
+QEMU using virtual machines, docker using -- well -- docker
+containers, and a local option only for Ouroboros. But we needed more
+publications, so for ARCFIRE Sander had implemented Loop-Free
+Alternates routing in Ouroboros and was getting some larger-scale
+results with them. And I reluctantly started working on a paper on
+Ouroboros -- I still felt the time wasn't right, and we first needed
+to have a full FRCP implementation and full congestion avoidance to
+make a worthwile analysis. By then I long had a feeling that my days
+at the university were numbered, it was time to move on, and I was
+either leaving after submitting a publication on Ouroboros, or without
+a publication on Ouroboros.
+
+In May 2018 there was another RINA workshop, where I presented
+Ouroboros. The feedback I got from John was characteristically short:
+_It's stupid_.
+
+We finished the experiments for ARCFIRE, but as with PRISTINE, the
+results were not accepted for publication. During the writing of the
+paper, a final realization came. We had implemented our link-state
+routing a while ago, and it was doing something interesting, akin to
+all link-state routing protocols: a link-state packet that came in on
+some flow, was sent out on all other flows. It was -- in effect
+--doing broadcast. But... OSPF is doing the same. Wait a minute. OSPF
+uses a multicast IP address. But of course! Multicast wasn't what it
+seemed to be. Multicast was broadcast on a layer, creating a multicast
+group was enrollment in that layer. A multicast IP address is a
+broadcast layer name! Let that one sink in. Based on the link-state
+routing code in the _normal IPCP_, I implemented the broadcast IPCP in
+a single night. The _normal IPCP_ was renamed _unicast IPCP_. It had
+all fallen into place, the Ouroboros architecture was shaped.
+
+But we had no value proposition to pitch, no value-added feature, no
+killer app, no unique selling point. Elegance? I received my notice on
+Christmas Eve 2018. Life as a researcher would be over. But what a
+ride those last 3 years had been. I'd do the same all over again.
+
+The [paper](https://arxiv.org/abs/2001.09707) was submitted in January
+2019. We haven't received any word from it since.
+
+With the GPL license on Ouroboros, Sander and myself decided to
+continue to update the prototype and build a bit of a website for
+it. So, if you made it all the way to the end of this blog post: thank
+you for your interest in the project, that's why we did what we did,
+and continue to do what we do.
+
+Stay curious,
+
+Dimitri \ No newline at end of file
diff --git a/content/en/blog/20210402-multicast.md b/content/en/blog/20210402-multicast.md
new file mode 100644
index 0000000..bce44f3
--- /dev/null
+++ b/content/en/blog/20210402-multicast.md
@@ -0,0 +1,479 @@
+---
+date: 2021-04-02
+title: "How does Ouroboros do anycast and multicast?"
+linkTitle: "Does Ouroboros do (any,multi)-cast?"
+description: >
+ Searching for the answer to the question: why do packet networks work?
+author: Dimitri Staessens
+---
+
+```
+Nothing is as practical as a good theory
+ -- Kurt Lewin
+```
+
+How does Ouroboros handle routing and how is it different from the
+Internet? How does it do multicast? That's a good subject for a blog
+post! I assume the reader to be a bit knowledgeable about the Internet
+Protocol (IP) suite. I limit this discussion to IPv4, but generally
+speaking it's also applicable to IPv6. Hope you enjoy the read.
+
+Network communications is commonly split up into four classes based on
+the delivery model of the message. If it is a single source sending to
+a single receiver, it is called _unicast_. This is the way most of the
+traffic on the Internet works: a packet is forwarded to a specific
+destination IP address. This process is then called _unicast routing_.
+If a sender is transmitting a message to _all_ nodes in a network,
+it's called _broadcast_. To do this efficiently, the network will run
+a bunch of protocols to construct some form of _spanning tree_ between
+the nodes in the network a process referred to as _broadcast
+routing_. If the destination is a specific set of receivers, it's
+called _multicast_. Broadcast routing is augmented with a protocol to
+create groups of nodes, the so-called multicast group, to again create
+some form of a spanning tree between the group members, called
+_multicast routing_. The last class is _anycast_, when the destination
+of the communication is _any_ single member of a group, usually the
+closest.
+
+Usually these concepts are explained in an Internet/IP setting where
+the destinations are (groups of) IP addresses, but the concepts can
+also be generalized towards the naming system: resolving a (domain)
+name to a set of addresses, for instance, which can then be used in a
+multicast implementation called _multidestination
+routing_. Multidestination routing (i.e. specifying a bunch of
+destination addresses in a packet) doesn't scale well.
+
+Can we contemplate other classes? Randomcast (sending to a random
+destination)? Or stupidcast (sending to all destinations that don't
+need to receive the message)? All kidding aside, the 4 classes above
+are likely to be all the _sensible_ delivery models.
+
+### Conundrum, part I
+
+During the development of Ouroboros, it became clearer and clearer to
+us that the distinction based on the delivery model is not a
+_fundamental_ one. If I have to make a -- definitely imperfect --
+analogy, it's a bit like classifying animals by the number of eyes
+they have. Where two eyes is unicast, more is multicast and composite
+eyes broadcast. Now, it will tell you _something useful_ about the
+animals if they are in the 2, multi or composite-eye class, but it's
+not how biologists classify animals. Some animal orders -- spiders --
+have members with 2, 4, 6 and 8 eyes. There are deeper, more
+meaningful distinctions that can be made on more fundamental grounds,
+such as whether the species has a backbone or not, that traces back
+their evolution. What are those fundamental differences for networks?
+
+Take a minute to contemplate the following little conundrum. Take a
+network that is linear, e.g.
+
+```
+source - node - node - node - node - destination
+```
+
+and imagine observing a packet traveling over every _link_ on this
+linear network, from source to destination. Was that communication
+anycast, unicast, multicast or broadcast? Now this may seem like a
+silly question, but it should become clearer why it's relevant, and --
+in fact -- fundamental. I will come back to this at the end of this
+post.
+
+But first, let's have a look at how it's done, shall we?
+
+### Unicast
+
+This is the basics. IP routers will forward packets based on the
+destination IP address (not in any special range) in their header to
+the host (in effect: an interface) that has been assigned that IP
+address. The forwarding is based on a forwarding table that is
+constructed using some routing protocol (OSPF/IS-IS/BGP/...). I'll
+assume you know how this works, and if not, there are plenty of online
+resources on these protocols.
+
+On unicast in Ouroboros, I will be pretty brief: it operates in a
+similar way as unicast IP: packets are forwarded to a destination
+address, and the layer uses some algorithm to build a forwarding table
+(or more general, a _forwarding function_). In the current
+implementation, unicast is based on IS-IS link-state routing with
+support for ECMP (equal-cost multipath). The core difference with IP
+is that there are _no_ special case addresses: an address is _always_
+uniquely assigned to a single network node. To scale the layer, there
+can be different _levels_ of (link-state) routing in a layer. It's
+very interesting in its own right, but I'll focus on the _modus
+operandi_ in this post, which is: packets get forwarded based on an
+address. I'll take a more in-depth look into Ouroboros addressing in
+(maybe the next) post (or you can find it in the
+[paper](https://arxiv.org/abs/2001.09707).
+
+### Anycast
+
+IP anycast is a funny thing. It's pretty simple: it's just unicast,
+but multiple nodes (interfaces) in the network have the same address,
+and the shortest path algorithm used in the routing protocol will
+forward the packet to the nearest node (interface) with that
+address. The network is otherwise completely oblivious; there is no
+such thing as an _anycast address_, it's a concept in the mind of
+network engineers.
+
+Now, if your reaction is _that can't possibly work_, you're absolutely
+right! Equal length paths can lead to _route flapping_, where some
+packets would be delivered _over here_ and other packets _over
+there_. That's why IP anycast is not something that _anyone_ can do. I
+can't run this server somewhere in Germany and a clone of it in
+Denver, and yet another clone in Singapore, and give them the same IP
+address. IP anycast is therefore highly restricted to some select
+cases, most notably DNS, NTP and some big Content Delivery Networks
+(CDNs). There is a certain level of trust needed between BGP peers,
+and BGP routers are monitored to remove routes that exhibit
+flapping. In addition, NTP and DNS use protocols that are UDP-based
+with a simple request-response mechanism, so sending subsequent
+packets to a different server isn't a big problem. CDN providers go to
+great _traffic engineering_ lengths to configure their peering
+relations in such a way that the anycast routes are stable. IP anycast
+"works" because there are a lot of assumptions and it's engineered --
+mostly through human interactions -- into a safe zone of
+operations[^1]. In the case of DNS in particular, IP anycast is an
+essential part of the Internet. Being close to a root DNS server
+impacts response times! The alternative would be to specify a bunch of
+alternate servers to try. But it's easier to remember
+[9.9.9.9](https://www.quad9.net/) than a whole list of random IP
+addresses where you have to figure out where they are! IP anycast also
+offers some protection against network failures in case the closest
+server becomes unreachable, but this benefit is relatively small as
+the convergence times of the routing protocols (OSPF/BGP) are on the
+order of minutes (and should be). That's why most hosts usually have 2
+DNS servers configured, because relying on anycast could mean a couple
+of minutes without DNS.
+
+Now, about anycast in Ouroboros, I can again be brief: I won't allow
+multiple nodes with the same address in a layer in the prototype, as
+this doesn't _scale_. Anycast is supported by name resolution. A
+service can be registered at different locations (addresses) and
+resolving such a name will return a (subset of) address(es) from the
+locations. If a flow allocation fails to the closest address, it can
+be repeated to the next one. Name resolution is an inherent function
+of a _unicast layer_, and currently implemented as a Distributed Hash
+Table (DHT). When joining a network (we call this operation
+_enrolment_, Kademlia calls it _join_), a list of known DHT node
+addresses is passed. The DHT stores its &lt;name, address&gt; entries
+in multiple locations (in the prototype this number is 8) and the
+randomness of the DHT hash assignment in combination with caching
+ensures the _proximity_ of the most popular lookups with reasonable
+probability.
+
+### Broadcast
+
+IP broadcast is also a funny thing, as it's not really IP that's doing
+the broadcasting. It's a coating of varnish on top of _layer 2_
+broadcast. So let's first look at Ethernet broadcast.
+
+Ethernet broadcast does what you would expect from a broadcasting
+solution. Note that switched Ethernets are confined to a loop-free
+topology by grace of the (Rapid) Spanning-Tree Protocol. A message to
+the reserved MAC address FF:FF:FF:FF:FF:FF will be _broadcasted_ by
+every intermediate Layer 2 node to all nodes (interfaces) that are
+connected on that Ethernet segment. If VLANs are defined, the
+broadcast is confined to all nodes (interfaces) on that
+VLAN. Quite nice, no objections _your honor_!
+
+The semantics of IP broadcast are related to the scope of the
+underlying _layer 2_ network. An IP broadcast address is the last "IP
+address" in a _subnet_. So, for instance, in the 192.168.0.0/24
+subnet, the IP broadcast address is 192.168.0.255. When sending a
+datagram to that IP broadcast destination, the Ethernet layer will be
+sending it to FF:FF:FF:FF:FF:FF, and every node _on that Ethernet_
+which has an IP address in the 192.168.0.0/24 network will receive it.
+You'd be forgiven to think that an IP broadcast to 255.255.255.255
+should be spread to every host on the Internet, but for obvious
+reasons that's not the case. The semantic of 0.0.0.0/0 is to mean your
+own local IP subnet on that specific interface. The DHCP protocol, for
+instance, makes use of this. A last thing to mention is that, in
+theory, you could send IP broadcast messages to a _different_ subnet,
+but few routers allows this, because it invites some very obvious
+[smurf attacks](https://en.wikipedia.org/wiki/Smurf_attack).
+Excuse me for finding it more than mildly amusing that standards
+originally
+[_required_](https://tools.ietf.org/html/rfc2644)
+routers to forward directed broadcast packets!
+
+So, in practice,IP broadcast is a _passthrough_ interface towards
+layer 2 (Ethernet, Wi-Fi, ...) broadcast.
+
+In Ouroboros -- like in Ethernet -- broadcast is a rather simple
+affair. It is facilitated by the _broadcast layer_, for which each
+node implements a _broadcast function_: what comes in on one flow,
+goes out on all others. The implementation is a stateless layer that
+-- also like Ethernet -- requires the graph to be a tree. But it has
+no addresses -- in fact, it doesn't even have a _header_ at all!
+Access control is part of _enrolment_, where participants in the
+broadcast layer get read and/or write access to the broadcast layer
+based on credentials. Every message on a broadcast layer is actually
+broadcast. This is the only way -- at least that I know of -- to make
+a broadcast layer _scaleable_ to billions of receivers![^2]
+
+So here is the first clue to the answer to the little conundrum at the
+beginning of this post. The Ouroboros model makes a distinction
+between _unicast layers_ and _broadcast layers_, based on the _nature
+of the algorithm_ applied to the message. If it's based on a
+_destination address_ in the message, we call the algorithm
+_FORWARDING_, and if it's sending on all interfaces except the
+incoming one, we call the algorithm _FLOODING_.
+
+An application like 'ping', where one broadcasts a message to a bunch
+of remotes, and each one responds back requires _both_ a broadcast
+layer and a unicast layer of (at least) the same size, with the 'ping'
+application using both[^3]. Tools like _ping_ and _traceroute_ and
+_nmap_ are administrative tools which reveal network information. They
+should only be available to _administrators_.
+
+It's not prohibited to implement an IPCP that does both broadcast (to
+the complete layer) and unicast. In fact, the unicast IPCP in the
+prototype in some sense does it, as we only figured out broadcast
+layers _after_ we implemented the link-state protocol, which is
+effectively broadcasting link-state messages within the unicast
+layer. All it would take is to implement the _flow\_join()_ API in the
+unicast IPCP and send those packets like we send Link-State
+Messages. But I won't do it, for a number of reasons: the first is
+that it is rare to have broadcast layers and unicast layers to be
+exactly the same size. Usually broadcast layers will be much
+smaller. The second is that, in the current implementation, the
+link-state messages are stateful: they have the source address and a
+sequence number to be able to deal with loops in the meshed
+topology. This doesn't scale to the _full_ unicast layer. To create a
+scaleable _do-both-unicast-and-broadcast_ layer, it would require to
+create a "virtual tree-topology-network" within the unicast layer,
+which is adjacency management. This would require an adjacency
+management module (running something like a hypothetical RSTP that is
+able to scale to billions of nodes) as part of the unicast
+IPCP. Adjacency management is functionality that was removed -- we
+called it the _graph adjaceny manager_ and the logic put _outside_ of
+the IPCP and replaced with a _connect_ API so it could be scripted as
+part of network management. And the third, and most important, is
+that we like the prototype to reflect the _model_, as it is more
+educational. Unicast layers and broadcast layers _are_ different
+layers. Always have been, and always will be. Combining them in an
+implementation only obfuscates this fact. To make a long (and probably
+confusing) story short, combining unicast and broadcast in a single
+IPCP _can_ be done, but at present I don't see any real benefit in
+doing it, and I'm pretty sure it will be hard to avoid
+[_making a mess_](https://www.cs.utexas.edu/users/EWD/transcriptions/EWD13xx/EWD1304.html)
+out of it.
+
+This transitions us nicely into multicast. Because combining unicast
+and multicast in the same layer is exactly what IP tries to do.
+
+### Multicast
+
+Before looking at IP, let's first have a look at how one would do
+multicast in Ethernet, because it's simpler.
+
+The simplest way to achieve multicast within Ethernet 802.1Q is using
+a VLAN: Configure VLANs on the end hosts and switch, and then just
+broadcast packets on that VLAN. The Ethernet II (MAC) frame will look
+like this:
+
+```
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+| FF:FF:FF:FF:FF:FF | SRC | 0x8100 | TCI | Ethertype | ..
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
+```
+
+The 0x8100 _en lieu_ of the Ethertype specifies that it's a VLAN, the
+Tag Control Information (TCI) has 12 bits that specify a VLAN ID, so
+one can have 4096 parallel VLANs. There are two fields needed to
+achieve the multicast nature of the communications: The destination
+set to the broadcast address, and a VLAN ID that will only be assigned
+to members of the group.
+
+Now, it won't come as a surprise to you, but IP multicast _is_ a funny
+thing. The gist of it is that there are protocols that do group
+management and protocols that assist in building a (spanning) tree.
+There is a range of IP addresses, 224.0.0.0 -- 239.255.255.255 (or in
+binary: starting with the 1110), called _class D_, which are only
+allowed as destination addresses. This _class D_ is further subdivided
+in different ranges for different functionalities, such as
+source-specific multicast. An IPv4 multicast packet can be
+distinguished by a single field: a destination address in the _class
+D_ range.
+
+If we compare this with Ethernet above, the _class D_ IP address is
+behaving more like the VLAN ID than the destination MAC _address_. The
+reason IP doesn't need an extra destination address is that the
+_broadcast_ functionality is _implied_ by the _class D_ address range,
+whereas a VLAN also supports unicast via MAC learning.
+
+Ethernet actually also has a special address range for multicast,
+01:00:5E:00:00:00 to 01:00:5E:7F:FF:FF, that copies the last 23 bits
+of the IP multicast address when that host joins the multicast
+tree. The reasoning behind it is this: if there are multiple endpoints
+for an IP multicast tree on the _same_ Ethernet segment, instead of
+the IP router sending individual unicast messages to each of them,
+that last "hop" can use a single Ethernet broadcast message.
+
+Next Ouroboros. From the discussion of the Ouroboros broadcast layer,
+you probably already figured out how Ouroboros does multicast. The
+same as broadcast! There is literally _zero_ difference. The only
+difference between multicast and broadcast is in the eye of the
+beholder when comparing a unicast layer and a broadcast layer.
+
+There is something else to remember about (Ouroboros) broadcast
+layers: the broadcast function is _stateless_, and _all_ broadcast
+IPCPs are _identical_ in function. The reason I mention this, is in
+relation the problem that I just mentioned above. What if I have a
+broadcast layer, for which a number of endpoints are also connected
+over a _lower_ broadcast layer? Can we, like IP/Ethernet, leverage
+this? And the answer is: no, there is no sharing of information
+between layers, and broadcast layers have no state. But we don't
+really need to! If there is a device with a broadcast IPCP in a lower
+broadcast layer, just add a broadcast IPCP to the higher level
+broadcast layer! It's not a matter of functionality, since the
+functionality for the higher level broadcast layer is _exactly_ the
+same as the lower one.
+
+While I am not eager to mix broadcast and unicast in a single IPCP
+program, I have few objections for creating a single program that
+behaves like multiple IPCPs of the same type. Especially for the
+stateless broadcast IPCP it would be rather trivial to make a single
+program that implements parallel broadcast IPCPs. And allowing
+something like _sublayers_ (like VLANs, with a single tag) is also
+something that can be considered for optimization purposes.
+
+### Conundrum, part II
+
+Now, let's look back at our little riddle, with a packet observed to
+move from source to destination over a linear network.
+
+```
+source - node - node - node - node - destination
+```
+
+Now, if we pedantically apply the definition of one-to-one
+communication given in most textbooks, it is unicast, since it has
+only a single source and a single destination. But to know what's
+going on at the routing level, can not be known. But I hope you gave
+it some thought about what information you'd need to be _able to
+tell_.
+
+Let's start with Ethernet. The Ethernet standard says that all MAC
+addresses are unique, so it's not anycast, and there is no _real_
+difference between multicast and broadcast. So, if the address is not
+the broadcast address or in some special range, it's _unicast_, else
+it's multi/broadcast. But really? What if the nodes were hubs instead
+of switches?
+
+What about IP? Bit harder. If it was anycast, it wouldn't have reached
+the destination if there was another node with the same address in
+this particular setup. But in a general IP network, it's not really
+possible to tell the difference between unicast and anycast without
+looking at all reachable node addresses. To know if it was broadcast
+or multicast, it would suffice to know the destination address in the
+packet.
+
+For Ouroboros, all you'd need to know what was going on is the type of
+layer. To detect anycast, one would need to query the directory to
+check if it returns a single or multiple destination addresses (since
+we don't allow _anycast routing_), and, like Ethernet in a way, it
+makes the distinction between multicast and broadcast rather moot.
+
+### The Ouroboros model
+
+In a nutshell, what does the Ouroboros model say?
+
+First, all communications is composed of either unicast or broadcast,
+and these two types of communications are fundamentally different and
+give rise to distinct types of layers. In a _unicast_ layer, nodes
+implement _FORWARDING_, which moves packets based on a destination
+address. In a _broadcast_ layer, nodes implement _FLOODING_, which
+sends incoming packets out on all links except for the incoming link.
+
+If we leave out the physical layer (wiring, spectrum tuning etc),
+constructing a layer goes through 2 phases: adding a node to a network
+layer (enrolment) and adding links (by which I mean allowed
+adjacencies) between that node and other members of the layer
+(adjacency management). After this the node becomes active in the
+network. During enrolment, nodes can be authenticated and permissions
+are acquired such as read/write access. Both types of layers go
+through this phase. A unicast layer, may, in addition, periodically
+disseminate information that enables _FORWARDING_. We call this
+dissemination function _ROUTING_, but if you know a better word that
+avoids confusion, we'll take it. _ROUTING_ is distinct from adjacency
+management, in the sense that adjacency management is administrative,
+and tells the networks which links it is allowed to use, which links
+_exist_. _ROUTING_ will make use of these links and make decisions
+when they are unavailable, for instance due to failures.
+
+Let's apply the Ouroboros model to Ethernet. Ethernet implements both
+types of layers. Enrolment and topology management are taken care of
+by the _spanning tree protocol_. It might be tempting to think that
+STP does _only_ topology management, but that's not really the
+case. Just add a new _root bridge_ to an Ethernet network: at some
+point, that network will go down completely. The default operation of
+Ethernet is as a _broadcast layer_: the default function applied to a
+packet is _FLOODING_. To allow unicast, Ethernet implements _ROUTING_
+via _MAC learning_. MAC learning is thus a highly specific routing
+protocol, that piggybacks its dissemination information on user
+frames, and calculates the forwarding tables based on the knowledge
+that the underlying topology is a _tree_. This brings a caveat: it
+only detects sending interfaces. If there are receivers on an Ethernet
+that never send a frame (but for which the senders know the MAC
+address), that traffic will always be broadcast. And in any case, the
+_first_ packet will always be broadcast.
+
+Next, VLAN. In Ouroboros speak, a VLAN is an implementation detail
+(important, of course) to logically combine parallel Ethernets. VLANs
+are independent layers, and indeed, must be enrolled (VLAN IDs set on
+bridge interfaces) and will keep their own states of (R)STP and MAC
+learning. Without VLAN, Ethernet is thus a single broadcast layer and
+a single multicast layer. With VLAN, Ethernet is potentially 4096
+broadcast layers and 4096 unicast layers.
+
+If we apply the Ouroboros model to IP, we again see that IP tries to
+implement both unicast and broadcast. A lot is configured
+manually. Enrolment and adjacency management are basically assigning
+IP addresses and, in addition, adding BGP routes and rules. IP has two
+levels of _ROUTING_, one is inside an autonomous system using
+link-state protocols such as OSPF, and on top there is BGP, which is
+disseminating routes as path vectors. Multicast in IP is building a
+broadcast layer, which is identified using a "multicast address",
+which is really the name of that broadcast layer. Enrolment into this
+virtual broadcast layer is handled via protocols such as IGMP, with
+adjacencies managed in many possible ways that involve calculating
+spanning trees based on internal topology information from OSPF or
+other means. The tree is then grafted into the routing table by
+labeling outgoing interfaces with the name of the broadcast
+layer. Yes, _that_ is what adding a multicast destination address to
+an IP forwarding table is _really_ doing! It's just hidden in plain
+sight!
+
+Now, my claim is that the Ouroboros model can be applied to _any_
+packet network technology. To conclude this post, let's take a real
+tricky one: Multi-Protocol Label Switching (MPLS).
+
+MPLS looks very different from Ethernet and IP. It doesn't have
+addresses at all, but uses _labels_, and it can swap and stack labels.
+
+Now, surely, MPLS doesn't fit the unicast layer, which says that every
+node gets an address, and forwards based on the destination address.
+Here's the solution to MPLS: it is a set of broadcast layers! The
+labels are a distributed way of identifying the layer _by its links_,
+instead of a single identifier for the whole layer, like a VLAN or a
+multicast IP address. RSVP / LDP (and their traffic engineering -TE
+cousins) are protocols that do enrolment and adjacency management.
+
+I hope this gave you a bit of an insight into the Ouroboros view of
+the world. Course materials on computer networks consist of a
+compendium of technologies and _how_ they work. The Ouroboros model is
+an attempt to figure out _why_ they work.
+
+Stay curious.
+
+Dimitri
+
+
+[^1]: I'm sure someone has or will propose some AI to solve it.
+
+[^2]: Individual links on a broadcast layer can be protected with
+ retransmission and using multi-path routing in the underlying
+ unicast layer.
+
+[^3]: Now that I'm writing this, I put it on my todo list to
+ implement this into the oping application.
diff --git a/content/en/blog/2021115-rejected.md b/content/en/blog/2021115-rejected.md
new file mode 100644
index 0000000..22bd82d
--- /dev/null
+++ b/content/en/blog/2021115-rejected.md
@@ -0,0 +1,61 @@
+---
+date: 2021-11-15
+title: "JACM paper rejected"
+linkTitle: "JACM paper rejected"
+description: "reflections"
+author: Dimitri Staessens
+---
+
+This weekend we got word from the paper we submitted to JACM early
+2019. Not too surprised that it was rejected. Actually, rather
+surprised that we still hear of it after 3 years. So thanks to the
+reviewer for his/her time. The rejection was justified, and I got
+something useful out of it, despite a couple of the reviewer's
+comments being very wrong[^1].
+
+I've written over 30 research papers in my first years at university,
+most went from first conception to a paper in less than a month. I had
+only 2 rejects. That's because they contained only work and very
+little ideas. I was bored out of my skull. It took me months to write
+the Ouroboros paper. Because I had no clear-cut conclusion yet to work
+towards. And definitely no engineering results.
+
+Publish or perish. To write publications, you need results. To
+get results you need time. To get time you need funding. To get
+funding you need publications. The vicious circle ensuring that
+academics can't take on any long-term high-risk endeavour that doesn't fit the
+ever shortening funding cycles. What a waste of time.. Rob Pike
+[saw it 20 years ago](http://doc.cat-v.org/bell_labs/utah2000/utah2000.html).
+
+There's a joke that in most jobs, people hope to win to lottery so
+they can quit. But in academia, they hope to win the lottery so they
+can keep it.
+
+Carl Sagan famously said that great claims require great evidence.
+We've failed (and wasted tons of research time) trying to squeeze a
+paper out of this work-in-progress. As I detailed in a
+[previous blog post](blog/2021/03/20/how-does-ouroboros-relate-to-rina-the-recursive-internetwork-architecture/),
+there is a lot of research and [implementation
+work](https://tree.taiga.io/project/dstaesse-ouroboros/epics) (not
+necessarily in that order) to be done before we can _comfortably_
+write a paper on these ideas. We'll just have to ride it out.
+
+Direction is more important than speed.
+
+Cheers,
+
+Dimitri
+
+[^1]: Especially comments regarding the math. The graph theory
+ definitions in the paper are based on Dieter Jungnickel's
+ sublime
+ [Graphs, Networks and Algorithms](https://link.springer.com/book/10.1007/978-3-642-32278-5).
+ I cannot recommend this work enough to anyone interested in
+ graph theory. The math in the paper has been reviewed before
+ submission by a professor that lectures discrete mathematics to
+ engineering students and additionally, because I wanted a second
+ opinion, a professor in pure mathematics (who had excellent
+ comments, that definitely improved the definitions). I'll take
+ the reviewer's notes as evidence that it was more than justified
+ to add the basic math definitions and build everything up from
+ scratch. I stand by the math in the paper. \ No newline at end of file
diff --git a/content/en/blog/20211226-2022andbeyond.md b/content/en/blog/20211226-2022andbeyond.md
new file mode 100644
index 0000000..e0e076a
--- /dev/null
+++ b/content/en/blog/20211226-2022andbeyond.md
@@ -0,0 +1,89 @@
+---
+date: 2021-12-26
+title: "A brief look into 2022"
+linkTitle: "Plans for 2022"
+description: "Quo vadis?"
+author: Dimitri Staessens
+---
+
+```
+A discipline doesn’t mean that you make sure that you have breakfast
+at eight o’clock in the morning and you are out of the house by half
+past eight. A discipline is that you… if you conceive some thing, then
+you decide whether or not it’s worth following through, and if it’s
+worth following through then you follow it through to its logical
+conclusion, and do it with the best… to the best of your
+ability. That’s a discipline, yes? -- David Bowie
+```
+
+With the end of the year in sight, it's time for a bit of reflection.
+
+2021 was still in the grip of the pandemic. The delta and omicron
+variants of SARS-CoV-2 prove to be some of the most contageous viruses
+known to date. Evolution driving emerging coronavirus variants that
+reduce the effectiveness of the vaccines is a big blow for all to
+bear. The fight against this strain can only be won if we all fight
+together, with respect for the health workers that have been standing
+in the front lines for almost two years now. Follow sound medical
+advice, and stay safe.
+
+But I'm here to reflect on the Ouroboros (O7s) project. 2021 was a bit
+of a slow year, not in the least because of my own motivation -- or
+lack of it. But some things are worth mentioning.
+
+We rewrote how O7s runs over UDP/IPv4. Instead of using of UDP ports
+to mimic the concepts of port IDs as outlined in RINA, we caved in and
+realigned the O7s ipcpd-udp to work more as a "normal" UDP
+service. The UDP port is not re-used as an Endpoint ID; instead, we
+added a small header to the protocol stack, and use a UDP-UDP tunnel
+to transport O7s traffic. This means that clients connecting to
+an O7s network from behind a NAT firewall will not need to add
+port forwarding rules. This should make things easier for people
+trying Ouroboros over IP.
+
+We removed support for Raptor. Raptor was our NetFPGA demo to run
+Ouroboros point-to-point over Ethernet Layer 1 without the use of
+Ethernet MAC layer and addressing. The plan for Raptor was also to
+include a driver that maps user-space memory (like netmap, but then
+tailored to the O7s shared memory layout) to reduce context switches
+to the kernel when sending high packet rates and thus boost
+performance. However, given the niche hardware requirements and our
+current limited resources, we decided to remove it from the project
+instead of continuing to maintain this.
+
+The most important feature this year is probably the InfluxDB
+exporter, which adds a layer of observability to the project and
+allows us to better monitor the internals for much needed debugging so
+we can stabilize the implementation.
+
+2021 also brought an unexpected surprise, in that we received a review
+from the paper we submitted to the Journal of the ACM. After more than
+2 years, we were not expecting any review anymore. Getting review
+comments on an article always comes with mixed feelings. The paper was
+rejected, and I agree with that outcome. But the quality of the peer
+review was extremely disappointing. For a journal of such standing,
+having only a single reviewer doesn't feel acceptable to me. It's a
+waste of time -- both to me and to the reviewers -- to even try to
+improve and re-submit a paper. Eo Romam iterum crucifigi.
+
+So, what's up for 2022? Right now the focus is on **scalablility of
+the O7s routing system**. To dynamically construct the addressing for
+a network, bottom-up. To build a global-scale network by continuously
+aggregating independent smaller networks, possibly with different
+routing protocols and different naming schemes, without causing
+disruptions to existing services in those networks. This will need
+some changes in the datapath implementation and support in the flow
+management for multiple data transfer components and multiple
+directories. If this sounds interesting, stay tuned.
+
+Finally, I'm infinitely grateful to anyone that took the time to give
+the prototype a go and then reported back on things that didn't go as
+they expected. This is a very small project, I have only so much time
+I can put in, and any bug reports and constructive feedback are really
+appreciated. That's how this can be taken forward. Do keep it coming.
+
+My best wishes for 2022.
+
+Stay healthy, and above all, stay curious.
+
+Dimitri
diff --git a/content/en/blog/20211229-flow-vs-connection.md b/content/en/blog/20211229-flow-vs-connection.md
new file mode 100644
index 0000000..3806dd2
--- /dev/null
+++ b/content/en/blog/20211229-flow-vs-connection.md
@@ -0,0 +1,351 @@
+---
+date: 2021-12-29
+title: "Behaviour of Ouroboros flows vs UDP sockets and TCP connections/sockets"
+linkTitle: "Flows vs connections/sockets"
+author: Dimitri Staessens
+---
+
+A couple of days ago, I received a very good question from someone who
+was playing around around with Ouroboros/O7s. He started from the
+[_oecho_](https://ouroboros.rocks/cgit/ouroboros/tree/src/tools/oecho/oecho.c#n94) tool.
+
+_oecho_ is a very simple application. It establishes what we call a
+"raw" flow. Raw flows have no fancy features, they are the best-effort
+class of packet transport (a bit like UDP). Raw flows do not have an
+Flow-and-retransmission control protocol (FRCP) machine. This person
+changed oecho to use a _reliable_ flow, and slightly modified it, ran
+into some unexpected behaviour,and then asked: **is it possible to
+detect a half-closed connection?** Yes, it is, but it's not
+implemented (yet). But I think it's worth answering this in a fair bit
+of detail, as it highlights some differences between O7s flows and
+(TCP) connections.
+
+A bit of knowledge on the core protocols in Ouroboros is needed, and
+can be found [here](/docs/concepts/protocols/) and the flow allocator
+[here](/docs/concepts/fa/). If you haven't read these in while, it
+will be useful to first read them to make the most out of this post.
+
+## The oecho application
+
+The oecho server is waiting for a client to request a flow, reads the
+message from the client, sends it back, and deallocates the flow.
+
+The client will allocate a _raw_ flow, the QoS parameter for the flow
+is _NULL_. Then it will write a message, read the response and also
+deallocate the flow.
+
+In a schematic, the communication for this simple application looks
+like this[^1]:
+
+{{<figure width="90%" src="/blog/20211229-oecho-1.png">}}
+
+All the API calls used are inherently _blocking_ calls. They wait for
+some event to happen and do not always return immediately.
+
+First, the client will allocate a flow to the server. The server's
+_flow\_accept()_ call will return when it receives the request, the
+client's _flow\_alloc()_ call will return when the response message is
+received from the server. This exchange agrees on the Endpoint IDs and
+possibly the flow characteristics (QoS) that the application will
+use. For a raw flow, this will only set the Endpoint IDs that will be
+used in the DT protocol[^2]. On the server side, the _flow\_accept()_
+returns, and the server calls _flow\_read()_. While the _flow\_read()_
+is still waiting on the server side, the flow allocation response is
+underway to the client. The reception of the allocation response
+causes the _flow\_alloc()_ call on the client side to return and the
+(raw) flow is established[^3].
+
+Now the client writes a packet, the server reads it and sends it
+back. Immediately after sending that packet, the server _deallocates_
+the flow. The response, containing a copy of the client message, is
+still on its way to the client. After the client receives it, it also
+deallocates the flow. Flow deallocation destroys the state associated
+with the flow and will release the EIDs for reuse. In this case of
+raw, unreliable flows, _flow\_dealloc()_ will return almost
+immediately.
+
+## Flows vs connections
+
+The most important thing to notice from the diagram for _oecho_, is
+that flow deallocation _does not send any messages_! It only cleans up
+_local_ state. Suppose that the server would send a message to destroy
+the flow immediately after it sends the response. What if that message
+to destroy the flow arrives _before_ the response? When do we destroy
+the state associated with the flow? Flows are not connections. Raw
+flows like the one used in oecho behave like UDP. No guarantees. Now,
+let's have a look at _reliable_ flows, which behave more like TCP.
+
+## A modification to oecho with reliable flows
+
+{{<figure width="90%" src="/blog/20211229-oecho-2.png">}}
+
+To use a reliable flow, we call a _flow\_alloc()_ from the client with
+a different QoS spec (qos_data). The flow allocation works exactly as
+before. The flow allocation request now contains a data QoS request
+instead of a raw QoS request. Upon reception of this request, the
+server will create a protocol machine for FRCP, the protocol in O7s
+that is in charge of delivering packets reliably, in-order and without
+duplicates. FRCP also performs flow control to avoid sending more
+packets than the server can process. When the flow allocation arrives
+at the client, it will also create an FRCP protocol instance. When
+these FRCP instances are created, they are in an initial state where
+the Delta-t timers are _timed out_. This is the state that allows
+starting a new _run_. I will not explain every detail of FRCP here,
+these are explained in the
+[protocols](/docs/concepts/protocols/#flow-and-retransmission-control-protocol-frcp)
+section.
+
+Now, the client sends its first packet, with a randomly chosen
+sequence number (100) and the Data Run Flag (DRF) enabled. The meaning
+of the DRF is that there were no _previously unacknowledged_ packets
+in the currently tracked packet sequence, and it allows to avoid a
+3-way handshake.
+
+When that packet with sequence number 100 arrives in the FRCP protocol
+machine at the server, it will detect that DRF is set to 1, and that
+it is in an initial state where all timers are timed out. It will
+start accepting packets for this new run starting with sequence number
+100. The server almost immediately sends a response packet back. It
+has no active sending run, so a random sequence number is chosen (300)
+and the DRF is set to 1. This packet will contain an acknowledgment
+for the received packet. FRCP acknowledgements contain the lowest
+acceptable packet number (so 101). After sending the packet, the
+server calls _dealloc()_, which will block on FRCP still having
+unacknowledged packets.
+
+Now the client gets the return packet, it has no active incoming run,
+the receiver connection is set to initial timed out state, and like
+the server, it will see that the DRF is set to 1, and accept this new
+incoming run starting from sequence number 300. The client has no data
+packets anymore, so the deallocation will send a _bare_
+acknowledgement for 301 and exit. At the server side, the
+_flow\_dealloc()_ call will exit after it receives the
+acknowledgement. Not drawn in the figure, is that the flow identifiers
+(EIDs) will only time out internally after a full Delta-t timeout. TCP
+does something similar and will not reused closed connection state for
+2 * Maximum Segment Lifetime (MSL).
+
+## Unexpected behaviour
+
+{{<figure width="90%" src="/blog/20211229-oecho-3.png">}}
+
+While playing around with the prototype, a modification was made to
+oecho as above: another _flow_read()_ was added to the client. As you
+can see from the diagram, there will never be a packet sent, and, if
+no timeout is set on the read() operation, after the server has
+deallocated the flow (and re-entered the loop to accept a new flow),
+the client will remain in limbo, forever stuck on the
+_flow\_read()_. And so, I got the following question:
+
+```
+I would have expected the second call to abort with an error
+code. However, the client gets stuck while the server is waiting for a
+new request. Is this expected? If so, is it possible to detect a
+half-closed connection?
+```
+
+## A _"half-closed connection"_
+
+So, first things first: the observation is correct, and that second
+call should (and soon will) exit on an error, as the flow is not valid
+anymore. Now it will only exit if there was an error in the FRCP
+connection (packet retransmission fails to receive an acknowledgment
+within a certain timeout). It should also exit on a remotely
+deallocated flow. But how will Ouroboros detect it?
+
+Now, a "half closed connection" comes from TCP. TCP afficionados will
+probably think that I need to add something to FRCP, like
+[FIN](https://www.googlecloudcommunity.com/gc/Cloud-Product-Articles/TCP-states-explained/ta-p/78462)
+at the end of TCP to signal the end of a flow[^4]:
+
+```
+TCP A TCP B
+
+ 1. ESTABLISHED ESTABLISHED
+
+ 2. (Close)
+ FIN-WAIT-1 --> <SEQ=100><ACK=300><CTL=FIN,ACK> --> CLOSE-WAIT
+
+ 3. FIN-WAIT-2 <-- <SEQ=300><ACK=101><CTL=ACK> <-- CLOSE-WAIT
+
+ 4. (Close)
+ TIME-WAIT <-- <SEQ=300><ACK=101><CTL=FIN,ACK> <-- LAST-ACK
+
+ 5. TIME-WAIT --> <SEQ=101><ACK=301><CTL=ACK> --< CLOSED
+
+ 6. (2 MSL) CLOSED
+```
+
+While FRCP performs functions that are present in TCP, not everything
+is so readily transferable. Purely from a design perspective, it's
+just not FRCP's job to keep a flow alive or detect if the flow is
+alive. It's job is to deliver packets reliably, and all it needs to do
+that job is present. But would adding FINs work?
+
+Well, the server can crash just before the dealloc() call, leaving it
+in the current situation (the client won't receive FINs). To resolve
+it, it would also need a keepalive mechanism. Yes, TCP also has a
+keepalive mechanism. And would adding that solve it? Not to my
+satisfaction. Because, Ouroboros flows are not connections, they don't
+always have an end-to-end protocol (FRCP) running[^5]. So if we add
+FIN and keepalive to FRCP, we would still need to add something
+_similar_ for flows that don't have FRCP. We would need to duplicate
+the keepalive functionality somewhere else. The main objective of O7s
+is to avoid functional duplication. So, can we kill all the birds with
+one stone? Detect flows that are down? Sure we can!
+
+## Flow liveness monitoring
+
+But we need to take a birds eye view of the flow first.
+
+On the server side, the allocated flow has a flow endpoint with
+internal Flow ID (FID 16), to which the oecho server writes using its
+flow descriptor, fd=71. On the client side, the client reads/writes
+from its fd=68, which behind the scenes is linking to the flow
+endpoint with ID 9. On the network side, the flow allocator in the
+IPCPs also reads and writes from these endpoints to transfer packets
+along the network. So, the flow endpoint marks the boundary between
+the "network".
+
+{{<figure width="80%" src="/blog/20211229-oecho-4.png">}}
+
+This is drawn in the figure above. I'll repeat it because it is
+important: the datastructure associated with a flow at the endpoints
+is this "flow endpoint". It forms the bridge between the application
+and the network layer. The role of the IRMd is to manage these
+endpoints and the associated datastructures.
+
+Flow deallocation is a two step process: both the IPCP and the
+application have a _dealloc()_ call. The endpoint is only destroyed if
+_both_ the application process and the IPCP signal they are done with
+it. So a _flow\_dealloc()_ from the application will kill only its use
+with the endpoint. This allows the IRMd to keep it alive until it
+sends an OK to the IPCP to also deallocate the flow and signal it is
+done with it. Usually, if all goes well, the application will
+deallocate the flow first.
+
+The IRMd also monitors all O7s processes. If it detects an application
+crashing, or an IPCP crashing, it will automatically perform that
+applications' half of the flow deallocation, but not the complete
+deallocation. If an IPCP crashes, applications still hold the FRCP
+state and can recover the connection over a different flow[^6].
+
+**Edit: the below section is not correct, but it's interesting to read
+anyway**[^7]. There is a new post, documenting the
+[actual implementation](/blog/2022/02/28/application-level-flow-liveness-monitoring/).
+
+So, now it should be clear that the liveness of a flow has to be
+detected in the flow allocator of the IPCPs, not in the application
+(again, reminder: FRCP state is maintained inside the application).
+The IPCP will detect that its flow has been deallocated locally
+(either intentionally or because of a crash).It's paramount to do it
+here, because of the recursive nature of the network. Flows are
+everywhere, also between "router machines"! Routers usually restrict
+themselves to raw flows. No retransmissions, no flow control, no fuss,
+that's all too expensive to perform at high rates. But they need to be
+able to detect links going down. In IP networks, the OSPF protocol may
+use something like Bi-directional Forwarding Detection (BFD) to detect
+failed adjacencies. And then applications may use TCP keepalive and
+FIN. Or HTTP keepalive. All unneeded functional duplication, symptoms
+of a messy architecture, at least in my book. In Ouroboros, this flow
+liveness check is implemented once, in the flow allocator. It is the
+only place in the Ouroboros system where liveness checks are
+needed. It handles failed allocation, broken connections, terminated
+or crashed applications. Clean. Shipshape. Nice and tidy. Spick and
+span. We call it Flow Liveness Monitoring (FLM).
+
+If I recall correctly, we implemented an FLM in the RINA/IRATI flow
+allocator years ago when we were working on PRISTINE and were trying
+to get loop-free alternate (LFA) routes working. This needed to detect
+flows going down. In Ouroboros it is not implemented yet. Maybe I'll
+add it in the near future. Time is in short supply, the items on my
+todo list are not.
+
+## Flows vs connections, a "layered" view
+
+To wrap it up, I tried to represent how O7s functionality is organized
+in a way similar to the OSI/TCP models. I omitted the "physical
+layer", which is handled by dedicated IPCP implementations, such as
+the ipcpd-local, ipcpd-eth, etc. It's not that important here. What is
+important is that O7s splits functionality that is in TCP/IP in two
+layers (L3/L4), into **3 independent layers**[^8] (and protocols). Let's
+go through O7s from bottom to top.
+
+{{<figure width="80%" src="/blog/20211229-oecho-5.png">}}
+
+Network forwarding layer, which moves packets between (unicast) IPCP
+data transfer components (the forwarding elements in the model).
+
+The network end-to-end layer does flow monitoring (the FLM explained
+in this post) and also congestion control/avoidance (preventing that
+applications can send more traffic than the network can handle). The
+lifetime of a flow starts at flow allocation, and ends when one of the
+peers deallocates the flow, or crashes, (or an IPCP at the client or
+server crashes).
+
+The application end-to-end layer does flow control (avoiding that
+client applications send more than the server application can handle)
+and reliability (taken care of by FRCP). But also integrity (e.g. a
+Cyclic Redundancy Check) to detect packet corruption and
+authentication and encryption are handled here. Each of these
+functions can be enabled/disabled indepenendently (and is derived from
+the QoS specification from the _flow\_alloc()_ call). In essence the
+lifetime of an FRCP connection is _infinite_ (see Watson's Delta-t
+paper if this sounds weird), but FRCP is subdivided in "data runs". A
+failure of a data run (i.e. an FRCP connection record times out with
+unacknowledged packets) is the only thing that causes an FRCP
+connection to terminate. It is up to the application how to deal with
+this. An FRCP connection can last at most as long as an application
+flow. It can potentially recover from IPCP crashes, but not from
+application crashes.
+
+Finally, the application session layer takes care of establishing,
+maintaining, synchronizing and terminating application
+sessions. Application sessions can be shorter, as long as, or longer
+than the duration of an application flow.
+
+Probably long enough for a blog post. Have yourselves a wonderful new
+year, and above all, stay curious!
+
+Dimitri
+
+[^1]: We are omiting the role of the Ouroboros daemons (IPCPd's and
+ IRMd) for now. There would be a name resolution step for "oecho"
+ to an address in the IPCPds. Also, the IRMd at the server side
+ brokers the flow allocation request to a valid oecho server. If
+ the server is not running when the flow allocation request
+ arrives at the IRMd, O7s can also start the oecho server
+ application _in response_ to a flow allocation request. But
+ going into those details are not needed for this discussion. We
+ focus solely on the application perspective here.
+
+[^2]: Flow allocation has no direct analogue in TCP or UDP, where the
+ protocol to be used and the destination port are known in
+ advance. In any case, flow allocation should not be confused
+ with a TCP 3-way handshake.
+
+[^3]: I will probably do another post on how flow allocation deals
+ with lost messages, as it is also an interesting subject.
+
+[^4]: Or even more bluntly tell me to "just use TCP instead of FRCP".
+
+[^5]: A UDP server that has clients exit or crash is also left to its
+ own devices to clean up the state associated with that UDP
+ socket.
+
+[^6]: This has not been implemented yet, and should make for a nice
+ demo.
+
+[^7]: After implementing the solution below, it became apparent to me
+ that something was off. I needed to leak the FRCP timeout into
+ the IPCP, which is a layer violation. I noted this fact in my
+ commit message, but after more thought, I decided to retract my
+ patch... it just couldn't be right. This layer violation didn't
+ come up when we implemented FLM in the Flow allocator RINA,
+ because RINA puts the whole retransmission logic (called DTCP)
+ in the IPCP.
+
+[^8]: The "recursive layer boundary" in the figure uses the word layer
+ in the sense of a RINA DIF. We didn't adopt the terminology DIF,
+ since it has special meaning in RINA, and O7s' recursive layers
+ are not interchangeable or compatible with RINA DIFs. \ No newline at end of file
diff --git a/content/en/blog/20211229-oecho-1.png b/content/en/blog/20211229-oecho-1.png
new file mode 100644
index 0000000..b15ccd9
--- /dev/null
+++ b/content/en/blog/20211229-oecho-1.png
Binary files differ
diff --git a/content/en/blog/20211229-oecho-2.png b/content/en/blog/20211229-oecho-2.png
new file mode 100644
index 0000000..ea59987
--- /dev/null
+++ b/content/en/blog/20211229-oecho-2.png
Binary files differ
diff --git a/content/en/blog/20211229-oecho-3.png b/content/en/blog/20211229-oecho-3.png
new file mode 100644
index 0000000..9c9152e
--- /dev/null
+++ b/content/en/blog/20211229-oecho-3.png
Binary files differ
diff --git a/content/en/blog/20211229-oecho-4.png b/content/en/blog/20211229-oecho-4.png
new file mode 100644
index 0000000..07e30e8
--- /dev/null
+++ b/content/en/blog/20211229-oecho-4.png
Binary files differ
diff --git a/content/en/blog/20211229-oecho-5.png b/content/en/blog/20211229-oecho-5.png
new file mode 100644
index 0000000..b2f01b7
--- /dev/null
+++ b/content/en/blog/20211229-oecho-5.png
Binary files differ
diff --git a/content/en/blog/20220206-hole-punching.md b/content/en/blog/20220206-hole-punching.md
new file mode 100644
index 0000000..395d1d7
--- /dev/null
+++ b/content/en/blog/20220206-hole-punching.md
@@ -0,0 +1,99 @@
+---
+date: 2022-02-06
+title: "Decentralized UDP hole punching"
+linkTitle: "Decentralized hole punching"
+description: >
+ Can we make O7s-over-UDP scale with many nodes behind firewalls?
+author: Dimitri Staessens
+---
+
+Today, Max Inden from the libp2p project gave a very interesting
+presentation at FOSDEM 2022 about decentralized hole punching, project
+Flare.
+
+The problem is this: if servers A and B are each behind a (possibly
+symmetric) NAT firewall, they can't _directly_ communicate unless the
+firewall opens some ports from the external source to the internal LAN
+destination. Let's assume A's NAT has public address 1.1.1.1 and B's
+NAT has public address 2.2.2.2. If A runs a service, lets say a web
+server on its local LAN address 192.168.0.1 on port 443[^1] -- B cannot
+connect to this server directly. The firewall for A will need to
+forward some port on the public address 1.1.1.1:X to the internal
+address 192.168.0.1:443. If B is also behind a NAT firewall, that
+firewall will need to forward a port on 2.2.2.2:Y towards 1.1.1.1:X.
+In a symmetric NAT, the firewall rule is tied to the remote address,
+so once established, another node will not be able to send traffic to
+1.1.1.1:X, only B can from 2.2.2.2:Y. That's why centralized solutions
+like [STUN](https://en.wikipedia.org/wiki/STUN) may fail on symmetric
+NATs.
+
+What Max describes is basically a timing attack on a NAT firewall. I
+definitely recommend you
+[watch it](https://fosdem.org/2022/schedule/event/peer_to_peer_hole_punching_without_centralized_infrastructure/)
+when the talk becomes available. The specification can be found
+[here](https://github.com/libp2p/specs/blob/master/relay/DCUtR.md)
+Instead of using a central server, consider the following.
+
+If A sends a packet to 2.2.2.2:Y, it will open up a temporary hole it
+its firewall (1.1.1.1:X <-> 2.2.2.2:Y) for the response to
+arrive[^2]. If B sends a packet to 1.1.1.1:X, it will also create a
+temporary hole in its firewall (2.2.2.2:Y <-> 1.1.1.1:X). So, if both
+do this roughly _at the same time_, the packets can slip through, the
+firewall rules become established and B can communicate with A! Pretty
+nifty!
+
+Whether this is "decentralized" is a bit debatable, because there
+needs to be some coordination between A and B to get the timing
+right. And what I don't fully understand (yet), is how the ports X and
+Y are known at the time of the hole punching. I *think* there is some
+guesswork involved based on the ports that A and B used to contact the
+node(s) that provided the synchronization information, as NAT
+firewalls may use sequential allocation of these ports. I will try to
+find out more (or read the code).
+
+How would this benefit Ouroboros? Well, most likely exactly the same
+as libp2p. Firewalls do not pose a connectivity issue, but they do
+pose a scalability issue.
+
+The ipcpd-udp allows running Ouroboros over UDP (over IPv4). What it
+does is create a point-to-point UDP datagram stream with another
+ipcpd-udp. We have redesigned the inner workings a couple of times --
+mainly how the ipcpd-udp juggles around UDP ports. At first, we wanted
+it to mimic how a real unicast IPCP works -- listening on a fixed port
+for incoming requests, and then use randomly chosen ports on either
+side for the actual Ouroboros data 'flow'. But that was quickly thrown
+out because of -- you guessed it -- firewalls, in favor of using the
+listening port also for the incoming 07s data flows. That way, all
+that was needed was to open up a single port on a firewall. This
+opening up the firewall port was also needed for creating
+connections. The reasoning being that we wanted anyone that would
+connect TO the network, also accept incoming connections FROM the
+network. This would ensure that we could create any mesh between the
+Ouroboros nodes. But after some further deliberation, we caved in and
+made the ipcpd-udp behave like a normal UDP service, allowing incoming
+connection even if the remote "client" ipcpd-udp was not publicly
+available.
+
+{{<figure width="40%" src="/blog/20220206-hole-punching.png">}}
+
+So this is the current situation shown above left. The red squares
+represent nodes that are not publicly reachable, the green ones nodes
+that are. By allowing the red nodes, the network will look less like a
+mesh, and more like a centralized 'star' network, putting extra load
+on the "central" green server. What this hole punching technique
+would allow us to do, is to add a (distributed) auxiliary program on
+top the Ouroboros layer that coordinates the hole punching for the UDP
+connectivity so we can add some 'direct' links at the UDP level.
+Definitely something I'll consider later on.
+
+So, if you haven't already, have a look at Max's
+[talk](https://fosdem.org/2022/schedule/event/peer_to_peer_hole_punching_without_centralized_infrastructure/).
+
+Cheers,
+
+Dimitri
+
+[^1]: It works for both TCP and UDP, so I will not specify this further.
+
+[^2]: Providing the firewall doesn't block all outbound traffic on port Y
+ or some other rule that prevents it.
diff --git a/content/en/blog/20220206-hole-punching.png b/content/en/blog/20220206-hole-punching.png
new file mode 100644
index 0000000..742c064
--- /dev/null
+++ b/content/en/blog/20220206-hole-punching.png
Binary files differ
diff --git a/content/en/blog/20220212-mvc.png b/content/en/blog/20220212-mvc.png
new file mode 100644
index 0000000..6d4fff8
--- /dev/null
+++ b/content/en/blog/20220212-mvc.png
Binary files differ
diff --git a/content/en/blog/20220212-tcp-ip-architecture.md b/content/en/blog/20220212-tcp-ip-architecture.md
new file mode 100644
index 0000000..49fea7a
--- /dev/null
+++ b/content/en/blog/20220212-tcp-ip-architecture.md
@@ -0,0 +1,449 @@
+---
+date: 2022-02-12
+title: "What is wrong with the architecture of the Internet?"
+linkTitle: "What is wrong with the architecture of the Internet?"
+description: "A hard look at the root cause of most problems"
+author: Dimitri Staessens
+---
+
+```
+There are two ways of constructing a software design: One way is to
+make it so simple that there are obviously no deficiencies, and the
+other way is to make it so complicated that there are no obvious
+deficiencies. The first method is far more difficult. -- Tony Hoare
+```
+
+## Introduction
+
+There are two important design principles in computer science that are
+absolutely imperative in keeping the architectural complexity of any
+technological solution (not just computer programs) in check:
+[separation of concerns](https://en.wikipedia.org/wiki/Separation_of_concerns)
+and
+[separation of mechanism and policy](https://en.wikipedia.org/wiki/Separation_of_mechanism_and_policy).
+
+There is no simple 2-line definition of these principles, but here's
+how I think about them. _Separation of concerns_ allows one to break
+down a complex solution into *different subparts* that can be
+implemented independently and in parallel and then integrated into the
+completed solution. _Separation of mechanism_ and _policy_, when
+applied to software abstractions, allows the *same subpart* to be
+implemented many times in parallel, with each implementation applying
+different solutions to the problem.
+
+Both these design principles require the architect to create
+abstractions and define interfaces, but the emphasis differs a
+bit. With separation of concerns, the interfaces define an interaction
+between different components, while with separation of mechanism and
+policy, the interfaces define an interaction towards different
+implementations, basically separating the _what the implementation
+should do_ from the _how the implementation does it_. An interface
+that fully embraces one of these principles, usually embraces the
+other.
+
+One of the best known examples of separation of concerns is the
+_model-view-controller_ design pattern:
+
+{{<figure width="20%" src="/blog/20220212-mvc.png">}}
+
+The model is concerned with the maintaining the state of the
+application, the view is concerned with presenting the state of the
+application, and the controller is concerned with manipulating the
+state of the application. The keywords associated with good separation
+of concerns are modularity and information hiding. The view doesn't
+need to know the rules for manipulating the model, and the controller
+doesn't need to know how to present the model.
+
+As very simple example for separation of mechanism and policy is the
+_mechanism_ sort - returning a list of elements in some order - which
+can be implemented by different _policies_ quick-sort, bubble-sort or
+insertion-sort. But that's not all there's to it. The key is to hide
+the policy details from the interface into the mechanism. For sort
+this is simple, for instance, sort(list, order='descending') would be
+an obvious API for a sort mechanism. But it goes much further than
+that. Good separation of mechanism and policy requires abstracting
+every aspect of the solution behind an implementation-agnostic
+interface. That is far from obvious.
+
+## Trade-offs
+
+Violations of these design principles can cause a world of hurt. In
+most cases, they do not cause problems with functionality. Even bad
+designs can be made to work. They cause development friction and
+resistance to large-scale changes in the solution. Separation of
+concerns violations make the application less maintainable because
+changes to some part cascade into other parts, causing _spaghetti
+code_. Violation of separation of mechanism and policy make an
+application less nimble because some choices get anchored in the
+solution, for instance the choice for a certain encryption library or
+a certain database solution and directly calling these proprietary
+APIs from all parts of the application. This tightly locked in
+dependency can cause serious problems if these dependencies cease to
+be available (deprecation) or show serious defects.
+
+Good design lets development velocities add up. Bad design choices
+slow development because progress that should be independent starts to
+interlock. Ever tried running with your shoelaces knotted to someone
+else's? Whenever one makes a step forward, the other has to catch up.
+
+Often, violations against these 2 principles are made in the name of
+optimization. Let's have a quick look at the trade-offs.
+
+Separation of concerns can have a performance impact, so a choice has
+to be made between the current performance, and future development
+velocity. In most cases, code that violates separation of concerns is
+harder to adapt and (much) harder to thoroughly _test_. My
+recommendation for developers is to approach such situations by first
+creating the API and implementation _respecting_ separation of
+concerns and then after very careful consideration, create a separate
+additional low-level optimized API with an optimized
+implementation. Then the optimized implementation can be tested (and
+performance-evaluated) against the functionality (and performance) of
+the non-optimized one. If later on, functionality needs to be added to
+the implementation, having the non-optimized path will prove a
+timesaver.
+
+Separation of mechanism and policy usually has less of a direct
+performance impact, and the tradeoff is commonly future development
+velocity versus current development time. So if this principle is not
+respected by choice, the driver for it is usually time pressure. If
+only a single implementation is used what is the point of abstracting
+the mechanism behind an API? More often than not, though, violations
+against mechanism/policy just creep in unnoticed. The negative
+implications are usually only felt a long way down the line.
+
+But we haven't even gotten to the _hardest_ part yet. A well-known
+phrase is that there are 2 hard things in computer science: cache
+invalidation and naming things (and off-by-one errors). I think it
+misses one: _identifying concerns_. Or in other words: finding the
+_right_ abstraction. How do we know when an abstraction is the right
+one? Designs with obvious defects will usually be discarded quickly,
+but most design glitches are not obvious. There is a reason that Don
+Knuth named his tome "The _Art_ of Computer Programming". How can we
+compare abstractions, can we quantify elegance or is it just _taste_?
+How much of the complexity of a solution is inherent in the problem,
+and how much complication is added because of imperfect abstraction?
+I don't have an answer to this one.
+
+A commonly used term in software engineering for all these problems is
+_technical debt_. Technical debt is to software as entropy is to the
+Universe. It's safe to state that in any large project, technical debt
+is inevitable and will only accumulate. Fixing technical debt requires
+investing a lot of time and effort and usually brings little immediate
+return on this investment. The first engineering manager that happily
+invests time and money towards refactoring has yet to be born.
+
+## Layer violations in the TCP/IP architecture
+
+Now what have this _software development_ principles to do with the
+architecture of the TCP/IP Internet[^1]?
+
+I find it funny that the wikipedia page uses the Internet's layered
+architecture as an example for
+[separation of concerns](https://en.wikipedia.org/wiki/Separation_of_concerns)
+because I use it as an example of violations against it.
+The _intent_ is surely there, but the execution is completely lacking.
+
+Let's take some examples the common TCP/IP/Ethernet stack violates the
+2 precious design principles. In a layered architecture (like computer
+network architectures), they are called _layer violations_.
+
+Layer 1: At the physical layer, Ethernet has a minimum frame size,
+which is required to accurately detect collisions. For 10/100Mbit this
+is 64 bytes. Shorter frames must be _padded_. How to distinguish the
+padding from a packet which actually has zeros at the end of its data?
+Well, Ethernet has a _length_ field in the MAC header. But in DIX
+Ethernet that is an Ethertype, so a _length_ field in the IP header is
+used (both IPv4 as IPv6). A Layer 1 problem is actually propagated
+into Layer 2 and even Layer 3. Gigabit Ethernet has an even larger
+minimum frame sizes (512 bytes), however, the padding is properly (and
+efficiently!) taken care of at Layer 1 by a feature called Carrier
+Extension.
+
+Layer 2: The Ethernet II frame has an
+[Ethertype](https://en.wikipedia.org/wiki/EtherType#Values),
+which is also a layer violation, specifying the encapsulated
+higher-layer protocol. 0x800 for IPv4, 0x86DD for IPv6, 0x8100 for
+tagged VLANs, etc.
+
+Layer 3: Similarly as the Ethertype, IP has a
+[protocol](https://en.wikipedia.org/wiki/List_of_IP_protocol_numbers)
+field, specifying the carried protocol. UDP = 17, TCP = 6. Other tight
+couplings between Layer 2 and Layer 3 are, IGMP snooping and even
+basic routing[^2]. One thing worth noting, and often disregarded in
+course materials on computer networks, is that OSI's 7 layers each had
+a _service definition_ that abstracts the function of each layer away
+from the other layers so these layers can be developed
+independently. TCP/IP's implementation was mapped to the OSI layers,
+usually compressed to 5-layers, but TCP/IP _has no such service
+definitions_. The interfaces into Layer 2 and Layer 3 basically _are_
+the protocol definitions. Craft a valid packet according to the rules
+and send it along.
+
+Layer 4: My favorite.
+[Well-known ports](https://en.wikipedia.org/wiki/List_of_TCP_and_UDP_port_numbers).
+HTTP: TCP port 80, HTTPs: TCP port 443, UDP port 443 is now
+predominantly QUIC/HTTP3 traffic. This of course creates a direct
+dependency between application protocols and the network.
+
+Explaining these layer violations to a TCP/IP network engineer is like
+explaining inconsistencies and contradictions in the bible to a
+priest. So why do I care so much, and a lot of IT professionals brush
+this off as nitpicking? Let's first look at what I think are the
+consequences of these seemingly insignificant pet peeves of mine.
+
+## Network Ossification
+
+The term _ossification of the Internet_ is sometimes used to describe
+difficulties in making sizeable changes within the TCP/IP network
+stack -- a lack of _evolvability_. For most technologies, there is a
+steady cycle of innovation, adoption and deprecation. Remember DOS,
+OS/2, Windows 3.1, Windows 95? Right, that's what they are: once
+ubiquitous, now mostly memories. In contrast, "Next Generation
+Internet" designs are mostly "Current Generation Internet +". Plus AI
+and machine learning, plus digital ledger/blockchain, plus big data,
+plus augmented/virtual reality, plust platform as a service, plus
+ubiquitous surveillance. At the physical layer, there's the push for
+higher bandwidth, both in the wireless and wired domains (optics and
+electronics) and at planetary (satellite links) and microscopic
+(nanonetworks) scales. A lot of innovation at the top and the bottom
+of the 7-layer model, but almost none in the core "networking" layers.
+
+The prime example for the low evolvability of the 'net is of course
+the adoption of IPv6, which is now slogging into its third
+decade. Now, if you think IPv6 adoption is taking long, contemplate
+how long it would take to _deprecate_ IPv4. The reason for this is not
+hard to find. There is no separation between mechanism and policy --
+no service interface -- at Layer 3[^3]. Craft a packet from the
+application and send it along. A lot of applications manipulate IP
+addresses and TCP or UDP ports all over their code and
+configurations. The difficulties in deploying IPv6 have been taken as
+a rationale that replacing core network protocols is inherently hard,
+rather than the symptom of an obvious defect in the interfaces between
+the logical assembly blocks of the current Internet.
+
+For application programmers, the network itself has so little
+abstraction that the problem is basically bypassed alltogether by
+implementing protocols _on top of_ the 7-layer stack. Far more
+applications are now developed on top of HTTP's connection model and
+its primitives (PUT/GET/POST, ...) resulting in so-called RESTful
+APIs, than on top of TCP. This alleviates at least some of the burden
+of server-side port management as it can be left a frontend web server
+application (Apache/Nginx). It much easier to use a textual URI to
+reach an application than to assign and manage TCP ports on public
+interfaces and having to disseminate them accross the
+network[^4]. Especially in a microservice architecture where hundreds
+of small, tailored daemons, often distributed across many machines
+that themselves have interfaces in different IP subnets and different
+VLANs, working together to provide a scalable and reliable end-user
+service. Setting such a service up is one thing. When a reorganization
+in the datacenter happens, moving such a microservice deployment more
+often than not means redoing a lot of the network configuration.
+
+Innovating on top of HTTP, instead of on top of TCP or UDP may be
+convenient for the application developer, it is not the be-all and
+end-all solution. HTTP1/2 is TCP-based, and thus far from optimal for
+voice communications and other realtime applications such as
+aumented/virtual reality, now branded the _metaverse_.
+
+The additional complexities in developing applications that directly
+interface with network protocols, compared to the simplicity offered
+by developing on top of HTTP primitives may drive developers away from
+even attempting it, choosing the 'easy route' and further reduce true
+innovation in networking protocols. Out of sight, out of mind. Since
+the money goes where the (perceived) value is, and it's hard to
+deprecate anything, the protocol real-estate between IP and HTTP that
+is not on the direct IP/TCP/HTTP (or IP/UDP/HTTP3) path may fall into
+further disarray.
+
+We have experienced something similar when testing Ouroboros using our
+IEEE 802.2 LLC adaptation layer (the ipcpd-eth-llc). IEEE 802.2 is not
+used that often anymore, most 802.2 LLC traffic that we spotted on our
+networks were network printers, and the wireless routers were
+forwarding 802.2 packets with all kinds of weird defects. Out of
+sight, out of mind. This brings us nicely to the next problem.
+
+## Protocol ossification
+
+Let's kick this one off with an example. HTTP3[^5] is designed on top
+of UDP. It could have run on top of IP. The reason why it's not is
+mentioned in the original QUIC protocol documentation,
+[RFC 9000](https://datatracker.ietf.org/doc/html/rfc9000):
+_QUIC packets are carried in UDP datagrams to better facilitate
+deployment in existing systems and networks_. What it's basically
+saying is also what we have encountered evaluating new network
+prototypes (RINA and Ouroboros) directly over IP: putting an
+non-standard protocol number in an IP packet will cause any router
+along the way to just drop it. If even _Google_ thinks it's futile...
+
+This is an example of what is referred to as
+[protocol ossification](https://en.wikipedia.org/wiki/Protocol_ossification).
+If a protocol is designed with a flexible structure, but that
+flexibility is never used in practice, some implementation is going to
+assume it is constant.
+
+Instead of the IP "Protocol" field in routers that I used in the
+example above, the usual examples are _middleboxes_ -- hardware that
+perform all kinds of shenanigans on unsuspecting TCP/IP packets. The
+reason why these boxes _can_ work is because of the violations of the
+two important design principles. The example from the wikipedia page,
+on how version negotiation in TLS1.3 was
+[preventing it from getting deployed](https://blog.cloudflare.com/why-tls-1-3-isnt-in-browsers-yet/),
+is telling.
+
+But it happens deeper down the network stack as well. When we were
+working on
+[the IRATI prototype](https://irati.eu/),
+we wanted to run RINA over Ethernet. The obvious thing to do is to use
+the ARP protocol. Its specification,
+[RFC826](https://datatracker.ietf.org/doc/html/rfc826),
+allows any protocol address (L3) to be mapped to a hardware address (L2).
+So we were going to map RINA names, with a capped length of max 256 bytes
+to adhere to ARP, to Ethernet addresses.
+But in the Linux kernel,
+[ARP only supports IP](https://github.com/torvalds/linux/blob/master/net/ipv4/arp.c#L7).
+I can guarantee that with all the architectural defects in the TCP/IP
+stack, that "future" mentioned in the code comment will likely never
+come. Sander actually implemented
+[an RFC826-compliant ARP Linux Kernel Module](https://github.com/IRATI/stack/blob/master/kernel/rinarp/arp826.h)
+when working on IRATI. And we had to move it to a
+[different Ethertype](https://github.com/IRATI/stack/blob/master/kernel/rinarp/arp826.h#L29),
+because the Ethernet switches along the way were dropping the RFC-compliant
+packets as suspicious!
+
+## A message falling into deaf ears
+
+So, why do we care so much about this, why so many in the network
+research community seem not to?
+
+The (continuing) journey that is Ouroboros has its roots in EC-funded
+research into the Recursive Network Architecture (RINA)[^6]. A couple
+of comments that we received at review meetings or some peer reviews
+from papers stuck with me. I won't go into the details of who, what,
+where and when. All these reviewers were, and still are, top experts
+in their respective fields. But they do present a bit of a picture of
+what I think is the problem when communicating about core
+architectural concepts within the network community[^7].
+
+One comment that popped up, many times actually, is _"I'm no software
+engineer"_. The research projects were very heavy on actual software
+development, so, since we had our interfaces actually implemented, it
+was only natural to us to present them from code. I'm the first to
+agree that _implementation details_ do not matter. There surely is no
+point going over every line of code. But, as long as we stuck to
+component diagrams and how they interact, everything was fine. But
+when the _interfaces_ came up, the actual primitives that detailed
+what information was exchanged between components, interest was
+gone. Those interfaces are what make the architecture shine. We spent
+_literally_ months refining them. At one review, when we started
+detailing these software APIs, there was a direct proposal from one of
+the evaluation experts to "skip this work package and go directly to
+the prototype demonstrations". I kid you not.
+
+This exemplifies something that I've often felt. A bit of a disdain
+for anything that even remotely smells like implementation work by
+those involved in research and standardization. Becoming adept in the
+_principles of separation and policy_ and _separation of concern_ is a
+matter of honing ones' skill, not accumulation of knowledge. If
+software developers break the principles it leads to spaghetti code.
+Breaking them at the level of standards leads to spaghetti standards.
+And there can't be a clean implementation of a spaghetti standard.
+
+The second comment I recall vividly is "I'm looking for the juicy
+bits", and it's derivatives "What can I take away from this
+project?". A new architecture was not interesting unless we could
+demonstrate new capabilities. We were building a new house on a
+different set of foundations. The reviewers would happily take a look,
+but what they were _really_ interested in, was knocking off the
+furniture. Our plan was really the same, but the other way
+around. Ouroboros (and RINA) aren't about optimizations and new
+capabilities. At least not yet. The point of doing the new
+architecture is to get rid of the ossification, so that when future
+innovations arrive, they can easily be adopted.
+
+## Wrapping up
+
+The core architecture of the Internet is not 'done'. As long as the
+overwhelming consensus is that _"It's good enough"_ that is exactly
+what it will not be. A house built on an unstable foundation can't be
+fixed by replacing the furniture. Plastering the walls might make it
+look more appealing, and fancy furniture might even make it feel
+temporarily like "home" again. But however shiny the new furniture,
+however comfortable the new queen-sized bed, at some point in time the
+once barely-noticeable rot seeping through the walls becomes ever
+more apparent, ever more annoying, ever harder to ignore,
+until the only remaining option is to move out.
+
+When that realization comes, know that some of us have already started
+building on a different foundation.
+
+As always, stay curious.
+
+Dimitri
+
+[^1]: I use Internet in a restrictive sense, meaning the
+ packet-switched TCP/IP network on top of the (optical) support
+ backbones, not for the wider ecosystem on top of (and including)
+ the _world-wide-web_.
+
+[^2]: How do IPv4 packets reach the default IP gateway? A direct
+ lookup by L3 into the L2 ARP table! And why would IPv6 even
+ consider including the MAC address in the IP address if these
+ layers were independent?
+
+[^3]: Having an API is of course no guarantee to fast paced innovation
+ or revolutionary breakthroughs. The slowing innovation into
+ Operating Systems Architecture is partly because of the appeal
+ of compatibility with current standards. Rather than rethinking
+ the primitives for interacting with the OS and providing an
+ adaptation layer for backwards compatibility, performance
+ concerns more often than not nip such approaches in the bud
+ before they are even attempted. Optimization really is the root
+ of all evil. But at least, within the primitives specified by
+ POSIX, monokernels, unikernels, microkernels are still being
+ researched and developed. An API is better than no API.
+
+[^4]: As an example, you reach the microservice on
+ "https://serverurl/service" instead of on
+ "https://serverurl:7639/". This can then redirect to the service
+ on the localhost/loopback interface on the (virtual) machine,
+ and the (TCP) port assigned to the service only needs to be
+ known on that local (virtual) machine. In this way, a single
+ machine can run many microservice components and only expose the
+ HTTPS/HTTP3 port (tcp/udp 443) on external interfaces.
+
+[^5]: HTTP3 is really interesting from an architectural perspective as
+ it optimizes between application layer requests and the network
+ transport protocol. The key problem -- called _head of line
+ blocking_ -- in HTTP2 is, very roughly, this: HTTP2 allows
+ parallel HTTP requests over a single TCP connection to the
+ server. For instance, when requesting an HTML page with many
+ photographs, request all the photographs at the same time and
+ receive them in parallel. But TCP is a single byte stream, it
+ does not know about these parallel requests. If there is packet
+ lost, TCP will wait for the re-transmissions, potentially
+ blocking all the other requests for the other images even if
+ they were not affected by the lost packets. Creating multiple
+ connections for each request also has big overhead. QUIC, on the
+ other hand integrates things so that the requests are also
+ handled in parallel in the re-transmission logic. Interestingly,
+ this maps well onto Ouroboros' architecture which has a
+ distinction between flows and the FRCP connections that do the
+ bookkeeping for re-transmission. To do something like HTTP3
+ would mean allowing parallel FRCP connections within a flow,
+ something we always envisioned and will definitely implement at
+ some point, and mapping parallel application requests on these
+ FRCP connections. How to do HTTP3/QUIC within Ouroboros' flows
+ + parallel FRCP could make a nice PhD topic for someone. But I
+ digress, and I was already digressing.
+
+[^6]: This is the [story all about how](/blog/2021/03/20/how-does-ouroboros-relate-to-rina-the-recursive-internetwork-architecture/).
+
+[^7]: These are a few examples are to highlight what I think is a core
+ difference in priorities between what we tried to achieve with
+ the projects -- a flexible architecture in the long term --
+ versus what most current research and development is targeted at
+ -- fixes for urgent problems and improvements in the short
+ term. I want to stress that we were never treated unfairly by
+ any reviewer, and this section should not be read as a complaint
+ of any sort.
diff --git a/content/en/blog/20220220-half-deallocated-flows.md b/content/en/blog/20220220-half-deallocated-flows.md
new file mode 100644
index 0000000..e7d34dd
--- /dev/null
+++ b/content/en/blog/20220220-half-deallocated-flows.md
@@ -0,0 +1,113 @@
+---
+date: 2022-02-20
+title: "Half-deallocated flows"
+linkTitle: "Flows vs connections/sockets (2)"
+author: Dimitri Staessens
+---
+
+A few weeks back I wrote a post about Ouroboros flows vs TCP
+connections, and how "half-closed connections" should be handled in
+the Ouroboros architecture. This was very basic functionality that was
+sorely missing. You can refresh your memory on that
+[post](/blog/2021/12/29/behaviour-of-ouroboros-flows-vs-udp-sockets-and-tcp-connections/sockets/)
+if needed.
+
+Today I wrapped up an initial implementation without whistles and
+bells (fixed timeout at 120s), and I'll share a bit with you how it
+works.
+
+The modified oecho application looks as follows (decluttered). On the
+server side, we have:
+
+```C
+ while (true) {
+ fd = flow_accept(NULL, NULL);
+
+ printf("New flow.\n");
+
+ count = flow_read(fd, &buf, BUF_SIZE);
+
+ printf("Message from client is %.*s.\n", (int) count, buf);
+
+ flow_write(fd, buf, count);
+
+ flow_dealloc(fd);
+ }
+
+ return 0;
+```
+And on the client side, we have:
+
+```C
+ char * message = "Client says hi!";
+ qosspec_t qs = qos_raw;
+
+ fd = flow_alloc("oecho", &qs, NULL);
+
+ flow_write(fd, message, strlen(message) + 1);
+
+ count = flow_read(fd, buf, BUF_SIZE);
+
+ printf("Server replied with %.*s\n", (int) count, buf);
+
+ /* The server has deallocated the flow, this read should fail. */
+ count = flow_read(fd, buf, BUF_SIZE);
+ if (count < 0) {
+ printf("Failed to read packet: %zd.\n", count);
+ flow_dealloc(fd);
+ return -1;
+ }
+
+ flow_dealloc(fd);
+
+```
+
+Previously, the second flow_read would hang forever, (unless a timeout
+was set on the read operation using fccntl, which we didn't do).
+
+Now the IPCP will detect the peer as gone, and mark the flow as DOWN
+to the application.
+
+```
+[dstaesse@heteropoda website]$ oecho
+Server replied with Client says hi!
+Failed to read packet: -1005.
+```
+
+We can see this in a simple test case over the
+loopback adapter:
+
+```
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: Flow on flow_id 13 allocated.
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: Flow on flow_id 12 allocated.
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: Partial deallocation of flow_id 13 by process 70597.
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: Completed deallocation of flow_id 13 by process 70534.
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: New instance (70597) of oecho added.
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: This process accepts flows for:
+feb 20 18:50:06 heteropoda irmd[70364]: irmd: oecho
+feb 20 18:52:13 heteropoda ipcpd-unicast[70405]: flow-allocator: Flow 66 down: Unresponsive peer.
+feb 20 18:52:13 heteropoda irmd[70364]: irmd: Partial deallocation of flow_id 12 by process 70598.
+feb 20 18:52:13 heteropoda irmd[70364]: irmd: Completed deallocation of flow_id 12 by process 70405.
+feb 20 18:52:13 heteropoda irmd[70364]: irmd: Dead process removed: 70598.
+```
+
+In the first 2 lines, the flow between the oecho client and server is
+allocated, creating a flow endpoint 13 at the server side, and flow
+endpoint id 12 at the client side. Then the server calls flow_dealloc
+and the flow is deallocated (lines 3 and 4). The server re-enters its
+accept loop, and it's ready for new incoming flow requests (lines
+5-7). About 2 minutes later, the flow liveness mechanism in the flow
+allocator at the client side detects that the remote is gone, and
+flags the flow as DOWN (line 8). After that, the client's read call
+terminates and the client calls dealloc, after which the flow is
+deallocated (lines 9-10) and the client exits (last line).
+
+Note that works independent of the QoS of the flow. I'll add a
+configurable timeout soon, and it will work at any scale, from seconds
+to years. I thought seconds should be small enough, but if anyone
+makes a good case for timing out flows at sub-second timescales, I'll
+happily enable it.
+
+Stay curious,
+
+Dimitri \ No newline at end of file
diff --git a/content/en/blog/20220228-flm-app.png b/content/en/blog/20220228-flm-app.png
new file mode 100644
index 0000000..13df9bd
--- /dev/null
+++ b/content/en/blog/20220228-flm-app.png
Binary files differ
diff --git a/content/en/blog/20220228-flow-liveness-monitoring.md b/content/en/blog/20220228-flow-liveness-monitoring.md
new file mode 100644
index 0000000..e94d5a0
--- /dev/null
+++ b/content/en/blog/20220228-flow-liveness-monitoring.md
@@ -0,0 +1,150 @@
+---
+date: 2022-02-28
+title: "Application-level flow liveness monitoring"
+linkTitle: "Flows vs connections/sockets (3)"
+author: Dimitri Staessens
+---
+
+This week I completed the (probably final) implementation of flow
+liveness monitoring, but now in the application. In the next prototype
+version (0.19) Ouroboros will allow setting a keepalive timeout on
+flows. If there is no other traffic to send, either side will send
+periodic keepalive packets to keep the flow alive. If no activity has
+been observed for the keepalive time, the peer will be considered
+down, and IPC calls (flow_read / flow_write) will fail with
+-EFLOWPEER. This does not remove any flow state in the system, it only
+notifies each side that the peer is unresponsive (presumed dead,
+either it crashed, or deallocated the flow). It's up to the
+application how to respond to this event.
+
+The duration can be set using the timeout value on the QoS
+specification. It is specified in milliseconds, currently as a 32-bit
+unsigned integer. This allows timeouts up to about 50 days. Each side
+will send a keepalive packet at 1/4 of the specified period (not
+configurable yet, but this may be useful at some point). To disable
+keepalive, set the timeout to 0. I've set the current default value to
+2 minutes, but I'm open to other suggestions.
+
+The modified oecho application looks as follows (decluttered). On the
+server side, we have:
+
+```C
+ while (true) {
+ fd = flow_accept(NULL, NULL);
+
+ printf("New flow.\n");
+
+ count = flow_read(fd, &buf, BUF_SIZE);
+
+ printf("Message from client is %.*s.\n", (int) count, buf);
+
+ flow_write(fd, buf, count);
+
+ flow_dealloc(fd);
+ }
+
+ return 0;
+```
+
+And on the client side, the following example sets a keepalive of 4 seconds:
+```C
+ char * message = "Client says hi!";
+ qosspec_t qs = qos_raw;
+ qs.timeout = 4000;
+
+ fd = flow_alloc("oecho", &qs, NULL);
+
+ flow_write(fd, message, strlen(message) + 1);
+
+ count = flow_read(fd, buf, BUF_SIZE);
+
+ printf("Server replied with %.*s\n", (int) count, buf);
+
+ /* The server has deallocated the flow, this read should fail. */
+ count = flow_read(fd, buf, BUF_SIZE);
+ if (count < 0) {
+ printf("Failed to read packet: %zd.\n", count);
+ flow_dealloc(fd);
+ return -1;
+ }
+
+ flow_dealloc(fd);
+```
+
+Running the client against the server will result in (1006 indicates EFLOWPEER).
+
+```
+[dstaesse@heteropoda website]$ oecho
+Server replied with Client says hi!
+Failed to read packet: -1006.
+```
+
+How does it work?
+
+In the
+[first post on this topic](/blog/2021/12/29/behaviour-of-ouroboros-flows-vs-udp-sockets-and-tcp-connections/sockets/),
+I explained my reasoning how Ouroboros should deal with half-closed
+flows (flow deallocation from one side should eventually result in a
+terminated flow at the other side). The implementation should work
+with any kind of flow, which means we can't put in the the FRCP
+protocol. And thus, I argued, it had to be below the application, in
+the flow allocator. This is also where we implemented it in RINA a few
+years back, so it was easy to think this would directly translate to
+O7s. I was convinced it was right.
+
+I was wrong.
+
+After the initial implementation, I noticed that I needed to leak the
+FRCP timeout (remaining Delta-t) into the IPCP. I was not planning on
+doing that, as it's a _layer violation_. In RINA that is not as
+obvious, as DTCP is already in the IPCP. But in O7s, the deallocation
+first waits for Delta-t to expire in the application[^1] before
+telling the IPCP to get rid of the flow (where it's an instantaneous
+operation). This means that for flows with retransmission, the
+keepalive timeout will first wait for the peers' Delta-t timer to
+expire (because the flow isn't deallocated in the peer's IPCP until it
+does), and then again wait for the keepalive to expire in it's own
+IPCP. With 2 minutes each, that means the application would only
+timeout after 4 minutes after the deallocation. To solve that with
+keepalive in the flow allocator, I would need to pass the timeout to
+the flow allocator, and on dealloc tell it to stop sending keepalives,
+and wait for the longest of the [keepalive, delta-t] to expire before
+getting rid of the flow state. It would work, it wouldn't even be a
+huge mess to most eyes. But it bugged me tremendously. It had to be in
+the application, as shown in the figure below.
+
+{{<figure width="80%" src="/blog/20220228-flm-app.png">}}
+
+But this poses a different problem: how to spot keepalive packets from
+regular traffic. As I said many times before, it can't be in FRCP, as
+it wouldn't work with raw flows. It also has to work with
+encryption. Raw flows have no header, so I can't mark them easily, and
+adding a header just for marking keepalive flows is also a bridge too
+far.
+
+I think I found an elegant solution. _0-length packets_. No header. No
+flags. Nothing. Nada. The flow at the receiver gets notified of a
+packet with a length of 0 bytes from the flow, updates it last
+activity time, and drops the packet without waking up application
+reads. Works with any type of traffic on the flow. 0-byte reads on the
+receiver already have a semantic of a partial read that was completed
+with exactly the buffer size[^2]. The sender can send 0-length
+packets, but the effect will be that it is a purposeful keepalive
+initiated at the sender.
+
+[^1]: Logically in the application. After all packets are
+ acknowledged, the application will exit and the IRMd will just
+ wait for the remaining timeout before telling the IPCP to
+ deallocate the flow. This is also a leak of the timeout from the
+ application to the IRMd, but it's an optimization that is really
+ needed. Nobody wants to wait 4 minutes for an application to
+ terminate after hitting Ctrl-C. This isn't really a clear-cut
+ "layer violation" as the IRMd should be considered part of the
+ Operating System. It's similar to TCP connections being in
+ TIME_WAIT in the kernel for 2 MSL.
+
+
+[^2]: If flow\_read(fd, buf, 128) returns 128, it should be called
+ again. If it returns 0, it means that the message was 128 bytes
+ long, if it returns another value, it is still part of the
+ previous message. \ No newline at end of file
diff --git a/content/en/blog/20220520-oping-flm.md b/content/en/blog/20220520-oping-flm.md
new file mode 100644
index 0000000..6d4ce08
--- /dev/null
+++ b/content/en/blog/20220520-oping-flm.md
@@ -0,0 +1,87 @@
+---
+date: 2022-05-20
+title: "What is there to learn from oping about flow liveness monitoring?"
+linkTitle: "Learning from oping (1): cleaning up"
+author: Thijs Paelman
+---
+
+### Cleaning up flows
+
+While I was browsing through some oping code
+(trying to get a feeling about how to do [broadcast](https://ouroboros.rocks/blog/2021/04/02/how-does-ouroboros-do-anycast-and-multicast/#broadcast)),
+I stumbled about the [cleaner thread](https://ouroboros.rocks/cgit/ouroboros/tree/src/tools/oping/oping_server.c?id=bec8f9ac7d6ebefbce6bd4c882c0f9616f561f1c#n54).
+As we can see, it was used to clean up 'stale' flows (sanitized):
+
+```C
+void * cleaner_thread(void * o)
+{
+ int deadline_ms = 10000;
+
+ while (true) {
+ for (/* all active flows i */) {
+
+ diff = /* diff in ms between last valid ping packet and now */;
+
+ if (diff > deadline_ms) {
+ printf("Flow %d timed out.\n", i);
+ flow_dealloc(i);
+ }
+ }
+ sleep(1);
+ }
+}
+```
+
+But we have since version 19.x flow liveness monitoring (FLM), which does this for us!
+So all this code could be thrown away, right?
+
+Turns out I was semi-wrong!
+It's all about semantics, or 'what do you want to achieve'.
+
+If this thread was there for cleaning up flows from which the peers stopped their flow (and stopped sending keep-alives),
+then we could throw it away by all means! Because FLM does that job.
+
+Or was it there to clean up valid flows, but from which the peers didn't send any ping packets anymore (they *do* send keep-alives, otherwise FLM kicks in)?
+Then we should of course keep it, because this is a server-side decision to cut those peers off.
+This might protect for example against client implementations which connect, send a few pings, but then leave the flow open.
+Or a better illustration of the 'cleaner' thread might be to cut off peers after a 100 pings,
+showing that this decision to 'clean up' has nothing to do with flow timeouts.
+
+### Keeping timed-out flows
+
+On the other side of the spectrum, we have those flows that are timing out (no keep-alives are coming in anymore).
+This is my proposal for the server side parsing of messages:
+
+```C
+while(/* get next fd on which an event happened */) {
+ msg_len = flow_read(fd, buf, OPING_BUF_SIZE);
+ if (msg_len < 0) {
+ /* if-statement is the only difference with before */
+ if (msg_len == -EFLOWPEER) {
+ fset_del(server.flows, fd);
+ flow_dealloc(fd);
+ }
+ continue;
+ }
+ /* continue with parsing and responding */
+}
+```
+
+We can see here that the decision is taken to 'clean up' (= `flow_dealloc`) those flows that are timing out.
+But, as we can see, it's an application decision!
+We might as well decide to keep it open for another 10 min to see if the client (or the network in between) recovers from interruptions, e.g..
+
+We might for example use this mechanism to show to the user that the peer seems to be down[^overleaf] and even take measures (like saving or removing state), but also allow to just wait until the peer is live again.
+
+### Conclusion
+
+As an application, you have total freedom (and responsibility) over your flows.
+Ouroboros will only inform you that your flow is timing out (and your peer thus appears to be down),
+but it's up to you to decide if you deallocate your side of the flow and when.
+
+Excited for my first blog post & always learning,
+
+Thijs
+
+
+[^overleaf]: I'm thinking about things like the Overleaf banner: `Lost Connection. Reconnecting in 2 secs. Try Now`
diff --git a/content/en/blog/20221207-loc-id-mobility-1.png b/content/en/blog/20221207-loc-id-mobility-1.png
new file mode 100644
index 0000000..87bb04a
--- /dev/null
+++ b/content/en/blog/20221207-loc-id-mobility-1.png
Binary files differ
diff --git a/content/en/blog/20221207-loc-id-mobility-2.png b/content/en/blog/20221207-loc-id-mobility-2.png
new file mode 100644
index 0000000..4fedee9
--- /dev/null
+++ b/content/en/blog/20221207-loc-id-mobility-2.png
Binary files differ
diff --git a/content/en/blog/20221207-loc-id-split.md b/content/en/blog/20221207-loc-id-split.md
new file mode 100644
index 0000000..bad82ac
--- /dev/null
+++ b/content/en/blog/20221207-loc-id-split.md
@@ -0,0 +1,216 @@
+---
+date: 2022-12-07
+title: "Loc/Id split and the Ouroboros network model"
+linkTitle: "On Loc/Id split"
+author: Dimitri Staessens
+---
+
+A few weeks back I had a drink with Thijs who is now doing a master's
+thesis on Loc/Id split, so we dug into the concepts behind Locators
+and Identifiers and see if matches or in any way interferes with the
+Ouroboros network model.
+
+For this, we started from the paper _Locator/Identifier Split
+Networking: A Promising Future Internet Architecture_[^1].
+
+# Loc/Id split?
+
+In a nutshell, Loc/Id split starts from the observation that the
+transport layer (TCP, UDP) is tightly coupled to network (IP)
+addresses via a certain TCP/UDP port.
+
+Assuming our IPv4 local address is 10.10.0.1 /24 and there is an SSH
+server on 10.10.5.253 /24 listening on port 22, after making a
+connection, our client application could be bound to 10.10.0.1 /24 on
+port 25406. If we move our laptop to another room that is on an access
+point in a different subnet, and we receive IP address 10.10.4.7 /24,
+our TCP connection to the SSL server will break.
+
+Loc/Id split suggest to split the "address" into two parts, an
+Identifier that is location-independent and specifies the _who_ at the
+transport layer, and a locator that is location-dependent and
+specifies the _where_ at the network layer. Since an IPv6 address has
+more than enough (128) bits, there's plenty of space to chop it up and
+attach some semantics to the individual pieces.
+
+Of course, after the split, identifiers need to be mapped to locators,
+so there is a mapping system needed to resolve the locator given the
+identifier. This mapping system resides in a Sub-Layer between the
+transport layer and the network layer. If this mapping system sounds a
+lot like DNS to you, then you're right, but then remember that TCP
+doesn't bind to a DNS name + port, but to an IP address + port. That's
+where the issue lies that the Identifier tries to solve.
+
+Resolving the Locator from the Identifier usually happens in the
+end-host, but some Loc/Id split proposals may forward this
+responsibility to other nodes in the network. When only end-hosts
+perfom Id->Loc resolution, it's called a host-based Loc/Id split
+architecture, if some other nodes perform Id->Loc resolution it's
+called a network-based architecture. In a network-based architecture,
+the identifier MUST be part of the packet header (in a host-based
+architecture it's optional), and the network nodes forward towards a
+resolver node based on the identifier and then when the locator is
+known based on the locator towards the end-host. I have my doubts that
+this can ever scale, so in this article, I'll focus on host based
+Loc/Id split. Host-based architectures are summarized in the figure
+below, taken from the survey paper[^1].
+
+{{<figure width="60%" src="/blog/20221207-loc-id.png">}}
+
+My first reaction to seeing that was _sounds about right to me_, it's
+almost identical to what O7s proposes for a fully scalable and
+evolvable architecture. But before I get to that, let's first dig a
+bit deeper into those locators and identifiers. What _are_ these
+beasts?
+
+# Mobility in Loc/Id split
+
+{{<figure width="40%" src="/blog/20221207-loc-id-mobility-1.png">}}
+
+Let's assume the previous example where, from my laptop, I'm connected
+to some SSH server, but this time we're in a Loc/Id split network. So
+my laptop got a different address for its interface, an identifier,
+say COFF33D00D, and, since I'm in the green network, a locator that is
+conveniently the IPv4 address for my wireless LAN interface,
+10.10.0.1 /24. The TCP connection in the SSH client is Loc/Id aware,
+and now bound to C0FF33D00D:25406. After connecting to the client at
+008BADF00D, It learns that I'm C0FF33D00D and my locator is 10.10.0.1.
+
+When I move to another floor, the laptop WLAN interface gets a new
+locator, but my identifier stays the same. It's now
+C0FF33D00D:10.10.4.7. The OS is implementing a host-based Loc/Id split
+architecture, so I quickly send a _loc/id update_ message to the
+server at 10.10.5.253 that my locator for C0FF33D00D has changed to
+10.10.4.7, and it updates its mapping. The Loc/Id-aware TCP state
+machine in my laptop had some packet loss to deal with while I was in
+the elevator, but other than that, since it was bound to my identifier
+the connection remains intact.
+
+Nice! Splitting an address into a locator and identifier has a pretty
+elegant solution to mobility.
+
+Notice I didn't give the routers identifiers parts in their
+address? That's on purpose.
+
+Let's take a little thought experiment.
+
+Instead of moving to the other floor, I already have a laptop already
+sitting there. Its WLAN interface has address COFFEEBABE:10.10.4.7.
+
+{{<figure width="40%" src="/blog/20221207-loc-id-mobility-2.png">}}
+
+Now, what I do in this thought experiment, is copy the entire _program
+state_ of my SSH client to that other laptop, _including_ the TCP
+state[^2] and fork it as a new process on the other laptop. What is
+needed to make it work from a network perspective?
+
+Well, like when actually moving with my laptop, I need to update the
+server that my identifier C0FF33D00D has moved to another locator at
+10.10.4.7. That should do the trick, quite easy.
+
+Unless there was already another application connected on port 25406
+on that destination laptop. Then there is no way for the incoming
+laptop to know where to deliver the packets to. Unless the identifier
+is in the packet header. But host-based Loc/Id split had them
+optional? This seems to hint that host-based Loc/Id split supports
+device mobility but cannot fully support application mobility[^3].
+
+So, what is that identifier actually naming? Well, all that moved was
+the application state, and the identifier seemed to move with
+it... And since the routers in the example don't run "end-host"
+applications, they don't need identifiers.
+
+# What does the Ouroboros model say?
+
+Ouroboros[^4] gives each application process a name, which is mapped
+to an IPCP's address[^5]. The O7s application name basically
+corresponds to the _identifier_, and the IPCPs address maps to the
+_locator_.
+
+{{<figure width="30%" src="/blog/20220228-flm-app.png">}}
+
+Let's compare the architecture of Ouroboros above with the figure at
+the top.
+
+First, the similarities. The Ouroboros model conjectures a split of
+the transport layer into an _application end-to-end layer_ (roughly
+TCP without congestion avoidance) and a network end-to-end layer that
+includes the _flow allocator_.
+
+The _flow allocator_ in O7s performs the name <--> address mapping
+that is similar to id <--> loc mapping. Interesting to note is that in
+O7s, the Flow allocator is present in every IPCP, which is needed for
+Congestion Notifications. Given that identifiers are mapping to
+application names, resolving in name <--> address in other nodes than
+the source, like in network-based Loc/Id split, is not violating the
+O7s architecture. But we haven't considered this as it doesn't look
+feasible from a scalability perspective.
+
+Now, the differences. First, the naming. The "identifier" in Ouroboros
+is a network/globally unique application name[^6]. Processes[^7] can
+be _bound_ to an application name. If a single process binds to an
+application name it's unicast, if multiple processes on the same
+server bind to the same name, it provides per-connection
+load-balancing between these processes. If multiple processes on
+different servers bind to the same name, it provides a form of anycast
+name-based load-balancing.
+
+Second, Ouroboros endpoint identifiers (EIDs) are only known to the
+Flow Allocator at the endpoint and specify the application. The O7s
+EID can be viewed as a combination of the L3 _protocol_ field and the
+L4 _port_ field into a single field that sits in between L3 and L4
+(the Loc/Id proposed sublayer). This allows O7s to allocate a new flow
+(assigning new EIDs) while keeping the connection state in the process
+(FRCP) intact, and thus allowing full application mobility in addition
+to device mobility. Taking another look at the Loc/Id split figure,
+note that Ouroboros splits "network" from "application" just above the
+"Sub-layer", instead of above the "transport layer".
+
+# Wrapping up
+
+The discussions on Loc/Id split were quite interesting. A lot of the
+steps and solutions it proposes are in line with the O7s model. What
+strikes me most is that LoC/Id split is still not very well-defined as
+a _model_. What exactly _are_ identifiers? What exactly _are_
+locators? The thing that sets O7s apart is that the model consists of
+a limited amount of objects (forwarding elements and flooding
+elements, which form Layers[^8], application, process, ...) that have
+well-defined names[^9] that are immutable and exist only for as long
+as the object exists.
+
+
+[^1]: https://doi.org/10.1109/COMST.2017.2728478
+
+[^2]: This is hard to do with TCP state being in the kernel, but let's
+ forget about that and memory addresses and others stuff for a
+ moment and assume the complete application state is a nice
+ containerized package.
+
+[^3]: The Ouroboros model does allow complete application
+ mobility. The problem in this Loc/Id proposal is that the port
+ is still part of the Transport Layer state (see the figure at
+ the start of the post).
+
+[^4]: This, and a lot of other things in O7s, were proposed in the
+ RINA architecture, that's where the credit should go.
+
+[^5]: To be accurate: we hash the application name.
+
+[^6]: At least, for a public Internetwork, they should be globally
+ unique.
+
+[^7]: In O7s, processes are named with a process name (which in the
+ implementation maps to the linux process id (pid). Process names
+ are only local (system) scope.
+
+[^8]: I capitalize Layers, as these Layers that are made up of
+ forwarding elements (unicast Layers) or flooding elements
+ (broadcast Layers) have a different meaning than the layers in
+ the discussion above. Maybe we should call them _strata_ instead
+ of Layers...
+
+[^9]: Synonyms are allowed, but they serve no function in the
+ architecture. As an example, application names are hashed (a
+ synonym) which has practical implications for security and
+ implementation simplicity, but the architecture is theoretically
+ identical without that hash. \ No newline at end of file
diff --git a/content/en/blog/20221207-loc-id.png b/content/en/blog/20221207-loc-id.png
new file mode 100644
index 0000000..51a046d
--- /dev/null
+++ b/content/en/blog/20221207-loc-id.png
Binary files differ
diff --git a/content/en/blog/20241110-auth.md b/content/en/blog/20241110-auth.md
new file mode 100644
index 0000000..5dd427d
--- /dev/null
+++ b/content/en/blog/20241110-auth.md
@@ -0,0 +1,140 @@
+---
+date: 2024-11-10
+title: "The other end of the authentication rabbit hole"
+linkTitle: "Authentication"
+author: Dimitri Staessens
+---
+
+```
+I know the pieces fit
+'Cause I watched them tumble down
+ -- Tool, Schism (2001)
+```
+
+Mariah Carey signals it almost Christmas, so it sounds like as good a
+time for a blog post. Been a long time since my last one.
+
+My plan for this year for Ouroboros was to clean up the flow allocator
+code, then add authentication and clean up and fix packet loss
+handling (EFCP), fix congestion avoidance, and then leave the big
+missing part - the naming system - for later.
+
+As they say, plans never survive contact with the enemy. At least I
+got the clean-up of the flow allocator code done.
+
+So how did the authentication implementation go, you ask? Or maybe
+not, but I'm going to tell you anyway. Despite not having an
+implementation, the journey has been quite interesting.
+
+The idea was quite simple, actually. I've been been told ad nauseam
+not to "reinvent the wheel", so my approach to authentication was
+going to be as boring as one could expect: establish a flow between
+the client and server, and then do some well-known public key
+cryptography magic to authenticate. It's how SSL does it, nothing
+wrong with that, is there?
+
+After an initial check, I validated I could easily use OpenSSL and
+X509 certificates. This authentication implementation was going to be
+a walk in the park.
+
+But here I am, at least 5 months later, and I don't have this
+implemented.
+
+There was one thing I absolutely wanted to avoid: having to configure
+the certificates as part of the application. The library could of
+course add some configuration/command line options to each linked
+application (e.g. --certificate <cert.pem>), but that is also not to
+my liking. I know this is roughly how it is with OpenSSL certificates
+today, but it's a drag, and the O7s code was very emphatically hinting
+that it could be done without.
+
+But how? One way was to map programs to certificates in the IRM
+configuration, and when an application starts, the IRMd will pass it
+the certificate. As an example, the configuration has an entry
+/usr/bin/apache -> /etc/ouroboros/apache.crt and when /usr/bin/apache
+starts, it would get that certificate loaded). That would kind of have
+worked, but it left another problem: what if I wanted the same
+application to start twice, but with different certificates? In other
+words, I'd need to map the certificate to a process, not to a program.
+That kept my mind busy for a while, as there seemingly was no obvious
+solution to this. I say seemingly, because the solution is as obvious
+as it is simple. But it's far from the current state of affairs.
+
+So, how could I get a process to load a certificate at runtime, but
+configure it in a central location, and before the process is running?
+That seems like a catch-22. First mapping the program, instantiating
+the program, then configure a new mapping, starting another process
+would be an error-prone mess riddled with race conditions.
+
+The answer was simple: don't map the process, map the service
+name. O7s already has the primitives to map programs and processes to
+service names. And this method is actually not that far off the
+current Internet approach. X509 certficates are usually tied to Domain
+Names. One part of the puzzle solved using this indirection, and I can
+still use standard X509 certificates, nice.
+
+So, why don't I have this yet? Glad you asked.
+
+Mapping certificates to names and as such indirectly tying them to
+processes allows for a significantly different approach for
+authentication. Why would we trust the peer with sending us his
+certificate? We can get it from the name system, before establishing
+any communications with the peer. Translated to the current Internet:
+DNS could return the public key together with the IP address. So,
+instead of first allocating a flow, and then authenticating and
+deriving symmetric keys for end-to-end in-flight encryption - as is
+done with TLS - we can do something a lot more secure: get the public
+key of the peer from the naming system and use it to encrypt the
+initial (flow allocation) handshake.
+
+But this has another impact on my implementation plan: I need the
+naming system, as the public certificate needs to be retrieved from it
+for this to work.
+
+The naming system is a component that has not been written yet. It is
+a distributed application/database and it needs IPC with the IRMd.
+
+And so this has opened another debate in my head: should I start over
+with the implementation?
+
+O7s started in 2016 with the intention to be a RINA implentation, but
+as we went on we changed it quite a bit. Some core parts are a thorn
+in my eye, most notably the synchronous RPC implementation. I don't
+want to build the new component (name system) using that approach, and
+ripping it out of the current implementation is going to be messy.
+
+Starting over would allow using a more "modern" language than C. I've
+been looking at rust, but from my initial survey, rust don't seem like
+a good fit due to the lack of shared memory/robust mutex
+support. Golang inherits mutexes from C pthreads, but I'm not that
+fond of switching to golang over C (and it might not work that
+transparently with other OS like BSD or OS X if I rely on the pthreads
+and shared memory). If C is the only viable language for this thing,
+ripping the guts out of the current implementation might be the best
+option after all.
+
+I've not reached a conclusion yet.
+
+Anyway, as always: Stay Curious. And have a nice end of year.
+
+Dimitri
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
diff --git a/content/en/blog/_index.md b/content/en/blog/_index.md
index 43820eb..34904e5 100644
--- a/content/en/blog/_index.md
+++ b/content/en/blog/_index.md
@@ -1,13 +1,7 @@
---
-title: "Docsy Blog"
+title: "Developer Blog"
linkTitle: "Blog"
menu:
main:
weight: 30
---
-
-
-This is the **blog** section. It has two categories: News and Releases.
-
-Files in these directories will be listed in reverse chronological order.
-
diff --git a/content/en/blog/news/20201024-why-better.md b/content/en/blog/news/20201024-why-better.md
deleted file mode 100644
index 5f9c173..0000000
--- a/content/en/blog/news/20201024-why-better.md
+++ /dev/null
@@ -1,119 +0,0 @@
----
-date: 2020-10-24
-title: "Why is this better than IP?"
-linkTitle: "Why is this better than IP?"
-description: "The problem with the Internet isn't that it's
-wrong, it's that it's almost right. -- John Day"
-author: Dimitri Staessens
----
-
-With the COVID-19 pandemic raging on at its current peak all over the
-world, it has been a weird couple of months for most of us. In the
-last few weeks I did a first implementation of the missing part of the
-FRCP protocol (retransmission and flow control, fragmentation is still
-needed), and I hope to get congestion control done by the end of the
-year, which is very needed because with the retransmission the
-prototype experiences congestion collapse on a bottleneck caused by
-the bandwidth of the shared memory allocator we implemented. Then I
-can finally get rid of the many annoying bugs, stabilize the
-prototype, and then implement a better allocator. But this is not the
-main thing I wanted to adress.
-
-I've also been pondering the question I always get from network
-engineers. It's the one that annoys me the most, because it seems that
-first this question needs to be answered for the questioner to even
-consider taking any attempt or spending any effort in trying to
-understand the body of work presented. This question is, of course,
-*Why is this better than IP?*
-
-Now,there are two things we did when building Ouroboros. The most
-visible is the prototype implementation, but behind it is a network
-model. This network model is not some arbitrary engineering excercise,
-we tried to find the minimum *required and sufficient conditions* for
-networking, mathematically. In this sense -- under the precondition
-that the model is correct[^2] -- every working network has to fulfill
-the model, like it or not. It's not a choice one can take or
-leave. I'll try to find more time (and to be perfectly honest: the
-motivation) in the next couple of weeks to get this aspect across with
-more clarity and precision than with which it is currently presented
-in the [paper](https://arxiv.org/pdf/2001.09707.pdf).
-
-Now, I would argue that the current Internet with all its technologies
-(TCP/IP, VPNs, MPLS, QUIC, ...) is closer to the model we derived than
-it is to the 7-layer-model. NASA's
-[DTNs](https://www.nasa.gov/content/dtn) don't violate the
-model[^1]. RINA is of course very close to the model, as it also
-predicts the recursive nature of networks. The prototype is an
-implementation of the model (and thus in some way, close to a minimal
-network).
-
-While the model predicts recursive layering, it doesn't forbid
-engineers to use different APIs for each layer, mix and combine layers
-into logical constructs, expose layer internals, rewrite packet
-headers, and do whatever they please. It may have economical benefits
-for an engineer to put a solution in whatever logical part of the
-network, hack in an API to expose what the solution needs and then
-sell it. But this engineering comes at the cost of adding complexity
-without functionality.
-
-I'll use an analogy from programming: a simple C for loop uses an
-index variable, typically named "i". It is generally seen as bad
-practice to modify this index variable inside the loop, something like
-this.
-
-```C
-for (int i = 1; i <= 10; ++i) {
- if(i == 2)
- i = 4;
- /* Do some work */
-}
-```
-
-But is such code "worse" than another solution to skip 2 and 3, such
-as this?
-
-```C
-for (int i = 1; i <= 10; ++i) {
- if(i == 2 || i == 3)
- continue;
- /* Do some work */
-}
-```
-
-The reason it is considered bad practice, is based on years of
-programming experience: thousands of programmers noticing that such
-constructs often lead to bugs and lower maintainability[^3]. I would
-argue that the Internet is full of this kind of bad practices -- it
-doesn't take a huge stretch of the imagination to take the example
-above as an analogy for Network Address Translation (NAT). Of course,
-NAT is probably the only way to stretch the IPv4 address space beyond
-its natural limits. But in an ideal networking world (which is our
-objective), one shouldn't break end-to-end and consider addresses
-immutable.
-
-And that's why I think Ouroboros is a "good" network model. If the
-minimal necessary and sufficient conditions for networking are
-understood as a consistent model for networks, we can better assess
-where engineering decisions have been made that add complexity without
-objectively adding functionality to the whole. Unfortunately, there
-aren't many scientists that design networking solutions from the
-ground up, so there isn't much experience to go around.
-
-So then, is Ouroboros better than IP? From an engineering perspective,
-maybe not, or not by much. But I'm fine with that, it's not my objective.
-
-Keep safe social distancing, take care and stay curious.
-
-Dimitri
-
-P.S. Of course, I was curious about what the GCC
-[compiler](https://godbolt.org) does with the source code examples
-above. It does seems like the "better" solution also optimizing a bit
-better:
-
-<iframe width="80%" height="800px" src="https://godbolt.org/e#g:!((g:!((g:!((h:codeEditor,i:(fontScale:14,j:1,lang:___c,selection:(endColumn:2,endLineNumber:18,positionColumn:2,positionLineNumber:18,selectionStartColumn:2,selectionStartLineNumber:18,startColumn:2,startLineNumber:18),source:'//+Type+your+code+here,+or+load+an+example.%0A%23include+%3Cstdio.h%3E%0A%0Avoid+func()+%7B%0A++++for+(int+i+%3D+1%3B+i+%3C%3D+10%3B+%2B%2Bi)+%7B%0A++++++++if(i+%3D%3D+2)%0A+++++++++++i+%3D+4%3B%0A++++++++printf(%22%25d%5Cn.%22,+i)%3B%0A++++%7D%0A%7D%0A%0Avoid+func2()+%7B%0A++++for+(int+i+%3D+1%3B+i+%3C%3D+10%3B+%2B%2Bi)+%7B%0A++++++++if(i+%3D%3D+2+%7C%7C+i+%3D%3D+3)%0A++++++++++++continue%3B%0A++++++++printf(%22%25d%5Cn.%22,+i)%3B%0A++++%7D%0A%7D'),l:'5',n:'0',o:'C+source+%231',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0'),(g:!((h:compiler,i:(compiler:cg102,filters:(b:'0',binary:'1',commentOnly:'0',demangle:'0',directives:'0',execute:'1',intel:'0',libraryCode:'1',trim:'1'),fontScale:14,j:1,lang:___c,libs:!(),options:'-O4',selection:(endColumn:25,endLineNumber:11,positionColumn:25,positionLineNumber:11,selectionStartColumn:25,selectionStartLineNumber:11,startColumn:25,startLineNumber:11),source:1),l:'5',n:'0',o:'x86-64+gcc+10.2+(Editor+%231,+Compiler+%231)+C',t:'0')),k:50,l:'4',n:'0',o:'',s:0,t:'0')),l:'2',n:'0',o:'',t:'0')),version:4"></iframe>
-
-
-[^1]: I came across this when thinking about the limits for the timers for retransmission. A DTN is basically two layers, one without retransmission on top of one with retransmission at very long timeouts.
-[^2]: Of all the comments from peer review, not a single one has addressed any technical issues -- let alone correctness -- of the model. Most are that I fail to justify why the reviewer should bother to read the article or make an effort to try to understand it as it doesn't fit current engineering workflows and thus has little chance of short-term deployment. Sorry, but I don't care.
-[^3]: With modern programming languages leaning more and more towards discouraging mutables alltogether.
diff --git a/content/en/blog/news/_index.md b/content/en/blog/news/_index.md
deleted file mode 100644
index c10cfa2..0000000
--- a/content/en/blog/news/_index.md
+++ /dev/null
@@ -1,5 +0,0 @@
----
-title: "News About Docsy"
-linkTitle: "News"
-weight: 20
----
diff --git a/content/en/blog/releases/_index.md b/content/en/blog/releases/_index.md
deleted file mode 100644
index b1d9eb4..0000000
--- a/content/en/blog/releases/_index.md
+++ /dev/null
@@ -1,8 +0,0 @@
-
----
-title: "New Releases"
-linkTitle: "Releases"
-weight: 20
----
-
-
diff --git a/content/en/blog/releases/upcoming.md b/content/en/blog/releases/upcoming.md
deleted file mode 100644
index f984380..0000000
--- a/content/en/blog/releases/upcoming.md
+++ /dev/null
@@ -1,7 +0,0 @@
----
-date: 2019-10-06
-title: "Plans for 0.16"
-linkTitle: "Ouroboros 0.16"
-description: "Ouroboros 0.16"
-author: Dimitri Staessens
----
diff --git a/content/en/docs/Concepts/broadcast_layer.png b/content/en/docs/Concepts/broadcast_layer.png
new file mode 100644
index 0000000..01079c0
--- /dev/null
+++ b/content/en/docs/Concepts/broadcast_layer.png
Binary files differ
diff --git a/content/en/docs/Concepts/dependencies.jpg b/content/en/docs/Concepts/dependencies.jpg
deleted file mode 100644
index eaa9e79..0000000
--- a/content/en/docs/Concepts/dependencies.jpg
+++ /dev/null
Binary files differ
diff --git a/content/en/docs/Concepts/elements.md b/content/en/docs/Concepts/elements.md
deleted file mode 100644
index a803065..0000000
--- a/content/en/docs/Concepts/elements.md
+++ /dev/null
@@ -1,90 +0,0 @@
----
-title: "Elements of a recursive network"
-author: "Dimitri Staessens"
-date: 2019-07-11
-weight: 2
-description: >
- The building blocks for recursive networks.
----
-
-This section describes the high-level concepts and building blocks are
-used to construct a decentralized [recursive network](/docs/what):
-layers and flows. (Ouroboros has two different kinds of layers, but
-we will dig into all the fine details in later posts).
-
-A __layer__ in a recursive network embodies all of the functionalities
-that are currently in layers 3 and 4 of the OSI model (along with some
-other functions). The difference is subtle and takes a while to get
-used to (not unlike the differences in the term *variable* in
-imperative versus functional programming languages). A recursive
-network layer handles requests for communication to some remote
-process and, as a result, it either provides a handle to a
-communication channel -- a __flow__ endpoint --, or it raises some
-error that no such flow could be provided.
-
-A layer in Ouroboros is built up from a bunch of (identical) programs
-that work together, called Inter-Process Communication (IPC) Processes
-(__IPCPs__). The name "IPCP" was first coined for a component of the
-[LINCS]
-(https://www.osti.gov/biblio/5542785-delta-protocol-specification-working-draft)
-hierarchical network architecture built at Lawrence Livermore National
-Laboratories and was taken over in the RINA architecture. These IPCPs
-implement the core functionalities (such as routing, a dictionary) and
-can be seen as small virtual routers for the recursive network.
-
-{{<figure width="60%" src="/docs/concepts/rec_netw.jpg">}}
-
-In the illustration, a small 5-node recursive network is shown. It
-consists of two hosts that connect via edge routers to a small core.
-There are 6 layers in this network, labelled __A__ to __F__.
-
-On the right-hand end-host, a server program __Y__ is running (think a
-mail server program), and the (mail) client __X__ establishes a flow
-to __Y__ over layer __F__ (only the endpoints are drawn to avoid
-cluttering the image).
-
-Now, how does the layer __F__ get the messages from __X__ to __Y__?
-There are 4 IPCPs (__F1__ to __F4__) in layer __F__, that work
-together to provide the flow between the applications __X__ and
-__Y__. And how does __F3__ get the info to __F4__? That is where the
-recursion comes in. A layer at some level (its __rank__), will use
-flows from another layer at a lower level. The rank of a layer is a
-local value. In the hosts, layer __F__ is at rank 1, just above layer
-__C__ or layer __E_. In the edge router, layer __F__ is at rank 2,
-because there is also layer __D__ in that router. So the flow between
-__X__ and __Y__ is supported by flows in layer __C__, __D__ and __E__,
-and the flows in layer __D__ are supported by flows in layers __A__
-and __B__.
-
-Of course these dependencies can't go on forever. At the lowest level,
-layers __A__, __B__, __C__ and __E__ don't depend on a lower layer
-anymore, and are sometimes called 0-layers. They only implement the
-functions to provide flows, but internally, they are specifically
-tailored to a transmission technology or a legacy network
-technology. Ouroboros supports such layers over (local) shared memory,
-over the User Datagram Protocol, over Ethernet and a prototype that
-supports flows over an Ethernet FPGA device. This allows Ouroboros to
-integrate with existing networks at OSI layers 4, 2 and 1.
-
-If we then complete the picture above, when __X__ sends a packet to
-__Y__, it passes it to __F3__, which uses a flow to __F1__ that is
-implemented as a direct flow between __C2__ and __C1__. __F1__ then
-forwards the packet to __F2__ over a flow that is supported by layer
-__D__. This flow is implemented by two flows, one from __D2__ to
-__D1__, which is supported by layer A, and one from __D1__ to __D3__,
-which is supported by layer __B__. __F2__ will forward the packet to
-__F4__, using a flow provided by layer __E__, and __F4__ then delivers
-the packet to __Y__. So the packet moves along the following chain of
-IPCPs: __F3__ --> __C2__ --> __C1__ --> __F1__ --> __D2__ --> __A1__
---> __A2__ --> __D1__ --> __B1__ --> __B2__ --> __D3__ --> __F2__ -->
-__E1__ --> __E2__ --> __F4__.
-
-{{<figure width="40%" src="/docs/concepts/dependencies.jpg">}}
-
-A recursive network has __dependencies__ between layers in the
-network, and between IPCPs in a __system__. These dependencies can be
-represented as a directed acyclic graph (DAG). To avoid problems,
-these dependencies should never contain cycles (so a layer I should
-not directly or indirectly depend on itself). The rank of a layer is
-defined (either locally or globally) as the maximum depth of this
-layer in the DAG.
diff --git a/content/en/docs/Concepts/fa.md b/content/en/docs/Concepts/fa.md
index d91cc00..b03e3f7 100644
--- a/content/en/docs/Concepts/fa.md
+++ b/content/en/docs/Concepts/fa.md
@@ -30,7 +30,7 @@ system has an Ouroboros IRMd and a unicast IPCP. These IPCPs work
together to create a logical "layer". System 1 runs a "client"
program, System 2 runs a "server" program.
-We are going to explain in some detail the steps that Ourobros takes
+We are going to explain in some detail the steps that Ouroboros takes
to establish a flow between the "client" and "server" program so they
can communicate.
diff --git a/content/en/docs/Concepts/layers.jpg b/content/en/docs/Concepts/layers.jpg
deleted file mode 100644
index 5d3020c..0000000
--- a/content/en/docs/Concepts/layers.jpg
+++ /dev/null
Binary files differ
diff --git a/content/en/docs/Concepts/model_elements.png b/content/en/docs/Concepts/model_elements.png
new file mode 100644
index 0000000..bffbca8
--- /dev/null
+++ b/content/en/docs/Concepts/model_elements.png
Binary files differ
diff --git a/content/en/docs/Concepts/ouroboros-model.md b/content/en/docs/Concepts/ouroboros-model.md
new file mode 100644
index 0000000..7daa95b
--- /dev/null
+++ b/content/en/docs/Concepts/ouroboros-model.md
@@ -0,0 +1,635 @@
+---
+title: "The Ouroboros model"
+author: "Dimitri Staessens"
+date: 2020-06-12
+weight: 2
+description: >
+ A conceptual approach to packet networking fundamentals
+---
+
+```
+Computer science is as much about computers as astronomy is
+about telescopes.
+ -- Edsger Wybe Dijkstra
+```
+
+The model for computer networks underlying the Ouroboros prototype is
+the result of a long process of gradual increases in my understanding
+of the core principles that underly computer networks, starting from
+my work on traffic engineering packet-over-optical networks using
+Generalized Multi-Protocol Label Switching (G/MPLS) and Path
+Computation Element (PCE), then Software Defined Networks (SDN), the
+work with Sander investigating the Recursive InterNetwork Architecture
+(RINA) and finally our implementation of what would become the
+Ouroboros Prototype. The way it is presented here is not a reflection
+of this long process, but a crystalization of my current understanding
+of the Ouroboros model.
+
+I'll start with the very basics, assuming no delay on links and
+infinite capacity, and then gradually add delay, link capacity,
+failures, etc to assess their impact and derive _what_ needs to be
+added _where_ in order to come to the complete Ouroboros model.
+
+The main objective of the definitions -- and the Ouroboros model as a
+whole -- is to __separate mechanism__ (the _what_) __from policy__
+(the _how_) so that we have objective definitions and a _consistent_
+framework for _reasoning_ about functions and protocols in computer
+networks.
+
+### The importance of first principles
+
+One word of caution, because this model might read like I'm
+"reinventing the wheel" and we already know _how_ to do everything that
+is written here. Of course we do! The point is that the model
+[reduces](https://en.wikipedia.org/wiki/Reductionism)
+networking to its _essence_, to its fundamental parts.
+
+After studying most courses on Computer Networks, I could name the 7
+layers of the OSI model, I know how to draw TCP 3-way handshakes,
+could detail 5 different TCP congestion control mechanisms, calculate
+optimal IP subnets given a set of underlying Local Area Networks, draw
+UDP headers, chain firewall rules in iptables, calculate CRC
+checksums, and derive spanning trees given MAC addresses of Ethernet
+bridges. But after all that, I still feel such courses teach about as
+much about computer networks as cookbooks teach about chemistry. I
+wanted to go beyond technology and the rote knowledge of _how things
+work_ to establish a thorough understanding of _why they work_.
+During most of my PhD work at the engineering department, I spent my
+research time on modeling telecommunications networks and computer
+networks as _graphs_. The nodes represented some switch or router --
+either physical or virtual --, the links represented a cable or wire
+-- again either physical or virtual -- and then the behaviour of
+various technologies were simulated on those graphs to develop
+algorithms that analyze some behaviour or optimize some or other _key
+performance indicator_ (KPI). This line of reasoning, starting from
+_networked devices_ is how a lot of research on computer networks is
+conducted. But what happens if we turn this upside down, and develop a
+_universal_ model for computer networks starting from _first
+principles_?
+
+This sums up my problem with computer networks today: not everything
+in their workings can be fully derived from first principles. It also
+sums up why I was attracted to RINA: it was the first time I saw a
+network architecture as the result of a solid attempt to derive
+everything from first principles. And it’s also why Ouroboros is not
+RINA: RINA still contains things that can’t be derived from first
+principles.
+
+### Two types of layers
+
+The Ouroboros model postulates that there are only 2 scalable methods
+of distributing packets in a network layer: _FORWARDING_ packets based
+on some label, or _FLOODING_ packets on all links but the incoming
+link.
+
+We call an element that forwards a __forwarding element__,
+implementing a _packet forwarding function_ (PFF). The PFF has as
+input a destination name for another forwarding element (represented
+as a _vertex_), and as output a set of output links (represented
+as _arcs_) on which the incoming packet with that label is to be
+forwarded on. The destination name needs to be in a packet header.
+
+We call an element that floods a __flooding element__, and it
+implements a packet flooding function. The flooding element is
+completely stateless, and has a input the incoming arc, and as output
+all non-incoming arcs. Packets on a broadcast layer do not need a
+header at all.
+
+Forwarding elements are _equal_, and need to be named, flooding
+elements are _identical_ and do not need to be named[^1].
+
+{{<figure width="40%" src="/docs/concepts/model_elements.png">}}
+
+Peering relationships are only allowed between forwarding elements, or
+between flooding elements, but never between a forwarding element and
+a flooding element. We call a connected graph consisting of nodes that
+hold forwarding elements a __unicast layer__, and similary we call a
+connected _tree_[^2] consisting of nodes that house a flooding element
+a __broadcast layer__.
+
+The objective for the Ouroboros model is to hold for _all_ packet
+networks; our __conjecture__ is that __all scalable packet-switched
+network technologies can be decomposed into finite sets of unicast and
+broadcast layers__. Implementations of unicast and broadcast layers
+can be easily found in TCP/IP, Recursive InterNetworking Architecture
+(RINA), Delay Tolerant Networks (DTN), Ethernet, VLANs, Loc/Id split
+(LISP),... [^3]. The Ouroboros _model_ by itself is not
+recursive. What is known as _recursive networking_ is a choice to use
+a single standard API for interacting with all the implementatations
+of unicast layers and a single standard API for interacting with all
+implementations of broadcast layers[^4].
+
+### The unicast layer
+
+A unicast layer is a collection of interconnected nodes that implement
+forwarding elements. A unicast layer provides a best-effort unicast
+packet service between two endpoints in the layer. We call the
+abstraction of this point-to-point unicast service a flow. A flow in
+itself has no guarantees in terms of reliability [^5].
+
+{{<figure width="70%" src="/docs/concepts/unicast_layer.png">}}
+
+A representation of a very simple unicast layer is drawn above, with a
+flow between the _green_ (bottom left) and _red_ (top right)
+forwarding elements.
+
+The forwarding function operates in such a way that, given the label
+of the destination forwarding element (in the case of the figure, a
+_red_ label), the packet will move to the destination forwarding
+element (_red_) in a _deliberate_ manner. The paper has a precise
+mathematical definition, but qualitatively, our definition of
+_FORWARDING_ ensures that the trajectory that packets follow through a
+network layer between source and destination
+
+* doesn't need to use the 'shortest' path
+* can use multiple paths
+* can use different paths for different packets between the same
+ source-destination pair
+* can involve packet duplication
+* will not have non-transient loops[^6][^7]
+
+The first question is: _what information does that forwarding function
+need in order to work?_ Mathematically, the answer is that all
+forwarding elements needs to know the values of a valid __distance
+function__[^8] between themselves and the destination forwarding
+element, and between all of their neighbors and the destination
+forwarding element. The PFF can then select a (set of) link(s) to any
+of its neighbors that is closer to the destination forwarding element
+according to the chosen distance function and send the packet on these
+link(s). Thus, while the __forwarding elements need to be _named___,
+the __links between them need to be _measured___. This can be either
+explicit by assigning a certain weight to a link, or implicit and
+inferred from the distance function itself.
+
+The second question is: _how will that forwarding function know this
+distance information_? There are a couple of different possible
+answers, which are all well understood. I'll briefly summarize them
+here.
+
+A first approach is to use a coordinate space for the names of the
+forwarding elements. For instance, if we use the GPS coordinates of
+the machine in which they reside as a name, then we can apply some
+basic geometry to _calculate_ the distances based on this name
+only. This simple GPS example has pitfalls, but it has been proven
+that any connected finite graph has a greedy embedding in the
+hyperbolic plane. The obvious benefit of such so-called _geometric
+routing_ approaches is that they don't require any dissemination of
+information beyond the mathematical function to calculate distances,
+the coordinate (name) and the set of neighboring forwarding
+elements. In such networks, this information is disseminated during
+initial exchanges when a new forwarding element joins a unicast layer
+(see below).
+
+A second approach is to disseminate the values of the distance
+function to all destinations directly, and constantly updating your
+own (shortest) distances from these values received from other
+forwarding elements. This is a very well-known mechanism and is
+implemented by what is known as _distance vector_ protocols. It is
+also well-known that the naive approach of only disseminating the
+distances to neighbors can run into a _count to infinity_ issue when
+links go down. To alleviate this, _path vector_ protocols include a
+full path to every destination (making them a bit less scaleable), or
+distance vector protocols are augmented with mechanisms to avoid
+transient loops and the resulting count-to-infinity (e.g. Babel).
+
+The third approach is to disseminate the link weights of neighboring
+links. From this information, each forwarding element can build a view
+of the network graph and again calculate the necessary distances that
+the forwarding function needs. This mechanism is implemented in
+so-called _link-state_ protocols.
+
+I will also mention MAC learning here. MAC learning is a bit
+different, in that it is using piggybacked information from the actual
+traffic (the source MAC address) and the knowledge that the adjacency
+graph is a _tree_ as input for the forwarding function.
+
+There is plenty more to say about this, and I will, but first, I will
+need to introduce some other concepts, most notably the broadcast
+layer.
+
+### The broadcast layer
+
+A broadcast layer is a collection of interconnected nodes that house
+flooding elements. The node can have either, both or neither of the
+sender and receiver role. A broadcast layer provides a best-effort
+broadcast packet service from sender nodes to all (receiver) nodes in
+the layer.
+
+{{<figure width="70%" src="/docs/concepts/broadcast_layer.png">}}
+
+Our simple definition of _FLOODING_ -- given a set of adjacent links,
+send packets received on a link in the set on all other links in the
+set -- has a huge implication the properties of a fundamental
+broadcast layer: the graph always is a _tree_, or packets could travel
+along infinite trajectories with loops [^9].
+
+### Building layers
+
+We now define 2 fundamental operations for constructing packet network
+layers: __enrollment__ and __adjacency management__. These operations
+are very broadly defined, and can be implemented in a myriad of
+ways. These operations can be implemented through manual configuration
+or automated protocol interactions. They can be skipped (no-operation,
+(nop)) or involve complex operations such as authentication. The main
+objective here is just to establish some common terminology for these
+operations.
+
+The first mechanism, enrollment, adds a (forwarding or flooding)
+element to a layer; it prepares a node to act as a functioning element
+of the layer, establishes its name (in case of a unicast layer). In
+addition, it may exchange some key parameters (for instance a distance
+function for a unicast layer) it can involve authentication, and
+setting roles and permissions. __Bootstrapping__ is a special case of
+enrollment for the _first_ node in a layer. The inverse operation is
+called _unenrollment_.
+
+After enrollment, we may add peering relationships by _creating
+adjacencies_ between forwarding elements in a unicast layer or between
+flooding elements in a broadcast layer. This will establish neighbors
+and in case of a unicast layer, may addinitionally define link
+weights. The inverse operations is called _tearing down adjacencies_
+between elements. Together, these operations will be referred to as
+_adjacency management_.
+
+Operations such as merging and splitting layers can be decomposed into
+these two operations. This doesn't mean that merge operations
+shouldn't be researched. To the contrary, optimizing this will be
+instrumental for creating networks on a global scale.
+
+For the broadcast layer, we already have most ingredients in
+place. Now we will focus on the unicast layer.
+
+### Scaling the unicast layer
+
+Let's look at how to scale implementations of the packet forwarding
+function (PFF). On the one hand, in distance vector, path vector and
+link state, the PFF is implemented as a _table_. We call it the packet
+forwarding table (PFT). On the other hand, geometric routing doesn't
+need a table and can implement the PFF as a mathematical equation
+operating on the _forwarding element names_. In this respect,
+geometric routing looks like a magic bullet to routing table
+scalability -- it doesn't need one -- but there are many challenges
+relating to the complexity of calculating greedy embeddings of graphs
+that are not static (a changing network where routers and end-hosts
+enter and leave, and links can fail and return after repair) that
+currently make these solutions impractical at scale. We will focus on
+the solutions that use a PFT.
+
+The way the unicast layer is defined at this point, the PFT scales
+_linearly_ with the number of forwarding elements (n) in the layer,
+its space complexity is O(n)[^10]. The obvious solution to any student
+of computer networks is to use a scheme like IP and Classless
+InterDomain Routing (CIDR) where the hosts _addresses_ are subnetted,
+allowing for entries in the PFT to be aggregated, drastically reducing
+its space complexity, in theory at least, to O(log(n)). So we should
+not use arbitrary names for the forwarding elements, but give them an
+_address_!
+
+Sure, that _is_ the solution, but not so fast! When building a model,
+each element in the model should be well-defined and named at most
+once -- synonyms for human use are allowed and useful, but they are
+conveniences, not part of the functioning of the model. If we
+subdivide the name of the forwarding element in different subnames, as
+is done in hierarchical addressing, we have to ask ourselves what
+element in the model each subname that name is naming! In the
+geographic routing example above, we dragged the Earth into the model,
+and used GPS coordinates (latitude and longitude) in the name. But
+where do subnets come from, and what _are_ addresses? What do we drag
+into our model, if anything, to create them?
+
+#### A quick recap
+
+{{<figure width="70%" src="/docs/concepts/unicast_layer_bc_pft.png">}}
+
+Let's recap what a simple unicast layer that uses forwarding elements
+with packet forwarding table looks like in the model. First we have
+the unicast layer itself, consisting of a set of forwarding elements
+with defined adjacencies. Recall that the necessary and sufficient
+condition for the unicast layer to be able to forward packets between
+any (source, sink)-pair is that all forwarding engines can deduce the
+values of a distance function between themselves and the sink, and
+between each of their neighbors and the sink. This means that such a
+unicast layer requires an additional (optional) element that
+distributes this routing information. Let's call it the __Routing
+Element__, and assume that it implements a simple link-state
+routing protocol. The RE is drawn as a turquoise element accompanying
+each forwarding element in the figure above. Now, each routing element
+needs to disseminate information to _all_ other nodes in the layer, in
+other words, it needs to _broadcast_ link state information. The RE is
+inside of a unicast layer, and unicast layers don't do broadcast, so
+the REs will need the help of a broadcast layer. That is what is drawn
+in the figure above. Now, at first this may look weird, but an IP
+network does this too! For instance, the Open Shortest Path First
+(OSPF) protocol uses IP multicast between OSPF routers. The way that
+the IP layer is defined just obfuscates that this OSPF multicast
+network is in fact a disguised broadcast layer. I will refer to my
+[blog post on multicast](/blog/2021/04/02/how-does-ouroboros-do-anycast-and-multicast/)
+if you like a bit more elaboration on how this maps to the IP world.
+
+#### Subdividing the unicast layer
+
+```
+Vital realizations not only provide unforeseen clarity, they also
+energize us to dig deeper.
+ -- Brian Greene (in "Until the end of time")
+```
+
+Now, it's obvious that a single global layer like this with billions
+of nodes will buckle under its own size, we need to split things up
+into smaller, more manageable groups of nodes.
+
+{{<figure width="70%" src="/docs/concepts/unicast_layer_bc_pft_split.png">}}
+
+This is shown in the figure above, where the unicast layer is split
+into 3 groups of forwarding elements, let's call them __routing
+areas__, a yellow, a turquoise and a blue area, with each its own
+broadcast layer for disseminating the link state information that is
+needed to populate the forwarding tables. These areas can be chosen
+small enough so that the forwarding tables (which still scale linear
+with respect to the number of participating nodes in the routing area)
+are manageable in size. It can also keep latency in disseminating the
+link-state packets in check, but we will deal with latency later, for
+now, let's still assume latency on the links is zero and bandwidth on
+the links is infinite.
+
+Now, in this state, there can't be any communication between the
+routing areas, so we will need to add a fourth one.
+
+{{<figure width="70%" src="/docs/concepts/unicast_layer_bc_pft_split_broadcast.png">}}
+
+This is show in the figure above. We have our 3 original routing
+areas, and I numbered some of the nodes in these original routing
+areas. These are the numbers after the dot in figure: 1, 2, 3, 4 in
+the turquoise routing area, 5,6,10 in the yellow routing area, and 1,
+5 in the blue area (I omitted some not to clutter the illustration).
+
+We have also added 4 new forwarding elements, each with their own
+(red) routing element, that have a client-server relationship (rather
+than a peering relationship) with other forwarding elements in the
+layer. These are the numbers before the dot: 1, 2, 2, and 3. This may
+look intuitively obvious, and "1.4" and "3.5" may look like
+"addresses", but let's stress the things that I think are important,
+noting that this is a _model_ and most certainly _not an
+implementation design_.
+
+Every node in the unicast layer above consists of 2 forwarding
+elements in a client-server relationship, but all the ones that are
+not drawn all have the same name, and are not functionally active, but
+are there in a virtual way to keep the naming in the layer unique.
+
+We did not introduce new elements to the model, but we did add a new
+client-server relationship between forwarding elements.
+
+This client-server relationship gives rise to some new rules for
+naming the forwarding elements.
+
+First, the names of forwarding elements that are within a routing area
+have to be unique within that routing area if they have no client
+forwarding elements within the node.
+
+Forwarding elements with client forwarding elements have the same name
+if and only if their clients are within the same routing area.
+
+In the figure, there are peering relationships between unicast nodes
+"1.4" and "2.5" and unicast nodes "2.10" and "3.5", and these four
+nodes disseminate forwarding information using the red broadcast
+layer[^11].
+
+Note that not all forwarding elements need to actively disseminate
+routing information. If the forwarding elements in the turquoise
+routing area were all (logically) directly connected to 1.4, they
+would not need the broadcast layer, this is like IP, which also
+doesn't require end-hosts to run a routing protocol.
+
+#### Structure of a unicast node
+
+The rules for allowed peering relationships relate to the structure of
+the client-server relationship. In most generalized form, this
+relationship gives rise to a directed acyclic graph (DAG) between
+forwarding elements that are part of the same unicast node.
+
+{{<figure width="70%" src="/docs/concepts/unicast_layer_dag.png">}}
+
+We call the _rank_ of the forwarding element within the node the
+height at which it resides in this DAG. For instance, the figure above
+shows two unicast nodes with their forward elements arranged as DAGs.
+The forwarding elements with a turquoise and purple routing element
+are at rank 0, and the ones with a yellow routing element are at rank
+3.
+
+A forwarding elements in one node can have peering relationships only
+with forwarding elements of other nodes that
+
+1) Are at the same rank,
+
+2) Have a different name,
+
+3) Are in the same routing area at that rank,
+
+and only if
+
+1) there are no peering relationships between two forwarding elements
+that are in the same unicast nodes at any forwarding element that is
+on a path towards the root of the DAG
+
+2) there cannot be a lower ranked peering relationship.
+
+So, in the figure above, there cannot be a peering relationship at
+rank 0, because these elements are in different routing areas
+(turquoise and purple). The lowest peering relationship can be at rank
+1, in the routing area. If at rank one, the right node would be in a
+different routing area, there could be 2 peering relationships between
+these unicast nodes, for instance at rank 2 in the green routing area,
+and at rank 3 between in the yellow routing area (or also at rank 2 in
+the blue routing area).
+
+#### What are addresses?
+
+Let's end this discussion with how all this relates to IP addressing
+and CIDR. Each "IPv4" host has 32 forwarding elements with a straight
+parent-child relationship between them [^12]. The rules above imply
+that there can be only one peering relationship between two nodes. The
+subnet mask actually acts as a sort of short-hand notation, showing
+where the routing elements are in the same routing area: with mask
+255.255.255.0, the peering relationship is at rank 8, IP network
+engineers then state that the nodes are in a /24 network.
+
+Apart from building towards CIDR from the ground up, we have also
+derived _what network addresses really are_: they consist of names of
+forwarding elements in a unicast node and reflect the organisation of
+these forwarding elements in a directed acyclic graph (DAG). Now,
+there is still a (rather amusing) and seemingly neverending discussion
+in the network community on whether IP adresses should be assigned to
+nodes or interfaces. This discussion is moot: you can write your name
+on your mailbox, that doesn't make it the name of your mailbox, it is
+_your_ name. It is also a false dichotomy caused by device-oriented
+thinking, looking at a box of electronics with a bunch of holes in
+which to plug some wires, and then thinking that we either have to
+name the box or the holes: the answer is _neither_. Just like a post
+office building doesn't do anything without post office workers (or
+their automated robotic counterparts), a router or switch doesn't do
+anything without forwarding elements. I will come back to this when
+discussing multi-homing.
+
+One additional thing is that in the current IP Internet, the layout of
+the routing areas is predominantly administratively defined and
+structured into so-called Autonomous Systems (ASs) that each receive a
+chunk of the available IP address space, with BGP used to disseminate
+routes between them. The layout and peering relationship between these
+ASs is not the most optimal for the layout of the Internet. Decoupling
+the network addressing within an AS from the addressing and structure
+of an overlaying unicast layer, and how to disseminate routes in that
+overlay unicast layer is an interesting topic that mandates more
+study[^13].
+
+### Do we really need routing at a global scale?
+
+An interesting question to ask, is whether we need to be able to scale
+a layer to the scale of the planet, or -- some day -- the solar
+system, or even the universe? IPv6 was the winning technology to deal
+with the anticapted problem of IPv4 address exhaustion. But can we
+build an Internet that doesn't require all possible end users to share
+the same network (layer)?
+
+My answer is not proven and therefore not conclusive, but I think yes,
+any public Internetwork at scale -- where it is possible for any
+end-user to reach any application -- will always need at least one
+(unicast) layer that spans most of the systems on the network and thus
+a global address space. In the current Internet, applications are
+identified by an IP address and (well-defined) port, and the Domain
+Name System (DNS) maps the host name to an IP address (or a set of IP
+addresses). In any general Internetwork, if applications were in
+private networks, we would need a system to find the (private network,
+node name in private network) for some application, and every end-host
+would need to reach that system, which -- unless I am missing
+something big -- means that system will need a global address
+space[^14].
+
+### Multi-homing
+
+
+[Under construction - [this blog post](/blog/2022/12/07/loc/id-split-and-the-ouroboros-network-model/)
+over Loc/ID split might be interesting]
+
+### Dealing with limited link capacity
+
+
+
+[Under construction]
+
+[^1]: In the paper we call these elements _data transfer protocol
+ machines_, but I think this terminology is clearer.
+
+[^2]: A tree is a connected graph with N vertices and N-1 edges.
+
+[^3]: I've already explored how some technologies map to the Ouroboros
+ model in my blog post on
+ [unicast vs multicast](/blog/2021/04/02/how-does-ouroboros-do-anycast-and-multicast/).
+
+[^4]: Of course, once the model is properly understood and a
+ green-field scenario is considered, recursive networking is the
+ obvious choice, and so the Ouroboros prototype _is_ a recursive
+ network.
+
+[^5]: This is where Ouroboros is similar to IP, and differs from RINA.
+ RINA layers (DIFs) aim to provide reliability as part of the
+ service (flow). We found this approach in RINA to be severely
+ flawed, preventing RINA to be a _universal_ model for all
+ networking and IPC. RINA can be modeled as an Ouroboros network,
+ but Ouroboros cannot be modeled as a RINA network. I've written
+ about this in more detail about this in my blog post on
+ [Ouroboros vs RINA](/blog/2021/03/20/how-does-ouroboros-relate-to-rina-the-recursive-internetwork-architecture/).
+
+[^6]: Transient loops are loops that occur due to forwarding functions
+ momentarily having different views of the network graph, for
+ instance due to delays in disseminating information on
+ unavailable links.
+
+[^7]: Some may think that it's possible to build a network layer that
+ forwards packets in a way that _deliberately_ takes a couple of
+ loops between a set of nodes and then continues forwarding to
+ the destination, violating the definition of _FORWARDING_. It's
+ not possible, because based on the destination address alone,
+ there is no way to know whether that packet came from the loop
+ or not. _"But if I add a token/identifier/cookie to the packet
+ header"_ -- yes, that is possible, and it may _look like that
+ packet is traversing a loop_ in the network, but it doesn't
+ violate the definition. The question is: what is that
+ token/identifier/cookie naming? It can be only one of a couple
+ of things: a forwarding element, a link or the complete
+ layer. Adding a token and the associated logic to process it,
+ will be equivalent to adding nodes to the layer (modifying the
+ node name space to include that token) or adding another
+ layer. In essence, the implementation of the nodes on the loop
+ will be doing something like this:
+
+ ```
+ if logic_based_on_token:
+ # behave like node (token, X)
+ else if logic_based_on_token:
+ # behave like node (token, Y)
+ else # and so on
+ ```
+
+ When taking the transformation into account the resulting
+ layer(s) will follow the fundamental model as it is presented
+ above. Also observe that adding such tokens may drastically
+ increase the address space in the ouroboros representation.
+
+[^8]: For the mathematically inclined, the exact formulation is in the
+ [paper](https://arxiv.org/pdf/2001.09707.pdf) section 2.4
+
+[^9]: Is it possible to broadcast on a non-tree graph by pruning in
+ some way, shape or form? There are some things to
+ consider. First, if the pruning is done to eliminate links in
+ the graph, let's say in a way that STP prunes links on an
+ Ethernet or VLAN, then this is operation is equivalent creating
+ a new broadcast layer. We call this enrollment and adjacency
+ management. This will be explained in the next sections. Second
+ is trying to get around loops by adding the name of the (source)
+ node plus a token/identifier/cookie as a packet header in order
+ to detect packets that have traveled in a loop, and dropping
+ them when they do. This kind of network doesn't fit neither the
+ broadcast layer nor the unicast layer. But the thing is: it also
+ _doesn't scale_, as all packets need to be tracked, at least in
+ theory, forever. Assuming packet ordering is preserved inside a
+ layer a big no-no. Another line of thinking may be to add a
+ decreasing counter to avoid loops, but it goes down a similar
+ rabbit hole. How large to set the counter? This also doesn't
+ scale. Such things may work for some use cases, but they
+ don't work _in general_.
+
+[^10]:In addition to the size of the packet forwarding tables, link
+ state, path vector and distance vector protocols are also
+ limited in size because of time delays in disseminating link
+ state information between the nodes, and the amount to be
+ disseminated. We will address this a bit later in the discourse.
+
+[^11]:The functionality of this red routing element is often
+ implemented as an unfortunate human engineer that has to subject
+ himself to one of the most inhuman ordeals imaginable: manually
+ calculating and typing IP destinations and netmasks into the
+ routing tables of a wonky piece of hardware using the most
+ ill-designed command line interface seen this side of 1974.
+
+[^12]:Drawing this in a full network example is way beyond my artistic
+ skill.
+
+[^13]:There is a serious error in the paper that states that this
+ routing information can be marked with a single bit. This is
+ only true in the limited case that there is only one "gateway"
+ node in the routing area. In the general case, path information
+ will be needed to determine which gateway to use.
+
+[^14]:A [paper on RINA](http://rina.tssg.org/docs/CAMAD-final.pdf)
+ that claims that a global address space is not needed, seems to
+ prove the exact opposite of that claim. The resolution system,
+ called the Inter-DIF Directory (IDD) is present on every system
+ that can make use of it and uses internal forwarding rules based
+ on the lookup name (in a hierarchical namespace!) to route
+ requests between its peer nodes. If that is not a global address
+ space, then I am Mickey Mouse: the addresses inside the IDD are
+ just based on strings instead of numbers. The IDD houses a
+ unicast layer with a global address space. While the IDD is
+ techically not a DIF, the DIF-DAF distinction is [severely
+ flawed](/blog/2021/03/20/how-does-ouroboros-relate-to-rina-the-recursive-internetwork-architecture/#ouroboros-diverges-from-rina).
diff --git a/content/en/docs/Concepts/problem_osi.md b/content/en/docs/Concepts/problem_osi.md
index 845de5e..66b0ad4 100644
--- a/content/en/docs/Concepts/problem_osi.md
+++ b/content/en/docs/Concepts/problem_osi.md
@@ -2,22 +2,45 @@
title: "The problem with the current layered model of the Internet"
author: "Dimitri Staessens"
-date: 2019-07-06
+date: 2020-04-06
weight: 1
description: >
- The current networking paradigm
+
---
+```
+The conventional view serves to protect us from the painful job of
+thinking.
+ -- John Kenneth Galbraith
+```
+
+Every engineering class that deals with networks explains the
+[7-layer OSI model](https://www.bmc.com/blogs/osi-model-7-layers/)
+and the
+[5-layer TCP model](https://subscription.packtpub.com/book/cloud_and_networking/9781789349863/1/ch01lvl1sec13/tcp-ip-layer-model).
+
+Both models have common origins in the International Networking
+Working Group (INWG), and therefore share many similarities. The
+TCP/IP model evolved from the implementation of the early ARPANET in
+the '70's and '80's. The Open Systems Interconnect (OSI) model was the
+result of a standardization effort in the International Standards
+Organization (ISO), which ran well into the nineties. The OSI model
+had a number of useful abstractions: services, interfaces and
+protocols, where the TCP/IP model was more tightly coupled to the
+Internet Protocol (IP).
+
+### A birds-eye view of the OSI model
+
{{<figure width="40%" src="/docs/concepts/aschenbrenner.png">}}
-Every computer science class that deals with networks explains the
-[7-layer OSI model](https://www.bmc.com/blogs/osi-model-7-layers/).
Open Systems Interconnect (OSI) defines 7 layers, each providing an
-abstraction for a certain *function* that a network application may
-need.
+abstraction for a certain *function*, or _service_ that a networked
+application may need. The figure above shows probably
+[the first draft](https://tnc15.wordpress.com/2015/06/17/locked-in-tour-europe/)
+of the OSI model.
From top to bottom, the layers provide (roughly) the following
-functions:
+services.
The __application layer__ implements the details of the application
protocol (such as HTTP), which specifies the operations and data that
@@ -46,35 +69,112 @@ Finally, the __physical layer__ is responsible for translating the
bits into a signal (e.g. laser pulses in a fibre) that is carried
between endpoints.
+The benefit of the OSI model is that each of these layers has a
+_service description_, and an _interface_ to access this service. The
+details of the protocols inside the layer were of less importance, as
+long as they got the job -- defined by the service description --
+done.
+
This functional layering provides a logical order for the steps that
data passes through between applications. Indeed, existing (packet)
-networks go through these steps in roughly this order (however, some
-may be skipped).
-
-However, when looking at current networking solutions in more depth,
-things are not as simple as these 7 layers seem to indicate. Consider
-a realistic scenario for a software developer working
-remotely. Usually it goes something like this: he connects over the
-Internet to the company __Virtual Private Network__ (VPN) and then
-establishes an SSH __tunnel__ over the development server to a virtual
-machine and then establishes another SSH connection into that virtual
-machine.
-
-We are all familiar enough with this kind of technologies to take them
-for granted. But what is really happnening here? Let's assume that the
-Internet layers between the home of the developer and his office
-aren't too complicated. The home network is IP over Wi-Fi, the office
-network IP over Ethernet, and the telecom operater has a simple IP
-over xDSL copper network (because in reality operator networks are
-nothing like L3 over L2 over L1). Now, the VPN, such as openVPN,
-creates a new network on top of IP, for instance a layer 2 network
-over TAP interfaces supported by a TLS connection to the VPN server.
-
-Technologies such as VPNs, tunnels and some others (VLANs,
-Multi-Protocol Label switching) seriously jumble around the layers in
-this layered model. Now, by my book these counter-examples prove that
-the 7-layered model is, to put it bluntly, wrong. That doesn't mean
-it's useless, but from a purely scientific view, there has to be a
-better model, one that actually fits implementations.
-
-Ouroboros is our answer towards a more complete model for computer networks. \ No newline at end of file
+networks go through these steps in roughly this order.
+
+### A birds-eye view of the TCP/IP model
+
+{{<figure width="25%" src="https://static.packt-cdn.com/products/9781789349863/graphics/6c40b664-c424-40e1-9c65-e43ebf17fbb4.png">}}
+
+The TCP/IP model came directly from the implementation of TCP/IP, so
+instead of each layer corresponding to a service, each layer directly
+corresponded to a (set of) protocol(s). IP was the unifying protocol,
+not caring what was below at layer 1. The HOST-HOST protocols offered
+a connection-oriented service (TCP) or a connectionless service (UDP)
+to the application. The _TCP/IP model_ was retroactively made more
+"OSI-like", turning into the 5-layer model, which views the top 3
+layers of OSI as an "application layer".
+
+### Some issues with these models
+
+When looking at current networking solutions in more depth,
+things are not as simple as these 5/7 layers seem to indicate.
+
+#### The order of the layers is not fixed.
+
+Consider, for instance, __Virtual Private Network__ (VPN) technologies
+and SSH __tunnels__. We are all familiar enough with this kind of
+technologies to take them for granted. But a VPN, such as openVPN,
+creates a new network on top of IP. In _bridging_ mode this is a Layer
+2 (Ethernet) network over TAP interfaces, in _routing_ mode this is a
+Layer 3 (IP) network over TUN interfaces. In both cases they are
+supported by a Layer 4 connection (using, for instance Transport Layer
+Security) to the VPN server that provides the network
+access. Technologies such as VPNs and various so-called _tunnels_
+seriously jumble around the layers in this layered model.
+
+#### How many layers are there exactly?
+
+Multi-Protocol Label switching (MPLS), a technology that allows
+operators to establish and manage circuit-like paths in IP networks,
+typically sits in between Layer 2 and IP and is categorized as a
+_Layer 2.5_ technology. So are there 8 layers? Why not revise the
+model and number them 1-8 then?
+
+QUIC is a protocol that performs transport-layer functions such as
+retransmission, flow control and congestion control, but works around
+the initial performance bottleneck after starting a TCP connection
+(3-way handsake, slow start) and some other optimizations dealing with
+re-establishing connections for which security keys are known. But
+QUIC runs on top of UDP. If UDP is Layer 4, then what layer is QUIC?
+
+One could argue that UDP is an incomplete Layer 4 protocol and QUIC
+adds its missing Layer 4 functionalities. Fair enough, but then what
+is the minimum functionality for a complete Layer 4 protocol? And what
+is a minimum functionality for a Layer 3 protocol? What have IP, ICMP
+and IGMP in common that makes them Layer 3 beyond the arbitrary
+concensus that they should be available on a brick of future e-waste
+that is sold as a "router"?
+
+#### Which protocol fits in which layer is not clear-cut.
+
+There are a whole slew of protocols that are situated in Layer 3:
+ICMP, SNMP... They don't really need the features that Layer 4
+provides (retransmission, ...). But again, they run on _top of Layer
+3_ (IP). They get assigned a protocol number in the IP field, instead
+of a port number in the UDP header. But doesn't a Layer 3 protocol
+number indicate a Layer 4 protocol? Apparently only in some cases, but
+not in others.
+
+The Border Gateway Protocol (BGP) performs (inter-domain)
+routing. Routing is a function that is usually associated with Layer
+3. But BGP runs on top of TCP, which is Layer 4, so is it in the
+application layer? There is no real concensus of what layer BGP is in,
+some say Layer 3, some (probably most) say Layer 4, because it is
+using TCP, and some say it's application layer. But the concensus does
+seem to be that the BGP conundrum doesn't matter. BGP works, and the
+OSI and TCP/IP models are _just theoretical models_, not _rules_ that
+are set in stone.
+
+### Are these issues _really_ a problem?
+
+Well, in my opinion: yes! These models are pure [rubber
+science](https://en.wikipedia.org/wiki/Rubber_science). They have no
+predictive value, don't fit with observations of the real-world
+Internet most of us use every day, and are about as arbitrary as a
+seven-course tasting menu of home-grown vegetables. Their only uses
+are as technobabble for network engineers and as tools for university
+professors to gauge their students' ability to retain a moderate
+amount of stratified dribble.
+
+If there is no universally valid theoretical model, if we have no
+clear definitions of the fundamental concepts and no clearly defined
+set of rules that unequivocally lay out what the _necessary and
+sufficient conditions for networking_ are, then we are all just
+_engineering in the dark_, progress in developing computer networks
+condemned to a sisyphean effort of perpetual incremental fixes, its
+fate to remain a craft that builds on tradition, cobbling together an
+ever-growing bungle of technologies and protocols that stretch the
+limits of manageability.
+
+Not yet convinced? Read an even more in-depth explanation on our
+[blog](/blog/2022/02/12/what-is-wrong-with-the-architecture-of-the-internet/),
+about the seperation of concerns design principle and layer violations
+and about seperation of mechanism & policy and ossification.
diff --git a/content/en/docs/Concepts/rec_netw.jpg b/content/en/docs/Concepts/rec_netw.jpg
deleted file mode 100644
index bddaca5..0000000
--- a/content/en/docs/Concepts/rec_netw.jpg
+++ /dev/null
Binary files differ
diff --git a/content/en/docs/Concepts/unicast_layer.png b/content/en/docs/Concepts/unicast_layer.png
new file mode 100644
index 0000000..c77ce48
--- /dev/null
+++ b/content/en/docs/Concepts/unicast_layer.png
Binary files differ
diff --git a/content/en/docs/Concepts/unicast_layer_bc_pft.png b/content/en/docs/Concepts/unicast_layer_bc_pft.png
new file mode 100644
index 0000000..77860ce
--- /dev/null
+++ b/content/en/docs/Concepts/unicast_layer_bc_pft.png
Binary files differ
diff --git a/content/en/docs/Concepts/unicast_layer_bc_pft_split.png b/content/en/docs/Concepts/unicast_layer_bc_pft_split.png
new file mode 100644
index 0000000..9a4f9fb
--- /dev/null
+++ b/content/en/docs/Concepts/unicast_layer_bc_pft_split.png
Binary files differ
diff --git a/content/en/docs/Concepts/unicast_layer_bc_pft_split_broadcast.png b/content/en/docs/Concepts/unicast_layer_bc_pft_split_broadcast.png
new file mode 100644
index 0000000..fa66864
--- /dev/null
+++ b/content/en/docs/Concepts/unicast_layer_bc_pft_split_broadcast.png
Binary files differ
diff --git a/content/en/docs/Concepts/unicast_layer_dag.png b/content/en/docs/Concepts/unicast_layer_dag.png
new file mode 100644
index 0000000..010ad4f
--- /dev/null
+++ b/content/en/docs/Concepts/unicast_layer_dag.png
Binary files differ
diff --git a/content/en/docs/Concepts/what.md b/content/en/docs/Concepts/what.md
deleted file mode 100644
index ac87754..0000000
--- a/content/en/docs/Concepts/what.md
+++ /dev/null
@@ -1,78 +0,0 @@
----
-title: "Recursive networks"
-author: "Dimitri Staessens"
-
-date: 2020-01-11
-weight: 2
-description: >
- The recursive network paradigm
----
-
-The functional repetition in the network stack is discussed in
-detail in the book __*"Patterns in Network Architecture: A Return to
-Fundamentals"*__. From the observations in the book, a new architecture
-was proposed, called the "__R__ecursive __I__nter__N__etwork
-__A__rchitecture", or [__RINA__](http://www.pouzinsociety.org).
-
-__Ouroboros__ follows the recursive principles of RINA, but deviates
-quite a bit from its internal design. There are resources on the
-Internet explaining RINA, but here we will focus
-on its high level design and what is relevant for Ouroboros.
-
-Let's look at a simple scenario of an employee contacting an internet
-corporate server over a Layer 3 VPN from home. Let's assume for
-simplicity that the corporate LAN is not behind a NAT firewall. All
-three networks perform (among some other things):
-
-__Addressing__: The VPN hosts receive an IP address in the VPN, let's
-say some 10.11.12.0/24 address. The host will also have a public IP
-address, for instance in the 20.128.0.0/16 range . Finally that host
-will have an Ethernet MAC address. Now the addresses __differ in
-syntax and semantics__, but for the purpose of moving data packets,
-they have the same function: __identifying a node in a network__.
-
-__Forwarding__: Forwarding is the process of moving packets to a
-destination __with intent__: each forwarding action moves the data
-packet __closer__ to its destination node with respect to some
-__metric__ (distance function).
-
-__Network discovery__: Ethernet switches learn where the endpoints are
-through MAC learning, remembering the incoming interface when it sees
-a new soure address; IP routers learn the network by exchanging
-informational packets about adjacency in a process called *routing*;
-and a VPN proxy server relays packets as the central hub of a network
-connected as a star between the VPN clients and the local area
-network (LAN) that is provides access to.
-
-__Congestion management__: When there is a prolonged period where a
-node receives more traffic than can forward forward, for instance
-because there are incoming links with higher speeds than some outgoing
-link, or there is a lot of traffic between different endpoints towards
-the same destination, the endpoints experience congestion. Each
-network could handle this situation (but not all do: TCP does
-congestion control for IP networks, but Ethernet just drops traffic
-and lets the IP network deal with it. Congestion management for
-Ethernet never really took off).
-
-__Name resolution__: In order not having to remember addresses of the
-hosts (which are in a format that make it easier for a machine to deal
-with), each network keeps a mapping of a name to an address. For IP
-networks (which includes the VPN in our example), this is done by the
-Domain Name System (DNS) service (or, alternatively, other services
-such as *open root* or *namecoin*). For Ethernet, the Address
-Resolution Protocol maps a higher layer name to a MAC (hardware)
-address.
-
-{{<figure width="50%" src="/docs/concepts/layers.jpg">}}
-
-Recursive networks take all these functions to be part of a network
-layer, and layers are mostly defined by their __scope__. The lowest
-layers span a link or the reach of some wireless technology. Higher
-layers span a LAN or the network of a corporation e.g. a subnetwork or
-an Autonomous System (AS). An even higher layer would be a global
-network, followed by a Virtual Private Network and on top a tunnel
-that supports the application. Each layer being the same in terms of
-functionality, but different in its choice of algorithm or
-implementation. Sometimes the function is just not implemented
-(there's no need for routing in a tunnel!), but logically it could be
-there.
diff --git a/content/en/docs/Contributions/_index.md b/content/en/docs/Contributions/_index.md
index b5ffa5f..558298e 100644
--- a/content/en/docs/Contributions/_index.md
+++ b/content/en/docs/Contributions/_index.md
@@ -7,14 +7,23 @@ description: >
How to contribute to Ouroboros.
---
+### Ongoing work
+
+Ouroboros is far from complete. Plenty of things need to be researched
+and implemented. We don't really keep a list, but this
+[epic board](https://tree.taiga.io/project/dstaesse-ouroboros/epics) can
+give you some ideas of what is still on our mind and where you may be
+able to contribute.
+
### Communication
There are 2 ways that will be used to communicate: The mailing list
(ouroboros@freelists.org) will be used for almost everything except
-for day-to-day chat. For that we use the
-[slack](https://odecentralize.slack.com) (invite link in footer) and
-the #ouroboros channel on Freenode (IRC chat). The slack channel is a
-bit more active, and preferred. Use whatever login name you desire.
+for day-to-day chat. For that we use a public slack channel
+[slack](https://odecentralize.slack.com) (invite link in footer)
+bridged to a
+[matrix space](https://matrix.to/#/#ODecentralize:matrix.org).
+Use whatever login name you desire.
Introduce yourself, use common sense and be polite!
@@ -22,7 +31,7 @@ Introduce yourself, use common sense and be polite!
The coding guidelines of the main Ouroboros stack are similar as those
of the Linux kernel
-(https://www.kernel.org/doc/Documentation/CodingStyle) with the
+(https://www.kernel.org/doc/html/latest/process/coding-style.html) with the
following exceptions:
- Soft tabs are to be used instead of hard tabs
@@ -96,8 +105,8 @@ real e-mail address.
#### Commit messages
-A commit message should follow these 10 simple rules (adjusted from
-http://chris.beams.io/posts/git-commit/):
+A commit message should follow these 10 simple rules, based on
+(http://chris.beams.io/posts/git-commit/)
1. Separate subject from body with a blank line
2. Limit the subject line to 50 characters
diff --git a/content/en/docs/Extra/ioq3.md b/content/en/docs/Extra/ioq3.md
index db38d83..05a4626 100644
--- a/content/en/docs/Extra/ioq3.md
+++ b/content/en/docs/Extra/ioq3.md
@@ -41,8 +41,9 @@ With Ouroboros installed, build the ioq3 project in standalone mode:
$ STANDALONE=1 make
```
-You may need to install some dependencies like SDL2, see the [ioq3
-documentation](http://wiki.ioquake3.org/Building_ioquake3).
+You may need to install some dependencies like [SDL2]
+(https://wiki.libsdl.org/SDL2/Installation), see the [ioq3 documentation]
+(https://ioquake3.org/help/building-ioquake3/building-ioquake3-on-linux/).
The ioq3 project only supplies the game engine. To play Quake III Arena,
you need the original game files and a valid key. Various open source
@@ -66,7 +67,7 @@ $ unzip -j openarena-0.8.8.zip 'openarena-0.8.8/baseoa/*' -d ./baseoa
```
Make sure you have a local Ouroboros layer running in your system (see
-[this tutorial](/tutorial-1/)).
+[this tutorial](/docs/tutorials/tutorial-1/)).
To test the game, start a server (replace <arch> with the correct
architecture extension for your machine, eg x86_64):
diff --git a/content/en/docs/Extra/rumba.md b/content/en/docs/Extra/rumba.md
deleted file mode 100644
index 5023f8e..0000000
--- a/content/en/docs/Extra/rumba.md
+++ /dev/null
@@ -1,13 +0,0 @@
----
-title: "Rumba"
-author: "Dimitri Staessens"
-date: 2019-10-06
-draft: false
-description: >
- Small orchestration framework for deploying recursive networks.
----
-
-Rumba is an __experimentation framework__ for deploying recursive
-network experiments in various network testbeds. It was developed as
-part of the [ARCFIRE](http://ict-arcfire.eu) project, and available on
-[gitlab](https://gitlab.com/arcfire/rumba) .
diff --git a/content/en/docs/Intro/_index.md b/content/en/docs/Intro/_index.md
deleted file mode 100644
index 7ca8160..0000000
--- a/content/en/docs/Intro/_index.md
+++ /dev/null
@@ -1,67 +0,0 @@
----
-title: "Welcome to Ouroboros"
-linkTitle: "Introduction"
-author: "Dimitri Staessens"
-date: 2019-12-30
-weight: 5
-description: >
- Introduction.
----
-
-```
-Simplicity is a great virtue but it requires hard work to achieve it and
-education to appreciate it.
-And to make matters worse: complexity sells better.
- -- Edsger Dijkstra
-```
-
-This is the portal for the ouroboros networking prototype. Ouroboros
-aims to make packet networks simpler, and as a result, more reliable,
-secure and private. How? By introducing strong, well-defined
-abstractions and hiding internal complexity. A bit like modern
-programming languages abstract away details such as pointers.
-
-The main driver behind the ouroboros prototype is a good ol' personal
-itch. I've started my academic research career on optical networking,
-and moved up the stack towards software defined networks, learning the
-fine details of Ethernet, IP, TCP and what not. But when I came into
-contact with John Day and his Recursive InterNetwork Architecture
-(RINA), it really struck home how unnecessarily complicated today's
-networks are. The core abstractions that RINA moved towards simplify
-things a lot. I was fortunate to have a PhD student that understood
-the implications of these abstractions, and together we just went on
-and digged deeper into the question of how we could make everything as
-simple as possible. When something didn't fall into place or felt
-awkward, we trace back to why it didn't fit, instead of plough forward
-and make it fit. Ouroboros is the current state of affairs in this
-quest.
-
-We often get the question "How is this better than IP"? To which the
-only sensible answer that we can give right now is that ouroboros is
-way more elegant. It has far fewer abstractions and every concept is
-well-defined. It's funny (or maybe not) how many times when we start
-explaining Ouroboros to someone, people immediately interrupt and
-start explaining how they can do this or that with IP. We know,
-they're right, but it's also completely besides our point.
-
-But, if you're open to the idea that the TCP/IP network stack is a
-huge gummed-up mess that's in need for some serious redesign, do read
-on. If you are interested in computer networks in general, if you are
-eager to learn something new and exciting without the need to deploy
-it tomorrow, and if you are willing to put in the time and effort to
-understand how all of this works, by all means: ask away!
-
-We're very open to constructive suggestions on how to further improve
-the prototype and the documentation, in particular this website. We
-know it's hard to understand in places. No matter how simple we made
-the architecture, it's still a lot to explain, and writing efficient
-documentation is a tough trade. So don't hesitate to contact us with
-any questions you may have.
-
-Above all, stay curious!
-
-```
-... for the challenge of simplification is so fascinating that, if
-we do our job properly, we shall have the greatest fun in the world.
- -- Edsger Dijkstra
-``` \ No newline at end of file
diff --git a/content/en/docs/Overview/_index.md b/content/en/docs/Overview/_index.md
index 9fd9970..06f5400 100644
--- a/content/en/docs/Overview/_index.md
+++ b/content/en/docs/Overview/_index.md
@@ -9,62 +9,93 @@ description: >
Ouroboros is a prototype **distributed system** for packetized network
communications. It is a redesign _ab initio_ of the current packet
-networking model -- from the programming API ("Layer 7") almost to the
-_wire_ ("Layer 1") -- without compromises. This means it's not
-directly compatible with anything currently available. It can't simply
-be "plugged into" the current network stack. Instead it has some
-interfaces into inter-operate with common technologies: run Ouroboros
-over Ethernet or UDP, or create tunnels over Ouroboros using tap or
-tun devices.
-
-From an application perspective, Ouroboros network operates as a "black
-box" with a
-[very simple interface](https://ouroboros.rocks/man/man3/flow_alloc.3.html).
-Either it provides a _flow_, a bidirectional channel that delivers data
-within some requested operational parameters such as delay and
+networking model -- from the programming API almost to the wire --
+without compromises. While the prototype not directly compatible with
+IP or sockets, it has some interfaces to be interoperable with common
+technologies: we run Ouroboros over Ethernet or UDP, or create
+IP/Ethernet tunnels over Ouroboros by exposing tap or tun devices.
+
+From an application perspective, an Ouroboros network is a "black box"
+with a
+[simple interface](https://ouroboros.rocks/man/man3/flow_alloc.3.html).
+Either Ouroboros will provides a _flow_, a bidirectional channel that delivers
+data within some requested operational parameters such as delay and
bandwidth and reliability and security; or it provides a broadcast
-channel.
+channel to a set of joined programmes.
From an administrative perspective, an Ouroboros network is a bunch of
_daemons_ that can be thought of as **software routers** (unicast) or
**software _hubs_** (broadcast) that can be connected to each other;
again through
[a simple API](https://ouroboros.rocks/man/man8/ouroboros.8.html).
-Each daemon has an address, and they forward packets among each other.
-The daemons also implement their own internal name-to-address resolution.
-Some of the main _features_ are:
+Some of the main characteristics are:
+
+* Ouroboros is <b>minimalistic</b>: it has only the essential protocol
+ fields. It will also try to use the lowest possible network layer
+ (i.e. on a single machine, Ouroboros communicates directly over
+ shared memory, over a LAN it will communicate over Ethernet, over IP
+ it will communicate over UDP), in a completely transparent way to
+ the application.
+
+* Ouroboros enforces the _end-to-end_ principle. Packet headers are
+ <b>immutable</b> between the state machines that operate on their
+ state. Only two protocol fields change on a hop-by-hop (as viewed
+ within a network layer) basis: [TTL and
+ ECN](/docs/concepts/protocols/). This immutability can be enforced
+ through authentication (not yet implemented).
-* Ouroboros is minimal: it only sends what it needs to send to operate.
+* Ouroboros has _external_ and _dynamic_ server application
+ binding. Socket applications leave it to the application developer
+ to manage binding from within the program (typically a bind() call
+ to either a specific IP address or to all addresses (0.0.0.0),
+ leaving all configuration application (or library-) specific. When
+ shopping for network libraries, typical questions are "can it bind
+ to multiple IP addresses for high availability?", "Can I run
+ multiple servers in parallel on the same port for scaling?".
+ Ouroboros makes all this management external to the program: server
+ applications only need to call flow_accept(). The _bind()_ primitive
+ allows a program (or running process) to be bound from the command
+ line to a certain (set of) service names and when a flow request
+ arrives for that service, Ouroboros acts as a broker that hands of
+ the flow to any program that is bound to that service. Binding is
+ N-to-M: multiple programs can be bound to the same service name, and
+ programs can be bound to multiple names. This binding is also
+ _dynamic_: it can be done while the program is running, and will not
+ disrupt existing flows. In addition, the _register()_ primitive
+ allows external and dynamic control over which network a service
+ name is available over. Again, while the service is running, and
+ without disrupting existing flows.
-* Ouroboros adheres to the _end-to-end_ principle. Packet headers are
- immutable between the program components (state machines) that
- operate on their state. Only two protocol fields change on a
- hop-by-hop (as viewed within a network layer) basis:
- [TTL and ECN](/docs/concepts/protocols/).
+* The Ouroboros end-to-end protocol performs flow control, error
+ control and reliable transfer and is implemented as part of the
+ _application library_. This includes sequence numbering, ordering,
+ sending and handling acknowledgments, managing flow control windows,
+ ...
* Ouroboros can establish an encrypted flow in a _single RTT_ (not
including name-to-address resolution). The flow allocation API is a
2-way handshake (request-response) that agrees on endpoint IDs and
- performs an ECDHE key exchange. The end-to-end protocol
+ performs an ECDHE key exchange. The end-to-end protocol is based on
+ Delta-t and
[doesn't need a handshake](/docs/concepts/protocols/#operation-of-frcp).
-* The Ouroboros end-to-end protocol performs flow control, error
- control and reliable transfer and is implemented as part of the
- _application library_. Sequence numbers, acknowledgments, flow
- control windows... The last thing the application does (or should
- do) is encrypt everything before it hands it to the network layer
- for delivery. With this functionality in the library, it's easy to
- force encryption on _every_ flow that is created from your machine
- over Ouroboros regardless of what the application programmer has
- requested. Unlike TLS, the end-to-end header (sequence number etc)
- is fully encrypted.
+* Ouroboros allows encrypting everything before handing it to the next
+ layer for delivery. With this functionality in the library, it's
+ easy to force encryption on _every_ flow that is created from your
+ machine over Ouroboros regardless of what the application programmer
+ has implemented. Unlike TLS, the end-to-end header (sequence number
+ etc) can be fully encrypted.
+
+* Ouroboros congestion control operates at the network level. It does
+ not (_can not!_) rely on acknowledgements. This means all network
+ flows are automatically congestion controlled.
* The flow allocation API works as an interface to the network. An
Ouroboros network layer is therefore "aware" of all traffic that it
- is offered. This allows the layer to shape and police traffic, but
- only based on quantity and QoS, not on the contents of the packets,
- to ensure _net neutrality_.
+ is offered. This allows the layer to implement shaping and police
+ traffic, but only based on quantity and QoS, not on the contents of
+ the packets, to ensure _net neutrality_.
For a lot more depth, our article on the design of Ouroboros is
accessible on [arXiv](https://arxiv.org/pdf/2001.09707.pdf).
@@ -72,9 +103,16 @@ accessible on [arXiv](https://arxiv.org/pdf/2001.09707.pdf).
The best place to start understanding a bit what Ouroboros aims to do
and how it differs from other packet networks is to first watch this
presentation at [FOSDEM
-2018](https://archive.fosdem.org/2018/schedule/event/ipc/) (it's over
-two years old, so not entirely up-to-date anymore), and have a quick
-read of the [flow allocation](/docs/concepts/fa/) and [data
-path](/docs/concepts/datapath/) sections.
+2018](https://archive.fosdem.org/2018/schedule/event/ipc/) but note
+that this presentation is over three years old, and very outdated in
+terms of what has been implemented. The prototype implementation is
+now capable of asynchronous flows handling, doing retransmission, flow
+control, congestion control...
+
+The next things to do are to have a quick read of the
+[flow allocation](/docs/concepts/fa/)
+and
+[data path](/docs/concepts/datapath/)
+sections.
{{< youtube 6fH23l45984 >}}
diff --git a/content/en/docs/Releases/0_18.md b/content/en/docs/Releases/0_18.md
new file mode 100644
index 0000000..c489d33
--- /dev/null
+++ b/content/en/docs/Releases/0_18.md
@@ -0,0 +1,109 @@
+---
+date: 2021-02-12
+title: "Ouroboros 0.18"
+linkTitle: "Ouroboros 0.18"
+description: "Major additions and changes in 0.18.0"
+author: Dimitri Staessens
+---
+
+With version 0.18 come a number of interesting updates to the prototype.
+
+### Automated Repeat-Request (ARQ) and flow control
+
+We finished the implementation of the base retransmission
+logic. Ouroboros will now send, receive and handle acknowledgments
+under packet loss conditions. It will also send and handle window
+updates for flow control. The operation of flow control is very
+similar to the operation of window-based flow control in TCP, the main
+difference being that our sequence numbers are per-packet instead of
+per-byte.
+
+The previous version of FRCP had some partial implementation of the
+ARQ functionality, such as piggybacking ACK information on _writes_
+and handling sequence numbers on _reads_. But now, Ourobroos will also
+send (delayed) ACK packets without data if the application is not
+sending and finish sending when a flow is closed if not everything was
+acknowledged (can be turned off with the FRCTFLINGER flag).
+
+Recall that Ouroboros has this logic implemented in the application
+library, it's not a separate component (or kernel) that is managing
+transmit and receive buffers and retransmission. Furthermore, our
+implementation doesn't add a thread to the application. If a
+single-threaded application uses ARQ, it will remain single-threaded.
+
+It's not unlikely that in the future we will add the option for the
+library to start a dedicated thread to manage ARQ as this may have
+some beneficial characteristics for read/write call durations. Other
+future addditions may include fast-retransmit and selective ACK
+support.
+
+The most important characteristic of Ouroboros FRCP compared to TCP
+and derivative protocols (QUIC, SCTP, ...) is that it is 100%
+independent of congestion control, which allows for it to operate at
+real RTT timescales (i.e. microseconds in datacenters) without fear of
+RTT underestimates severely capping throughput. Another characteristic
+is that the RTT estimate is really measuring the responsiveness of the
+application, not the kernel on the machine.
+
+A detailed description of the operation of ARQ can be found
+in the [protocols](/docs/concepts/protocols/#operation-of-frcp)
+section.
+
+### Congestion Avoidance
+
+The next big addition is congestion avoidance. By default, the unicast
+layer's default configuration will now congestion-control all client
+traffic sent over them[^1]. As noted above, congestion avoidance in
+Ouroboros is completely independent of the operation of ARQ and flow
+control. For more information about how this all works, have a look at
+the developer blog
+[here](/blog/2020/12/12/congestion-avoidance-in-ouroboros/) and
+[here](/blog/2020/12/19/exploring-ouroboros-with-wireshark/).
+
+### Revision of the flow allocator
+
+We also made a change to the flow allocator, more specifically the
+Endpoint IDs to use 64-bit identifiers. The reason for this change is
+to make it harder to guess these endpoint identifiers. In TCP,
+applications can listen to sockets that are bound to a port on a (set
+of) IP addresses. You can't imagine how many hosts are trying to brute
+force password guess SSH logins on TCP port 22. To make this at least
+a bit harder, Ouroboros has no well-known application ports, and after
+this patch they are roughtly equivalent to a 32-bit random
+number. Note that in an ideal Ouroboros deployment, sensitive
+applications such as SSH login should run on a different layer/network
+than publicly available applications.
+
+### Revision of the ipcpd-udp
+
+The ipcpd-udp has gone through some revisions during its lifetime. In
+the beginning, we wanted to emulate the operation of an Ouroboros
+layers, having the flow allocator listening on a certain UDP port, and
+mapping endpoints identifiers to random ephemeral UDP ports. So as an
+example, the source would generate a UDP socket, e.g. on port 30927,
+and send a request for a new flow the fixed known Ouroboros UDP port
+(3531) at the receiver. This also generates a socket on an ephemeral
+UDP port, say 23705, and it sends a response back to the source on UDP
+port 3531. Traffic for the "client" flow would be on UDP port pair
+(30927, 23705). This was giving a bunch of headaches with computers
+behind NAT firewalls, rendering that scheme only useful in lab
+environments. To make it more useable, the next revision used a single
+fixed incoming UDP port for the flow allocator protocol, using an
+ephemeral UDP port from the sender side per flow and added the flow
+allocator endpoints as a "next header" inside UDP. So traffic would
+always be sent to destination UDP port 3531. Benefit was that only a
+single port was needed in the NAT forwarding rules, and that anyone
+running Ouroboros would be able to receive allocation messages, and
+this is enforcing a bit all users to participate in a mesh topology.
+However, opening a certain UDP port is still a hassle, so in this
+(most likely final) revision, we just run the flow allocator in the
+ipcpd-udp as a UDP server on a (configurable) port. No more NAT
+firewall configurations required if you want to connect (but if you
+want to accept connections, opening UDP port 3531 is still required).
+
+The full changelog can be browsed in
+[cgit](/cgit/ouroboros/log/?showmsg=1).
+
+[^1]: This is not a claim that every packet inside a layer is
+ flow-controlled: internal management traffic to the layer (flow
+ allocator protocol, etc) is not congestion-controlled. \ No newline at end of file
diff --git a/content/en/docs/Releases/0_20.md b/content/en/docs/Releases/0_20.md
new file mode 100644
index 0000000..7f2ff9a
--- /dev/null
+++ b/content/en/docs/Releases/0_20.md
@@ -0,0 +1,70 @@
+---
+date: 2023-09-21
+title: "Ouroboros 0.20"
+linkTitle: "Ouroboros 0.20"
+description: "Major additions and changes in 0.20.0"
+author: Dimitri Staessens
+---
+
+Version 0.20 brings some code refactoring and a slow of bugfixes to
+the prototype to improve stability, but the main quality-of-life
+addition is config file support in TOML format. This removes the need
+for bash scripts to configure the prototype on reboots/restarts; a
+very basic feature that was long overdue.
+
+As an example, before v0.20, this server had Ouroboros running as a
+systemd service, and it was configured using the following irm commands:
+
+```bash
+irm i b t udp n udp l udp ip 51.38.114.133
+irm n r ouroboros.rocks.oping l udp
+irm b prog oping n ouroboros.rocks.oping auto -- -l
+```
+
+These bootstrap a UDP layer to the server's public IP address,
+register the name "ouroboros.rocks.oping" with that layer and bind the
+program binary /usr/bin/oping to that name, telling the irmd to start
+that server automatically if it wasn't running before.
+
+While pretty simple to perform, if the service was restarted or the
+server was rebooted, we needed to re-run these commands (we could have
+added them to some system startup script, of course).
+
+Now the IRMd will load the config file specified in
+/etc/ouroboros/irmd.conf. The IRMd configuration to achieve the above
+(I renamed the UDP layer to "Internet", but that name doesn't really
+matter if there is only one ipcpd-udp in the system):
+```bash
+root@vps646159:~# cat /etc/ouroboros/irmd.conf
+### Ouroboros configuration file
+[name."ouroboros.rocks.oping"]
+prog=["/usr/bin/oping"]
+args=["-l"]
+
+[udp.internet]
+bootstrap="Internet"
+ip="51.38.114.133"
+reg=["ouroboros.rocks.oping"]
+```
+
+To enable config file support, tomlc99 is needed. Install via
+
+```bash
+git clone https://github.com/cktan/tomlc99
+cd tomlc99
+make
+sudo make install
+```
+
+and then reconfigure cmake and build Ouroboros as usual.
+
+More information on how to use config files is in the example
+configuration file, installed in /etc/ouroboros/irmd.conf.example, or
+you can have a quick look in the
+[repository](/cgit/ouroboros/tree/irmd.conf.in).
+
+The full git changelog can be browsed in
+[cgit](/cgit/ouroboros/log/?showmsg=1).
+
+
+
diff --git a/content/en/docs/Releases/_index.md b/content/en/docs/Releases/_index.md
new file mode 100644
index 0000000..8328c33
--- /dev/null
+++ b/content/en/docs/Releases/_index.md
@@ -0,0 +1,6 @@
+
+---
+title: "Releases"
+linkTitle: "Release notes"
+weight: 120
+---
diff --git a/content/en/docs/Start/_index.md b/content/en/docs/Start/_index.md
index 963b9f1..735511b 100644
--- a/content/en/docs/Start/_index.md
+++ b/content/en/docs/Start/_index.md
@@ -1,7 +1,225 @@
---
title: "Getting Started"
-linkTitle: "Getting Started"
+linkTitle: "Getting Started/Installation"
weight: 20
description: >
How to get up and running with the Ouroboros prototype.
---
+
+### Get Ouroboros
+
+**Packages:**
+
+For ArchLinux users, the easiest way to try Ouroboros is via the [Arch
+User Repository](https://aur.archlinux.org/packages/ouroboros-git/),
+which will also install all dependencies.
+
+**Source:**
+
+You can clone the [repository](/cgit/ouroboros) over https or
+git:
+
+```bash
+$ git clone https://ouroboros.rocks/git/ouroboros
+$ git clone git://ouroboros.rocks/ouroboros
+```
+
+Or download a [snapshot](/cgit/ouroboros/) tarball and extract it.
+
+### System requirements
+
+Ouroboros builds on most POSIX compliant systems. Below you will find
+instructions for GNU/Linux, FreeBSD and OS X. On Windows 10, you can
+build Ouroboros using the [Linux Subsystem for
+Windows](https://docs.microsoft.com/en-us/windows/wsl/install-win10) .
+
+You need [*git*](https://git-scm.com/) to clone the
+repository. To build Ouroboros, you need [*cmake*](https://cmake.org/),
+[*google protocol buffers*](https://github.com/protobuf-c/protobuf-c)
+installed in addition to a C compiler ([*gcc*](https://gcc.gnu.org/) or
+[*clang*](https://clang.llvm.org/)) and
+[*make*](https://www.gnu.org/software/make/).
+
+Optionally, you can also install
+[*libgcrypt*](https://gnupg.org/software/libgcrypt/index.html),
+[*libssl*](https://www.openssl.org/),
+[*fuse*](https://github.com/libfuse), and *dnsutils*.
+
+On GNU/Linux you will need either libgcrypt (≥ 1.7.0) or libssl if your
+[*glibc*](https://www.gnu.org/software/libc/) is older than version
+2.25.
+
+On OS X, you will need [homebrew](https://brew.sh/).
+[Disable System Integrity Protection](https://developer.apple.com/library/content/documentation/Security/Conceptual/System_Integrity_Protection_Guide/ConfiguringSystemIntegrityProtection/ConfiguringSystemIntegrityProtection.html)
+during the
+[installation](#install)
+and/or
+[removal](#remove)
+of Ouroboros.
+
+### Install the dependencies
+
+**Debian/Ubuntu Linux:**
+
+```bash
+$ apt-get install git protobuf-c-compiler cmake
+$ apt-get install libgcrypt20-dev libssl-dev libfuse-dev dnsutils cmake-curses-gui
+```
+
+If during the build process cmake complains that the Protobuf C
+compiler is required but not found, and you installed the
+protobuf-c-compiler package, you will also need this:
+
+```bash
+$ apt-get install libprotobuf-c-dev
+```
+
+**Arch Linux:**
+
+```bash
+$ pacman -S git protobuf-c cmake
+$ pacman -S libgcrypt openssl fuse dnsutils
+```
+
+**FreeBSD 11:**
+
+```bash
+$ pkg install git protobuf-c cmake
+$ pkg install libgcrypt openssl fusefs-libs bind-tools
+```
+
+**Mac OS X Sierra / High Sierra:**
+
+```bash
+$ brew install git protobuf-c cmake
+$ brew install libgcrypt openssl
+```
+
+### Install Ouroboros
+
+When installing from source, go to the cloned git repository or
+extract the tarball and enter the main directory. We recommend
+creating a build directory inside this directory:
+
+```bash
+$ mkdir build && cd build
+```
+
+Run cmake providing the path to where you cloned the Ouroboros
+repository. Assuming you created the build directory inside the
+repository directory, do:
+
+```bash
+$ cmake ..
+```
+
+Build and install Ouroboros:
+
+```bash
+$ sudo make install
+```
+
+### Advanced options
+
+Ouroboros can be configured by providing parameters to the cmake
+command:
+
+```bash
+$ cmake -D<option>=<value> ..
+```
+
+Alternatively, after running cmake and before installation, run
+[ccmake](https://cmake.org/cmake/help/latest/manual/ccmake.1.html) to
+configure Ouroboros:
+
+```bash
+$ ccmake .
+```
+
+A list of all build options can be found [here](/docs/reference/compopt).
+
+### Remove Ouroboros
+
+To uninstall Ouroboros, simply execute the following command from your
+build directory:
+
+```bash
+$ sudo make uninstall
+```
+
+To check if everything is installed correctly, you can now jump into
+the [Tutorials](../../tutorials/) section, or you can try to ping this
+webhost over ouroboros using the name _ouroboros.rocks.oping_
+
+Our webserver is of course on an IP network, and ouroboros does not
+control IP, but it can run over UDP/IP.
+
+To be able to contact our server over ouroboros, you will need to do
+some small DNS configuration: to tell the ouroboros UDP system that
+the process "ouroboros.rocks.oping" is running on our webserver by
+add the line
+
+```
+51.38.114.133 1bf2cb4fb361f67a59907ef7d2dc5290
+```
+
+to your ```/etc/hosts``` file[^1][^2].
+
+Here are the steps to ping our server over ouroboros:
+
+Run the IRMd:
+
+```bash
+$ sudo irmd &
+```
+Then you will need find your (private) IP address and start an ouroboros UDP
+daemon (ipcpd-udp) on that interface:
+```bash
+$ irm ipcp bootstrap type udp name udp layer udp ip <your local ip address>
+```
+
+Now you can ping our server:
+
+```bash
+$ oping -n ouroboros.rocks.oping
+```
+
+The output from the IRM daemon should look something like this (in DEBUG mode):
+```
+[dstaesse@heteropoda build]$ sudo irmd --stdout
+==01749== irmd(II): Ouroboros IPC Resource Manager daemon started...
+==01749== irmd(II): Created IPCP 1781.
+==01781== ipcpd/udp(DB): Bootstrapped IPCP over UDP with pid 1781.
+==01781== ipcpd/udp(DB): Bound to IP address 192.168.66.233.
+==01781== ipcpd/udp(DB): Using port 3435.
+==01781== ipcpd/udp(DB): DNS server address is not set.
+==01781== ipcpd/ipcp(DB): Locked thread 140321690191424 to CPU 7/8.
+==01749== irmd(II): Bootstrapped IPCP 1781 in layer udp.
+==01781== ipcpd/ipcp(DB): Locked thread 140321681798720 to CPU 6/8.
+==01781== ipcpd/ipcp(DB): Locked thread 140321673406016 to CPU 1/8.
+==01781== ipcpd/udp(DB): Allocating flow to 1bf2cb4f.
+==01781== ipcpd/udp(DB): Destination UDP ipcp resolved at 51.38.114.133.
+==01781== ipcpd/udp(DB): Flow to 51.38.114.133 pending on fd 64.
+==01749== irmd(II): Flow on flow_id 0 allocated.
+==01781== ipcpd/udp(DB): Flow allocation completed on eids (64, 64).
+==01749== irmd(DB): Partial deallocation of flow_id 0 by process 1800.
+==01749== irmd(II): Completed deallocation of flow_id 0 by process 1781.
+==01781== ipcpd/udp(DB): Flow with fd 64 deallocated.
+==01749== irmd(DB): Dead process removed: 1800.
+```
+
+If connecting to _ouroboros.rocks.oping_ failed, you are probably
+behind a NAT firewall that is actively blocking outbound UDP port
+3435.
+
+[^1]: This is the IP address of our server and the MD5 hash of the
+ string _ouroboros.rocks.oping_. To check if this is configured
+ correctly, you should be able to ping the server with ```ping
+ 1bf2cb4fb361f67a59907ef7d2dc5290``` from the command line.
+
+[^2]: The ipcpd-udp allows setting up a (private) DDNS server and
+ using the Ouroboros ```irm name``` API to populate it, instead
+ of requiring each node to manually edit the ```/etc/hosts```
+ file. While we technically could also set up such a DNS on our
+ server for demo purposes, it is just too likely that it would be
+ abused. The Internet is a nasty place.
diff --git a/content/en/docs/Start/check.md b/content/en/docs/Start/check.md
deleted file mode 100644
index 69c5bef..0000000
--- a/content/en/docs/Start/check.md
+++ /dev/null
@@ -1,49 +0,0 @@
----
-title: "Check installation"
-date: 2019-12-30
-weight: 40
-description: >
- Check if ouroboros works.
-draft: false
----
-
-To check if everything is installed correctly, you can now jump into
-the [Tutorials](../../tutorials/) section, or you can try to ping this
-webhost over ouroboros using the name _ouroboros.rocks.oping_
-
-Our webserver is of course on an IP network, and ouroboros does not
-control IP, but it can run over UDP.
-
-To be able to contact our server over ouroboros, you will need to do
-some IP configuration: to tell the ouroboros UDP system that the
-process "ouroboros.rocks.oping" is running on our webserver by adding
-the line
-
-```
-51.38.114.133 1bf2cb4fb361f67a59907ef7d2dc5290
-```
-
-to your /etc/hosts file (it's the IP address of our server and the MD5
-hash of _ouroboros.rocks.oping_).
-
-You will also need to forward UDP port 3435 on your NAT firewall if
-you are behind a NAT. Else this will not work.
-
-Here are the steps to ping our server over ouroboros:
-
-Run the IRMd:
-
-```bash
-$ sudo irmd &
-```
-Then you will need find your (private) IP address and start an ouroboros UDP
-daemon (ipcpd-udp) on that interface:
-```bash
-$ irm ipcp bootstrap type udp name udp layer udp ip <your local ip address>
-```
-
-Now you should be able to ping our server!
-
-```bash
-$ oping -n ouroboros.rocks.oping
-```
diff --git a/content/en/docs/Start/download.md b/content/en/docs/Start/download.md
deleted file mode 100644
index 0429ea1..0000000
--- a/content/en/docs/Start/download.md
+++ /dev/null
@@ -1,28 +0,0 @@
----
-title: "Download"
-date: 2019-06-22
-weight: 10
-description: >
- How to get ouroboros.
-draft: false
----
-
-### Get Ouroboros
-
-**Packages:**
-
-For ArchLinux users, the easiest way to try Ouroboros is via the [Arch
-User Repository](https://aur.archlinux.org/packages/ouroboros-git/),
-which will also install all dependencies.
-
-**Source:**
-
-You can clone the [repository](/cgit/ouroboros) over https or
-git:
-
-```bash
-$ git clone https://ouroboros.rocks/git/ouroboros
-$ git clone git://ouroboros.rocks/ouroboros
-```
-
-Or download a [snapshot](/cgit/ouroboros/) tarball and extract it. \ No newline at end of file
diff --git a/content/en/docs/Start/install.md b/content/en/docs/Start/install.md
deleted file mode 100644
index ea4a3f7..0000000
--- a/content/en/docs/Start/install.md
+++ /dev/null
@@ -1,57 +0,0 @@
----
-title: "Install from source"
-author: "Dimitri Staessens"
-date: 2019-07-23
-weight: 30
-draft: false
-description: >
- Installation instructions.
----
-
-We recommend creating a build directory:
-
-```bash
-$ mkdir build && cd build
-```
-
-Run cmake providing the path to where you cloned the Ouroboros
-repository. Assuming you created the build directory inside the
-repository directory, do:
-
-```bash
-$ cmake ..
-```
-
-Build and install Ouroboros:
-
-```bash
-$ sudo make install
-```
-
-### Advanced options
-
-Ouroboros can be configured by providing parameters to the cmake
-command:
-
-```bash
-$ cmake -D<option>=<value> ..
-```
-
-Alternatively, after running cmake and before installation, run
-[ccmake](https://cmake.org/cmake/help/latest/manual/ccmake.1.html) to
-configure Ouroboros:
-
-```bash
-$ ccmake .
-```
-
-A list of all options can be found [here](/docs/reference/compopt).
-
-### Remove Ouroboros
-
-To uninstall Ouroboros, simply execute the following command from your
-build directory:
-
-```bash
-$ sudo make uninstall
-``` \ No newline at end of file
diff --git a/content/en/docs/Start/requirements.md b/content/en/docs/Start/requirements.md
deleted file mode 100644
index 7615b44..0000000
--- a/content/en/docs/Start/requirements.md
+++ /dev/null
@@ -1,76 +0,0 @@
----
-title: "Requirements"
-author: "Dimitri Staessens"
-date: 2019-07-23
-weight: 10
-draft: false
-description: >
- System requirements and software dependencies.
----
-
-### System requirements
-
-Ouroboros builds on most POSIX compliant systems. Below you will find
-instructions for GNU/Linux, FreeBSD and OS X. On Windows 10, you can
-build Ouroboros using the [Linux Subsystem for
-Windows](https://docs.microsoft.com/en-us/windows/wsl/install-win10) .
-
-You need [*git*](https://git-scm.com/) to clone the
-repository. To build Ouroboros, you need [*cmake*](https://cmake.org/),
-[*google protocol buffers*](https://github.com/protobuf-c/protobuf-c)
-installed in addition to a C compiler ([*gcc*](https://gcc.gnu.org/) or
-[*clang*](https://clang.llvm.org/)) and
-[*make*](https://www.gnu.org/software/make/).
-
-Optionally, you can also install
-[*libgcrypt*](https://gnupg.org/software/libgcrypt/index.html),
-[*libssl*](https://www.openssl.org/),
-[*fuse*](https://github.com/libfuse), and *dnsutils*.
-
-On GNU/Linux you will need either libgcrypt (≥ 1.7.0) or libssl if your
-[*glibc*](https://www.gnu.org/software/libc/) is older than version
-2.25.
-
-On OS X, you will need [homebrew](https://brew.sh/). [Disable System
-Integrity
-Protection](https://developer.apple.com/library/content/documentation/Security/Conceptual/System_Integrity_Protection_Guide/ConfiguringSystemIntegrityProtection/ConfiguringSystemIntegrityProtection.html)
-during the [installation](#install) and/or [removal](#remove) of
-Ouroboros.
-
-### Install the dependencies
-
-**Debian/Ubuntu Linux:**
-
-```bash
-$ apt-get install git protobuf-c-compiler cmake
-$ apt-get install libgcrypt20-dev libssl-dev libfuse-dev dnsutils cmake-curses-gui
-```
-
-If during the build process cmake complains that the Protobuf C
-compiler is required but not found, and you installed the
-protobuf-c-compiler package, you will also need this:
-
-```bash
-$ apt-get install libprotobuf-c-dev
-```
-
-**Arch Linux:**
-
-```bash
-$ pacman -S git protobuf-c cmake
-$ pacman -S libgcrypt openssl fuse dnsutils
-```
-
-**FreeBSD 11:**
-
-```bash
-$ pkg install git protobuf-c cmake
-$ pkg install libgcrypt openssl fusefs-libs bind-tools
-```
-
-**Mac OS X Sierra / High Sierra:**
-
-```bash
-$ brew install git protobuf-c cmake
-$ brew install libgcrypt openssl
-``` \ No newline at end of file
diff --git a/content/en/docs/Tools/_index.md b/content/en/docs/Tools/_index.md
new file mode 100644
index 0000000..578c47f
--- /dev/null
+++ b/content/en/docs/Tools/_index.md
@@ -0,0 +1,7 @@
+---
+title: "Tools"
+linkTitle: "Tools"
+weight: 35
+descriptiorn: >
+ Ouroboros tools and software.
+---
diff --git a/content/en/docs/Tools/grafana-frcp-constants.png b/content/en/docs/Tools/grafana-frcp-constants.png
new file mode 100644
index 0000000..19470bd
--- /dev/null
+++ b/content/en/docs/Tools/grafana-frcp-constants.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-frcp-window.png b/content/en/docs/Tools/grafana-frcp-window.png
new file mode 100644
index 0000000..5e43985
--- /dev/null
+++ b/content/en/docs/Tools/grafana-frcp-window.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-frcp.png b/content/en/docs/Tools/grafana-frcp.png
new file mode 100644
index 0000000..9b428af
--- /dev/null
+++ b/content/en/docs/Tools/grafana-frcp.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-ipcp-dt-dht.png b/content/en/docs/Tools/grafana-ipcp-dt-dht.png
new file mode 100644
index 0000000..cb6f1a9
--- /dev/null
+++ b/content/en/docs/Tools/grafana-ipcp-dt-dht.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-ipcp-dt-fa.png b/content/en/docs/Tools/grafana-ipcp-dt-fa.png
new file mode 100644
index 0000000..e7b0a93
--- /dev/null
+++ b/content/en/docs/Tools/grafana-ipcp-dt-fa.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-ipcp-np1-cc.png b/content/en/docs/Tools/grafana-ipcp-np1-cc.png
new file mode 100644
index 0000000..d1c0016
--- /dev/null
+++ b/content/en/docs/Tools/grafana-ipcp-np1-cc.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-ipcp-np1-fu.png b/content/en/docs/Tools/grafana-ipcp-np1-fu.png
new file mode 100644
index 0000000..b325438
--- /dev/null
+++ b/content/en/docs/Tools/grafana-ipcp-np1-fu.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-ipcp-np1.png b/content/en/docs/Tools/grafana-ipcp-np1.png
new file mode 100644
index 0000000..2fdf20b
--- /dev/null
+++ b/content/en/docs/Tools/grafana-ipcp-np1.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-lsdb.png b/content/en/docs/Tools/grafana-lsdb.png
new file mode 100644
index 0000000..fadd185
--- /dev/null
+++ b/content/en/docs/Tools/grafana-lsdb.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-system.png b/content/en/docs/Tools/grafana-system.png
new file mode 100644
index 0000000..a8d1f15
--- /dev/null
+++ b/content/en/docs/Tools/grafana-system.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-variables-interval.png b/content/en/docs/Tools/grafana-variables-interval.png
new file mode 100644
index 0000000..0c297be
--- /dev/null
+++ b/content/en/docs/Tools/grafana-variables-interval.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-variables-system.png b/content/en/docs/Tools/grafana-variables-system.png
new file mode 100644
index 0000000..d16e621
--- /dev/null
+++ b/content/en/docs/Tools/grafana-variables-system.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-variables-type.png b/content/en/docs/Tools/grafana-variables-type.png
new file mode 100644
index 0000000..b3f4a78
--- /dev/null
+++ b/content/en/docs/Tools/grafana-variables-type.png
Binary files differ
diff --git a/content/en/docs/Tools/grafana-variables.png b/content/en/docs/Tools/grafana-variables.png
new file mode 100644
index 0000000..26fdee6
--- /dev/null
+++ b/content/en/docs/Tools/grafana-variables.png
Binary files differ
diff --git a/content/en/docs/Tools/metrics.md b/content/en/docs/Tools/metrics.md
new file mode 100644
index 0000000..4c36533
--- /dev/null
+++ b/content/en/docs/Tools/metrics.md
@@ -0,0 +1,298 @@
+---
+title: "Metrics Exporters"
+author: "Dimitri Staessens"
+date: 2021-07-21
+draft: false
+description: >
+ Realtime monitoring using a time-series database
+---
+
+## Ouroboros metrics
+
+A collection of observability tools for exporting and
+visualising metrics collected from Ouroboros.
+
+Currently has one very simple exporter for InfluxDB, and provides
+additional visualization via grafana.
+
+More features will be added over time.
+
+### Requirements:
+
+Ouroboros version >= 0.18.3
+
+InfluxDB OSS 2.0, https://docs.influxdata.com/influxdb/v2.0/
+
+python influxdb-client, install via
+
+```
+pip install 'influxdb-client[ciso]'
+```
+
+### Optional requirements:
+
+Grafana, https://grafana.com/
+
+### Setup
+
+Install and run InfluxDB and create a bucket in influxDB for exporting
+Ouroboros metrics, and a token for writing to that bucket. Consult the
+InfluxDB documentation on how to do this,
+https://docs.influxdata.com/influxdb/v2.0/get-started/#set-up-influxdb.
+
+To use grafana, install and run grafana open source,
+https://grafana.com/grafana/download
+https://grafana.com/docs/grafana/latest/?pg=graf-resources&plcmt=get-started
+
+Go to the grafana UI (usually http://localhost:3000) and set up
+InfluxDB as your datasource:
+Go to Configuration -> Datasources -> Add datasource and select InfluxDB
+Set "flux" as the Query Language, and
+under "InfluxDB Details" set your Organization as in InfluxDB and set
+the copy/paste the token for the bucket to the Token field.
+
+To add the Ouroboros dashboard,
+select Dashboards -> Manage -> Import
+
+and then either upload the json file from this repository in
+
+dashboards-grafana/general.json
+
+or copy the contents of that file to the "Import via panel json"
+textbox and click "Load".
+
+### Run the exporter:
+
+Clone the repository:
+
+```
+git clone https://ouroboros.rocks/git/ouroboros-metrics
+cd ouroboros-metrics
+cd exporters-influxdb/pyExporter/
+```
+
+Edit the config.ini.example file and fill out the InfluxDB
+information (token, org). Save it as config.ini.
+
+then run oexport.py
+
+```
+python oexport.py
+```
+
+## Overview of Grafana general dashboard for Ouroboros
+
+The grafana dashboard allows you to explore various aspects of
+Ouroboros running on your local or remote systems. As the prototype
+matures, more and more metrics will become available.
+
+### Variables
+
+At the top, you can set a number of variables to restrict what is seen
+on the dashboard:
+
+{{<figure width="30%" src="/docs/tools/grafana-variables.png">}}
+
+* System allows you to specify a set of host/node/devices in the network:
+
+{{<figure width="30%" src="/docs/tools/grafana-variables-system.png">}}
+
+The list will contain all hosts that put metrics in the InfluxDB
+database in the last 5 days (Unfortunaly there seems to be no current
+option to restrict this to the current selected time range).
+
+* Type allows you to select metrics for a certain IPCP type
+
+{{<figure width="30%" src="/docs/tools/grafana-variables-type.png">}}
+
+As you can see, all Ouroboros IPCP types are there, with unclusion of
+an UNKNOWN type. This may briefly pop up when the metric is misread by
+the exporter.
+
+* Layer allows you to restrict the metrics to a certain layer
+
+* IPCP allows to restrict metrics to a certain IPCP
+
+* Interval allows to select a window in which metrics are aggregated.
+
+{{<figure width="30%" src="/docs/tools/grafana-variables-interval.png">}}
+
+Metrics will be aggregated from the actual exporter values (e.g. mean
+or last value) that fall in this interval. This interval should thus
+be larger than the exporter interval to ensure that each window has
+enough raw data.
+
+### Panels
+
+As you can see in the image above, the dashboard is subdivided in a
+bunch of panels, each of which focuses on some aspect of the
+prototype.
+
+#### System
+
+{{<figure width="80%" src="/docs/tools/grafana-system.png">}}
+
+The system panel shows the number of IPCPs and known IPCP flows in all
+monitored systems as a stacked series. This system is running a small
+test with 3 IPCPs (2 unicast IPCPs and a local IPCP) with a single
+flow between oping server/client(which has one endpoint in each IPCP,
+so it shows 2 because this small test runs on a single host). The
+colors on the graphs are sometimes not matching the labels, which is a
+grafana issue that I hope will get fixed soon.
+
+#### Link State Database
+
+{{<figure width="80%" src="/docs/tools/grafana-lsdb.png">}}
+
+The Link State Database panel shows the knowledge each IPCP has about
+the network routing area(s) it is in. The example has 2 IPCPs that are
+directly connected, so each knows 1 neighbor (the other IPCP), 2
+nodes, and two links (each unidirectional arc in the topology graph is
+counted).
+
+#### Process N-1 flows
+
+{{<figure width="80%" src="/docs/tools/grafana-frcp.png">}}
+
+This is the first panel that deals with the [Flow-and-Retransmission
+Control
+Protocol](/docs/concepts/protocols##flow-and-retransmission-control-protocol-frcp)
+(FRCP). It shows metrics for the flows between the applications (this
+is not the same flow as the data transfer flow above, which is between
+the IPCPs). This panel shows metrics relating to retransmission. The
+first is the current retransmission timeout, i.e. the time after which
+a packet will be retranmitted. This is calculated from the smoothed
+round-trip time and its estimated deviation (well below 1ms), as
+estimated by FRCP.
+
+The flow is created by the oping application that is pinging at a 10ms
+interval with packet retransmission enabled (so basically a service
+equivalent as running ping over TCP). The main difference with TCP is
+that Ouroboros flows are between the applications themselves. The
+oping server immediately responds to the client, so the client sees a
+response time well below 1 ms[^1]. The server, however, sees the
+client sending a packet only every 10ms and its RTO is a bit over
+10ms. The ACKs from the perspective of the server are piggybacked on
+the client's next ping. (This is similar to TCP "delayed ACK", the
+timer in Ouroboros is set to 10ms, so if I would ping at 1 second
+intervals over a flow with FRCP enabled, the server would also see a
+10ms Round-trip time).
+
+#### Delta-t constants
+
+The second panel to do with FRCP are the Delta-t constants. Delta-t is
+the protocol on which FRCP is based. Right now, they are only
+configurable at compile time, but in the future they will probably be
+configurable using fccntl().
+
+{{<figure width="80%" src="/docs/tools/grafana-frcp-constants.png">}}
+
+A quick refresher on these Delta-t timers:
+
+* **Maximum Packet Lifetime** (MPL) is the maximum time a packet can
+ live in the network, default is 1 minute.
+
+* **Retransmission timer** (R) is the maximum time which a
+ retransmission for a packet may be sent by the sender. The default
+ is 2 minutes. The first retransmission will happen after RTO,
+ then 2 * RTO, 4* RTO and so on with an exponential back-off, but
+ no packets will be sent after R has expired. If this happens, the
+ flow is considered failed / down.
+
+* **Acknowledgment timer** (A) is the maximum time which an packet may
+ be acknowledged by the receiver. Default is 10 seconds. So a
+ packet may be acknowledged immediately, or after 10 milliseconds,
+ or after 4 seconds, but not any more after 10 seconds.
+
+#### Delta-t window
+
+{{<figure width="80%" src="/docs/tools/grafana-frcp-window.png">}}
+
+The third and (at least at this point) last panel related to FRCP is
+the window panel that shows information regarding Flow Control. FRCP
+flow control tracks the number of packets in flight. These are the
+packets that were sent by the sender, but have not been
+read/acknowledged yet by the receiver. Each packet is numbered
+sequentially starting from a random value. The default maximum window
+size is currently 256 packets.
+
+#### IPCP N+1 flows
+
+{{<figure width="80%" src="/docs/tools/grafana-ipcp-np1.png">}}
+
+These graphs show basic statistics from the point of view of the IPCP
+that is serving the application flow. It shows upstream and downstream
+bandwidth and packet rates, and total sent and received packets/bytes.
+
+#### N+1 Flow Management
+
+{{<figure width="60%" src="/docs/tools/grafana-ipcp-np1-fu.png">}}
+
+These 4 panels show the management traffic sent by the flow
+allocators. Currently this traffic is only related to congestion
+avoidance. The example here is taken from a jFed experiment during a
+period of congestion. The receiver IPCP monitors packets for
+congestion markers and it will send an update to the source IPCP to
+inform it to slow down. It shows the rate of flow updates for
+multi-bit Explicit Congestion Notification. As you can see, there is
+still an issue where the receiver is not receiving all the flow
+updates and there is a lot of jitter and burstiness at the receiver
+side for these (small) packets. I'm working on fixing this.
+
+#### Congestion Avoidance
+
+{{<figure width="80%" src="/docs/tools/grafana-ipcp-np1-cc.png">}}
+
+This is a more detailed panel that shows the internals of the MB-ECN
+congestion avoidance algorithm.
+
+The left side shows the congestion window width, which is the
+timeframe over which the algorithm is averaging bandwidth. This scales
+with the packet rate, as there have to be enough packets in the window
+to make a reasonable measurement. Biggest change compared to TCP is
+that this window width is independent of RTT. The congestion
+algorithm then sets a target for the maximum number of bytes to send
+within this window (congestion window size). The division of the
+number of bytes that can be sent and the size of the windows yields
+the target bandwidth. The congestion was caused by a 100Mbit link, and
+the target set by the algorithm is quite near this value. The
+congestion level is a quantity that controls the rate at which the
+window scales down when there is congestion. This upstream/downstream
+view should be as close as possible to identical, the reason they are
+not is because of the jitter and loss in the flow updates as observed
+above. Work in progress.
+
+The graphs also show the number of packets and bytes in the current
+congestion window. The default target is set to min 8 and max 64
+packets within the congestion window before it scales up/down.
+
+And finally, the upstream packet counters shows the number of packets
+sent without receiving a congestion update from the receiver, and the
+downstream packet counter shows the number of packets received since
+the last time there was no congestion.
+
+#### Data transfer local components
+
+The last panel shows the (management) traffic sent and received by the
+IPCP internal as measured by the forwarding engine (Data transfer).
+
+{{<figure width="80%" src="/docs/tools/grafana-ipcp-dt-dht.png">}}
+
+The components that are current shown on this panel are the DHT and
+the Flow Allocator. As you can see, the DHT didn't do much during this
+interval. That's because it is only needed for name-to-address
+resolution and it will only send/receive packets when an address is
+resolved or when it needs to refresh its state, which happens only
+once every 15 minutes or so.
+
+{{<figure width="80%" src="/docs/tools/grafana-ipcp-dt-fa.png">}}
+
+The bottom part of the local components is dedicated to the flow
+allocator. During the monitoring period, only flow updates were sent,
+so this is the same data as shown in the flow management traffic, but
+from the viewpoint of the forwarding element in the IPCP, so it shows
+actual bandwidth in addition to the packet rates.
+
+[^1]: If this still seems high, disabling CPU "C-states" and tuning
+ the kernel for low latency can reduce this to a few
+ microseconds.
diff --git a/content/en/docs/Tools/rumba-topology.png b/content/en/docs/Tools/rumba-topology.png
new file mode 100644
index 0000000..aa8ce7f
--- /dev/null
+++ b/content/en/docs/Tools/rumba-topology.png
Binary files differ
diff --git a/content/en/docs/Tools/rumba.md b/content/en/docs/Tools/rumba.md
new file mode 100644
index 0000000..28202b7
--- /dev/null
+++ b/content/en/docs/Tools/rumba.md
@@ -0,0 +1,676 @@
+---
+title: "Rumba"
+author: "Dimitri Staessens"
+date: 2021-07-21
+draft: false
+description: >
+ Orchestration framework for deploying recursive networks.
+---
+
+## About Rumba
+
+Rumba is a Python framework for setting up Ouroboros (and RINA)
+networks in a test environment that was originally developed during
+the ARCFIRE project. Its main objectives are to configure networks and
+to evaluate a bit the impact of the architecture on configuration
+management and devops in computer and telecommunications networks. The
+original Rumba project page is
+[here](https://gitlab.com/arcfire/rumba).
+
+I still use Rumba to quickly (and I mean in a matter of seconds!) set
+up test networks for Ouroboros that are made up of many IPCPs and
+layers. I try to keep it up-to-date for the Ouroboros prototype.
+
+The features of Rumba are:
+
+ * easily define network topologies
+ * use different prototypes]:
+ * Ouroboros[^1]
+ * rlite
+ * IRATI
+
+ * create these networks using different possible environments:
+ * local PC (Ouroboros only)
+ * docker container
+ * virtual machine (qemu)
+ * [jFed](https://jfed.ilabt.imec.be/) testbeds
+ * script experiments
+ * rudimentary support for drawing these networks (using pydot)
+
+## Getting Rumba
+
+We forked Rumba with to the Ouroboros website, and you should get this
+forked version for use with Ouroboros. It should work with most python
+versions, but I recommend using the latest version (currently
+Python3.9).
+
+To install system-wide, use:
+
+```bash
+git clone https://ouroboros.rocks/git/rumba
+cd rumba
+sudo ./setup.py install
+```
+
+or you can first create a Python virtual environment as you wish.
+
+## Using Rumba
+
+The Rumba model is heavily based on RINA terminology (since it was
+originally developed within a RINA research project). We will probably
+align the terminology in Rumba with Ouroboros in the near future. I
+will break down a typical Rumba experiment definition and show how to
+use Rumba in Python interactive mode. You can download the complete
+example experiment definition [here](/docs/tools/rumba_example.py).
+The example uses the Ouroboros prototype, and we will run the setup on
+the _local_ testbed since that is available on any machine and doesn't
+require additional dependencies. We use the local testbed a lot for
+quick development testing and debugging. I will also show the
+experiment definition for the virtual wall server testbed at Ghent
+University as an example for researchers who have access to this
+testbed. If you have docker or qemu installed, feel free to experiment
+with these at your leisure.
+
+### Importing the needed modules and definitions
+
+First, we need to import some definitions for the model, the testbed
+and the prototype we are going to use. Rumba defines the networks from
+the viewpoint of the _layers_ and how they are implemented present on
+the nodes. This was a design choice by the original developers of
+Rumba.
+
+Three elements are imported from the **rumba.model** module:
+
+```Python
+from rumba.model import Node, NormalDIF, ShimEthDIF
+```
+
+* **Node** is a machine that is hosting the IPCPs, usually a server. In
+the local testbed it is a purely abstract concept, but when using the
+qemu, docker or jfed testbeds, each Node will map to a virtual machine
+on the local host, a docker container on the local host, or a virtual
+or physical server on the jfed testbed, respectively.
+
+* **NormalDIF** is (roughly) the RINA counterpart for an Ouroboros
+ *unicast layer*. The Rumba framework has no support for broadcast
+ layers (yet).
+
+* **ShimEthDIF** is (roughly) the RINA counterpart for an Ouroboros
+ Ethernet IPCP. These links make up the "physical network topology"
+ in the experiment definition. On the local testbed, Rumba will use
+ the ipcpd-local as a substitute for the Ethernet links, in the other
+ testbeds (qemu, docker, jfed) these will be implemented on (virtual)
+ Ethernet interfaces. Rumba uses the DIX ethernet IPCP
+ (ipcpd-eth-dix) for Ouroboros as it has the least problems with
+ cheaper switches in the testbeds that often have buggy LLC
+ implementations.
+
+You might have expected that IPCPs themselves would be elements of the
+Rumba model, and they are. They are not directly defined but, as we
+shall see in short, inferred from the layer definitions.
+
+We still need to import the testbeds we will use. As mentioned, we
+will use the local testbed and jfed testbed. The commands to import
+the qemu and docker testbed plugins are shown in comments for reference:
+
+```Python
+import rumba.testbeds.jfed as jfed
+import rumba.testbeds.local as local
+# import rumba.testbeds.qemu as qemu
+# import rumba.testbeds.dockertb as docker
+```
+
+And finally, we import the Ouroboros prototype plugin:
+
+```Python
+import rumba.prototypes.ouroboros as our
+```
+
+As the final preparation, let's define which variables to export:
+
+```Python
+__all__ = ["exp", "nodes"]
+```
+
+* **exp** will contain the experiment definition for the local testbed
+
+* **nodes** will contain a list of the node names in the experiment,
+ which will be of use when we drive the experiment from the
+ IPython interface.
+
+### Experiment definition
+
+We will now define a small 4-node "star" topology of two client nodes,
+a server node, and a router node, that looks like this:
+
+{{<figure width="30%" src="/docs/tools/rumba-topology.png">}}
+
+In the prototype, there is a unicast layer which we call _n1_ (in
+Rumba, a "NormalDIF") and 3 point-to-point links ("ShimEthDIF"), _e1_,
+_e2_ and _e3_. There are 4 nodes, which we label "client1", "client2",
+"router", and "server". These are connected in a so-called star
+topology, so there is a link between the "router" and each of the 3
+other nodes.
+
+These layers can be defined fairly straightforward as such:
+
+```Python
+n1 = NormalDIF("n1")
+e1 = ShimEthDIF("e1")
+e2 = ShimEthDIF("e2")
+e3 = ShimEthDIF("e3")
+```
+
+And now the actual topology definition, the above figure will help
+making sense of this.
+
+```
+clientNode1 = Node("client1",
+ difs=[e1, n1],
+ dif_registrations={n1: [e1]})
+
+clientNode2 = Node("client2",
+ difs=[e3, n1],
+ dif_registrations={n1: [e3]})
+
+routerNode = Node("router",
+ difs=[e1, e2, e3, n1],
+ dif_registrations={n1: [e1, e2, e3]})
+
+serverNode = Node("server",
+ difs=[e2, n1],
+ dif_registrations={n1: [e2]})
+
+nodes = ["client1", "client2", "router", "server"]
+```
+
+Each node is modeled as a Rumba Node object, and we specify which difs
+are present on that node (which will cause Rumba to create an IPCP for
+you) and how these DIFs relate to eachother in that node. This is done
+by specifying the dependency graph between these DIFs as a dict object
+("dif_registrations") where the client layer is the key and the list
+of lower-ranked layers is the value.
+
+The endpoints of the star (clients and server) have a fairly simple
+configuration: They are connected to the router via an ethernet layer
+(_e1_ on "client1", _e3_ on "client2" and _e2_ on "server", and then
+the "n1" sits on top of that. So for node "client1" there are 2 layers
+present (difs=[_e1_, _n1_]) and _n1_ makes use of _e1_ to connect into
+the layer, or in other words, _n1_ is registered in the lower layer
+_e1_ (dif_registrations={_n1_: [_e1_]}.
+
+The router node is similar, but of course, all the ethernet layers are
+present and layer _n1_ has to be known from all other nodes, so on the
+router, _n1_ is registered in [_e1_, _e2_, _e3_].
+
+All this may look a bit unfamiliar and may take some time to get used
+to (and maybe an option for Rumba where the experiment is defined in
+terms of the IPCPs rather than the layers/DIFs might be more
+intuitive), but once one gets the hang of this, defining complex
+network topologies really becomes childs play.
+
+Now that we have the experiment defined, let's set up the testbed.
+
+For the local testbed, there is literally almost nothing to it:
+
+``` Python
+tb = local.Testbed()
+exp = our.Experiment(tb,
+ nodes=[clientNode1,
+ clientNode2,
+ routerNode,
+ serverNode])
+```
+
+
+We define a local.Testbed and then create an Ouroboros experiment
+(recall we imported the Ouroboros plugin _as our_) using the local
+testbed and pass the list of nodes defined for the experiment. For the
+local testbed, that literally is it. The local testbed module will not
+perform installations on the host machine and assumes Ouroboros is
+installed and running.
+
+### An example on the Fed4FIRE/GENI testbeds using the jFed plugin
+
+Before using Rumba with jFed, you need to enable ssh-agent in each
+terminal.
+
+```
+eval `ssh-agent`
+ssh-add /path/to/cert.pem
+```
+
+To give an idea of what Rumba can do on a testbed with actual hardware
+servers, I will also give an example for a testbed deployment using
+the jfed plugin. This may not be relevant to people who don't have
+access to these testbeds, but it can server as a taste for what a
+kubernetes[^2] plugin can achieve, which may come if there is enough
+interest for it.
+
+
+```Python
+jfed_tb = jfed.Testbed(exp_name='cc2',
+ cert_file='/path/to/cert.pem',
+ authority='wall1.ilabt.iminds.be',
+ image='UBUNTU16-64-STD',
+ username='<my_username>',
+ passwd='<my_password>',
+ exp_hours='1',
+ proj_name='ouroborosrocks')
+```
+
+The jfed testbed requires a bit more configuration than the local (or
+qemu/docker) plugins. First, the credentials for accessing jfed (your
+username, password, and certificate) need to be passed. Your password
+is optional, and if you don't like supplying it in plaintext, Rumba
+will ask you to enter it at certain occasions. A jFed experiment
+requires an experiment name that can be chosen at will for the
+experiment, an experation time (in hours) and also a project name that
+has to be created within the jfed portal and pre-approved by the jfed
+project. Finally, the authority specifies the actual test
+infrastructure to use, in this case wall1.ilabt.iminds.be is a testbed
+that consist of a large number of physical server machines. The image
+parameter specifies which OS to run, in this case, we selected Ubuntu
+16.04 LTS. For IRATI we used an image that had the prototype
+pre-installed.
+
+More interesting than the testbed configuration is the additional
+functionality for the experiment:
+
+```Python
+jfed_exp = our.Experiment(jfed_tb,
+ nodes=[clientNode1,
+ clientNode2,
+ routerNode,
+ serverNode],
+ git_repo='https://ouroboros.rocks/git/ouroboros',
+ git_branch='<some working branch>',
+ build_options='-DCMAKE_BUILD_TYPE=Debug '
+ '-DSHM_BUFFER_SIZE=131072',
+ add_packages=['ethtool'],
+ influxdb={
+ 'ip': '<my public IP address>',
+ 'port': 8086,
+ 'org': "Ouroboros",
+ 'token': "<my token>"
+ })
+```
+
+For these more beefy setups, Rumba will actually install the prototype
+and you can specify a repository and branch (if not, it will use the
+master branch from the main ouroboros repository), build options for
+the prototype, additional packages to install for use during the
+experiment and as a specific option for Ouroboros the coordinates for
+an influxDB database, which will also install the [metrics
+exporter](/docs/tools/metrics) and allow realtime observation of key
+experiment parameters.
+
+This concludes the brief overview of the experiment definition, let's
+give it a quick try using the "local" testbed.
+
+### Interactive orchestration
+
+First, make sure that Ouroboros is running your host machine, save the
+[experiment definition script](/docs/tools/rumba_example.py) to your
+machine and run a python shell in the directory with the example file.
+
+Let's first add some additional logging to Rumba so we have a bit more
+information about the process:
+
+```sh
+[dstaesse@heteropoda examples]$ python
+Python 3.9.6 (default, Jun 30 2021, 10:22:16)
+[GCC 11.1.0] on linux
+Type "help", "copyright", "credits" or "license" for more information.
+>>> import rumba.log as log
+>>> log.set_logging_level('DEBUG')
+```
+
+Now, in the shell, import the definitions from the example file. I
+will only put (and reformat) the most important snippets of the output
+here.
+
+```
+>>> from rumba_example import *
+
+DIF topological ordering: [DIF e2, DIF e1, DIF e3, DIF n1]
+DIF graph for DIF n1: client1 --[e1]--> router,
+ client2 --[e3]--> router,
+ router --[e1]--> client1,
+ router --[e3]--> client2,
+ router --[e2]--> server,
+ server --[e2]--> router
+Enrollments:
+ [DIF n1] n1.router --> n1.client1 through N-1-DIF DIF e1
+ [DIF n1] n1.client2 --> n1.router through N-1-DIF DIF e3
+ [DIF n1] n1.server --> n1.router through N-1-DIF DIF e2
+
+Flows:
+ n1.router --> n1.client1
+ n1.client2 --> n1.router
+ n1.server --> n1.router
+```
+
+When an experiment object is created, Rumba will pre-compute how to
+bootstrap the requested network layout. First, it will select a
+topological ordering, the order in which it will create the layers
+(DIFs). We now only have 4, and the Ethernet layers need to be up and
+running before we can bootstrap the unicast layer _n1_. Rumba will
+create them in the order _e2_, _e1_, _e3_ and then _n1_.
+
+The graph for N1 is shown as a check that the correct topology was
+input. Then Rumba shows the ordering it which it will enroll the
+members of the _n1_ layer.
+
+As mentioned above, Rumba creates IPCPs based on the layering
+information in the Node objects in the experiment description. The
+naming convention used in Rumba is "<layer name>.<node name>". The
+algorithm in Rumba selected the IPCP "n1.client1" as the bootstrap
+IPCP, this is not explicitly printed, but can be derived as
+"n1.client1" is the node that is not enrolled with another node in the
+layer. It will enroll the IPCP on the router with the one on client1,
+and then the other 2 IPCPs in _n1_ with the unicast IPCP on the router
+node.
+
+Finally, it will create 3 flows between the members of _n1_ that will
+complete the "star" topology. Note that in Ouroboros, there will
+actually be 6, as it will have 3 data flows (for all traffic between
+clients of the layer, the directory (DHT), etc) and 3 flows for
+management traffic (link state advertisements).
+
+It is possible to print the layer graph (DIF graph) as an image (PDF)
+for easier verification that the topology is correct. For instance,
+for the unicast layer _n1_:
+
+```Python
+>>> from rumba_example import n1
+>>> exp.export_dif_graph("example.pdf", n1)
+>>> <snip> Generated PDF of DIF graph
+```
+
+This is actually how the image above was generated.
+
+The usual flow for starting an experiment is to call the
+
+```Python
+exp.swap_in()
+exp.install_prototype()
+```
+
+functions. The swap_in() function prepares the testbed by booting the
+(virtual) machines or containers. The install_prototype call will
+install the prototype of choice and all its dependencies and
+tools. However, we are now using a local testbed, and in this case,
+these two functions are implemented as _nops_, allowing to use the
+same script on different types of testbeds.
+
+Now comes the real magic (output cleaned up for convenience). The
+_bootstrap_prototype()_ function will create the defined network
+topology on the selected testbed. For the local testbed, all hosts are
+the same, so client1/client2/router/server will actually execute on
+the same machine. The only difference in these commands, should for
+instance a virtual wall testbed be used, is that the 'type local'
+IPCPs would be 'type eth-dix' and be configured on an Ethernet
+interface, and of course be run on the correct host machine. It is
+also what a network administrator would have to execute if he or she
+were to create the network manually on physical or virtual
+machines.
+
+This is one of the key strenghts of Ouroboros: it doesn't care about
+machines at all. It's a network of software objects, or even a network
+of algorithms, not a network of _devices_. It needs devices to run, of
+course, but the device, nor the interface is a named entity in any of
+the objects that make up the actual network. The devices are a concern
+for the network architect and the network manager, as they choose
+where to run the processes that make up the network and monitor them,
+but devices are irrelevant for the operation of the network in itself.
+
+Anyway, here is the complete output of the bootstrap_prototype()
+command, I'll break it down a bit below.
+
+```Python
+>>> exp.bootstrap_prototype()
+16:29:28 Starting IRMd on all nodes...
+[sudo] password for dstaesse:
+16:29:32 Started IRMd, sleeping 2 seconds...
+16:29:34 Creating IPCPs
+16:29:34 client1 >> irm i b n e1.client1 type local layer e1
+16:29:34 client1 >> irm i b n n1.client1 type unicast layer n1 autobind
+16:29:34 client2 >> irm i b n e3.client2 type local layer e3
+16:29:34 client2 >> irm i c n n1.client2 type unicast
+16:29:34 router >> irm i b n e1.router type local layer e1
+16:29:34 router >> irm i b n e3.router type local layer e3
+16:29:34 router >> irm i b n e2.router type local layer e2
+16:29:34 router >> irm i c n n1.router type unicast
+16:29:34 server >> irm i b n e2.server type local layer e2
+16:29:34 server >> irm i c n n1.server type unicast
+16:29:34 Enrolling IPCPs...
+16:29:34 client1 >> irm n r n1.client1 ipcp e1.client1
+16:29:34 client1 >> irm n r n1 ipcp e1.client1
+16:29:34 router >> irm n r n1.router ipcp e1.router ipcp e2.router ipcp e3.router
+16:29:34 router >> irm i e n n1.router layer n1 autobind
+16:29:34 router >> irm n r n1 ipcp e1.router ipcp e2.router ipcp e3.router
+16:29:34 client2 >> irm n r n1.client2 ipcp e3.client2
+16:29:34 client2 >> irm i e n n1.client2 layer n1 autobind
+16:29:34 client2 >> irm n r n1 ipcp e3.client2
+16:29:34 server >> irm n r n1.server ipcp e2.server
+16:29:34 server >> irm i e n n1.server layer n1 autobind
+16:29:34 server >> irm n r n1 ipcp e2.server
+16:29:34 router >> irm i conn n n1.router dst n1.client1
+16:29:34 client2 >> irm i conn n n1.client2 dst n1.router
+16:29:34 server >> irm i conn n n1.server dst n1.router
+16:29:34 All done, have fun!
+16:29:34 Bootstrap took 6.05 seconds
+```
+
+First, the prototype is started if it is not already running:
+
+```Python
+16:29:28 Starting IRMd on all nodes...
+[sudo] password for dstaesse:
+16:29:32 Started IRMd, sleeping 2 seconds...
+```
+
+Since starting the IRMd requires root privileges, Rumba will ask for
+your password.
+
+Next, Rumba will create the IPCPs on each node, I will go more
+in-depth for client1 and client2 as they bring some interesting
+insights:
+
+```Python
+16:29:34 Creating IPCPs
+16:29:34 client1 >> irm i b n e1.client1 type local layer e1
+16:29:34 client1 >> irm i b n n1.client1 type unicast layer n1 autobind
+16:29:34 client2 >> irm i b n e3.client2 type local layer e3
+16:29:34 client2 >> irm i c n n1.client2 type unicast
+```
+
+First of all there are two different choices of commands, the
+**bootstrap** commands starting with ``` irm i b ``` and the
+**create** commands starting with ```irm i c```. If you know the CLI a
+bit (you can find out more using ```man ouroboros``` from the command
+line when Ouroboros is installed), you already know that these are
+shorthand for
+
+```
+irm ipcp bootstrap
+irm ipcp create
+```
+
+If the IPCP doesn't exist, the ```irm ipcp bootstrap``` call will
+automatically first create an IPCP behind the screens using an ```irm
+ipcp create``` call, so this is nothing but a bit of shorthand.
+Ouroboros will create the IPCPs that will enroll, and immediately
+bootstrap those that won't. The Ethernet IPCPs are simple: they always
+are bootstrapped and cannot be _enrolled_ as the configuration is
+manual and may involve Ethernet switches; Ethernet IPCPs do not
+support the ```irm ipcp enroll``` method. For the unicast IPCPs that
+make up the _n1_ layer, the situation is different. As mentioned
+above, the first IPCP in that layer is bootstrapped, "n1.client1" and
+then other members of the layer are enrolled to extend that layer. So
+if you turn your attention back to the full listing of the steps
+executed by the bootstrap() procedure in Rumba, you will now see that
+there are only 3 IPCPs that are created using ```irm i c```: those 3
+that are selected for enrollment, which is the next step.
+
+Here Ouroboros deviates quite a bit from RINA, as what RINA calls
+enrollment is actually split into 3 different phases in Ouroboros. But
+as Rumba was intended to work with RINA (a requirement for the ARCFIRE
+project at hand) this is a single "step" in Rumba. In RINA, the DIF
+registrations are initiated by the IPCPs themselves, which means
+making APIs and what not to feed all this information to the IPCPs and
+let them execute this. Ouroboros, on the other hand, keeps things lean
+by moving registration operations into the hands of the network
+manager (or network management system). The IPCP processes can be
+registered and unregistered as clients for lower layers at will
+without any need to touch them. Let's have a look at the commands, of
+which there are 3:
+
+```
+irm n r # shorthand for irm name register
+irm i e # shorthand for irm ipcp enroll
+irm i conn # shorthand for irm ipcp connect
+```
+
+Rumba will need to make sure that the _n1_ IPCPs are known in the
+(Ethernet) layer below, and that they are operational before another
+_n1_ IPCP tries to enroll with it. There are interesting things to note:
+
+First, looking at the "n1.client1" IPCP, it is registered with the e1
+layer twice (I reformatted the commands for clarity):
+
+```
+16:29:34 client1 >> irm n r n1.client1 ipcp e1.client1
+16:29:34 client1 >> irm n r n1 ipcp e1.client1
+```
+
+Once under the "n1.client1" name (which is the name of the IPCP) and
+once under the more general "n1" name, which is actually the name of
+the layer.
+
+In addition, if we scout out the _n1_ name registrations, we see that
+it is registered in all Ethernet layers (reformatted for clarity) and
+on all machines:
+
+```
+16:29:34 client1 >> irm n r n1 ipcp e1.client1
+16:29:34 router >> irm n r n1 ipcp e1.router ipcp e2.router ipcp e3.router
+16:29:34 client2 >> irm n r n1 ipcp e3.client2
+16:29:34 server >> irm n r n1 ipcp e2.server
+```
+
+This is actually Ouroboros anycast at work, and this allows us to make
+the enrollment commands for the IPCPs really simple (reformatted for
+clarity):
+
+
+```
+16:29:34 router >> irm i e n n1.router layer n1 autobind
+16:29:34 client2 >> irm i e n n1.client2 layer n1 autobind
+16:29:34 server >> irm i e n n1.server layer n1 autobind
+```
+
+By using an anycast name (equal to the layer name) for each IPCP in
+the _n1_ layer, we can just tell an IPCP to "enroll in the layer" and
+it will enroll with any IPCP in that layer. This simplifies things for
+human administrators not having to know the names for reachable IPCPs
+in the layer they want to enroll with (although, of course, Rumba does
+have this information from the experiment definition and we could have
+specified a specific IPCP just as easily). If the enrollment with the
+destination layer fails, it means that none of the members of that
+layer are reachable.
+
+The "autobind" directive will automatically bind the process to accept
+flows for the ipcp name (e.g. "n1.router") and the layer name
+(e.g. "n1").
+
+The last series of commands are the
+
+```
+irm ipcp connect
+```
+
+commands. Ouroboros splits the topology definition (forwarding
+adjacencies in IETF speak) from enrollment. So after an IPCP is
+enrolled with the layer and knows the basic information to operate as
+a peer router, it will break all connections and wait for a specific
+adjacency to be made for data transfer and for management. The command
+above just creates them both in parallel. We may create a shorthand to
+create these connections with the IPCP that was used for enrollment.
+
+Let's ping the server from client1 using the Rumba storyboard.
+
+```Python
+>>> from rumba.storyboard import *
+>>> sb = StoryBoard(experiment=exp, duration=1500, servers=[])
+>>> sb.run_command("server",
+ 'irm bind prog oping name oping_server;'
+ 'irm name register oping_server layer n1;'
+ 'oping --listen > /dev/null 2>&1 &')
+18:04:33 server >> irm bind prog oping name oping_server;
+ irm name register oping_server layer n1;
+ oping --listen > /dev/null 2>&1 &
+>>> sb.run_command("client1", "oping -n oping_server -i 10ms -c 100")
+18:05:26 client1 >> oping -n oping_server -i 10ms -c 100
+```
+
+### The same experiment on jFed
+
+The ```exp.swap_in()``` and ```exp.install_prototype()``` will reserve
+and boot the servers on the testbed and install the prototype on each
+of them. Let's just focus on the prototype itself and see of you can
+spot the differences (and the similarities!) between the (somewhat
+cleaned up) output for running the exact same bootstrap command as
+above using physical servers on the jFed virtual wall testbed compared
+to the test on a local machine.
+
+
+```Python
+>>> exp.bootstrap_prototype()
+18:26:15 Starting IRMd on all nodes...
+18:26:15 n078-05 >> sudo nohup irmd > /dev/null &
+18:26:15 n078-09 >> sudo nohup irmd > /dev/null &
+18:26:15 n078-03 >> sudo nohup irmd > /dev/null &
+18:26:15 n078-07 >> sudo nohup irmd > /dev/null &
+18:26:16 Creating IPCPs
+18:26:16 n078-05 >> irm i b n e1.client1 type eth-dix dev enp9s0f0 layer e1
+18:26:16 n078-05 >> irm i b n n1.client1 type unicast layer n1 autobind
+18:26:17 n078-09 >> irm i b n e3.client2 type eth-dix dev enp9s0f0 layer e3
+18:26:17 n078-09 >> irm i c n n1.client2 type unicast
+18:26:17 n078-03 >> irm i b n e3.router type eth-dix dev enp8s0f1 layer e3
+18:26:17 n078-03 >> irm i b n e1.router type eth-dix dev enp0s9 layer e1
+18:26:17 n078-03 >> irm i b n e2.router type eth-dix dev enp9s0f0 layer e2
+18:26:17 n078-03 >> irm i c n n1.router type unicast
+18:26:17 n078-07 >> irm i b n e2.server type eth-dix dev enp9s0f0 layer e2
+18:26:17 n078-07 >> irm i c n n1.server type unicast
+18:26:17 Enrolling IPCPs...
+18:26:17 n078-05 >> irm n r n1.client1 ipcp e1.client1
+18:26:17 n078-05 >> irm n r n1 ipcp e1.client1
+18:26:18 n078-03 >> irm n r n1.router ipcp e1.router ipcp e2.router ipcp e3.router
+18:26:18 n078-03 >> irm i e n n1.router layer n1 autobind
+18:26:20 n078-03 >> irm n r n1 ipcp e1.router ipcp e2.router ipcp e3.router
+18:26:20 n078-09 >> irm n r n1.client2 ipcp e3.client2
+18:26:20 n078-09 >> irm i e n n1.client2 layer n1 autobind
+18:26:20 n078-09 >> irm n r n1 ipcp e3.client2
+18:26:20 n078-07 >> irm n r n1.server ipcp e2.server
+18:26:20 n078-07 >> irm i e n n1.server layer n1 autobind
+18:26:20 n078-07 >> irm n r n1 ipcp e2.server
+18:26:20 n078-03 >> irm i conn n n1.router dst n1.client1
+18:26:24 n078-09 >> irm i conn n n1.client2 dst n1.router
+18:26:25 n078-07 >> irm i conn n n1.server dst n1.router
+18:26:All done, have fun!
+18:26:25 Bootstrap took 9.57 seconds
+```
+
+Anyone who has been configuring distributed services in datacenter and
+ISP networks should be able to appreciate the potential for the
+abstractions provided by the Ouroboros model to make life of a network
+administrator more enjoyable.
+
+
+[^1]: I only support Ouroboros, it may not work anymore with rlite and
+ IRATI.
+
+[^2]: Hmm, why didn't I think of using _O7s_ as a shorthand for
+ Ouroboros before... \ No newline at end of file
diff --git a/content/en/docs/Tools/rumba_example.py b/content/en/docs/Tools/rumba_example.py
new file mode 100644
index 0000000..fc132b6
--- /dev/null
+++ b/content/en/docs/Tools/rumba_example.py
@@ -0,0 +1,41 @@
+from rumba.model import Node, NormalDIF, ShimEthDIF
+
+# import testbed plugins
+import rumba.testbeds.jfed as jfed
+import rumba.testbeds.local as local
+
+# import Ouroboros prototype plugin
+import rumba.prototypes.ouroboros as our
+
+__all__ = ["local_exp", "nodes"]
+
+n1 = NormalDIF("n1")
+e1 = ShimEthDIF("e1")
+e2 = ShimEthDIF("e2")
+e3 = ShimEthDIF("e3")
+
+clientNode1 = Node("client1",
+ difs=[e1, n1],
+ dif_registrations={n1: [e1]})
+
+clientNode2 = Node("client2",
+ difs=[e3, n1],
+ dif_registrations={n1: [e3]})
+
+routerNode = Node("router",
+ difs=[e1, e2, e3, n1],
+ dif_registrations={n1: [e1, e2, e3]})
+
+serverNode = Node("server",
+ difs=[e2, n1],
+ dif_registrations={n1: [e2]})
+
+nodes = ["client1", "client2", "router", "server"]
+
+local_tb = local.Testbed()
+
+local_exp = our.Experiment(local_tb,
+ nodes=[clientNode1,
+ clientNode2,
+ routerNode,
+ serverNode])
diff --git a/content/en/docs/Tutorials/tutorial-1.md b/content/en/docs/Tutorials/tutorial-1.md
index d3d24c0..2e98809 100644
--- a/content/en/docs/Tutorials/tutorial-1.md
+++ b/content/en/docs/Tutorials/tutorial-1.md
@@ -13,8 +13,12 @@ This tutorial runs through the basics of Ouroboros. Here, we will see
the general use of two core components of Ouroboros, the IPC Resource
Manager daemon (IRMd) and an IPC Process (IPCP).
-{{<figure width="50%" src="/docs/tutorials/ouroboros_tut1_overview.png">}}
+It is recommended to use a Debug build for this tutorial to show extra
+IRMd output. To do this, compile with the CMAKE_BUILD_TYPE set to
+"Debug". For a full list of build options and how to activate them, see
+[here](/docs/reference/compopt/).
+{{<figure width="50%" src="/docs/tutorials/ouroboros_tut1_overview.png">}}
We will start the IRMd, create a local IPCP, start a ping server and
connect a client. This will involve **binding (1)** that server to a
@@ -65,13 +69,21 @@ $ oping --listen
Ouroboros ping server started.
```
-The IRMd will notice that an oping server with pid 10539 has started:
+The IRMd will notice that an oping server has started. In our case it has pid 2337, but this will be different on your system:
```bash
-==02301== irmd(DB): New instance (10539) of oping added.
+==02301== irmd(DB): New instance (2337) of oping added.
==02301== irmd(DB): This process accepts flows for:
```
+If you are not running a debug build, you won't see this output and will
+have to look for the pid of the process using a linux command such as ps.
+
+```
+$ ps ax | grep oping
+ 2337 pts/4 Sl+ 0:00 oping --listen
+```
+
The server application is not yet reachable by clients. Next we will
bind the server to a name and register that name in the
"local_layer". The name for the server can be chosen at will, let's
diff --git a/content/en/docs/Tutorials/tutorial-2.md b/content/en/docs/Tutorials/tutorial-2.md
index 5f52a5a..f043442 100644
--- a/content/en/docs/Tutorials/tutorial-2.md
+++ b/content/en/docs/Tutorials/tutorial-2.md
@@ -246,7 +246,7 @@ Our oping server is not registered yet in the normal layer. Let's
register it in the normal layer as well, and connect the client:
```bash
-$ irm r n oping_server layer normal_layer
+$ irm n r oping_server layer normal_layer
$ oping -n oping_server -c 5
```
diff --git a/content/en/docs/_index.md b/content/en/docs/_index.md
index 587d2af..8bed85f 100755
--- a/content/en/docs/_index.md
+++ b/content/en/docs/_index.md
@@ -8,6 +8,4 @@ menu:
weight: 20
---
-{{% pageinfo %}}
-Table of Contents.
-{{% /pageinfo %}}
+We are [moving the documentation to a wiki](https://ouroboros.rocks/wiki). Thesepages are left here for the time being, but will be deprecated.