Overview
RFCs have played a pivotal role in helping to formalise ideas and requirements for much of the Internet’s design and engineering. They have facilitated peer review amongst engineers, researchers and computer scientists, which in turn has resulted in specification of key Internet protocols and their behaviours so that developers can implement those protocols in products and services, with a degree of certainty around correctness in design and interoperability between different implementations. Security considerations within RFCs were not present from the outset, but rather, evolved over time as the Internet grew in size and complexity, and as our understanding of security concepts and best practices matured. Arguably, security requirements across the corpus of RFCs (almost 8,900 at the time of writing) has been inconsistent, and perhaps attests to how and when we often see security vulnerabilities manifest themselves both in protocol design, and subsequent implementation.
We have explored properties of RFCs in terms of security, performing certain analyses on how security is (or isn’t) prescribed in attempts at trying to understand more specifically, how and why security vulnerabilities manifest themselves from design to implementation. To help us in this endeavour we have utilised different methods of analysis including graph databases to explore and query relationships between different properties of RFCs. Our ultimate intention from this research is to use any key observations and insights to stimulate further thought and discussion on how and where security improvements could be made to the RFC process, allowing for maximised security assurance at protocol specification and design so as to facilitate security and defence-in-depth. More broadly, we propose that using graph databases to assess bodies of knowledge like RFCs and their interrelationships provides useful ways of performing analysis and deriving new insights.
Our paper on this topic has been written in the style of an RFC, and can be accessed here. The following GitHub repository includes various python scripts and neo4j cypher code used to generate the data and graph databases used for our analysis: