Where you might want to start reading ...

Is there something wrong with software architecture - or with us?

I am a software architect (one of a few) for a 20-million LOC business software, with currently a few thousand installations, developed and ...

Showing posts with label rules. Show all posts
Showing posts with label rules. Show all posts

Sunday, June 18, 2017

Broad or narrow-spectrum prescriptions - that's the question

In my last posting, I have shown that writing precise rules for dependencies requires formulating a standard set of rules that allow using all the self-evident parts of the chosen runtime environment.

However, after we did that for the Archichect tool itself, there were still more than six thousand dependencies that violate the first, simplistic set of rules that I derived from a rough overview. What are the reasons for these violations? Essentially, there are two:
Here, I will concentrate on the first item, and derive an interesting and important question from it. The next posting will then deal with the second topic.

Before I start with the concrete rules, I would like to know how many of the present violations are of the first kind—i.e., referring external classes—, and how many of the second. For this, I simply add a rule
Archichect.** ---? Archichect.**
This rule allows every item inside Archichect to use any other Archichect item. Of course, I must take care to remove this rule from the final rule set afterwards. That's the reason why I chose the "question mark arrow", which will mark all matching dependencies as "questionable"—hopefully, I will remember to remove this rule later (an additional comment might also be helpful ...).

With this rule added, Archichect's output is Writing 1438 violations and 4717 questionable dependencies. Thus, we will deal in this step with around 1400 dependencies—the rest will be tackled in the next posting.

Now, what about these "violations"? Among other things, Archichect can do the following (I will explain the reasons for these features later, maybe much later—right now, assume that these features are just a "given"):
  1. It can extract dependencies from .Net assemblies (".DLLs" and ".EXEs");
  2. it can watch files (e.g. rule files or source files) and automatically rerun a check in a new thread when a file changes;
  3. it can create some architectural diagrams;
  4. of course, it accesses files and directories at many places;
  5. it provides a small web server for accessing output via a browser;
  6. it can write rule violations in an XML format (but right now not JSON, which is an unforgivable oversight; I will repair this soon);
  7. it can load plugins from assemblies for reading, transforming, writing, and more;
  8. it caches items and strings internally for more efficient memory usage.
These functionalities are located in the following classes:
  1. Reading .Net assemblies: Classes in namespace Archichect.Reading.AssemblyReading;
  2. file watching: Class FileWatcher in namespace Archichect;
  3. diagram creation: Classes in namespace Archichect.Rendering.GraphicsRendering;
  4. file and directory access: many classes need this for reading (of dependencies or configurations) and writing (of violations, logs etc.); so I allow it just for all of Archichect.
  5. webserver: Classes in namespace Archichect.WebServing;
  6. XML writing: Class RuleViolationWriter in namespace Archichect.Rendering.TextWriting;
  7. loading plugins: Classes GlobalContext and ItemAndDependencyFactoryList in namespace Archichect;
  8. Caching of items: Classes in namespace Archichect and child namespaces Archichect.Reading and Archichect.Rendering; in turn, the classes implementing the cache use System.Threading.
The following rules capture the intention that these packages are allowed to use certain namespaces from external libraries:
Archichect.Reading.AssemblyReading     ---> Mono.(Cecil.**|Collections.Generic)
        // For reading .Net assemblies, Archichect uses Mono.Cecil,
        // including some general Mono collection classes
Archichect:FileWatcher             
    ---> System.ComponentModel:Component
        // .Net's FileSystemWatcher derives from Component
Archichect:FileWatcher
                ---> System.Threading
Archichect.Rendering.GraphicsRendering ---> System.Drawing.**
Archichect.**                          ---> System.IO
Archichect.WebServing                  ---> System.Net
Archichect.Rendering.TextWriting:RuleViolationWriter.** ---> System.Xml.**
Archichect(.Reading.**|.Rendering.**)? ---> Gibraltar
        // String caching I copied from the web and generalized for arbitrary objects
Gibraltar                              ---> System.Threading
And now—ta-ta-ta-taa: Running Archichect with these rules leaves us with a mere 40 violations, down from the 1438 just minutes ago! It seems I have described an interesting part of Archichect's architecture almost perfectly (but, well, I invented the tool and wrote its code, so it would be fatal if I couldn't accomplish this, wouldn't it?)

What about these last 40? A quick scan of them exposes the following two reasons:
  • A somewhat widespread use of System.Reflection.MemberInfo, where its Name property is accessed. This results from code like ...GetType().Name, which I find totally ok. On the other hand, I do not want to allow use of all of System.Reflection everywhere, therefore I add the specific rule
  • Archichect.** ---> System.Reflection:MemberInfo::get_Name
  • The GlobalContext class uses System.Threading's CancellationTokenSource and CancellationToken as well as System.Reflection's Assembly and AssemblyName classes. I could now add another two rules going from GlobalContext to ... but wait: Am I still doing architecture here, or is this going into nitty-gritty details that concern no one?
That's an interesting question: Should the prescriptive architecture be
  • "narrow", i.e., allow only access from and to items where we positively need this for current features?
  • Or should it be "broad", i.e., allow the use of large swaths of foreign libraries in most of the product so that future extensions and modifications can use them freely?
I have to confess that I have never managed to keep a "narrow regime" up for a long time: Developers feel or actually are constrained to solve the problems they are supposed to solve. It seems to me that dependency rules should mostly "forbid what has to be forbidden", i.e., prevent definitely problematic design decisions. But they should not restrict something just because "there might be a reason against it, if we only think hard enough". Also, developers will find ways to subvert rules that are not common-sense or simply obstructive: So, my advice is, don't be narrow, be broad.

Still, threads and reflection are not something to use lightly. So, I will restrict the use of these two packages to the cancellation and assembly related classes, respectively; but allow that they be used everywhere in Archichect.

Of course, in your environment, there might be hard and fast rules that must be followed no matter what—so much better so that one can define and check precise dependency rules with tools like Archichect (when and if it is completed).

However, these are still lacking for the more than 4000 open "violations" flagged inside Archichect. The next posting is a stepping stone to tackling them.

Saturday, June 17, 2017

27000 violations - that, by all means, is a huge ticket ...

In the previous posting, I gave a first set of example rules for the Archichect tool, with the sad result that almost 27000 of the 35500 or so dependencies read in from Archichect.exe were flagged as bad. I want to find out what's wrong here. In other words, we are now looking at the descriptive architecture, or whatever Archichect considers this to be.

If you are curious how to run Archichect on itself, here is a short how-to: Put the rules in some text file, say ArchichectRules.dep (.dep is the standard extension of Archichect rule files). Then, run the following command on the command line:
Archichect.exe -read Archichect.exe -configure CheckDeps { -rule-defaultfile ArchichectRules.dep } -transform CheckDeps -write RuleViolationWriter { -newline } ArchichectViolations.txt
There are nicer ways to do this (putting the commands into an .arch file is one), but I will explain them somewhen later.

After running the command above, ArchichectViolations.txt contains all those 27-and-something-thousand bad dependencies. Let's take a look at the first (I have replaced some trailing information that we are not yet interested in with ellipses ...):
Bad dependency detected by global rule group:
DOTNETTYPE::<>f__AnonymousType0:Archichect;0.0.0.2;:...
  -- ;;1'_usesmember
  -> DOTNETITEM:System.Runtime.CompilerServices:CompilerGeneratedAttribute:...
  (at archichect.exe/.<>f__AnonymousType0)
Oh—the compiler produces things by itself! This is not an architectural artifact, and not even a design artifact, but an internal artifact of the concrete execution infrastructure—we simply do not want to see such things (typically). Before diving deeply into this complicated area, we simply write a catch-all rule:
** ---> System.Runtime.CompilerServices
Rerunning Archichect now gives us 26281 violations—that's about 800 fewer than before. Slow going. What's the next?
Bad dependency detected by global rule group:
DOTNETTYPE::<>f__AnonymousType0:Archichect;0.0.0.2;:...
  -- ;;1'_usesmember
  -> DOTNETITEM:System.Diagnostics:DebuggerDisplayAttribute:mscorlib;4.0.0.0;:.ctor
  (at archichect.exe/.<>f__AnonymousType0)
A use of System.Diagnostics:DebuggerDisplayAttribute should certainly be allowed:
** ---> System.Diagnostics
Come to think of it, we also should allow using many more things from the System libraries, like int and bool and string. At least the whole System namespace should be ok—but not its sub-namespaces: We might want to control whether our software uses System.Windows.Forms, for example. Let's therefore add all of System:
** ---> System
11207 is the new violation count—more than half of the former "violations" were simply uses of the .Net framework. What else is there? Before stumbling over all these problems one by one, let me show here all the namespace rules that capture useful and certainly allowed dependencies and which I therefore allow everywhereNote 1:
// Dependencies allowed everywhere in .Net applications

** ---> System
** ---> System.Collections
** ---> System.Collections.Generic
** ---> System.Collections.Specialized
** ---> System.Diagnostics
** ---> System.Diagnostics.CodeAnalysis:ExcludeFromCodeCoverageAttribute
** ---> System.Globalization
** ---> System.Linq
** ---> System.Runtime.CompilerServices
        // CompilerGeneratedAttribute, ExtensionAttribute,
        // IteratorStateMachineAttribute, RuntimeHelpers
** ---> System.Text
** ---> System.Text.RegularExpressions
** ---> -:<>f__AnonymousType*
        // Anonymous types, e.g. in Linq expressions
** ---> -:<PrivateImplementationDetails>/*
        // the C# compiler emits such things from time to time

** ---> JetBrains.Annotations
        // If you use ReSharper's [NotNull], [CanBeNull]

(**) ---> \1
        // A namespace can use all of itself
        // (but excluding child and parent namespaces)
And instead of copy-pasting these rules everywhere, it makes sense to put them into a file e.g. named StandardRules.dep, and then include this file from other rule files (this is done with a simple + line). The ArchichectRules.dep rule file now looks as follows:
$ DOTNETITEM ---> DOTNETITEM

+ StandardRules.dep

Archichect.Reading      ---> Archichect
Archichect.Transforming ---> Archichect
Archichect.Rendering    ---> Archichect
Running Archichect a last time for this posting, we are still left with 6155 violations, which I leave for the next posting.

But let me add an important, if obvious remark: In the process above, I fell deep into the claws of the descriptive model, i.e., into "what is".
But of course, with any software, there will always be some more prescriptive rules, i.e. "what we want", or at least "what we expect". It makes very much sense to write down such rules early.

Usually, a large part of the software will follow these rules, or rules that are just a little bit different from them: And during such an "architecture reconstruction", the rules, but soon also the software, can change so that the prescriptive and the descriptive models start to converge. But how do we make sure that they do not start to diverge the next day? I'll think a little bit about this in yet another posting.

Note 1: Of course, there are situations where these "minimal dependency rules" need tuning, e.g. for portable libraries. I trust that you can adapt them yourself in such cases.

A first example of rules - introducing Archichect

It's high time for a few examples. I'll draw them from two programs:
  • First, the medium-sized "Archichect" tool that I wrote for checking and exploration purposes (with substantial input some years ago from Thomas Freudenberg; and requests and ideas by colleagues at Pharmatechnik in Germany, my employer). Archichect is right now a proof-of-concept software, which means, among other things, that I change anything quite freely on a daily basis (if you nevertheless want to take a look, you can search it on GitHub and look into it).
  • Second, the large flagship IXOS product of our company, which I will use later to demonstrate large-scale architectural aspects.
Let me start with a small example from the prescriptive architecture of Archichect. Here is a rough sketch of some part of its intended (therefore, prescriptive) architecture:


Obviously, this is a "type 1 sketch": It is a mixture of definitive information ("there are three subpackages called Reading, Transforming, and Rendering") and illustrative, but incomplete information ("There is a class called DependencyChecker implementing interface ITransformer, but probably there are more implementors of this interface"). There is nothing at all wrong with such diagrams—except that one must be extremely careful to draw conclusions from them: The "existential assumption" is usually ok ("All the things in the diagram will be there in reality"), but even this is sometimes risky ("oh, I just meant that as an example").

So let us write some rules for this part of Archichect that lift the information from the diagram to a "type 3 declaration". Here they are, in Archichect syntax:
$ DOTNETITEM ---> DOTNETITEM

(**)                    ---> ::\1
Archichect.Reading      ---> ::Archichect
Archichect.Transforming ---> ::Archichect
Archichect.Rendering    ---> ::Archichect
(You are not happy with the notation?—you would like to express this in code? Well, as I said, I am not too stubborn about notation, so I will show how to write the same rules with code in one of the next postings).

One can actually run these rules over Archichect itself and get a nice and very long result telling us that Archichect (in its current version) has 35505 dependencies, 26690 of which violate the rules above: "Sad", as some well-known guy would tweet. Obviously, transferring an architectural diagram to strict rules requires a little more work than just more or less faithfully copy it to text—but it's only a little more, I promise (and will show you).

Before that, however, let me explain the rules above, and the assumptions behind them, a little more.

First of all, the diagram did not spell out how the concept of a UML package is mapped to the language. Most modern languages have at least one concept of nestable groups for naming things; for example, Java has packages, and the .Net languages have namespaces. In addition to these logical nestable constructs, the physical units of runtime environments, e.g. JAR files or .NET assemblies, are typically named with filenames, which can again use a hierarchical naming concept. For example, in .Net, there are assemblies named System.dll, System.Threading.dll, System.Threading.Tasks.dll, System.Threading.Tasks.Dataflow and System.Threading.Tasks.Parallel.dll etc.

In Archichect, I chose the standard approach of mapping packages to the naming concept of the implementation language, i.e., to .Net's (and C#'s) namespaces. Thus, I would put
  • Program, Item, Dependency as well as the three base interfaces into namespace Archichect;
  • DotNetAssemblyReader in namespace Archichect.Reading,
  • DependencyChecker in namespace Archichect.Transforming, and
  • ViolationsWriter in namespace Archichect.Rendering.
The type DOTNETITEM defined by Archichect's DotNetAssemblyReader nicely defines the fields Namespace, Class, Assembly.Name, Assembly.Version, Assembly.Culture, and Member.Name, and therefore the rules above refer to the Namespace field without any further syntactical ado.

Still, an alternative architecture decision could be to distribute the classes and interfaces into multiple assemblies, and then, the rules would have to be written differently, e.g. (there are more possibilities) as follows:
$ DOTNETITEM ---> DOTNETITEM

::(**)                    ---> ::\1
::Archichect.Reading      ---> ::Archichect
::Archichect.Transforming ---> ::Archichect
::Archichect.Rendering    ---> ::Archichect
The two colons indicate that the strings are assumed to refer to the third field of DOTNETITEM, i.e., the Assembly.Name field. Assembly rules collect, at least in .Net, also important architectural information and should therefore be part of the "rule set"; but I will ignore them for the moment and continue with namespace-based rules.

Secondly, regarding the diagram, it does not say anything about any dependency restrictions inside each package. A typical implicit assumption is that, on this level of granularity, each item inside a package may use any other item in the same package. To allow such dependencies, the rule
(**) ---> \1
is added. The \1 notation here is borrowed from regular expressions (and actually, internally, the rule checking is mostly done by creating regexes from the rules and then matching the read-in dependencies against them).
This Archichect rule says that items from some namespace can use items from exactly the same namespace, but not from a child or parent namespace. We will see in a later posting that this would, in some cases, prevent the useful organization of namespaces, and therefore the actual architecture rules of Archichect are somewhat different. For the moment, we leave it at that.

Still, we should understand why our apparently so useful rules gave us so many invalid dependencies, shouldn't we? That's some stuff for another posting.

Wednesday, May 3, 2017

Purposes of architectural documentation disentangled

I have been a little unfair in my last posting: The eight pages on UML 2.0 in Gorton's "Essential Software Architecture" are more than a mere advertisement for that (then) new UML version 2.0—they do actually contain some core advice about how to document architectural aspects of a program. I'll try to extract a compact view of what architecture documentation is, in Gorton's and, I think, the mainstream architecture's textbooks' view, from these pages and the case study in chapter 7.

First of all, architecture documentation is a collection of artifacts for human beings only. This is in contrast to code, which is targeted both at the "machine" and at human readers. In the background, there looms the idea of model-driven architecture, where an architecture model is used to create code—essentially, a compiler for a new language on some "higher" level than standard programming languages. However, like the book, I will disregard this aspect right now and return to it somewhat later.

The clear target of providing information to humans has lead most of us to the use of informal diagrams and standard prose to describe the architectural aspects of a software—"simple box-and-arrow diagrams", as Gorton calls them. He claims that there is "an appropriate diagram key to give a clear meaning to the notation used" in his examples, but most diagrams in his chapters 1 to 5 don't have such a key, and in any case, most people drawing such diagrams don't include one. The problem with this is that any plan to derive hard facts from such diagrams is then doomed.

Now, one purpose of architecture documentation is to give someone a "feeling of the interplay of things", and for this purpose, informal diagrams with textual or oral explanations are perfectly fine and, I am quite sure, even preferable: They appeal to our intuitive approach to most problems, which includes working with somewhat unclear terms and their relations in order to limit thinking about tricky consequences, so that our mind is free to "suck in the universe" of the problem area at hand.

Maybe it should be noted that formal clarity, precise meaning and even "simple" (mathematical) consistency entail, in almost all cases, "hard thought work", as the history of mathematics has shown:
  • Geometry in the plane seems like an easy subject, until you start trying to understand its base and algorithms from Euclid's axioms and definitions, well over 2300 years old: There is nothing easy with concepts like parallels or ratios of line segment lengths! And later formalizations, mainly from about the 1800s onwards, are even more intricate.
  • The other, apparently so "simple" basis of mathematics, namely the natural numbers, lost its simplicity also in ancient times with some prime number theory by the Greeks. It was and is by no means obvious what can emerge from simple addition and multiplication, let alone from the algebraic structures and formalizations extracted in the 19th century, leading to Gödel's mind-bending encodings and Turing's work.
Let me state this in my "Axiom 1": Mathematics, by and large, is not what we want in software documentation (and that from me, who majored in theoretical computer science ...).

Still, it seems we all want something more than the informal box-and-arrow-diagrams.

Gorton, like many others, proposes the use of UML. I cannot help the feeling that he is not really happy about it. The summary of chapter 6 has the following two sentences:
I’m a bit of a supporter of using UML-based notations and tools for producing architecture documentation. The UML, especially with version 2.0, makes it pretty straightforward to document various structural and behavioral views of a design.
"A bit of a supporter", "pretty straightforward": This does not really sound like wholehearted endorsement.

So, what is the problem?

The problem is, in my humble opinion, that there is no clear picture of what a notation for architectural documentation should do. The described use-cases typically oscillate between a "better notation" for those informal, easily comprehensible overviews over some aspects of a software system, and a more formal notation that can help derive hard knowledge about a system, with that implied goal of "generating code" in model-driven approaches.

I am, after many years in the field, now certain that we have to structure the use cases for architectural documentation in a threefold classification, with different notations for each area:
  1. Informal documentation, from which humans can learn easily and intuitively gather a common understanding and a useful overview about some aspects of the system. In the best case, such a documentation is part of a common culture about "how we name and see things." However, this documentation is not intended to derive any hard facts: Everything shown can be disputed and discussed and viewed differently, and the notation can be extended at will if it helps with that intuitive understanding. All must agree that formal arguments based on such documentation are futile and hence must be avoided.
  2. Formally sound and precise documentation that can be used to derive invariants and definitive properties of the documented system. If such documentation is used as the basis for a tool-supported model-driven approach, then there is no difference between a descriptive and a prescriptive architectural documentation for the aspects covered by the process. However, such an approach is very expensive in more than one respect:
    • First, especially without full tool support, keeping such a documentation in line with the system is much work, as even tiny changes on one or both sides require precise updates.
    • Second, as software can exhibit very complex behavior, the notation must be capable of describing many and, usually, deep concepts, which makes it hard and "mathematical" to understand and even harder to write. Such documentation therefore blatantly contradicts "Axiom 1".
    • Last, on a conceptual level, it is not really clear that such a documentation is actually "documentation" in the sense of "humanly accessible information relevant for many decisions in the software life-cycle". Rather, it might be more of a formal specification or even—when used in a model-driven process with code generation—part of the implementation, albeit (maybe) on some higher or "more compact" level than standard programming languages.
Thus, rich informal and deep formal notations are not sufficient for documenting and arguing about architectural aspects of a software.
  1. Therefore, we need notations that are somewhere in-between: Not informal, so that they can be used to derive and ensure hard facts. But equally, they must be easily usable so that they can be read and written by the average software engineer under average project circumstances. It should be obvious that this type of notation cannot be very rich and also not very abstract. Only then, it can on the one hand avoid requiring an extensive semantics for formal derivations, and on the other hand being too esoteric to be used for understandable documents. In other words, it must be a quite mundane notation. I'll show my preferred notation for this, and its uses, in later postings—just in case you think that this looks a little like the search for the holy grail.
UML, incidentally and unfortunately, does not work really well for any of these purposes if its complex semantics is taken seriously:
  1. For an informal notation, it carries a too heavy backpack of that formal semantics which no-one wants to remember when drawing informative diagrams in a running text (as, e.g., in the case study in Gorton's book).
  2. For a formal notation, it is too indirect: One needs to map UML propositions back to the underlying semantic model (like Petri nets or state machines), and only then one can formally draw conclusions; as far as I can oversee it, the number of publications that use UML as a formal base has declined quite a bit over the last years.
  3. Finally, as a simple but yet strict notation, UML is much too baroque, because it was lobbied to include every useful diagram and icon. This large notational size would recommend it for many different informal diagrams—if it weren't for that formal semantics ballast ...
But even if  you think that UML does work well (or well enough) for one area, there is the danger of misinterpreting UML diagrams: Is a diagram which your team uses as a basis for a decision a "type 1." diagram?—then it conveys informal concepts, but does not limit the decision strictly or formally. A "type 2." or "type 3." diagram, on the other hand, would narrowly limit some choices you can make—and definitely require a formally (for "type 2.") or at least collectively (for "type 3.") approved update of the diagram for any change in the software or the architecture. But most diagrams do not spell out explicitly their "conformance level".

Nonetheless, our analysts and some of our developers and architects (including me) are happy enough to use UML as a pool of symbols for sketching explanatory diagrams that help us to keep our complex machinery at least somewhat documented. So yes, I am, and we are also "a bit of a supporter of using UML-based notations and tools", as Ian Gorton puts it.

But now, I feel, I am starting to owe you an explanation how to do architectural documentation better. The next posting ... well, after I wrote it, it turned out to still consider some general observations about software architecture and how we deal with it.