Saturday, December 31, 2011

Resolutions for 2012

So, after waking up from great 12h sleep, taking a refreshing shower I think it's a good moment to sit down and think about what I want to accomplish in the coming year. I've got a few technologies I would like to learn. My primary goals will be:

  • learn Clojure and write some bigger-than-hello-world app with it. I intended to learn it this year but didn't manage to. I really liked the Haskell course at the univeristy and enjoy using tools such as Google Guava that bring a bit of functional programming into Java. Clojure seems to have an active community and integrates nicely with Java so one day it might get useful in my daily work. And learning a different programming paradigm than OO is supposed to make me a better software developer :) 
  • learn JavaScript. That's one of my dark secrets - I don't know JavaScript at all (except for things that you use with JSF like alert or confirmation dialogs but that doesn't count). I suppose it's a secret I share with many other Java devs who are not doing much of front-end development. Anyway, JS seems to become incredibly popular nowadays so it's high time for me to catch up with the mainstream.
  • Create a kind-of-real-world app using Event Sourcing (and possibly NoSQL). I have been reading /watching about CQRS and ES all the time and feel ashamed I haven't done an app using ES by now. We're now starting to have CQRS and events in our projects at work but there's still a long way to get to ES (suffering from the poison pill architecture that CQRS with events and ORM is). So - off to work. I even have an idea for such app but have to think about it some more.
  • Become proficient with git. I'm using git for my pet projects, code katas etc at home but never got beyond the usage similar to SVN. I read a couple of good tutorials about git and have to get deeply familiar with it, learn how to do feature branching etc. Git seems the future of VCSs so I can't afford to stay far behind.
  • Create a roguelike game. I'm a long time lurker at RGRD but haven't started even a hello-dungeon roguelike project. Enough of theory, time to get some @ running!

That would be it for my primary objectives. Hopefully will get the mission completed during next 12 months.
There's also a couple of other things I would like to do but they are not that high on my todo list:

  • learn Scala. It seems to be a cool, JVM-based language that is gaining much attention right now. Additionally there's Akka that I would like to learn (Alternatively I might try learning Erlang).
  • finally read whole Java Language Specification. I did read excerpts from it but never read all of it. Time to fill in the gaps in my knowledge about Java.
  • go through Spring 3 documentation and see the new and shiny things they've added since 2.5.6. I was following the release notes but don't think I have a good understanding of what has been added/changed.
  • try to do some development for the cloud. Just to start and learn the new possibilities and fallacies.

Some people say you should be creating habits, not the resolutions list . Therefore I publicly admit that I want to develop the following habit this year: doing code katas daily. I tried to do it at some point this year but failed. I'm still doing katas from time to time but not regularly. And as an aspiring software craftsman I need to work on my skills everyday.

Have a happy new year!

    Friday, December 30, 2011

    2011 Summary

    So 2011 is coming to an end and it might deserve a few minutes of consideration, short analysis of what has happened during last twelve months. A few points I can think of right now:

    • in April I went to the wonderful DDD&CQRS class run by Greg Young. Many thanks to my company for sponsoring my trip to Kraków (not that it was expensive - would probably attend even if I had to pay for it myself)! Since then we started changing our architecture. It's still in-progress and there might be a need to fight some resistance from above before we can implement more innovation but I hope it will go well. 
    • as a result I signed in for Twitter. Was skeptical at first but it turned out to be amazing tool for continuous learning  and an almost complete replacement for my RSS reading list. I also found some brilliant folk to follow and what they tweet really affects my everyday work. Thanks tweeple! From my side I think I could tweet more and participate in some discussions (though they often happen when I'm asleep - damn you, Central Europe timezone!)
    • I also started blogging. Unfortunately I didn't write many posts, probably should do it more often, in order to practice my written english if not for anything else. That leaves some place for improvement in 2012.
    • As for my everyday work - I've become a team lead. That's not really something I enjoy but I'm trying not to flee from this responsibility - someone has to do the job. I think I'm slowly getting used to it and it's causing a bit less pain then it initially did. And I get more possibilities to change the way the team I am part of works, to improve it.
    • Oh, almost forgot - I also changed teams. Last team got its size decreased because of BAs inefficiency (not enough work to do for devs). I joined one of the Hospital Information System teams, then switched to the team lead position in another few months later. I wasn't really happy about people around me changing so rapidly but now I'm okay with my new place. 
    Okay, that would be it. Just a short summary of things I could think of. That wasn't a bad year, had many positive changes, some bad ones, but I can't really complain about it. Hopefully 2012 will be even better.

    Wishing everyone that might stumble upon this post a better new year! (especially those in Samoa for whom it's coming sooner than usual)

    Monday, December 12, 2011

    Why my job sucks? (rant)

    Fhtagn!

    It's been a long time since I've written anything here. I'm not a very extravertic guy so I rarely feel a need to write about what I feel/think and most of the time a short tweet is more than enough. However, tonight I feel a big urge to rant about why I think my current job sucks. I wonder whether that means I'm getting burned out? Hopefully not...

    1. I don't have a feeling of progress, of learning, any more. I've been with the company for almost 3.5 years and until recently (half a year maybe?) I felt I was learning a lot every day.  Now I feel like I'm doing the same, repetitive tasks every day, and the first thought I get in the morning is "I wish I could get another hour of sleep" instead of "Yay, time to get some great stuff done".
      Maybe that was because I was working in slow-paced projects in which I had enough slack to allow some innovation? And that's funny because in that project I was complaining about not having enough to do. Now I see it was probably what gave me a chance to try out new things, to spend a lot of time of fixing stuff that had been done wrong etc.
    2. I don't get a chance to come up with complete solutions. That's because we've got an internal framework that's supposed to deliver every bit of technical solution we might need to create our business components. While it might seem that not having to deal with infrastructural stuff is an advantage (and I know some people who just hate such tasks), for me it makes my tasks feel "incomplete". I like to have the right balance between purely technical tasks and modelling my domain model, and now one part has been taken away from me. And I don't want to move to the framework team because (of course that's just the tip of the iceberg) it would just make me work on technical solutions, separated of any business context (and that sucks terribly - I strongly believe that frameworks - at least internal - should be pulled out of existing code, not pulled into it).
    3. I'm a team leader now. And I sincerely don't like it. I have to do some things (reports, keeping an eye on some things, meeeeeeetings) which just feel like a waste of time for me. And that time cannot be spent on doing some useful programming work. I  really enjoy being technical lead, I love architecture design tasks, but really hate to sit down in meetings that often have no outcome that would be in any way important to the team I'm part of.

    Okay, the rant is over now. I could probably carry on but now my frustration is gone, replaced by sleepiness, and now I just want to go to bed and have those 7 hours of sleep before getting up again and thinking... will not repeat myself.

    May tomorrow be better than today and may the stars be right!
    /|\(°,,°)/|\

    Tuesday, July 12, 2011

    FizzBuzz of Doom

    Today a colleague had to interview a bunch of students (on their 3rd or 4th year at the university I believe) wanting to have their internship at our company. Just before he went to talk to them I mentioned the FizzBuzz problem (huh?) as a nice example of simple screening questions and he decided to ask it the candidates.
    When I first heard about it I couldn't really believe this could be a problem for anyone (I suppose I was on my 3rd year of studies at the time). Today I got a hard proof I was terribly wrong. Out of 6 students only 2 gave an acceptable answer to the problem.

    Ouch! I don't really know what to think about it. Thought that FizzBuzz can actually be a problem for CS students makes me feel uneasy. Or maybe there's nothing to worry about? Time will show

    Tuesday, July 5, 2011

    Coding kata

    Before yesterday I have done a coding kata (what is it? a short programming exercise that you will ideally perform daily, write some code and throw it away - for more detailed explanation see here) and decided to start doing it daily. It was Roy Osherove's TDD kata 1 .On Sunday I've done it in Java, took be about 20 minutes to complete it (including the advanced part). Yesterday I did it again, this time in Python.
    I have learned a bit of Python while studying, done a couple of simple networking apps in it (curses-based IM, Tkinter mail client etc), but that's where my adventure with this fine language ended. Since then I've used it only for small scripts to handle boring tasks, and eventually my knowledge of it has faded.
    The basic scope of the String Calculator kata took me 1h to complete! I felt quite ashamed of my performance so decided to re-learn Python. There's a couple of different katas I want to try (including Uncle Bob's Bowling Kata), and have downloaded JetBrains' PyCharm IDE to get a decent tooling support (especially documentation, as I have forgotten most of the standard Python library functions). The experience I had with it yesterday was satisfying and I think I will continue evaluating it and learning Python in the process.
    Hopefully I have enough self-discipline to do it!

    Tuesday, June 28, 2011

    My CQRS training

    I've run an internal CQRS training for my team mates. Took 2 hours but we didn't manage to go through all topics I wanted to present. Hallway & canteen to the rescue - had another hour talking about CQRS and event sourcing. Here are my slides: http://bit.ly/kADhjZ.


    I suppose it's now time for a real world CQRS+ES project, huh? :D

    Wednesday, April 20, 2011

    Notes from DDD & CQRS training - day 3

    Here's the last part of my notes from Greg Young's class in Poland. Enjoy!

    Push vs Pull integration
    Pull is:
    Accounting <-- getAccountBalance() --- Sales
               -------- balance --------->
    Push is:
    Accounting -- AccountBalanceChanged--> Sales



    • With pull model we get tight coupling of the systems. With push systems are loosely coupled. System receiving events uses denormalizers to build whatever structural model is needs.
    • Why is push generally better than pull?
      • What if we put Accounting system in Poland and Sales in South Africa? 
        • performance will suck when using pull integration model
        • performance won't be affected as Sales will build it's own view model that can be queried without calling Accounting system
      • weakest link antipattern will hurt systems using pull integration
      • web services (think -> pull) cause Bounded Contexts boundaries to blur - my team needs to understand how other applications look at my system
      • push reduces coupling between project teams - we don't have to wait for other teams to implement their functionality
      • doing push means that we don't pollute our system with concepts of other systems
      • replacing a system with a new one
        • hard in PULL model (have to support how everyone sees our system)
        • easy in PUSH (have to support only events)
      • push keeps us from having a huge, messy canonical model
    • With push integration we apply the same pattern we did for aggregates - reducing coupling through redundancy
    • When can pull be beneficial?
      • when complex calculations must be performed on the data and we don't want to put such logic in every system
      • data from other system is vital for the business
      • it's hard to emulate PUSH with an adapter on top of another system
    • out of events coming from other systems we can build any possible structural model we need
    • a system publishes a language other systems can listen to
    • PUSH should be the default integration model
    • we can degrade our SLAs in order to achieve higher uptime
      • it's better to degrade SLA that being down
      • having errors is often better than being down
      • we introduce eventual consistency
        • if risk goes too high because of stale data the business can hit the red button to bring the system down                 
    • people are afraid of push integration because they are control freaks
      • they like to have a central point that manages everything
    • sending heartbeat messages ("hey, i'm still alive") to let other systems know that we're running fine so that they can act accordingly in case we are down
    • with push we can do remote calculations without pissing off the users
    • push makes eventual consistency explicit (we still have it implicit in PULL but prefer not to think about it)
    • doing push == applying OO principles between systems


    Versioning "is dead simple"

    • wouldn't it be easy if we only added things?
    • Let's consider version 1: 
    class InventoryItem {
      void deactivate() {// ...
        Apply(new ItemDeactivated(id);
      }
    }
    class InventoryItemDeactivated:Event{
      public readonly Guid id;
      InventoryItemDeactivated(Guid id){...}
    }
    • We'll move to version 2:
    // don't change existing event!
    class InventoryItemDeactivated:Event{
      public readonly Guid id;
      InventoryItemDeactivated(Guid id){...}
    }
    // instead just copy & paste & rename:
    public class InventoryItemDeactivated_V2:Event{
      public final Guid id;
      public final String comment;
      InventoryItemDeactivated_V2(Guid id,String comment)

      {...}
    }
    class InventoryItem {
      void deactivate(String comment) {
        if(comment.isNull()) 
          throw new ArgNullEx();
        Apply(new ItemDeactivated_V2(id, comment);
        // ...
      
    }
    }

    • copy & paste the apply() method to handle V2 event
      • but - as no business logic needs the comment so we don't event copy it into an aggregate
    • what about V57? gets a little dirty...
    • new version of event is convertable from the old version of event
      • if i can't transform v1 to v2 it's not the same event type!!!!
      • new fields get default value in case of old version events
      • let's have a method that converts event to newer version
    static InventoryItemDeactivatedEvent_V2 convert(
      InventoryItemDeactivatedEvent e){
      return new InventoryItemDeactivatedEvent_V2(e.id, "BEFORE COMMENTS");
      // or another default value
    }
      • now we can delete code that deals with old versions of events
    • we have to version our commands with exactly the same pattern
    class DeactivateInventoryItem:Command{
      public final Guid itemId;
      public final int originalVersion;
      // constructor...
    }
    class DeactivateInventoryItem_V2:Command{
      final Guid itemId;
      public final int originalVersion;
      public final String comment;
      // constructor
    }
    //let's jump into command handler:
    [Depreciated("13/04/2011")]
    public void handle(DeactivateInventoryItem m) {
      var item = repo.getById(m.id);
      item.deactivate("");
    }
    public void handle(DeactivateInventoryItem_V2 m) {
      var item =  repo.getById(m.id);
      item.deactivate(m.comment);
    }

    • we don't need any support for versioning in our serialization infrastructure
    • generally we keep 2-3 versions of a command and delete old versions(both handler and command) after some time
      • "how many test you web pages with IE4? why? don't you wanna support them?" 
    • keeping multiple versions running concurrently lets the clients do the transition
    • we never change events!! 
      • we add a new event
      • a deleting change example: v3 without the comment:
    class InventoryItemDeactivated_V3:Event {
      public final Guid id;
      // removed: public final String comment;
      InventoryItemDeactivated_V3(Guid id){...}
    }

    //in the convert() function just don't copy the comment!

    • snapshots (using memento pattern):
      • do it like commands - add a new handling method and keep it until it's no longer needed, then delete it
    • to prevent events & commands from being changed
      • don't write them, generate them from XSD
      • use some tool to detect changes made to XSD and reject checkins
    • bigger problem: we realize that our aggregate boundaries were wrong, what's now?
      • write a little script to break events apart: 
      • build the original aggregate, build a new aggregate from it and save it (keep the reference (id) to the old aggregate)
      • this is annoying task but doesn't happen very often
      • keeping the reference to original aggreagate help other systems integrated in PUSH way (like our read model?) keep their model intact
    • prefer flat events over those containing little data objects - this is a trade-off between coupling and duplication 
      • it's harder to measure coupling than duplication so normally we don't see those problems
      • most of the time we introduce coupling to avoid duplication because duplication is easier to spot
      • flat events don't have problems when a data object definition changes (how would we version that?)

    Merging


    • how to get optimal level concurrency?
      • merging prevents most of the problems with optimistic concurrency

    public class MergingHandler : Consumes {
      public  MergingHandler(Consumes next) {...}
      public void consume(T message) {
        var commit = eventStore.getEventsSinceVersion(
          message.AggregateId,message.ExpectedVersion); 
        foreach(var e in commit) {
          if(conflictsWith(message,e))
            throw new RealConcurrencyEx();
        }
        next.handle(message);
      }
    }

    • doesn't comparing commands to events seem wrong?
      • duplicates the business logic from the domain (aggregate)

    // following code assumes usage of UOW
    public class MergingHandler : Consumes {
      public  MergingHandler(Consumes next) {...}
      public void consume(T message) {
        var commit = eventStore.getEventsSinceVersion(
          message.AggregateId,message.ExpectedVersion);
          next.handle(message);
          foreach(var e in commit) {
            foreach(var attempted in UnitOfWork.Current.PeakAll()) {
            // events that have been created by the aggregate during the operation
              if(conflictsWith(attempted,e))
                throw new RealConcurrencyEx();
            }
        }
      }
    }


    • we can often have general rules for generic conflict detection, like:
      • events of same type tend to conflict
    • unfortunately, the above example still misses an important thing...

    public class MergingHandler : Consumes {
      public  MergingHandler(Consumes next) {...}
      public void consume(T message) {
        try {
        BEGIN:
          var commit = eventStore.getEventsSinceVersion(
            message.AggregateId,message.ExpectedVersion); 
          next.handle(message);
          foreach(var e in commit) {
            foreach(var attempted in UnitOfWork.Current.PeakAll()) {
              if(conflictsWith(attempted,e))
                throw new RealConcurrencyEx();
            }
          }
          //normally that would be in another cmd handler:
          UnitOfWork.current.commit();
        }catch(ConcurrencyException e) {
          goto BEGIN; // don't do that in production :)
        }
      }
    }

    • this is simple because we store events - try doing it on sql database with current state data!
    • in case of conflict rules that are not generic but domain-specific we usually add a conflictsWith(Event another) method on the event


    Eventual consistency


    • don't ask experts: "does the data needs to be eventuall consistent?"
      • ask: "is it ok to have data that is X time old"

    NEVER USE WORD "INCONSISTENT" WITH BUSINESS PERSON. SAY "OLD", "STALE" ETC

    • for business people inconsistent=wrong
    • how to get around problems with eventual consistency:
      • easy thing: "your comment is waiting for moderation"
      • last thing to do when everything else fails: fake the changes in the client. make it look like things have happened for the user making the changes
      • UI design & correct user's expectations
        • educate the user: 
          • tell them that sometimes software takes a second to think about what it's doing.
          • if the data is not there immediately, wait 2 seconds and press F5. 
          • if it's still not there immediately call tech support
          • after 1st week users get the point and will wait a bit longer if required
          • "they'are not all idiots"
        • use task-based UIs to make system look consistent (maximize time between sending commands and issuing a query on the client)
    • do we have to handle everything in the same pipe? maybe we can high- and low-priority pipes for different things in the system?


    • Set-based validation
      • what about validating that all usernames must be unique?
      • we only have consistency within a single AR
        • do we want to an AllUsers aggregate? erm, maybe not...
        1. ask: how bad is if two users get created with same username withing 500ms of each other? 
        2. we can see that something is wrong in an event handler (not a part of read model) and for example send an email?
        3. if we don't trust our clients we can put a validating layer on top of command endpoint checking the constraints in the read layer (but anyway - if the don't behave well they just get bad user experience)
      • more often than not if you ask about this topic you'll get redirected to this post
      • REMEMBER: solve problems in a business-centric way
    Never going down (the write side)
    • put a queue in front of the command handlers
      • traffic spikes won't overload the system
      • but we can't ACK/NACK the command - we say we accepted the command and assume it will work
        • client has to be "pretty damn certain that the command won't fail"
        • might want to provide some minimal validation just before putting cmd into the queue
      • most people just don't need such architecture, but one-way command pattern is extermaly valuable when they do
    • most message-oriented middleware isn't service bus
    • point-to-point == observer pattern
      • easy, great choice with only a few of queues to set up
      • gets complex with many connections, not scalable in this case
    • hub & spoke - middle-man observer pattern
      • we end up buying tibco or biztalk and start putting a lot of logic into it (workflows ...) and it quickly becomes a tangled mess
      • watching messages flow within organization is easy (debugging too)
      • single point of failure - when hub is down everything is down
    • service bus
      • we distribute the routing information
      • single point of failure no longer exists
      • can be hard to manage from network perspective
      • is a gross overkill in most cases
      • debugging message flows becomes a pain
      • extra features offered by service buses cause lots of logic to be put into transport
    • a bit of humour: IP over Avian Carriers
      • big lol but...
      • "never underestimate the throughput of a truck full of DVDs - highly latent, huge bandwidth"

    Sagas


    • what is a saga?
      • long-running business process? "long" can mean different things ;)
      • something that spans multiple transaction boundaries and ensures a process of getting back to a known good state if we fail in one of the transactions
    • got some hand-made drawings but don't feel like trying to re-create them in GIMP. why can't I find on Linux something as easy to use as M$ Paint?)
    • most companies get their competitive advantage not from a single system but from a bunch of interoperating systems
    • we need a facilitator instead of a bunch of business experts from specific domains
      • the PHBs in suits talking about kanban & lean (process optimization person - we don't want to act as one in this situation)
    • sagas do not contain business logic
    • set up a set of dependencies: 
      • who 
      • needs 
      • what 
      • when?
    • sagas move data to the right place at the right time for someone else to do the job
    • saga always starts in response to a single event coming out of domain model
    • choreographs the process and makes sure we reach the end
    • use a correlation id to know which events are related
      • most of the cases it's a part of the message. 
      • we might have multiple correlation ids.
    • sagas are state machines 
      • but we don't have implement it as one (few people think in state machines)
    • between events saga goes to sleep ( join calculus (think: wait, Future etc, continuations))
    • saga does the routing logic
      • it does not create data, just routes it between systems
    • some things have to happen before some amount of time passes
      • like in the movie Memento
      • no long term memory, have someone else providing information
      • use alarm clock for that - pass it a message that is an envelope for the message saga will send (?)
      • we want to avoid having state if possible, it should appear when we need it
    • types of sagas:
      • request-response based sagas
      • document based sagas
    • commands & events from individual systems become (are starting point for ) ubiquitous language
    • a saga often starts another saga (for example for handling rollbacks)
    • dashboards might be easily created from sagas data store (select * from sagastate ...)
    • if such a process is really important for our business why don't we model it (explicitly)?
    • sagas are extremally easy to test
      • small DSL for describing sagas
        • prove that you always exit correctly
        • generate all possible paths to exit
    • document oriented process
      • like with paper documents multiple persons use & fill with more info
      • most processes we try to implement has already been done before computers, on paper
      • but we forgot how we did it (and do the analysis again)
      • document based sagas are what you need in such cases
      • in case of big documents we don't send the whole document back and forth, we set up some storage for them and only send the links
    • RULE OF THUMB FOR VERSIONING SAGAS
      • when i release a new version all sagas already running stay in old version, all new will be run in new version (unless we've found a really bad bug in old implementation)
      • changing running sagas is dangerous and should be avoided
      • this rule makes versioning simple


    Scaling writes


    • we only guarantee CA out from CAP on the write side so we can't partition it
    • we can do real-time systems with CQRS
    • stereotypical architecture: single db, multiple app servers with load balancer in front
      • pros
        • fault tolerance
        • can do software upgrade without going down
        • knowledge about it is widespread
      • cons
        • app servers must be stateless!
        • can't be scaled (just buy a bigger database)
        • database remains a single point of failure
        • database might be a performance bottleneck
      • it's good but has limitations
    • let's replace the database with a event store!
      • there's no functional difference between this solution and previous one
      • loading aggregates on each request increases latency
    • we might split event store into multiple stores, based on aggregate ID (sharding)
      • this can (theoretically) go as far as having a single event store per aggregate
      • problem happens when one of the datastores goes down
        • we could multiply them with a master-slave pattern
        • but: each slave increases latency
      • this allows scaling out our event store
    • in order to reduce latency we can switch from stateless to statefull app servers
      • we have a message router (with fast, in-memory routing info) which knows which aggregate resides in each app server
      • loaded aggregate stays in memory of the app server
      • over time event store becomes write-only
      • when a app server goes down message router must distribute it's job among other servers
        • this can cause latency spike unacceptable for some real-time systems
    • to solve the problem we can use a warm replica 
      • just as in previous example but:
        • when message is routed to a server another server is told to shadow the aggregate that the message was directed to
          • shadowing server loads the AR and subscribes to it's events
        • events are delivered to shadowing systems by a publisher
          • stays ~100ms behind original write
          • can use UDP multicast for publishing events
        • when a server goes down shadowing server is only 100ms behind it and requires small operation to catch up with current state
        • this greatly reduces the latency spike when a server is going down
        • but...
        • we can get rid of the spike completely!
        • when shadowing server receives first command it can act as if it was up-to-date
          • but still listen to events from event store!
          • until it gets events it created itself it tries to merge
            • same code as regular events merging!
          • when it does get its own events it unsubscribes
        • many businesses will accept the risk of possible merging problems to avoid latency spikes
      • with this architecture there are no more reads from the event store!


    Occasionally connected systems
    My notes here are barely readable drawings on paper with some (even less readable) text here and there. Will unfortunately have to skip it (I'm certainly NOT doing those drawings in GIMP!) but...
    Greg already had a presentation on this subject recorded. It covers the same topics (watched it few days before the class).

    The interesting thing here is the conclusion: CQRS is nothing else as plain, old, good MVC (as initially done in Smalltalk) brought to architectural level.
    None of these ideas are new.
    Isn't it cool?
    The important lesson is:
    Review what you have already done.

    == END OF DAY 3 ===
    and unfortunately of the whole training. A pity, I wouldn't mind at all spending few more days attending to such a great class! Thanks a lot for it, Greg!

    Tuesday, April 19, 2011

    Notes from DDD & CQRS training - day 2

    That's when things got really interesting. The topics covered were more or less the same as in the 6.5h video from one of Greg's previous trainings which I had watched some time before the training during a long, lonely night at a hotel in Germany, but still I was listening at 100% attention. Without further babble, here go my notes:

    Read model

    • simple, hard to screw up
    • to be done by low value/junior developers
    • can be outsourced
      • can't find better thing to OS
    • uses whatever model is appropriate
    Command handlers
    public interface Consumes<T> where T:Message{ 
      //T would be a command(in case of command handlers) 
      //or an event in case of event handlers/projections
      void consume(T);
    }
    public interface Message {
      // just a marker interface
    }
    • they are the application services in CQRS-based systems, the external edge of the domain model
    • should contain no logic, not even a simple if statement
    • can implement cross-cutting concerns
      • logging
      • transactions
      • authentication
      • authorization
      • batch commands
      • exceptions handling
      • merging
    • handle cross-cutting concerns not directly in the same class that invokes aggregate method, but using composition (think decorator pattern).
    class DeactivateInventoryItemCommandHandler : 
      Consumes<DeactivateInventoryItemCommand> {
      /* constructor-injected repository */
      void consume(DeactivateInventoryItemCommand msg) {
        var item = repository.getById(msg.id);
        item.deactivate(msg.comment);
        // this makes batch cmd processing impossible
        repository.save(item); 
      }
    }

    class LoggingHandler<T> : Consumes<T> {
      public LoggingHandler(Consumes<T> next) {
        this.next = next;
      }
      public void consume(T message){
        Logger.write("received message:" + message);
        next.consume(message);
      }
    }

    var handler = new LoggingHandler(
      new DeactivateInventoryItemCommandHandler(
        repo));// yay, we've got logging!

    class AuthorizingHandler : Consumes<T>{
      AuthorizingHandler(Consumes<T> next){...}
      void consume(T message){
        // check authorization then do:
        next.consume(message);
      }
    }
    • make command handler wrapping automatic with reflection:
    [RequiresPermission("admin")]
    class DeactivateInventoryItemCommandHandler : 
      Consumes<DeactivateInventoryItemCommand> {....}

  • we can make our code our configuration




  • above is equal to doing functional composition (with interfaces). it could also be done explicitly:



  • // let's have a lambda:
    return x => DeactivateInventoryItemCommandHandler(
      new TestRepository<InventoryItem>(), x); 
      // this is DI in functional language
      // using function currying - that's so cool!

    public void DeactivateInventoryItemCommandHandler
      (Repository<InventoryItem repo,
       DeactivateInventoryItemCommand) {...}
    Projections
    • consume many events to update a view model
    • an important explicit concept
    • will have multiple methods, each handling another type of event
    • are in 1-to-1 relation with tables (sometimes, but rarely, 1-N)
    class InventoryItemCurrentCountProjection : 
      Consumes<InventoryItemDeactivated/*Event*/>, 
      Consumes<IventoryItemCreated> , ... 
    // more events needed to update the view model 
            // can't directly translate to Java :(
    {
    void consume(InventoryItemDeactivated message) {
    // do sth
    }
    void consume(IventoryItemCreated message) {
    // do sth else
    }
    }
    BOOK TO READ: The little LISPer


    CQRS can be done using a single data store for writes & reads. Like building the read model based on SQL views. But we can drive the specialization of write & read side even further. Finally, they've got totally different characteristics.
    And reports run on 3NF database are so sloooow.
    Enter:
    Events
    • verbs in past tense - they are things that have already happened, actions completed in the past (think passé composé)
    • listeners can disagree with them but can't say NO
      • can only compensate
    • can be used for synchronizing multiple different models
    So, we'll have our domain model implemented with (n)Hibernate emit events so that we can have our beloved 3NF database and denormalize into multiple read models (to get near infinite scalability)? Just having a 2PC transaction between the write db and a queue?

    Nope. This is guaranteed to fail.
    Why? 
    • ORM creates series of deltas
    • we have to prove that Δ(Hibernate) = Δ(events) - not easy
    • models can get out of sync in case of a bug
      • such problems can be hard to spot
      • impossible to fix data model broken this way
    So, what shall we do?
    • get rid of the ORM so our events are our only source of truth
    • we can have projections populating our 3NF model
      • but is it worth the costs and increased size of our code base?
    • this is the poison pill architecture
      • will get you to event sourcing
      • getting rid of 3NF model will let you get rid of 
        • your DBA freaks
        • costs of DB licences (business will like it!)
    Event sourcing At last!
    • in functional programming terms: current state = left fold of past behaviours
    • existing business systems (systems of problem domain, not necessarily computer systems, think: accounting) use history, not current state (like bank account balance)
    • deriving state from events allows us to change implementation of domain model easily, without affecting object persistence = disconnects the domain model from storage
    • events cannot be vetoed but we can compensate:
      • partial compensating actions (difficult, we don't want to go this way)
      • full compensating actions (accounting people - and developers! - prefer it)
        • compensate the whole transaction & add another one, correct
    • ES gives you an additive (append)-only behavioural model
    • we don't loose any information we would loose with structural model
      • we can build any structural model from our events
      • event log let's you see the system as it was at any point of time
        • this means you can go back in time
        • which is extremally valuable for debugging!
      • you can re-run everything that the system has ever done on latest (or any) version of software
    • when using MongoDB or Cassandra (or sth similar) aggregates can become documents you append events to
    • user's intention should be carrier through from commands to events
    • events are not equal to commands, even if from implementation point of view they might be identical
    • events can be enriched with results of business operations (authorization code of credit card operations, sales tax etc)
      • this prevents duplication of business logic between various places
    Event sourced aggregates
    • a base class is OK
    • important methods:
      • applyChange(event)
        • calls an event handling method of the aggregate (apply) for the concrete event type
        • registers events that have happened if it's a new event
      • public methods defining the business interface of the aggregate
        • business logic, conditionals live here
      • private methods defined in concrete aggregate classes handling events
        • no conditionals
        • only setting data
      • loadFromHistory(IEnumerable<Event>
        • accepts an event stream to restore the aggregate from the history
        • calls the apply method for each event (that's why those methods don't have behaviour, only set the data)
    • repository
      • saves only uncommitted changes of the aggregate and marks them as committed in the aggregate (clears the uncommitted events list)
    • a command makes aggregate produce 0..N events
    • you need a unit-of-work to support batch command processing
      • if you don't need it an explicit call to repository.save() in your event handler should be ok
      • UOW could be configured to accept events from only 1 aggregate and changing that setting to allow batch processing
    What if our aggregates have so many events that restoring aggregate state from them becomes a serious performance problem?
    Rolling snapshots
    • event log changes: [1,2,3,4,5,6,7] becomes [1,2,3,4,5, snapshot, 6, 7]
    • don't use direct serialization of aggregates
    • build snapshots in a separate snapshotter process
    Testing with Event sourcing

    • DDD testing - no asserting against getters, just the behaviour
    • an example scenario:

    public class when_deactivating_an_deactivated_inventory_item :
      AggregateSpecification {
      public IEnumerable given()  {
        yield return New.inventoryItemCreatedWithId(5);
        yield return New.inventoryItemDeactivatedWithId(5);
      }
      public override void when() {
        aggregate.Deactivate();
      }
      [Then]
      public void an_invalid_argument_exception_is_thrown() {
        Assert.isType{thrown};
      }
      [Then]
      public void no_events_are_produced() {
        Assert.isEmpty(events);
      }
    }
    • there's no magic in it, the base test class is dead simple:
    public abstract class AggregateSpecification 
      where T:AggregateRoot {
      public abstract IEnumerable Given();
      public abstract void When();
      protected T aggregate;
      protected Exception caught;
      protected List events;

      [Setup]
      public void Setup() {
        try {
          aggregate = new T();
          aggregate.loadFromHistory(given);
          When();
          events = new List(
            aggregate.getUncommittedChanges());
        } catch (Exception ex) {
          caught = ex;
        }
      }
    }
    • documentation can be generated from those tests
    • or: write the tests in natural language and generate test classes from them
      • then you (and business people) can see your progress as you make test cases pass
      • such tests can be used as communication tool
      • generate such docs in html or whatever format on every CI build so that business can see them at any time
      • override toString() on every event to get human-readable output that can be used in such tests
    • personal note: this is f*cking awesome!
    • we could also do it like:
    public class when_deactivating_an_deactivated_inventory_item :
      AggregateSpecification {


      public IEnumerable given()  {
        yield return New.inventoryItemCreated.WithId(5);
        yield return New.inventoryItemDeactivated.WithId(5);
      }

      public override Command when() {// difference here!
        return New.DeactivateInventoryItem.WithId(5);
      }

      [Then]
      public void an_invalid_argument_exception_is_thrown() {
        Assert.isType{thrown};
      }
      [Then]
      public void no_events_are_produced() {
        Assert.isEmpty(events);
      }
    }
    • entire testing can be expressed with events & commands
    • or maybe have a DSL to express those tests in platform-independent way? like:
    <Given>
    <!-- events serialized to XML -->
    </Given>
    <When>
    <!-- command serialized to XML -->
    </When>
    <Expect>
    <!-- assertions expressed in XML -->
    </Expect
    >
    • then get (for example) Ruby to make it ever nicer to look at
    • HINT: give business people a comfortable way to share their knowledge to development team
    • calling a method of an object == sending a message to an object
    • refucktoring
      • changing a test is making a _new_ test
    • versioning -> new tests with new versions
    • HINT: you don't want your devs understand the framework - you want them to understand the CONCEPT
    Building an event store on top of a SQL db

    • there's a detailed explanation available at cqrsinfo.com
    • RDBMS provides transactions out-of-the-box
    • with multiple event stores you can only guarantee events ordering within aggregate boundary (you can do global ordering with single event store)
    • metadata can be stored along with events (server the event originated in, security context, user, timestamp etc)
    • uses optimistic concurrency and carries the version between server & client
    • for storing events a stored procedure is recommended to avoid multiple server-db roundtrips:
    BEGIN
      var s = select currentversion from aggregate 
        where aggregateid = @1
      if(s==null)
        s = 0
        INSERT INTO AGGREGATES ....
      if( s != expectedVersion)
        throw new ConcurrencyException();
      foreach(event e)
        s++
        INSERT INTO EVENTLOG ...
      update aggregate set currentversion = s
    END
    • snapshotting 
      • brings in another table ( to avoid concurrency exceptions with the servers writing real events into the store)
      • snapshotting is done asynchronously by a snapshotter
      • when to snapshot? when we've got a certain number of events not included in last snapshot (different aggregates can have different snapshotting rules depending on the type etc)
      • snapshots are NOT necessity for most systems, only a heuristic brought in when we need performance boost on the write side
      • snaphots can be versioned differently from domain model (thanks to usage of Memento pattern for snapshots)
      • don't do snapshots by default
    • event store is a queue (little mutant freak database-queue hybrid baby)
    CQRS vs CAP theorem
    • CQRS doesn't break the CAP theorem
    • we don't get all 3 properties at the same time
    • domain (write) side needs C and A
    • read model needs A and P
    events, commands and dto are very strong boundaries allowing us to specialize within them

    CQRS from business perspective
    • are all developers created equal?
      • if your answer is yes you're just wrong
      • if your answer is no - why same people work on domain, UI & data storage?
    • how much time (%) do you really spend working with your domain? 25%? 30%? 40%?
    • reasons to create systems in private sector:
      • make money
      • save money
      • manage risk (let's have it just in case)
    • organizations with high level of maturity can have bigger teams
    • CQRS can get more people into the team (up to 2.5x) without decreasing maturity level
      • people can work in 3 teams independent of each other
      • communication between teams is low
      • create schema (think XSD) describing DTOs, commands & events in the estimation phase
        • if you can't - what the hell are you estimating?!
    • don't allow features to cross iteration boundaries - we want working system, not components, at the end of the iteration
      • keep teams working on the same feature at the same time
    • when working with UI you can mock out read model & commands endpoint
      • same with developing other parts of the system
    • there are 4 parts of every task/story: domain, GUI, read model, integration
    Moving to CQRS+ES architecture
    • one aggregate at a time
    • ask yourself: how are we gonna kill the system?
      • when ES-based system dies the events log is all that is left behind
        • you can migrate from ES to different system by creating a projection matching target data model
    CQRS vs stereotypical architecture
    • CQRS
      • writing to read side sucks
      • reading is easy
    • stereotypical architecture
      • writes are easy
      • queries suck
    • we're making a trade-off
    • both architectures produce an eventually consistent system
    • what about integration?
      • CQRS+ES system has integration model build-in!
        • our read model is in fact integrating with our domain
        • so we have actually tested our integration model!
        • we have a nice, push integration model
          • not an ugly, pull model
    === END OF DAY 2 ===

    That's it for now, advanced topics coming next in notes from day 3.

    Nighty-night!

    Monday, April 18, 2011

    Notes from DDD & CQRS training - day 1

    My notes from the 1st day (11/04/2011) of the DDD/CQRS training by Greg Young, just as I took them - very little post-processing applied so it might be of little help for anyone but me (or maybe other participants of the training).

    UIs:

    • CRUD (these suck)
    • task-based (users like those)
    Aggregate:
    • group of object we treat together as a whole
    • affect only a single aggregate - that lets you avoid distributed transactions (think horizontal partitioning/sharding)
    • put the method next to the state it operates on is
    • denormalization helps to get the design right
    Booksto read:
    • Streamlined object modelling 
      • time interval object
      • make implicit explicit
    • Object-oriented software construction 2nd edition by Bertrand Meyer
      • describes CQS 
    saving two objects = bad

    • business doesn't care about consistency
    • breaking bidirectional relationships
      • ask: do those things need to be consistent? 
      • drop consistency of invariant
    • domain model != data model
    • if needed, a Domain Service can ensure consistency (this should really be used only as a last resort!)
    • collection of Transaction objects can have a domain meaning
    • AggregateRoot (AR) name makes sense for the entire aggregate
    • too much magic is bad (think ORM)
    • between aggregates use soft links (IDs) instead of references
    TIP: keeping track of Optimistic Concurrency Exceptions makes an interesting statistic

    EXERCISE: test-drive Probability value object class with methods like combine(Probability), not() etc, encapsulating a Java's BigDecimal (.NET's Decimal?). The tricky part: you can't have any kind of accessor methods to expose the internal state. What do you test first?

    And now... suppose that standard BigDecimal implementation is too slow for your system. You have to change the implementation of the Probability class but retain the API. How many tests do you have to change?

    personal note: this turned out to be an easy, yet an interesting exercise. Funny, how it changes the way you write code when you don't have those evil getters around. I really, really liked it!

    Repository:
    • Evans: works on aggregates, provides domain language to persistence infrastructure
    • Fowler: purely technical stuff

    • make contracts in the domain:
      • as narrow as possible
      • as explicit as possible
    • this will lower the conceptual coupling
    Service:
    • any piece of procedural code
    • can be:
      • infrastructure
      • domain
      • application
    • but it the end they are all facades
    • if you do things right you might never need services
    • interface segregation
      • single method interfaces
      • role interfaces
    Hexagonal architecture - ports & adapters

    TIP:
     why not check check-ins for illegal dependencies (like domain depending on something else) and reject those that don't follow the rules?

    On SOLID principles:
    • they are just heuristics
    • don't try to stick to them no matter the cost (duplication sometimes can be a good thing!)
    EXAMPLE: When not to adhere to Interface Segregation?
    • when all methods go the the same source
      class Stream implements ICanSeek,ICanRead,ICanWrite
      // client code:
      void DoesSomething(ICanSeek seeker, ICanRead reader) {
        seeker.seekTo(0);
        while (var x = reader.read() != null) {
        /...
        }
      }
      ICanSeekRead extends ICanSeek,ICanRead != Foo implements ICanSeek, ICanRead
      • DI/IoC:
        • ServiceLocator is totally OK when resolving things at the same layer
        • about injecting into entities: most dependencies match the lifecycle of methods, not objects 
      void Submit(ISearchDriverLicences s) {
        s.searchFor("something");
      }
      void F() {//coupling from F to G
        G.Something();
      }
      interface ISomething{
        something();
      }
      void F(ISomething s) { 
        s.something();
      }
      class ISomethingImpl:ISomething {
        // sometimes DI is too much:
        void something(){
          Console.WriteLine("Hello world");
        }
      }
      • we're overusing tools, frameworks 
        frameworks pollute our brains

        Back to services:

        • ApplicationServices 
          • should be role interfaces, one for every use case of the system
          • you should have no business logic in them (not even an if statement!)
        isValid() antipattern
        • pure evil!
        • don't do that!
        • causes GIGO (Garbage In, Garbage Out)
        • entities end up being in one of 3 possible states:
          • valid
          • invalid
          • have no frakking clue
        • encapsulation is about protecting state - don't let people jam it!
        Specification
        • (wikipedia)
        • predicate logic (think Prolog)
        • might need getters (protected/internal) exposed
          • but people will start using them as soon as they see them
        • composite specification
        public class AService {
          AService( IEnumerable<Specification<Customer>>  
            rules){}

          void deactivate(Customer c) {
            if(!rules.areAllValid(c) { 
              throw new IllegalArgException(); 
            }
            ...
          }
        }

        === END OF DAY 1 ===
            
        Unfortunately, those notes don't show how absolutely awesome the training was. Really got my eyes wide open on many issues that I was somehow missing before.

        On a related note - people really do use functional programming in real-world applications! After hearing that from Greg I started learning Clojure. I had a Prolog & Haskell course back at the university. I didn't like the Prolog part but really enjoyed writing minimized code in Haskell. Now I just have to find some time to refresh by skills at functional programming. Or better - find some use for it so I can justify re-learning it at work ;)