Talk auf dem DevOps-Meetup Stuttgart - DEV vs. OPS

Johannes Mainusch und ich haben am 6.9. in Stuttgart beim Breuninger einen Meetup-Vortrag zum Thema “Vertikale-Orga­nisation: wer ist hier OPS-verantwortlich gehalten.

Naja, Vortrag kann man das eigentlich nicht nennen. Mehr Streitgespräch eines alten Ehepaares. Der Developer und der Opsler halt. Geendet ist das Ganze in einem Fishbowl und final im Mata Hari beim Fachsimpeln über Bier.

Sebastian Dörner vom Breuninger Digital-Team hat den Vortrag aufgezeichnet. Hier ein Eindruck des Abends:

Die Slides findet ihr hier.

Update

Hier der Vortrag in ganzer Länge:

Your HTTPS Setup is Broken III: Patch it with HPKP

In the first blogpost of this series, I’ve shown you how easy it is for an attacker to eavesdrop the SSL/TLS connection between you and your client. This is not a theoretical issue and happens to customers every day. Even strong ciphers and encryption settings don’t help. Why? The problem is trust: If your client trusts any server, it doesn’t matter which cyphers your server is using.

This serious issue (mainly caused by bad Certificate Authorities (CAs)) didn’t stop since my first post. They’re getting even worse: Blue Coat, a company which is suspected to sell their TLS man-in-the-middle products to repressive regimes, have obtained a Intermediate CA certificate from Symantec, which would allow them to create valid certificates for any domain.

End of June, Comodo exposed an HTML injection vulnerability, which allowed any user to obtain a valid certificate for any domain for free.

The first thing you should do, is to enforce HTTPS usage via HTTP Strict Transport Security (as shown in my second post). It tells the browser to accept only encrypted sessions, which mitigates a downgrade attack to HTTP.

The second thing you should do, is to enable HPKP (HTTP Public Key Pinning), which protects your HTTPS connection from attacks involving these fraudulent certificates.

Your HTTPS Setup is Broken II: Patch it with HSTS

This post is the second in the “Your HTTPS Setup is Broken”-series. Previously, I’ve described how easy it is for an attacker to eavesdrop on your “secure” communication. In this post I’ll show you how to enforce encrypted communication, so an attacker cannot downgrade the connection to unencrypted HTTP.

The attack vector which we want to mitigate is that somebody intercepts your traffic before it gets switched to HTTPS. Nearly nobody enters “https://www.example.com” into the titlebar. Instead, they’re entering the domain name “example.com” and your server redirects them to HTTPS.

To prevent the user’s browser to switch to HTTPS, a man-in-the-middle could simply strip out all HTTPS-links, -redirects or -form-actions. As the man-in-the-middle is effectively a proxy, it can rewrite any resource links from HTTPS to unencrypted HTTP in order to intercept any requests to these links. All traffic from the man-in-the-middle the server will then get encrypted within an established SSL-session to the server, and all traffic to the user will go over the wire via unencrypted HTTP.

Your HTTPS setup is broken!

So, you use HTTPS to encrypt communication with your customers. Maybe you use the latest encryption ciphers and algorithms. But you may still have a very big issue in your setup. In this first blog post about HTTPS security, I’ll show that trust is at least important as encryption while securing communication. Furthermore, I’ll show how untrustworthy the current Certificate Authority infrastructure is.

Type Class 101: A practical guide to Monad Transformers (Example)

The last episode of this series covered the motivation behind Monad Transformers and gave some examples of their usage. Now it is time to show a small real world application. By chance I stumpled accross this section of code in an open source project:

1
2
3
4
5
6
7
8
9
10
11
12
private[hajobs] def retriggerJobs(): Future[List[JobStartStatus]] = {
    def retriggerCount: (JobType) => Int = jobManager.retriggerCounts.getOrElse(_, 10)
    jobStatusRepository.getMetadata(limitByJobType = retriggerCount).flatMap { jobStatusMap =>
      val a = jobStatusMap.flatMap { case (jobType, jobStatusList) =>
        triggerIdToRetrigger(jobType, jobStatusList).map { triggerId =>
          logger.info(s"Retriggering job of type $jobType with triggerid $triggerId")
          jobManager.retriggerJob(jobType, triggerId)
        }
      }.toList
      Future.sequence(a)
    }
  }

It does not matter what the code does. We will just hang onto the types to improve it in little steps. Before I do that, we should wonder why we should improve it. For starters, I had a hard time to understand what this thing does. And when I do not understand code, a have a little list of things to look for:

  • there are a couple of flatMaps and maps. Code frequently becomes more readable using for comprehensions.
  • Obviously something is mapped around and then sequenced in the final step. That screams for the use of Future.traverse instead of sequence.
  • Looking at the code in an IDE reveals that some `implicit conversions from scala.Predef happen, in particular conversions of Option to List.

Before we start digging in our crates, I give you a small low down of the types involved:

  • jobStatusRepository.getMetadata(limitByJobType = retriggerCount) returns Future[Map[JobType, List[JobStatus]]]
  • triggerIdToRetrigger(jobType : JobType, jobStatusList : List[JobStatus]) returns Option[UUID]
  • jobManager.retriggerJob(jobType : JobType, triggerId : UUID) returns Future[JobStartStatus]
  • the temporay val a has the type val a: List[Future[JobStartStatus]]
  • The final result is of type Future[List[JobStartStatus]]

The first approach to optimization of readability would be to try to unify the type system a bit and more clearly:

  • The result of jobStatusRepository.getMetadata(...) could be treated as a Future[List[(JobType, List[JobStatus])]]
  • If I had a list of triggerIds I could do Future.traverse(triggerIds)(triggerId => jobManager.retriggerJob(jobType, triggerId))

This yields

1
2
3
4
5
6
7
8
9
10
11
12
13
14
private[hajobs] def retriggerJobs(): Future[List[JobStartStatus]] = {
    def retriggerCount: (JobType) => Int = jobType => jobManager.retriggerCounts.getOrElse(jobType, 10)

    for {
      metaDataList <- jobStatusRepository.getMetadata(limitByJobType = retriggerCount).map(_.toList)
      triggerIds <- ???
      jobStartStatusList <- Future.traverse(triggerIds) { triggerId =>
        logger.info(s"Retriggering job of type $jobType with triggerid $triggerId")
        jobManager.retriggerJob(jobType, triggerId)
      }
    } yield {
      jobStartStatusList
    }
  }

Now we have:

  • metaDataList of type List[(JobType, List[JobStatus])]
  • triggerIds must be of type List[UUID]
  • the means to get UUIDs is triggerIdToRetrigger(jobType : JobType, jobStatusList : List[JobStatus])
  • the right handside of the for comprehension (marked as ???) must return a Future[List[UUID]]

So we get:

1
2
3
4
5
6
triggerIds <- Future.successful {
  for {
    (jobType, jobStatusList) <- metaDataList
    triggerId <- triggerIdToRetrigger(jobType, jobStatusList).toList
  } yield triggerId
}

or as a final complete version:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
private[hajobs] def retriggerJobs(): Future[List[JobStartStatus]] = {
    def retriggerCount: (JobType) => Int = jobType => jobManager.retriggerCounts.getOrElse(jobType, 10)

    for {
      metaDataList <- jobStatusRepository.getMetadata(limitByJobType = retriggerCount).map(_.toList)
      triggerIds <- successful {
        for {
          (jobType, jobStatusList) <- metaDataList
          triggerId <- triggerIdToRetrigger(jobType, jobStatusList).toList
        } yield triggerId
      }
      jobStartStatusList <- Future.traverse(triggerIds) { triggerId =>
        logger.info(s"Retriggering job of type $jobType with triggerid $triggerId")
        jobManager.retriggerJob(jobType, triggerId)
      }
    } yield {
      jobStartStatusList
    }
  }

To recap - now we have a slightly better version (in my personal opinion) of the code without implicit conversions. We use for comprehensions throughout. On the left hand side we always have Lists, and on the right hand side we always have Future[List[XXX]]. Now if the whole code would not use futures at all, a solution would be very straight forward:

1
2
3
4
5
6
7
8
for {
  (_, jobStatusList) <- jobStatusRepository.getMetadata(limitByJobType = retriggerCount).toList
  triggerId <- triggerIdToRetrigger(jobType, jobStatusList).toList
  jobStartStatus <- {
    logger.info(s"Retriggering job of type $jobType with triggerid $triggerId")
    jobManager.retriggerJob(jobType, triggerId)
  }
} yield jobStartStatus

And with this realization we can finally turn to Monad Transformers to the rescue. We choose the ListT[Future, ?] class because we want our Future[List[?]] to behave as if they were lists.

The final result could look like this:

1
2
3
4
5
6
7
8
9
private[hajobs] def retriggerJobs(): Future[List[JobStartStatus]] = {
    def retriggerCount: (JobType) => Int = jobType => jobManager.retriggerCounts.getOrElse(jobType, 10)

    (for {
      (jobType, jobStatusList) <- ListT(jobStatusRepository.getMetadata(limitByJobType = retriggerCount).map(_.toList))
      triggerId                <- ListT(triggerIdToRetrigger(jobType, jobStatusList).toList.point[Future])
      jobStartStatus           <- jobManager.retriggerJob(jobType, triggerId).liftM[ListT]
    } yield jobStartStatus).run
  }

An alternative version which avoids having to lift everything in the ListT monad, is to use flatMapF(A => F[A]) like:

1
2
3
4
ListT(jobStatusRepository.getMetadata(limitByJobType = retriggerCount).map(_.toList))
      .flatMapF { case (jobType, jobStatusList) => triggerIdToRetrigger(jobType, jobStatusList).toList.point[Future] }
      .flatMapF { triggerId => jobManager.retriggerJob(jobType, triggerId).map(List(_)) }
      .run

I used some helper functions from scalaz, such as liftM. Look them up in the scaladocs, it’s fun. Other helper functions where not used so, to keep the concepts clearer. If you have questions, just ask in the comments.

For further articles in this series: TypeClass101

Type Class 101: A practical guide to Monad Transformers

Let’s say you are a typical scala programmer, making plenty of use of Futures in your code. Sooner or later you end up having APIs like the following:

1
2
3
4
5
6
7
 case class Article(id: Int, ..., metaInformationId : Int)
 case class MetaInformation(id: Int, key: String, value: String)

 // get an Article from the DB if it there.
 def getArticle(id: Int) : Future[Option[Article]] = ???
 // retrieve MetaInformation. If it exists in an article it must be in the DB.
 def getMetaInformation(id: Int) : Future[List[MetaInformation]] = ???

And for starters, let’s say you want to retrieve 3 articles and return something like Future[Option((Article, Article, Article))] which implies that you want some tuple if you could retrieve all three articles, “none” tuple if any of the articles could not be found, and a failure if any of the database accesses failed.

Cassandra - to BATCH or not to BATCH

This post is about Cassandra’s batch statements and which kind of batch statements are ok and which not. Often, when batch statements are discussed, it’s not clear if a particular statement refers to single- or multi-partition batches or to both - which is the most important question IMO (you should know why after you’ve read this post).

Instead, it is often differentiated between logged and unlogged batches ([1], [2], [3]). For one thing, logged/unlogged may cause misunderstandings (more on this below). For another thing, unlogged batches (for multiple partitions) are deprecated since Cassandra 2.1.6 ([1], [4], [5]). Therefore I think that the distinction between logged/unlogged is not that (no longer) useful.

In the following I’m describing multi- and single-partition batches: what they provide, what they cost and when it’s ok to use them.

TL;DR: Multi partition batches should only be used to achieve atomicity for a few writes on different tables. Apart from this they should be avoided because they’re too expensive. Single partition batches can be used to get atomicity + isolation, they’re not much more expensive than normal writes.

Integrating Self-Contained Systems - be careful with your SSIs!

In this post I’m going to describe an issue we experienced with nginx and its handling of Server Side Includes (SSIs). We saw that nginx at first decodes the SSI URI path and afterwards encodes it when loading the resource. And in some cases, the URI path encoded by nginx was different than the original one. The solution is easy (use query parameters if in doubt), but I thought I’d share this so that others maybe don’t run into this issue and/or see how to debug such things.

Event Sourcing/CQRS with Akka Persistence and Eventuate

Here’s a short post with linked slides and the recording of our first Reactive Systems Hamburg Meetup, where Martin Krasser compared the Event-Sourcing/CQRS tools Akka Persistence (which he also authored, as successor of his Eventsourced lib) and Eventuate (which he’s now building for Red Bull Media House to support a globally distributed system).

At first a bit of our meetup history: in Hamburg we had 3 meetups/user groups that were related to Scala and the Typesafe stack: the Scala Meetup itself, the Akka Meetup (organized by Lutz Hühnken and Markus Jura from Typesafe) and our Playframework Meetup (run by my colleague Markus Klink and me/Martin Grotzke). The Akka and Playframework Meetups often had overlapping topics, therefore we decided to bring them together / combine our forces. One major topic of many meetups/talks was this Reactive Thing (which happens also to be a major architectural driver in our current projects at inoio), so it was fairly obvious what the resulting meetup should be about - the Reactive Systems Hamburg Meetup was born.

One of the main advantages we see is that we’re no longer focussing on a specific product/framework, but an architectural approach instead. So now we’re looking forward to see great talks about other tools like e.g. Vert.x, ReactiveX, Reactor etc., and also about general approaches to the different reactive traits (responsive, resilient, elastic and message driven).

For our first meetup we invited Martin Krasser to talk about Event Sourcing and CQRS in distributed systems - Martin is an expert in this area. He is also the original author of Akka Persistence (an event sourcing implementation based on Akka) and also Eventuate (which adds optional partition tolerance and support for globally distributed systems, via a causal consistency model).

Here is a short overview over Martins (really great!) talk:

  • After talking about the history of Akka Persistence and Eventuate he showed the similarities of both (Scala, Akka, Streams, + Storage Backends)
    • As part of this Martin showed the life cycle and relevant command processing / state recovery related methods of event-sourced actors
  • He explained the differences of both tools
    • In terms of CAP, Akka Persistence chooses CP while Eventuate also allows to choose AP - both support strong consistency, Eventuate additionally relaxation to causal consistency
    • Martin showed how state replication of EventsourcedActors works in Eventuate, including handling of network partitions and automated and interactive conflict resolution. As part of this he also explained causal consistency (really nice!)
    • Then Martin pointed out differences regarding the event log, event routing and event collaboration (consumption of events from different event-sourced actors)
    • Afterwards he talked about the query-side differences (and how Eventuate provides causal consistency there)
  • As last part Martin did a quick comparison of Akka Distributed Data and Eventuate, and their approach to CRDTs (conflict-free replicated data types)

In summary this was an amazing talk, and afterwards Martin answered many questions and we had a great discussion! Thanks a lot to Martin and to all attendees, the first Reactive Systems Hamburg Meetup couldn’t have been better!

So now you probably want to check out Martins slides and enjoy the video!

Many thanks to my colleague Matthias Brandt for organizing the recording together with aussenborder.TV!