35
CAVE - an overview Val Dumitrescu Paweł Raszewski

CAVE Overview

Embed Size (px)

DESCRIPTION

A quick overview of CAVE, a managed service for monitoring infrastructure, platform, and application metrics, to provide visibility into your system's performance and operational levels. CAVE is built at GILT, using Scala, Play and Akka.

Citation preview

Page 1: CAVE Overview

CAVE - an overviewVal Dumitrescu

Paweł Raszewski

Page 2: CAVE Overview

At GILT:● OpenTSDB + Nagios● DataDog● NewRelic● Notifications with PagerDuty

Another monitoring solution?

Page 3: CAVE Overview

ContinuousAudit

VaultEnterprise

What is CAVE?

Page 4: CAVE Overview

A monitoring system that is:● secure● independent● proprietary● open source

What is CAVE?

Page 5: CAVE Overview

● horizontally scalable to millions of metrics, alerts● multi-tenant, multi-user● extensible HTTP-based API● flexible metric definition● data aggregation / multiple dimensions● flexible and extensible alert grammar● pluggable notification delivery system● clean user interface for graphing and dashboarding

Requirements

Page 6: CAVE Overview

Architecture

Page 7: CAVE Overview

Architecture

Page 8: CAVE Overview

Architecture

Page 9: CAVE Overview

Architecture

Page 10: CAVE Overview

Architecture

Page 11: CAVE Overview

Architecture

Page 12: CAVE Overview

Architecture

Page 13: CAVE Overview

Alert Grammar

Metric has name and tags (key-value pairs)e.g.orders [shipTo: US]response-time [svc: svc-important, env: prod]

Page 14: CAVE Overview

Alert Grammar

Aggregated Metric has metric, aggregator and period of aggregation, e.g.orders [shipTo: US].sum.5mresponse-time [svc: svc-important, env: prod].p99.5m

Supported aggregators:count, min, max, mean, mode, median, sumstddev, p99, p999, p95, p90

Page 15: CAVE Overview

Alert Grammar

Alert Condition contains one expression with two terms and an operator. Each term is a metric, an aggregated metric or a value.e.g.orders [shipTo: US].sum.5m < 10orders [shipTo: US].sum.5m < ordersPredictedLow [shipTo: US]

Page 16: CAVE Overview

Alert Grammar

An optional number of times the threshold is broken, e.g.response-time [svc: svc-team, env: prod].p99.5m > 3000 at least 3 times

Page 17: CAVE Overview

Alert Grammar

Special format for missing datae.g.orders [shipTo: US] missing for 5mheartbeat [svc: svc-important, env: prod] missing for 10m

Page 18: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { sealed trait Source case class ValueSource(value: Double) extends Source case class MetricSource( metric: String, tags: Map[String, String]) extends Source case class AggregatedSource( metricSource: MetricSource, aggregator: Aggregator, duration: FiniteDuration) extends Source

sealed trait AlertEntity case class SimpleAlert( sourceLeft: Source, operator: Operator, sourceRight: Source, times: Int) extends AlertEntity case class MissingDataAlert( metricSource: MetricSource, duration: FiniteDuration) extends AlertEntity …}

Page 19: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { … def valueSource: Parser[ValueSource] = decimalNumber ^^ { case num => ValueSource(num.toDouble) }

def word: Parser[String] = """[a-zA-Z][a-zA-Z0-9.-]*""".r def metricTag: Parser[(String, String)] = (word <~ ":") ~ word ^^ { case key ~ value => key -> value }

def metricTags: Parser[Map[String, String]] = repsep(metricTag, ",") ^^ { case list => list.toMap } …}

Page 20: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { … def metricSourceWithTags: Parser[MetricSource] = (word <~ "[") ~ (metricTags <~ "]") ^^ { case metric ~ tagMap => MetricSource(metric, tagMap) }

def metricSourceWithoutTags: Parser[MetricSource] = word ^^ { case metric => MetricSource(metric, Map.empty[String, String]) }

def metricSource = metricSourceWithTags | metricSourceWithoutTags

…}

Page 21: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { … def duration: Parser[FiniteDuration] = wholeNumber ~ ("s"|"m"|"h"|"d") ^^ { case time ~ "s" => time.toInt.seconds case time ~ "m" => time.toInt.minutes case time ~ "h" => time.toInt.hours case time ~ "d" => time.toInt.days }

def aggregatedSource: Parser[AggregatedSource] = (metricSource <~ ".") ~ (aggregator <~ ".") ~ duration ^^ { case met ~ agg ~ dur => AggregatedSource(met, agg, dur) }

def anySource: Parser[Source] = valueSource | aggregatedSource | metricSource …}

Page 22: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { … def missingDataAlert: Parser[MissingDataAlert] = metricSource ~ "missing for" ~ duration ^^ { case source ~ _ ~ d => MissingDataAlert(source, d) }

def simpleAlert: Parser[SimpleAlert] = anySource ~ operator ~ anySource ^^ { case left ~ op ~ right => SimpleAlert(left, op, right, 1) }

def repeater: Parser[Int] = "at least" ~ wholeNumber ~ "times" ^^ { case _ ~ num ~ _ => num.toInt } def simpleAlertWithRepeater: Parser[SimpleAlert] = anySource ~ operator ~ anySource ~ repeater ^^ { case left ~ op ~ right ~ num => SimpleAlert(left, op, right, num) }

Page 23: CAVE Overview

Alert Grammartrait AlertParser extends JavaTokenParsers { … def anyAlert: Parser[AlertEntity] = missingDataAlert | simpleAlertWithRepeater | simpleAlert}

Usage:class Something(conditionString: String) extends AlertParser {

… parseAll(anyAlert, conditionString) match {

case Success(SimpleAlert(left, op, right, times), _) => … case Success(MissingDataAlert(metric, duration), _) => … case Failure(message, _) => … }}

Page 24: CAVE Overview

Functional Relational Mapping (FRM) library for Scala

Slick <> Hibernate

Slick

Page 25: CAVE Overview

compile-time safetyno need to write SQL

full control over what is going on

Slick

Page 26: CAVE Overview

Scala Collections APIcase class Person(id: Int, name: String)

val list = List(Person(1, "Pawel"),

Person(2, "Val"),

Person(3, "Unknown Name"))

Page 27: CAVE Overview

Scala Collections APIcase class Person(id: Int, name: String)

val list = List(Person(1, "Pawel"),

Person(2, "Val"),

Person(3, "Unknown Name"))

list.filter(_.id > 1)

Page 28: CAVE Overview

Scala Collections APIcase class Person(id: Int, name: String)

val list = List(Person(1, "Pawel"),

Person(2, "Val"),

Person(3, "Unknown Name"))

list.filter(_.id > 1).map(_.name)

Page 29: CAVE Overview

Scala Collections APIcase class Person(id: Int, name: String)

val list = List(Person(1, "Pawel"),

Person(2, "Val"),

Person(3, "Unknown Name"))

list.filter(_.id > 1).map(_.name)

SELECT name FROM list WHERE id > 1

Page 30: CAVE Overview

Schema

ORGANIZATIONS TEAMS

Page 31: CAVE Overview

Entity mapping/** Table description of table orgs.*/

class OrganizationsTable(tag: Tag) extends Table[OrganizationsRow](tag,"organizations") {

...

/** Database column id AutoInc, PrimaryKey */

val id: Column[Long] = column[Long]("id", O.AutoInc, O.PrimaryKey)

/** Database column name */

val name: Column[String] = column[String]("name")

/** Database column created_at */

val createdAt: Column[java.sql.Timestamp] = column[java.sql.Timestamp]("created_at")

… /** Foreign key referencing Organizations (database name token_organization_fk) */

lazy val organizationsFk = foreignKey("token_organization_fk", organizationId,

Organizations)(r => r.id, onUpdate = ForeignKeyAction.NoAction, onDelete =

ForeignKeyAction.NoAction)

}

Page 32: CAVE Overview

CRUDval organizationsTable = TableQuery[OrganizationsTable]

// SELECT * FROM ORGANIZATIONS

organizationsTable.list

// SELECT * FROM ORGANIZATIONS WHERE ID > 10 OFFSET 3 LIMIT 5

organizationsTable.filter(_.id > 10).drop(3).take(5).list

// INSERT organizationsTable += OrganizationsRow(1, "name", "email", "notificationUrl", ... , None,

None)

// UPDATE ORGANIZATIONS SET name = “new org name” WHERE ID=10

organizationsTable.filter(_.id === 10).map(_.name).update("new org name")

// DELETE FROM ORGANIZATIONS WHERE ID=10

organizationsTable.filter(_.id === 10).delete

Page 33: CAVE Overview

Queries - JOINSval organizationsTable = TableQuery[OrganizationsTable]

val teamsTable = TableQuery[TeamsTable]

val name = “teamName”

val result = for {

t <- teamsTable.sortBy(_.createdAt).filter(t => t.deletedAt.isEmpty)

o <- t.organization.filter(o => o.deletedAt.isEmpty && o.name === name)

} yield (t.name, o.name)

SELECT t.name, o.name FROM TEAMS t

LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id

WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName`

ORDER BY t.created_at

Page 34: CAVE Overview

SELECT t.name, o.name FROM TEAMS t

LEFT JOIN ORGANIZATIONS o ON t.organization_id = o.id

WHERE t.deleted_at IS NULL AND o.deleted_at IS NULL AND o.name = `teamName`

ORDER BY t.created_at

val result: List[(String, String)]

Page 35: CAVE Overview

Connection pool and transactionsval ds = new BoneCPDataSource

val db = {

ds.setDriverClass(rdsDriver)

ds.setJdbcUrl(rdsJdbcConnectionString)

ds.setPassword(rdsPassword)

ds.setUser(rdsUser)

Database.forDataSource(ds)

}

db.withTransaction { implicit session =>

// SLICK CODE GOES HERE

}