CosmosDb

2020

Follow along at:

http://itsnull.com/

http://itsnull.com/presentations/cosmos2020/

Created by Kip Streithorst / @itsnull

Overview

  • Overview
  • Demo application
  • Live polls for the audience
  • Deep dive

Questions

  • Ask questions, please
  • If I just wanted to record a video, I would have.
  • This is better than video, you can ask questions.

What is Cosmos?

  • Globally available, replicated, high-performance NoSQL storage solution
  • Store schema-less documents
  • Turn-key global replication
  • Supports cloud scale by automatically scaling horizontally (both storage and throughput)
  • Billed via RUs
  • SLAs for read and write performance

Poll Time

What is NoSQL?

  • Store a document, think JSON payload
    {
      "FirstName": "Jane",
      "LastName": "Doe"
    }
                
  • Schema-less, each document can be completely different from every other document.

What is NoSQL?

  • A document can contain complex structured data, think
    {
      "FirstName": "Jane",
      "LastName": "Doe",
      "Children": [
        {
          "FirstName": "Bobby",
          "LastName": "Doe"
          "Age": "14"
          "Hobbies": ["Video Games"]
        }
      ]
    }
                
  • Data duplication is normal
  • Enforcement of data constraints happens up a layer (at application-level)

CosmosDb

  • Now, let's get into Cosmos specifics

Document Overview

  • Every document is required to have two fields set, Id and PartitionKey
  • A document is located using (Id and PartitionKey)
  • Id can be a string of any kind up to 255 characters
  • PartitionKey will be explained more later
  • Some built-in fields are provided
    • _ts - timestamp of last update
    • _etag - used to tell if document has been changed
    • _rid, _self - internal fields
    • _attachments - have not used myself
    • _ttl - preferred method to delete documents

SLAs and Performance

  • Read Document
    • 10ms SLA
    • Cost is 1 Request Unit (RU) to read 1 KB document
  • Write Document
    • 10ms SLA
    • This could insert, upsert, delete, overwrite
    • Cost depends on the operation, size of document and which properties are indexed

SLAs and Performance

  • Query documents
    • No performance SLA
    • Cost depends greatly if query is single partition or cross partition, what properties are indexed
    • Generally 100x-1000x or more costly and slower than either reading or writing a document
    • Avoid it all costs
  • You pre-provision request units per second (RU/s). If you exceed, Cosmos will throttle (e.g. HTTP 429 Server Busy)

Physical Partitions

  • A new Cosmos database starts with 1 Physical Partition (e.g. server)
    • Max of 10,000 RU/s
    • Min of 400 RU/s
    • Max of 20 GB of storage (previously was 10 GB limit)
  • Details of physical partition are hidden and auto-managed by the service

Physical Partitions

  • If you provision your account to more than 10,000 RU/s or load more than 20 GB of data, then at some point (e.g. secret sauce), the service will split your single partition into two physical partitions (e.g. servers). This results in two new physical partitions and retirement of the old parent partition.
  • A document will only live in 1 physical partition, regardless of how many there are.
  • Each physical partition maintains an index for only those documents stored on that partition.

Physical Partitions

  • When reading or writing a document, you must provide both the PartitionKey and the Id
    • The SDK will directly contact the physical partition (e.g. server) responsible for holding those documents. Any other partitions (e.g. servers) have ZERO load requested from them.
  • var cosmosDb = new Dictionary<string, Dictionary<string, object>>();
  • Write is cosmosDb[partitionKey][id] = new { Name = "Jane Doe" };
  • Read is cosmosDb[partitionKey][id]

Physical Partitions

  • When query a document, you can provide a PartitionKey
    • The SDK will directly contact the physical partition (e.g. server) to execute the query. Any other partitions (e.g. servers) will have ZERO load requested from them.
  • When querying a document, you can specify you want a cross-partition query
    • The SDK will simultaneously query multiple physical partitions (e.g. servers). Each will execute the query against their stored data and return relevant results (may be zero for a given partition). The SDK will then stitch the results back together as the simultaneous results are returned.

Poll Time

Logical Partitions

Partition Key Documents
North Carolina 1,500
Ohio 4,500
Texas 2,500
  • Currently everything lives on 1 physical partition
  • Now, increase load but keep only these 3 partition keys. Cosmos at most could create 1 physical partition to match each logical partition, so you end up with 3 physical partitions. So, you have an upper limit of 20 GB * 3 = 60 GB and 10,000 RU/s * 3 = 30,000 RU/s

Logical Partitions

Partition Key Documents
North Carolina 1,500
Ohio 4,500
Texas 2,500
  • Let's add 47 more states, while increasing load.
  • We could now have 50 physical partitions, e.g. 20 GB * 50 = 1 TB and 500,000 RU/s. Each state though can only hold 20 GB of data and have 10,000 RU/s a second of traffic. Might California and New York exceed capacity?

Logical Partitions

  • What if we changed PartitionKey to City? Might some cities exceed 20 GB and 10,000 RU/s? Maybe a different PartitionKey?
  • Remember though you need both the PartitionKey and Id to read or write a document
  • Choosing a PartitionKey can be very difficult and will directly correlate to how well CosmosDb will work as a solution for you.
  • A good PartitionKey and the system is basically limitless.

Logical Partitions

  • Cosmos decides whn and how to split physical partitions. In general, I have NEVER seen it rejoin partitions.
  • It will never split a logical partition into two (e.g. all documents belonging to a single state will always reside on a single physical partition or server).
  • Hot partitions result from a bad PartitionKey (e.g. all requests go to the same physical server and Cosmos can't split a single logical partition). Since it can't split the partition, traffic will be throttled.

Consistency

  • Writing a single document is atomic, the write will fully succeed or fully fail. This goes for creating a document as well.
  • When mutating a document, you must provide the entire document contents.

Consistency

  • When using the SDKs, by default it operates in last-write wins.
  • However, each document maintains an _etag property. You can provide that _etag back with a write operation and Cosmos will only permit the write operation if the provided _etag value matches the value in the current document. This gives you optimistic concurrency to ensure two users don't overwrite each other's changes.

Consistency

  • What if you want to update multiple documents atomically?
  • You must use Cosmos Stored Procedures. They are written in Javascript and execute on a physical partition.
  • To execute, you must provide a PartitionKey value and the entire logical partition is locked during the stored procedure execution.
  • Many documents belonging to the same logical partition can then be updated in an ACID compliant manner.

Poll Time

Demo Code

  • Let's look at code for the Polling Application
  • https://github.com/kstreith/poll-app/

Cosmos Change Feed

  • Native feature of Cosmos
  • Provides a feed of each change to a document in a collection
  • Provides changes in order by modification time within each logical partition.
  • Changes can be synchronized from any point-in-time or all changes from the beginning.
  • Only guaranteed to have the latest version of a document (intermediate changes may be dropped over time).
  • Document deletes do NOT appear. Instead update document with a Time To Live (e.g. _ttl property)

Cosmos Change Feed

  • Check-pointing must be done in your code, Cosmos server doesn't handle.
  • The Cosmos v3 SDK provides check-pointing logic.
  • Can be used to replicate data to other systems in real-time.
  • Can be used to implement CQRS or Event Sourcing patterns.

Poll App - Change Feed

  • A WebJob runs continuously against Cosmos Change Feed.
  • For any PollResponse documents, accumulates the value into the PollResult document.
  • Demo

Poll App - Metrics

  • Let's look at metrics for the CosmosDb instance.
  • The free tier of CosmosDb is 400 RU/s and 5 GB of storage

Pricing

  • You pay for storage
  • You pay for RU/s
  • The standard model was to pre-provision RU/s and pay for that amount regardless of the amount being used.
  • You can buy RUs in bulk ahead to get a discount (15% to 65% discount)

Pricing

  • Just announced Autoscale pricing
  • Set a maximum RU/s, system will autoscale from 10% of max - up to max
  • You pay whatever is being consumed as the max in a given hour.
  • Cost is 1.5 the cost of pre-provisioned RU/s
  • In next few months, a fully pay-per-request model (e.g. serverless) will be in public preview.

Queries

  • If we shouldn't query, why does Cosmos provide a SQL query API?
  • You can use queries, just be aware your RU consumption will be much more unpredictable.
  • What is Cosmos? Is it your transactional store?
  • Or is it your query store?

Queries

  • In preview is the Analytical data store
  • A decoupled columnar storage is allocated
  • The data is auto synchronized from your Cosmos collection to the analytical data storage
  • Query using Apache Spark or SQL using Azure Synapse
  • Queries have no impact on your transactional performance

Key Take-Aways

  • Treat Cosmos as a Dictionary<string, id> - combine id and partitionKey together in app code
  • Avoid Querying Cosmos
  • Allocate RUs at the Database level, not the collection level
  • Store multiple document types in a collection

Things We Skipped

  • Multiple consistency models
    • Strong
    • Bounded Staleness
    • Session
    • Consistent Prefix
    • Eventual
  • Multi-region support

Things We Skipped

  • There are multiple API front-ends for Cosmos
    • Gremlin
    • MongoDb
    • Cassandra
    • Table
    • Etcd
  • I only showed the SQL API

Things We Skipped

  • Stored procedures
  • Triggers
  • User-defined functions
  • Security
  • Backup
  • Index Configuration

Thanks, Any Questions?