Azure CosmosDB Pricing Intricacies

Azure CosmosDB Pricing Intricacies

The Azure CosmosDB system has the potential to be a great storage layer for your solution. It automatically scales to maintain performances by splitting the data into partitions. It can geo-replicate in order to minimize data transfer latencies and has multiple consistency models to suit your different needs.

But…

In CosmosDB you pay for two things within a data center.
1 – The storage you use, which is a flat fee per gigabyte.
2 – The amount of RUs you would like to provision per second for performance. An RU is an arbitrary unit meaning roughly “something similar to reading 1k”.
If you replicate your dataset to another data center for redundancy or to reduce latency, these costs are PER data center.

Now for the trickiness, when you provision performance, let’s say 10k RU/sec, you are saying that you would like the entire dataset to be served with that performance level. Then arrives a more complicated subject: partitions. A partition is basically a split of your container into multiple parts that represent different sections of your data. For example, if your partition key is a person’s family name, you might end up with partitions for people’s last name starting with [A-G], [H-J], [K-Q], [R-Z]. In this scenario, 4 partitions whom must share equally the performance hence 2500 RU/sec per partition. Note that globally the performance level is the same, but if for a given amount of time, only one partition is solicited, then it will appear as if only 2500 RU/sec was available.

For small datasets that might not be so dramatic as partitions might never occur… but for larger datasets that can grow, CORRECTLY choosing a partition key becomes of paramount importance…

With this understanding, you might think that CosmosDB is great because you pay the same amount per month for a given performance. But two scenarios might arise…

1 – Partitions might exhaust their part of the total RU/sec faster than anticipated due to bad partition keys or specific usage patterns. Your usage of CosmosDB must be resilient to that fact that the system is unavailable because RU/sec have been exhausted.

2 – There is a minimum amount of RU/sec required to host a partition. If the container splits to a point where a partition has less than 100 RU/sec (current value), then RU/sec will be added to your bill in order to guarantee that minimum per partition.

Hope that clears a few things up !



VS Code Addins

A long time ago, I was working with companies helping them create add-ons for Visual Studio. This was NOT an easy task. We had to write code, implement COM interfaces, figure out registration, packaging, development instances of VS… It was CRAZY!

Now a new beast is in town, Visual Studio Code. It is a code editor based on Electron, a Javascript system for packaging apps. Although I still love my Visual Studio, I do have to say that I spend a lot of time in Visual Studio Code.

As it turns out, one of my clients needed to have an easy way to update data for his web site and so we created a JSON file that was in an accessible location. Next what we needed was an easy way to edit these files and so we thought, how about edit the file in Visual Studio Code with a custom addon.

The two commands required to start are (install yeoman, the generator and then create the base structure) :

npm install -g yo generator-code
yo code

Then you open Visual Studio Code in the directory and hit F5 ! Your addon will be loaded into a second instance of VS Code, you can even debug !

Now go on and create something ! Here is a link to the docs !


Microsoft .net Orleans

The Microsoft Orleans project (http://dotnet.github.io/orleans/) is a .NET framework for building systems on an Actor Model paradigm.

A typical transactional system receives a command and executes it. Executing it usually means fetching data from a database, modifying it and then saving it.

The reason I was interested in Orleans for To-Do.Studio was that each action in the app generates a command, there is no save button, no transactional boundary. This naturally creates a chatty interface, where many small commands are sent to the server, which much for each: get data, modify it and save it. Combine that with the fact that NOSQL components such as CosmosDB make you pay for reads and what you have is an expensive bottleneck.

The Actor Model in Orleans would have fixed this for me by doing the following. The first time a command executes which references a particular domain object, a grain is activated and the data is read. Each command then modifies the data in the grain. Eventually, when no one uses the grain the system will deactivate the grain causing it to save itself.

This would have the added benefit for us to minimize data access costs and offer speedups similar to adding a caching layer.

As with all magic though, we lose some control:

  1. First thing is that as the system scales to thousands of users and tens of thousands of grains, we have to think of system scalability (Orleans doesn`t yet deactivate resources based on memory pressure, only an “unused delay”)
  2. The deployment model is not fully integrated with Azure.
    1. Hosting is possible within a VM – not PAAS enough for me
    2. Hosting is possible within worker roles – sounds interesting but not exactly what I want
    3. Hosting is possible within Service Fabric (which is another implementation of an Actor Model from the Azure team and not the .NET team) – doesn’t feel seamless this would be my ideal hosting option)
    4. Host in a Linux container which is scaled automatically by Kubernetes – to be honest, I am not a container fan; Ido see their advantages for certain workloads, but it feels like the PAAS of infrastructure people.

Anyways, my best option would be hosting on top of Service Fabric. It would need to be 100% integrated with dotnetcore stuff though.

I could also recommend having a dedicated PASS offering for Orleans (kind of like what happened with SignalR).

Finally, Orleans should be upgraded to support memory pressure for deactivations as well as some sort of queuing mechanism when grains start swapping to keep the amount of work which is executing in parallel rationalized.


Web Push Notifications

As time goes by, the web becomes more and more powerful. It’s been around for a while, but I think that with Chrome and Edge finally supporting PWA apps (and native apps starting to become mobile web apps), it is time to embrace this technology. It is time to create web pages that can push notifications directly to the desktop even if the browser is closed.

This works right now on any device running Microsoft Edge, as well as devices that run Google Chrome (Android devices and desktops running Chrome).

Architecturally, the Service Worker part of your website (running in the background) creates a subscription with a push notification service provider and then that “token” is sent to your application server, which will use it to send you a notification.

What I didn’t grasp, is that you the concept of the push notification service provider is not any server running some software running a specific protocol, but rather tied directly to the browser. For example, the ONLY push notification service provider if the website is being viewed within Google Chrome is FCM (Firebase), Mozilla or Microsoft Edge (Windows WNS). In fact, these are the same technologies that are used by real native apps on those platforms.

Soooo…

  1. The API to subscribe and get some sort of token to enable push notifications is defined by internet standards, so it is safe to use in any browser on any website.
  2. Each browser will implement those methods using its own push notification service provider
  3. Sending a push notification will be different depending on whom the push notification service provider is.

Here is a link to a functional demo that works everywhere : https://webpushdemo.azurewebsites.net/


Office365 Two Factor Authentication

Two-factor authentication is an important thing, we should all now that by now. For a long time, I’ve had it activated on my Microsoft Account and Google Account. Today I decided to turn it on for the modelon.net Office365 tenant, I went into the admin tools and turned it on. Easy enough.

As expected, I had to remove the account from Outlook and Windows 10 and re-add them. But surprise, Outlook 2016 didn’t want to connect. It seems there is a manual intervention that needed to be done.

I am no IT admin and no PowerShell guru, but following the articles here saved my life:

https://support.office.com/en-us/article/Enable-or-disable-modern-authentication-in-Exchange-Online-58018196-f918-49cd-8238-56f57f38d662

https://docs.microsoft.com/en-us/powershell/exchange/exchange-online/connect-to-exchange-online-powershell/mfa-connect-to-exchange-online-powershell?view=exchange-ps

By using the instructions in the second link, you get some sort of Office365 configured local PowerShell thing. Once you open that, all you have to do is these two commands to make everything work:

Connect-EXOPSSession -UserPrincipalName erik.renaud@modelon.net
Set-OrganizationConfig -OAuth2ClientProfileEnabled $true

VSTS – Build & Release for To-Do.Studio’s web site

So my startup To-Do.Studio is advancing, and we started getting rolling on our informational website.

The first thing was to create a “web site” project with Visual Studio, and get it into GIT.

As you can see, it’s just a bunch of HTML, CSS and stuff, with a small web.config so that we can host on IIS (we are using Azure App Services for this). The first thing which is abnormal though and unlike a normal ASP.NET project, I do not have a CSPROJ that describes my project, instead, I have a PUBLISHPROJ, and it gets ignored by GIT. So how did we get here ?

Since Visual Studio relies on things like MSBUILD to “publish” projects, it needs some sort of project file and in this case, Visual Studio created the PUBLISHPROJ the first time I created the publish profile, this allows me to publish from the command line and more.

Although the file is GIT ignore, I had to add it to GIT in order for the VSTS build system to be able to publish this project.

The other modification we had to do was to add to GIT a special file, in the folder of the solution called “after.ToDoStudio.StaticWebFront.sln.targets”. Turns out MSBUILD looks for these files automatically to include them in your build process, without the need to modify the solution file. This is what we put in there:

<!--?xml version="1.0" encoding="utf-8"?-->
<Project ToolsVersion="4.0" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
 <Target Name="Deploy Website" AfterTargets="Build">
 <Message Text="Starting Website deployment" Importance="high">
 </Message>
 <MSBuild Projects="$(MSBuildProjectDirectory)\ToDoStudio.StaticWebFront\website.publishproj" 
 BuildInParallel="true" /> 
 </Target>
</Project>

What this does is ensure that after the Build action (which by default will build nothing cause this is a project with nothing to build), the system will automatically invoke the publish action using the PUBLISHPROJ.

Now we were ready to build this thing within VSTS.

As you can see here, our build is pretty standard with defaults used everywhere:

The release pipeline is also pretty standard, nothing fancy here:

 


Podcasts – Mine and Others

Let me be honest – I hadn’t listened to podcasts in a while, so was pleasantly surprised when I was recording a podcast with Mario Cardinal and Guy Barrette on CosmosDB and, during the initial chitchat, Mario talks about a great podcast he started listening to called After On.

Here is a link to the podcast that I have recorded, it is an introduction into CosmosDB, a service within Microsoft Azure that is a NOSQL and built from the ground up to scale with the web. It is the kind of technology that could be used to build things like Facebook or Twitter.

As for After-On, this is a podcast I instantly became hooked too. The discussions are out of this world, on diverse subjects and very in depth. Please go listen and let me know which is your favorite episode.


Shrinking USB keys to save a old Windows Tablet

At the first Microsoft Build event, they gave attendees a Samsung tablet, made for Windows 8. I loved that tablet (other than the fact that it was heavy) because it had a built-in cellular modem.

But sadly, last week, the thing crashed and was going into a boot loop. The only way to revive it was to reinstall everything…

This was going to be a challenge because this particular model required a 4gb Fat32 USB key. Unfortunatly it is 2018 and the smallest key i found was 16 gigabytes…

So search the internet and i didn’t find an exact recipe, but i did try something that could work – so i gave it a try. Open DiskPart and issue the commands:

list disk
select disk 1 (in my case it was 1, it might be different for you !)
clean
create partition primary size=4160
active

That’s it, then i formatted it, copied the contents of this ISO of Windows 10 onto it and bingo, it booted and in thirty minutes everything was up and running.

To get back the key to 16gb, just run the same instructions without the size-4160 part.

And that’s how you shrink a usb key.


Azure AppService to FTP (not in Azure)

Oy,  I just spent a crazy week to learn that :

It is impossible for an AppService application to connect to a FTP server on the internet (in passive mode). The reason is that each Azure App-Service is assigned a pool of IP addresses for outgoing traffic and Azure is free to choose a new outgoing IP for each connection (the natting stuff) and FTP expects the data connection to be from the same ip as the control connection.

This said, when using sftp, the handshake is negociated with the connection that composes the control channel. When the second connection to transfer data is built, the ftp server receives the connection from a potentially different IP address, which is a 425 type error.

Took us 3 days to diagnose, one evening to write a basic http web api that allows to post and get files, a few hours to install it all and a full day to rewrite the code that worked with ftp…

This said, it was something that no one on the team could of had predicted, live and learn !


Screen scrapping

One thing I believe in is constant change, and constant learning, ideally one thing per day. Sometimes it’s learning how to cook the best eggs benedict ever for breakfast, or sometimes it’s helping out a friend with a special request.

Today I was asked by a colleague if I could help extract data from a web site. As an architect, the first thing I look at in the “code” is a clean separation of what is presentation and what is data. Obviously, I did not find that, which made me realise how frameworks which render html mixed with data are bad bad bad. Why can’t everything follow MVVM with some binding of some sort.

Anyways, we needed a solution and what we wipped up was screen scrapping.

My first attempt was to write a small html page that loads jquery, does an ajax call to hit the webpage we needed data from and then extract it from it’s DOM… Turns out it was an easy to execute but I was met an error : CORS headers not found for file://dlg/extractData.html. GRRRRR

New strategy !

I opened Chrome, did a search for screen scrapping extensions and behold I found this.

An extension that would allow me to navigate to any page, lookup how it is built, understand it’s usage of CSS selectors and voilà. Any page that reuses the same css selector to represent repeating data (as in a list) can be extracted to a json or csv file.

Well, thanks DL for getting me to learn something new today !