Data Mesh

It is very difficult to not hear or read about data mesh these days. I have read plenty of articles, listened to dozens of podcasts and watched numerous Youtube videos. What I was looking for was some practical guidance on how to implement a data mesh in an organization.

Technology, Cultural & Organizational Change

First of all, I think it is important to understand the problem a data mesh is trying to solve. In my understanding it is a set of principles that targets to overcome the bottleneck of central data platforms. These principles are:

  • domain-oriented decentralized data ownership and architecture
  • data as a product
  • self-serve data infrastructure as a platform
  • federated computational governance

Visit Data Mesh Principles and Logical Architecture (martinfowler.com) to read more about the four principles.

If I think about it in a more practical way, these principles can’t be applied only by technology. It is clear to me that the harder part is to change the organization & culture in a way that we can overcome the scalability issues of the central data platforms.

Does an Agile organization help?

Personally, I find it difficult to envison a successful data mesh adoption in an organization that doesn’t have teams owning products e2e. In that sense, the data mesh adoption could be seen as a logical next step to empower teams to not only own the operational services, but also the data products.

Proof of Concept

When I think about the data mesh adoption, it is not enough to just do a technical PoC, we have to include the the cultural and organizational aspects from day one. For instance, do a PoC that spans three different teams: consumer, provider and data platform.

Temperature and Humidity Tracker (or My first IoT Project connecting an ESP32 via WiFi to a Firebase RTB)

Project

What I wanted to build was a little device that tracks the temperature and humidity and stores the sensor data in a database; and I wanted to build a simple UI to visualize the data. I am particularly curious to find out about the effects of venting or other activities in different rooms the house. But let’s start simple! 🙂

This is not meant to be a step by step guide on how you can build your own project. You might find parts useful, specifically regaring the code (connecting to Firebase, WiFI, reading DHT22 sensor): feel free to use it. Always be careful when dealing with batteries, lithium ion batteries are dangerous if not handled correctly (e.g. short circuit).

The final project – so far.

Shopping List

  • ESP32 developer board – I use the Firebeetle ESP32 IOT from DFRobot
  • DHT22 temperature and humidity sensor
  • (battery, to operate without cables, works well via USB as well)

Wiring

The wiring couldn’t be simpler. I am using a DHT22 temperature and humidity sensor which is pre-calibrated and you only need to connect three pins:

  • VCC
  • Data (any GPIO pin, on the Firebeetle one of D0 to D9)
  • GND

It is advised to use a pull-up resistor together with a DHT22, but I have not integrated one. Please note I am using a breakout board, no specific reason, it’s just what I got when I ordered the sensor. The sensor itself has 4 pins, one is not connected (nc).

Software

The software is the more interesting part for this project I believe. Find the code below and some explanations later.

  • read the sensor data
  • connect to my WiFi
  • setup a Firebase RTDB
  • connect the ESP32 to the Firebase RTDB
  • store the data in the RTDB
  • write an Angular app to visualize the data
#include <WiFi.h>
#include <Firebase_ESP_Client.h>
#include "DHT.h"

#define DHT22_PIN D2
#define DHTTYPE DHT22

// WiFi
const char* ssid = "xxxx";
const char* password = "xxxx";

DHT dht(DHT22_PIN, DHTTYPE);
FirebaseData fbdo;
FirebaseAuth auth;
FirebaseConfig config;
String chipId;

void setup()
{
  Serial.begin(115200);
  initWiFi();
  initDHT();
  initFirebase();
  registerDevice();
}

void initWiFi() {
  WiFi.mode(WIFI_STA);
  WiFi.begin(ssid, password);
  Serial.print("Connecting to WiFi ..");
  while (WiFi.status() != WL_CONNECTED) {
    Serial.print('.');
    delay(1000);
  }
  Serial.println(WiFi.localIP());
}

void initDHT() {
  dht.begin();
}

void initFirebase() {
  config.host = "xxxx";
  config.api_key = "xxxx";
  auth.user.email = "xxxx";
  auth.user.password = "xxxx";
  Firebase.begin(&config, &auth);
}

void registerDevice() {
  chipId = String((uint32_t)ESP.getEfuseMac(), HEX);
  chipId.toUpperCase();

  if (Firebase.ready() ) {
    Firebase.RTDB.setString(&fbdo, "/devices/" + chipId + "/name", chipId) ;
  }
}

void loop()
{
  float temp = dht.readTemperature();
  float hum = dht.readHumidity();

  if (Firebase.ready() ) {
    FirebaseJson json;
    json.set("t", temp);
    json.set("h", hum);
    Firebase.RTDB.pushJSON(&fbdo, "/devices/" + chipId + "/sensors/climate", &json);
    String dataPath = fbdo.dataPath() + "/" + fbdo.pushName();
    Firebase.RTDB.setTimestamp(&fbdo, dataPath + "/ts");
  }
  delay(5 * 60 * 1000); // only every 5min
  //delay(2000);
}

Required Libraries

  • DHT
  • Firebase
  • Unified Sensor support

Adding Libaries

In the Arduino IDE (Library Manager) I installed:

and…

and…

(which I actually installed by adding the zip file manually, following the instructions here.)

Create Firebase Project

I assume you have basic understanding of Firebase and how to create a project. If not there are many tutorial out there to learn more. For this particular project I had to:

  • create the project
  • manually create a user
  • create a web app
  • create real time database
First create a project
Give it a meaningful name
I disabled Google Analytics for this project, it won’t take long to create your project
Add Email/Password provider
Enable it and save
Switch to users and add a user manually. Remember the password.
Goto Realtime Database and hit “Create Database”
I usually go with test mode security rules when I start a new project. Keep in mind this allows everyone to read / write into your DB, specifically if you host your application and with it all required Firebase configurations like the API key.

Connect to WiFi

Connecting to my WiFi network is also quite simple:

void initWiFi() {
  WiFi.mode(WIFI_STA);
  WiFi.begin(ssid, password);
  Serial.print("Connecting to WiFi.");
  while (WiFi.status() != WL_CONNECTED) {
    delay(1000);
  }
  Serial.println(WiFi.localIP());
}

Obviously you should know to which WiFi to connect, just put replace the ssid and the password accordingly. 😉

const char* ssid = "xxxxxxx";
const char* password = "xxxxxxx";

Connect to Firebase

With a stable WiFi connection I was able to connect to my Firebase RTDB. Using the library mentioned above this doesn’t require many lines of code.

void initFirebase() {
  config.host = "xxxx";
  config.api_key = "xxxx";
  auth.user.email = "xxxx";
  auth.user.password = "xxxx";
  Firebase.begin(&config, &auth);
}

The API key can be found in the project settings, look for Web API key.

The host can be found here:

The user/password should be clear, as we created it before ourselves.

Store Data

Now that the connection to Firebase is setup, I can read or store data.

Setup

In the setup I read the chip id and use it as the name. First of all it was a good first test to store data and secondly I intend allowing to change the name on via Bluetooth.

 void registerDevice() {
  chipId = String((uint32_t)ESP.getEfuseMac(), HEX);
  chipId.toUpperCase();

  if (Firebase.ready() ) {
    Firebase.RTDB.setString(&fbdo, "/devices/" + chipId + "/name", chipId) ;
  }
}

Loop

In the loop function I’m reading the sensor data and create a JSON object with the values. Then I add the current timestamp. Altough the code is simple I’m sure this can be done in a more efficient way. Mainly I’m not convinced Firebase is the right choice to store sensor data, but it’s cool you can connect an ESP32 with their DB and the API is easy to use.

void loop()
{
  float temp = dht.readTemperature();
  float hum = dht.readHumidity();

  if (Firebase.ready() ) {
    FirebaseJson json;
    json.set("t", temp);
    json.set("h", hum);
    Firebase.RTDB.pushJSON(&fbdo, "/devices/" + chipId + "/sensors/climate", &json);
    String dataPath = fbdo.dataPath() + "/" + fbdo.pushName();
    Firebase.RTDB.setTimestamp(&fbdo, dataPath + "/ts");
  }
  delay(5 * 60 * 1000); // every 5min

}

Client Application

Now that everything works and data is stored in Firebase, it is time to create a client application to visualize the sensor data. If I find some time I will describe this part in another blog post. But it looks like this in its current stage:

Humidity reacts quickly to people in the room and to venting

Next

It would be cool if I could just add another device in a room and it would automatically start collecting data and store it. I guess similar products exist anywys already, but the fun lies in DIY. 🙂

Nevertheless, what I would like to improve:

  • 3D printed box to house the electronics and the battery
  • Provide a way to configure WiFi, name of the device via Bluetooth
  • Find a better way to store the data, I don’t think Firebase RTDB is the right choice here (maybe ThingSpeak or AskSensors, …)
  • Rewrite code to enable deep sleep mode between sensor reads

Conclusion

Microcontroller programming is a topic I had on my todo list since many years. I tried differnet things, but I never got any project done. Meanwhile I have several bread boards, many (really many) resistors, a few ICs, a Raspberry Pi, an Arduino board and since a few days an ESP32 developer board!

I actually bought a starter kit with some sensors and a little book included, I have to say it made it easy to get started and the ESP32 really got me hooked! Unfortunately the board isn’t alive anymore, so I had to switch to the Firebeetle. I can’t imagine an easier way to create an IoT project – WiFi and Bluetooth is aboard! Thanks Fabian Merki for pointing me in this direction! Fabian is creating music boxes with cool light effects using the ESP32 and he was always helping in case I had some issues with my setup. 🙂

Overall the project was really fun and I learned quite some things. It’s incredibly easy to connect to the internet and connect to a Firebase RTDB is just powerful.

Fun with WSL and local GitLab Runner

I was looking for a solution to run a GitLab pipeline locally. I haven’t really figured out an easy way, but apparently one could use the gitlab-runner tool to run individual jobs. Although you can install all tools for Windows I wanted to run the tools a bit more isolated. Therefore I decided to use wsl.

This is what I had to do!

  • install Ubuntu distribution
  • install gitlab-runner tools
  • install docker
  • run the gitlab commands

The list is quite short, but I spent quite some time figuring out how I can make caching happen.

In a nutshell I run an Ubuntu VM using wsl in which I can execute my pipeline jobs using gitlab-runner. The runner is spinning up Docker containers to execute the jobs as declared in .gitlab-ci.yml.

Ubuntu / WSL

First I had to install the Ubuntu WSL distro. Although the command line tells me where to find the distros (i.e. the Microsoft Store) I had a bit a hard time finding it. But the link WSL | Ubuntu helped me out as there is a link to directly get to the proper distro.

I have a complete Ubuntu environment ready in seconds and the integration with Windows works really well. I start WSL by typing wsl -d Ubuntu in my command line.

Ubuntu, ready in seconds

Install the tools

First of all I installed gitlab-runner:

sudo apt install gitlab-runner

Then I installed docker, which is a bit of a pain if you just want to get started quickly. I basically followed this guide and it worked well: How To Install and Use Docker on Ubuntu 20.04 | DigitalOcean

WSL 2

I first tried to run docker on my VM, but it failed. I had to upgrade my distro to WSL 2 by invoking this command:

wsl –set-version Ubuntu 2

After launching the VM again, I was able to run docker commands.

Docker

When I run my GitLab pipeline, I want to use Docker as executor. GitLab runner basically spins up containers (as per image defined in the .gitlab-ci.yml) and executes the job. The Docker daemon doesn’t start automatically, this is not hard to configure, but to first test my setup I had to start it manually by invoking sudo service docker start

I verified my setup by running docker run hello-world. If it works, it will print something like:

running a container in a VM running on Windows. Cool!

Running GitLab

Although it reads pretty simple, I spent quite some time understanding how to use the gitlab-runner tool. My main issue was to ensure the cache is working between the job executions. All the builds runs in a container and my initial assumption that caching just works was wrong. The tool tells me that instead of a distributed cache a local cache is used, but it never worked.

The trick is to mount a volume, so that the cache created inside the container is persisted on the host.

So, to actually run a job from my pipeline I navigated to a project with the .gitlab-ci.yml in it and executed the following command:

sudo gitlab-runner exec docker build-web –docker-volumes /home/gitlab-runner/cache/:/cache

Where build-web is the job I want to run and /home/gitlab-runner/cache the directory on the host system where the cache should be stored. By default the runner will put the cache in the /cache directory in the container.

Final Thoughts

I was hoping that I can execute the whole pipeline using the command line. Seems with gitlab-runner I can only run a single job. Still good to test stuff – definitely good to learn more about how GitLab runners work. And maybe this guide helps someone setting up their local GitLab runner.

Speed up builds and separating infrastructure (update on becoming an Azure Solution Architect)

It has been a while since I last posted an update on becoming an Azure Solution Architect. When I started this journey in 2020 I didn’t have a lot of hands on experience with Azure. One year later I still feel like I’m learning new things every day. 🙂

Working on a real project helped me a lot in understanding things better and automating the whole setup with Terraform and GitLab was a great experience. I really recommend to think about CI/CD first when starting a new project, altough it isn’t easy.

But it pays off very soon, as you just dont have to care anymore about infrastructure and you can recreate your resources any time. Just run terraform apply when starting to work on the project and run terraform destroy at the end of the coding session to avoid unnecessary costs during development. It is pretty cool watching terraform setting up and tearing down all the resources.

Terraform supports Azure quite well, altough I encountered some limitations. The documentation is really good!

Separating Infrastructure and App Deployment (and sharing data)

One lesson I had to learn (thanks to the guidance from a colleague at work): it is better to separate the cloud infrastructure and the application build and deployment. I may sound tempting to put it all together, but it grows in complexity quite fast. I ended up having two projects with two pipelines:

  • my-project
  • my-project-infra

The infra project contains the terraform declarations and a simple pipeline to run the terraform commands. The client and client secret I provide via GitLab variables. This works very well, but you will typically require some keys, URLs, connection strings or the like when deploying the application. Terraform allows to store and access the required attributes by declaring outputs

output "storage_connection_string" {
  description = "Connection string for storage account"
  value       = azurerm_storage_account.my_storage.primary_connection_string
  sensitive = true
}

Terraform allows us to access the connection string any time later by invoking terraform commands, as the data is kept together with the state. This is where the concept clicked for me. I use them in the pipeline like so, exporting them via dotenv

terraform_output:
  stage: terraform_output
  image:
    name: hashicorp/terraform:1.0.8    
    entrypoint:
      - '/usr/bin/env'
      - 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'    
  script:
    - terraform init  
    - echo "AZURE_STORAGE_CONNECTION_STRING=$(terraform output --raw storage_connection_string)" >> build.env        
  artifacts:
    reports:
      dotenv: build.env
  only:
    - master

When deploying the web app, I could then just access the connection string. For me this was not very intuitive, I think tools could support such use cases better, unless I’m just doing it wrong. 🙂 Happy to hear about better ways. But essentially this is the way I could access the connetion string as an environment variable in a later stage, using a different image.

deploy-web:
  stage: deploy
  image: mcr.microsoft.com/azure-functions/node:3.0-node14-core-tools	
  script:
   - az storage blob delete-batch --connection-string $AZURE_STORAGE_CONNECTION_STRING -s "\$web"
   - az storage blob upload-batch --connection-string $AZURE_STORAGE_CONNECTION_STRING -d "\$web" -s ./dist/my-app
  only:
    refs:
      - master        
  dependencies:
    - terraform_output
    - build-web

Optimize the build

A downside of the way we are building software today: there is no built in incremental build support. At least my pipelines tend to be quite slow without optimization and proper caching and it takes minutes to build and redeploy everything, even if the project is rather simple. So, knowing which parts of the build you can cache can save you a lot of time and money, but it may also not be super intuitive.

That’s why I would like to share one pattern that I use for my Angular applications (and it should work for any node / npm based project).

Who doesn’t get sleepy waiting for npm to install all the project dependencies?

I have split up the job into two parts to only run npm install when really required, i.e. when something in the package-lock.json changes – and then cache the result for the next stage (and subsequent runs).

install_dependencies:
  stage: install_dependencies
  image: node:14-alpine
  cache: 
    key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR
    paths:
      - ./node_modules/
  script:
    - npm ci
  only:
    changes:
      - ./package-lock.json

only/changes will ensure the job only runs if the package-lock.json has been changed, for example when you add or upgrade a dependency.

The cache configuration then keeps the node_modules handy for the next job or stage:

build-web:
  stage: build
  image: node:14-alpine
  cache: 
    key: $CI_COMMIT_REF_SLUG-$CI_PROJECT_DIR
    paths:
      - ./node_modules
    policy: pull-push 
  script:
    - npm install -g @angular/cli
    - ng build
  artifacts:
    paths:
      - ./dist/my-app

Have fun speeding up your pipelines!

Typolino – Adding Features (one year later)

Coming back to a side project after a year can be pretty painful. I actually wanted to see how it goes with Typolino. Is the CI/CD pipeline still working? Can I easily upgrade my dependencies and add new features?

A product should grow with your clients and Typolino is no different. My kids get older and they want some new challenges. 🙂

CI/CD

Besides allowing teams to focus on writing code adding business value, having a CI/CD pipeline is also great to come back to an abandoned project. I didn’t have to do anything, it still works fine. I recently tend to tell my colleagues that we should think CI/CD first (like mobile first, API first, …) and I believe this pays off very quickly. I’m aware this can also go wrong, but I was lucky enough. Having a regular run to prove your build still works fine may be a good idea, so that you an fix issues right away.

Upgrade

Upgrading npm dependencies is always a bit cumbersome, especially if you want to use the latest major versions. There is still a lot of movement in the web area. I thought it would be cool to see how I can upgrade a pretty old Angular version to Angular 12. It worked quite well, but it was not painless. To be honest I think this should be easier, it is not something you want to think about too much about, specifically if it affects the build system itself. I didn’t have any breaking API changes, but the build system was updated. It took me more than an hour of absolutely not-fun work. 🙂

As a side note: the Angular team decided to go “prod build by default”. Which now kind of killed the faster development builds, I need to do some more research on how to get this back, but waiting like 20 seconds to see the changes during development is a no go. I’m sure I can fix it, just a question how many extra config in angular.json it will require.

Features

Having the CI/CD still working and the upgrade done I got enough time to add some features.

  • Better styling, I added an Angular component that can render words as keyboard keys. Looks a bit fancier now
  • Updated the colors to be a bit sharper
  • Added a hard mode, that will blank a letter in the word. This is great for my older one as he now needs to know how to properly write the word

Conclusion

Having a proper CI/CD pipeline frees me up from manual activities that would probably hold be back from additional changes in “legacy” code and lets me focus on adding actual value. The hurdles are so much lower in my view.

If you want to test the new features, visit Typolino (still German, sorry).

Getting Started with CVXPY to solve convex optimization problems

I don’t know much about convex optimization problems, but being a hands on person I like to have everything ready to experiment. My final goal is to run some scripts on Azure Databricks. So I’m still ony my journey to become Azure Solutions Architect. 😉

Installation

First things first: to get started I had to install Python and all required libraries. This was not super straight forward on Windows and it took me longer than expected. The main problem was, that I needed to install some additional software (some Visual Studio stuff).

Python

I used to use Anaconda to manage my Python environments. But I didn’t have it ready on this machine. So when typing python in the commandline of Windows 10, it basically directed me to the the Microsoft Store.

So, the installation of Python was pretty straight forward and I was able to verify the installation in a new command line window.

CVXPY

The manual says: pip install cvxpy.

But of course it didn’t work. 😉 But reading the error message carefully revealed that I had to install some additional software: error: Microsoft Visual C++ 14.0 or greater is required. Get it with “Microsoft C++ Build Tools”: https://visualstudio.microsoft.com/visual-cpp-build-tools/

It took me two attempts as I was not reading it carefully enough and I missed to install the correct pieces. But following the link in the error and firing up the installer is actually simple and didn’t confront me with any issues. After a restart of my machine I was finally able to install CVXPY without errors.

Verification

Having all dependencies installed I used one of the samples to verify my installation. I used least-squares problem.

And it worked just fine!

Conclusion

It was not a big deal, but it took me some time. Specifically setting Python up and installing the missing Visual Studio dependencies. I’m by no Python expert, I don’t use it everyday and getting back to this beautiful programming language with its rich set of powerful, yet simple to use libraries is always nice.

Moving youcode.ch to Azure – DNS configuration update

The CNAME record for the subdomain www.youcode.ch works like a charm. I was able to add the domain and the site is reachable.

The validation via TXT record worked as well, but I don’t know how I can configure the DNS record. I had to contact the support team. Worst case I have to switch to another DNS server (e.g. Azure DNS).

Moving youcode.ch to Azure

Since a few years I have this idea of youcode.ch. I never invested a lot into this side project after buying some WordPress theme and paying for the domain. I like WordPress for simple websites, but I never got warm with it for more complex stuff.

Anyhow the website stopped working and appearantly WordPress is pretty resource hungry and the company where I host my stuff suggested to go for another offering, which of course is more expensive. 🙂

Therefore, I decided to give it another try and move youcode.ch over to Azure.

In essence I want to host a static web app (Angular, custom domain) and add some magic using Azure functions (which may connect to other services like storage to read/write some data).

So the first thing to do is: setup a static web app and configure the custom domain.

Create Static Web App

Best watch my YouTube video on this topic. I followed the same steps for youcode.ch.

  1. Create a new ng app
  2. Setup Azure Static Webapp in VSC

I amable to access my brand new web app, but the domain isnot yet what I want. So let’s configure a custom domain.

Custom Domain

I already own youcode.ch – or at least I pay for it. So let’s see how simple we can get this configured! To do this, we first need to go to the Azure portal and open the static web app resource. Clicking on the custom domains will show us all registered domains.

Hitting the “add” button will open up a sidebar. Where we are guided through the necessairy steps.

Let’s go for the TXT record type, as this will allow me to add the root domain youcode.ch (and not the subdomain www.youcode.ch). On my hosting provider it is pretty simple to add this record.

Every provider has its own ways to deal with DNS configuration. You can find a lot of useful information here:
Set up a custom domain in Azure Static Web Apps | Microsoft Docs

Now I have to wait a few hours, and I’m not sure it will work as expected as I haven’t seen a way to add ALIAS records at first sight. But let’s see!

How To Keep Up To Date

Recently I was asked what you should do as a software engineer to keep your skill up to date.

Of course, there are many things you can do. But one of the simplest and most effective practices in my view is reading.

There are three websites I visit and scan as part of my daily routine to learn about new trends in technology.

Personally I don’t limit myself to a specific topic. I just pick whatever looks interesting. The reason is that although I may not have a problem that I need to solve today, the gained knowledge may serve me very well later.

What I particularly like about dev.to is that you get a continuous feed of articles – and they are usually pretty short.

Stay curious!