From device wiring to Data Studio: a journey with MicroPython and Google serverless

Last Updated: 2019-12-11

This tutorial extends the awesome one of Carl Krauss on this topic:

https://medium.com/google-cloud/connecting-micropython-devices-to-google-cloud-iot-core-3680e632681e

What you'll build

In this codelab, you're going to wire an ESP32 in order to gather its data. Your project will:

Gather data from ESP32 board with MicroPython.
Send data to the Cloud to a Pub/Sub topic.
Pull data from Pub/Sub topic to BigQuery to do some normalization.
Use Google Data Studio to analyze the data.

What you'll learn

How to wire an ESP32 board
How to flash MicroPython on the board
How to upload scripts to the board
How to use MicroPython to run scripts on the board
How to create and populate a Pub/Sub topic
How to pull data from the topic
How to use BigQuery to work on data
How to use DataStudio to analyze data

Code blocks are glossed over and are provided for you to simply copy and paste.

What you'll need

An ESP32 board
A BME280 sensor (or you can use the board embedded temp sensor)
Knowledge of Python
A Google Cloud Account

Get the code

We've put everything you need for this project into a Git repo. To get started, you'll need to grab the code and open it in your favorite dev environment. For this codelab, we recommend using Git.

Strongly Recommended: Use Git

Using Git is the recommended method for working through this codelab.

Open a new browser tab and go to https://git-scm.com/.
Download and install Git.
Open the console.
Git clone https://github.com/nicolaguglielmi/ESP32-to-DataStudio.git.
Once the repo has been cloned, edit the config.py file, and update it with your SSID and Wi-Fi password.

Alternative: Download code

Download source code

Unpack the downloaded zip file.
Edit the config.py file, and update it with your SSID and Wi-Fi password.

Download MicroPython

Download the appropriate MicroPython version for your device from MicroPython website:

http://micropython.org/download

Upload the firmware

To upload the firmware you need esptool and you can install it with pip:

pip install esptool

or you can download the Windows binary executable installing the corresponding ESP32 library inside Arduino IDE or from one of these links:

https://dl.espressif.com/dl/esptool-4dab24e-windows.zip

https://dl.espressif.com/dl/esptool-2.3.1-windows.zip

Esptool is a very powerful tool with a lot of useful functions, so spending some minutes to take a look at it is not a bad idea:

You can check if the board is connected and detected correctly issuing some commands (i.e. checking the chip id of your board):

esptool.py -port COM5 chip_id

Follow the instructions for your specific board, don't miss the offset value 0x1000 for the ESP32 board:

esptool.py --chip esp32 --port /dev/ttyUSB0 --baud 460800 write_flash -z 0x1000 esp32-idf3...

If you are in a Windows environment, please use the correct COM port, reading it from device manager:

esptool.py --chip esp32 --port COM5 --baud 460800 write_flash -z 0x1000 esp32-idf3...

After these steps, you should be able to connect to the board by serial, in Windows you can use "putty" but every serial emulator will do the job.

Just set the right COM port and speed (115200 should work).

You will see the Python interactive prompt (REPL,Read Evaluate Print Loop):

Try some Python commands and that will be evaluated inside the console:

Other tools

One another very useful tool is ampy (Adafruit MicroPython Tool) from

https://github.com/scientifichackers/ampy

Ampy let you interact with the board and its filesystem and it is very useful to upload your scripts or reset the board without opening the serial emulator.

You can easily install it with pip:

pip install adafruit-ampy

Now it's the time to clone the repository:

git clone https://github.com/nicolaguglielmi/ESP32-to-DataStudio

First you need to generate a pair of pub/priv keys that you will use in a while:

openssl genrsa -out rsa_private.pem 2048

openssl rsa -in rsa_private.pem -pubout -out rsa_public.pem

Now you have two other files in the folder, rsa_private.pem and rsa_public.pem

Due to the memory limits of the ESP32 you need a small workaround to get the board able to create the JWT.

You have to decode the private key to its components using the cpu of your machine: to do so, you can use the decode_rsa.py routine

pip install rsa

python utils/decode_rsa.py >>config.py

With the previous command you append the result of decode_rsa.py to the config.py; now edit the file to restore the right JSON format, note that the keys are passed as integer, not as strings:

config.py

Jwt_config = {
        ‘algorithm':'RS256',
        ‘Token_ttl': 43200, #12 hours
        # Use utils/decode_rsa.py to decode private pem to pkcs1.
        ‘private_key':(<PRIVATE_KEY>)
}

You will be back on config.py after doing some tests.

You can try to write a simple script:

easy_counter.py

#Just an easy sample to test the time lib
Import time
print("I can count the seconds:")
i = 1

while True:
        print(i, "seconds passed from the start...")
        i += 1
        time.sleep(1.0) # 1 second delay

And to upload it to the board:

ampy -p COM5 put easy_counter.py

To start a script from the REPL you just have to import it:

import easy_counter

Just press CTRL+C to break the loop.

Script autoloading

If you want to execute your script when the board boots, you have to name it main.py and put it on the root tree of the board filesystem.

The files that will be executed each boot of the board are boot.py and main.py.

Now some wiring is needed to connect the board and the sensor.

Here you see BME280, a sensor made from Bosch, that can sense temperature, humidity and pressure in a single board. It uses the I2C communication bus.

If you don't have the BME280 sensor, you can use whatever hardware you have and/or prefer or try with the board embedded temperature monitor just to have some data to stream to the cloud.

Start wiring the BME280, look for the pin on your board that are marked as [Wire SDL] [Wire SDA] and connect that pins to the corresponding ones on the BME280 sensor.

In this case the pins are the IO22 and IO21, you can find yours from the reference pinout of your board. You also need 3.3V and GND for the BME280, you can use these pins:

If all the connections are ok, the hardware part is done and you can focus on the software.

Test the sensors

Let's test the sensor loading the library folder to the board with ampy:

ampy -p COM5 put third_party

It will take a while to load.

Make a quick test with the POW script bmtest.py:

bmetest.py

# A quick POW to check that the board get the values from the sensors
import machine
#from machine import Pin, ADC, I2C, RTC
import lib.BME280
Import time

# ESP32 - Pin assignment
I2c = machine.I2C(scl=machinePin(22), sda=machine.Pin(21), freq=10000)

while True:
    bme = BME280.BME280(i2c=i2c)
    temp = bme.temperature
    humi = bme.humidity
    pres = bme.pressure
    print("_____________________")
    print("Temperature :",temp)
    print("Humidity    :",humi)
    print("Pressure    :",pres)
    print("_____________________")
    time.sleep(10)

Upload to the board:

ampy -p COM5 put bmetest.py

Connect with serial emulator and start the script importing it.

If everything is ok, you should see some data on the console!

Brief architecture description

There are several ways to design and deploy a data management pipeline using the cloud services.

In this tutorial we will focus on a full serverless approach.

On opposite you can run a vm or docker with tools like RabbitMQ.

In a serverless approach, you can choose among many options, too, as you can see in the chart:

Our entrypoint is IoT Core, after that we will accumulate the sensors readings in Pub/Sub and we will process with a Google Cloud Functions (GCF).

With Cloud Dataflow you can design and execute the transformations your data needs to be loaded inside BigQuery in a managed environment.

The approach is super easy and effective, but for low data volume there is a drawback.

Each Dataflow job is executed on a Compute Engine Instance and you can start it as a batch job, to process the data in bulk or as a stream job to process data continuously.

Due the nature of Pub/Sub system, it requires a stream processing and this blocks the termination of the Dataflow instance, keeping it running indefinitely and generating costs.

On opposite, Cloud Functions are executed on event, like new Pub/Sub message, and you can leave the function working because the cost of each invocation is microscopic.

Google Cloud Platform

Point your browser to https://console.cloud.google.com/ and if you don't have one, you can create a free account with 300$ of credit for one year.

Create a project or select one and select IoT Core functions menu:

You can find IoT Core under BigData section.

Now create a registry, choose a registry ID to identify the IoT Core endpoint. You can have multiple devices send data to this registry.

Choose a region, closer to the device that will stream the data and enable the protocol, for this tutorial you need just MQTT, but you can left both enabled (MQTT/HTTP).

[Default telemetry topic]

Now it's the time to create a pub/sub topic to collect the data that are coming from the device, select create topic from drill down menu and type in a name for this Sub/Sub topic.

You can leave the key management to Google.

[Device state topic]

In a more complex setup you may wish to redirect changes in status events to a separate Pub/Sub topic and you will use the Device state topic.

Select the error level you wish, "info" is enough, you can change later.

Keep in mind that these logs use Stackdriver services and would billed.

Click on create and you will create registry and Pub/Sub topic.

Now select the registry you created and click on "Devices" to configure the access for your device:

Choose a device id to identify your device's data, set it up to allow communications, set authentication to "manual" with the key format RS256 and paste the content of the rsa_private.pem file you created before.

Let's stream some data to the cloud

You should have all the files of the repository you previously cloned inside a folder.

Edit the config.py wifi_config settings to match the values you set during creation of registry and device id and the values of your sda/scl pinout.

Once the config.py file is ready you can upload all the files to the board.

ampy -p COM5 put config.py

ampy -p COM5 put iot-test.py main.py

With the second command you will upload the file iot-test.py and rename it as main.py to allow auto execution on each restart.

If everything is ok, you should see some data flowing in Pub/Sub.

To do a quick check, use the Stackdriver logs:

You should see some data:

Now it's the time to process the incoming data.

The data should flow from IoT Core to Pub/Sub and accumulates until some subscription pulls it.

We will build a Google Cloud Function that will be called each time new data flows to pub/sub and do a simple data preparation before pushing it to BigQuery.

Create dataset

First of all, let's create a dataset and table in BigQuery to receive the incoming data: go to BigQuery from the Google Cloud Console.

You can use the name you prefer for dataset and table, just respect the constraints of naming: write it down for use in the GCF.

Under "Resource", click on your Project Name and select "Create Dataset":

Type in the name IoT_test and select the region nearest to you (or to the streaming device):

It's the time to create a table for the data!

Click on the newly created dataset, select "Create table" and define the fields you need on the table:

Now you are able to collect the data you are streaming.

Google Cloud Functions setup

Last step of ingress to DB process is to write and deploy a GCF that take data from your device as soon as they are published to Pub/Sub, make a small makeup of data and send it to Big Query.

Go back to Google Cloud Console and open Cloud Functions from the left menu.

Click on the Create Function and input some infos, like function's name, memory etc...

You can have different memory size for your functions and you can choose different languages.

For this tutorial we use Python and 128MB should be enough.

The GCF will be executed each time one event will trigger it, and we have several event triggers to choose from.

Under trigger, select Pub/Sub and select your topic.

Copy the code from the file gcfunctions.py and paste it in the textarea.

You can have many functions inside a single GCF and you have to specify which one should be executed on the trigger event: type the name of that function in "Function to execute":

If you need specific libs for your function, you can load them in the requirements.txt tab.

We need two additional libs for our project:

google.cloud

Google-cloud-bigquery

Clicking on "Environment variables, networking, timeouts and more" you can access additional options, like the Region in which you want to deploy your function, some limit like time execution limit and maximum limit for concurrent functions you want to execute.

Under environment you will define two variables to define your dataset and your table:

Click on "Deploy", give it some seconds and go to check logs: open your function and click on "view logs".

You should see something like this:

If you see a terminating status "ok", the data has probably been processed correctly and you can check if in BigQuery everything is going well.

Open BigQuery in another tab, click on your dataset, table and click on "Preview":

Open Data Studio and to create a datasource, browsing to:

https://datastudio.google.com/

First create a new datasource pointing to your BigQuery data:

Select BigQuery, set a name for this datasource and select the project, dataset and table:

Connect and make a little modification to the timestamp visualization field to also get hours and minutes:

Finally , click on "Create Report".

Just add a time-series chart to your dashboard, stretch it to max size and make some markup, add your metric selecting Average as aggregation:

Make some improvements, moving the Pressure series to a right axis scale and make some little tweaks, experiment with the options (don't be shy, you can't break anything! ;))

And finally, play with styles to give a better look and feel to your dashboard and add widgets and labels:

Congratulations, you've successfully built your first sensor gathering system in the Cloud!

You wired the circuit, flashed MicroPython on a board and setup BigQuery and Data Studio to refine and visualize the data.

You now know the key steps required to gather data from an ESP32 board to make some analysis.