Handling data for development and production in Google Cloud Datastore

Google Cloud Datastore is a highly scalable NoSQL database for your applications. It's a fully managed database hosted in cloud (Surprise! Surprise!!) offered in Google Cloud Platform. Recently I have used this cloud database in one of my app for app specific settings and environment variables. We already know that saving password or other similar sensitive credentials in the code are not safe at all. The best idea to save such critical information as environment variables. But when you use container technologies like docker for your project, then handingling environment variables from your development machine to docker container and then update them in production server and the deployed container is a huge pain in the neck. Using a cloud datastore really helps in this situation. Google also provides an emulator for local datastore access which works exactly like the production datastore replying same RESTful like data access calls. Good thing is you can transfer your tested data from your local machine to the datastore on cloud using some lines of python code. Lets have a look at it.

To use Cloud Datastore and all other cloud feature on Google Cloud Platform, you have to register, create a billing account, create an app, install Google Cloud sdk and make gcloud command work on terminal, etc. Those does not fall into the scope of this article. As you're reading this, I am already guessing you have done all those and landed here for some tips regarding cloud datastore.

Also, you will need a service account to access the datastore on emulator and on cloud. If you already don't have a service account, you can create on here. After you have created the service account successfully, you will be prompted to download the credentials and private key which is a json file with a name like your-app-name-123456789c.json. Download and save this file on your local machine.

The datastore emulator is a component of the Google Cloud SDK's gcloud tool. Now install the datastore emulator using gcloud command in terminal:

gcloud components install cloud-datastore-emulator

After the installation, start the emulator by running the following command in terminal:

gcloud beta emulators datastore start --data-dir=app/datastore

Note: I have used the --data-dir to create the datastore inside my app's datastore folder. Providing --data-dir is optional, if not given the database will be created in ~/.config/gcloud/emulators/datastore/WEB-INF/appengine-generated/local_db.bin. For more info and other start up options, check here.

When everything goes smooth, your datastore emulator is running now as you can see in the terminal.

After the emulator starts, you need to setup the environment variables so that the application connects to the local datastore instead of the cloud one. You can set the necessary environment variables running the following command:

$(gcloud beta emulators datastore env-init)

To see what are those environment variables you are setting, you can also print it:

echo $(gcloud beta emulators datastore env-init)

For me, it was something like this:

export DATASTORE_DATASET=my-project-id export DATASTORE_EMULATOR_HOST=localhost:8081 export DATASTORE_EMULATOR_HOST_PATH=localhost:8081/datastore export DATASTORE_HOST=http://localhost:8081 export DATASTORE_PROJECT_ID=my-project-id

Once the environment variables are set, now you can access the datastore on your local machine. Now, as I used the cloud datastore to save my app specific settings and environment variables for the docker container, I need to save those values in the datastore. I have created a python script to does this job. Before we go for the code, we have to install the python module for Google Cloud Datastore:

pip install google-cloud-datastore
import os
from google.cloud import datastore

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/credentials/your-app-name-123456789c.json'

env_vars = {
    'FLASKAPP_DEMO_DB_NAME': 'db_flask_demo_gcloud',
    'FLASKAPP_DEMO_MYSQL_PASSWORD': 'root',
    'FLASKAPP_DEMO_MYSQL_USERNAME': 'root',
    'FLASK_APP': 'run.py',
    'FLASK_DEBUG': 1,
    'FLASK_ENV': 'development',
    'FLASK_RUN_PORT': '8080',
    'FLASK_SECRET_KEY_DEV': 'dev-secret-key',
    'FLASK_SECRET_KEY_PROD': 'prod-secret-key',
    'HOST_DOCKER_INTERNAL': 'host.docker.internal',
    'HTTP_GUEST_PORT': '8080',
    'HTTP_HOST_PORT': '8080',
    'MAIL_PASSWORD': 'email password',
    'MAIL_PORT': '587',
    'MAIL_SERVER': 'smtp.email.com',
    'MAIL_USERNAME': 'webmaster@mydomain.com',
    'MAIL_USE_TLS': '1',
    'RECAPTCHA_SECRET_KEY': 'recaptcha secret key',
    'RECAPTCHA_SITE_KEY': 'recaptcha site key',
    'SQLALCHEMY_ECHO': 1,
    'SQLALCHEMY_TRACK_MODIFICATIONS': 1
}

ds = datastore.Client()
entity = datastore.Entity(key=ds.key('flaskapp-demo-dev', 'environment-variables'))
entity.update(env_vars)
ds.put(entity)

So, the code above simply creating all the settings and environment variables you need using a python dictionary. Then just using datastore client to create, add and save the data entity into the datastore. Please note that, you have to create the entity key as two part naming structure for better scopes. The part of the name of the key flaskapp-demo-dev works as parent. You might like to name it according to your project and environment type. The second part of the name environment-variables denotes the type of variables. You might create more fine grained separation for your app settings variables like, db-connection-strings, bulk-email-settings, payment-settings, etc. according to your need.

Now, as we have created our data entities, we are going to retrieve them to see if it really worked. Here is the python code to do it:

import os
from google.cloud import datastore
# Just to prettily print the retrieved entity object
from pprint import pprint

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/credentials/your-app-name-123456789c.json'

ds = datastore.Client()
key = ds.key('flaskapp-demo-dev', 'environment-variables')
env_vars = ds.get(key)

pprint(env_vars)

In this stage, hopefully you will see the retrieved data entity as dictionary.

Please note that if you don't declare the environment variable by ``, then your application will connect to the datastore on cloud. Now, you have all the necessary data on your local machine. And, now you want to copy them over cloud platform. Yup, you guessed it right. Now you just have to use the same data creation code but unsetting the environment variables you set earlier. It might get little bit clumsy to set and unset environment variables to sync your data from local to cloud. So, I have created two separate scripts: one for creating data on local machine and another for creating data on cloud datastore.

For data creation on local machine:

import os
from google.cloud import datastore

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/credentials/your-app-name-123456789c.json'

# Instead of creating environment variables using 
# $(gcloud beta emulators datastore env-init), I am 
# setting them manually here.
os.environ['DATASTORE_DATASET'] = 'my-project-id'
os.environ['DATASTORE_EMULATOR_HOST'] = 'localhost:8081'
os.environ['DATASTORE_EMULATOR_HOST_PATH'] = 'localhost:8081/datastore'
os.environ['DATASTORE_HOST'] = 'http://localhost:8081'
os.environ['DATASTORE_PROJECT_ID'] = 'my-project-id'

env_vars = {
    'FLASKAPP_DEMO_DB_NAME': 'db_flask_demo_gcloud',
    'FLASKAPP_DEMO_MYSQL_PASSWORD': 'root',
    'FLASKAPP_DEMO_MYSQL_USERNAME': 'root',
    'FLASK_APP': 'run.py',
    'FLASK_DEBUG': 1,
    'FLASK_ENV': 'development',
    'FLASK_RUN_PORT': '8080',
    'FLASK_SECRET_KEY_DEV': 'dev-secret-key',
    'FLASK_SECRET_KEY_PROD': 'prod-secret-key',
    'HOST_DOCKER_INTERNAL': 'host.docker.internal',
    'HTTP_GUEST_PORT': '8080',
    'HTTP_HOST_PORT': '8080',
    'MAIL_PASSWORD': 'email password',
    'MAIL_PORT': '587',
    'MAIL_SERVER': 'smtp.email.com',
    'MAIL_USERNAME': 'webmaster@mydomain.com',
    'MAIL_USE_TLS': '1',
    'RECAPTCHA_SECRET_KEY': 'recaptcha secret key',
    'RECAPTCHA_SITE_KEY': 'recaptcha site key',
    'SQLALCHEMY_ECHO': 1,
    'SQLALCHEMY_TRACK_MODIFICATIONS': 1
}

ds = datastore.Client()
entity = datastore.Entity(key=ds.key('flaskapp-demo-dev', 'environment-variables'))
entity.update(env_vars)
ds.put(entity)

When you're done with testing and ready to push your data to datastore, just remove the environment variables from above script, and it will access the datastore on cloud:

import os
from google.cloud import datastore

os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path/to/credentials/your-app-name-123456789c.json'

env_vars = {
    'FLASKAPP_DEMO_DB_NAME': 'db_flask_demo_gcloud',
    'FLASKAPP_DEMO_MYSQL_PASSWORD': 'root',
    'FLASKAPP_DEMO_MYSQL_USERNAME': 'root',
    'FLASK_APP': 'run.py',
    'FLASK_DEBUG': 1,
    'FLASK_ENV': 'development',
    'FLASK_RUN_PORT': '8080',
    'FLASK_SECRET_KEY_DEV': 'dev-secret-key',
    'FLASK_SECRET_KEY_PROD': 'prod-secret-key',
    'HOST_DOCKER_INTERNAL': 'host.docker.internal',
    'HTTP_GUEST_PORT': '8080',
    'HTTP_HOST_PORT': '8080',
    'MAIL_PASSWORD': 'email password',
    'MAIL_PORT': '587',
    'MAIL_SERVER': 'smtp.email.com',
    'MAIL_USERNAME': 'webmaster@mydomain.com',
    'MAIL_USE_TLS': '1',
    'RECAPTCHA_SECRET_KEY': 'recaptcha secret key',
    'RECAPTCHA_SITE_KEY': 'recaptcha site key',
    'SQLALCHEMY_ECHO': 1,
    'SQLALCHEMY_TRACK_MODIFICATIONS': 1
}

ds = datastore.Client()
entity = datastore.Entity(key=ds.key('flaskapp-demo-dev', 'environment-variables'))
entity.update(env_vars)
ds.put(entity)

That is it. Now integrating and testing your app which uses Google Cloud Datastore would be easy I hope.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back To Top