A Flask-based API to convert a Google Sheet to JSON. Structurally, this application is based on this example, which itself is a restructuring of this example and its accompanying article.
For both local and remote environments, you'll need to make sure the application has access to any Google Sheets data that it needs to load. In version 4 of the Sheets API, this happens through Service Accounts.
If you are authenticating this application with the Sheets API for the first time, you'll need to create a new Google Cloud project. Start by following this guide from Google. When you've finished Google's steps, you should have a new project.
Our specific Google Sheets integration uses the Sheetfu library, which has an authentication guide to finish this process. The screenshots are not necessarily up to date with the names Google uses for things.
Between these resources, you should follow these steps to create and access the authentication credentials:
- Create a new Google Cloud Platform project.
- Enable the Sheets and Drive APIs in the APIs & Services section of the Google Cloud Platform settings.
- Create a Service Account in the IAM & Admin section of the Google Cloud Platform settings.
- Download the new JSON-formatted key for that Service Account. Click the Actions menu for that account and choose the Manage keys option to create and download the key. Only use this key for one environment.
This new Service account will have an automatically-created email address. For this application, that email address must have at least Viewer-level access on any Google Sheets that it needs to access. It's best to give it that level of access on the folder level.
If this user is new or it is being given new access, it can take a few minutes for the changes to propogate.
If the Service Account user already exists in the Google Cloud Platform, you can access it at https://console.cloud.google.com/home/dashboard?project=[application-name].
If it hasn't been, you'll need your Google account added. An Administrator can do that at the correct dashboard URL by clicking "Add People to this Project."
Follow these steps to access the authentication credentials:
- Once you have access to the project's dashboard, click "Go to project settings" in the Project info box.
- Click Service Accounts in the IAM & Admin section of the Google Cloud Platform settings.
- If there is more than one service account, find the correct one.
- Click the Actions menu for that account and choose the Manage keys option.
- Click Add Key, choose Create new key, and choose JSON as the Key type. Click the Create button and download the key for that Service Account. Only use this key for one environment.
Once you have downloaded the JSON file with the credentials, you will use the values from it in the .env file or in the project's Heroku settings. See the sections of this readme that cover authentication for Google Sheets. Once you've authenticated successfully, you don't have to keep the JSON file around, unless you'd like to have a backup.
Enter the configuration values from the JSON key downloaded above into the .env file's values for these fields:
SHEETFU_CONFIG_TYPESHEETFU_CONFIG_PROJECT_IDSHEETFU_CONFIG_PRIVATE_KEY_IDSHEETFU_CONFIG_PRIVATE_KEYSHEETFU_CONFIG_CLIENT_EMAILSHEETFU_CONFIG_CLIENT_IDSHEETFU_CONFIG_AUTH_URISHEETFU_CONFIG_TOKEN_URISHEETFU_CONFIG_AUTH_PROVIDER_URLSHEETFU_CONFIG_CLIENT_CERT_URL
In the project's Heroku settings, enter the configuration values from the production-only JSON key downloaded above into the values for these fields:
SHEETFU_CONFIG_TYPESHEETFU_CONFIG_PROJECT_IDSHEETFU_CONFIG_PRIVATE_KEY_IDSHEETFU_CONFIG_PRIVATE_KEYSHEETFU_CONFIG_CLIENT_EMAILSHEETFU_CONFIG_CLIENT_IDSHEETFU_CONFIG_AUTH_URISHEETFU_CONFIG_TOKEN_URISHEETFU_CONFIG_AUTH_PROVIDER_URLSHEETFU_CONFIG_CLIENT_CERT_URL
This API supports storing JSON in either the application's Redis instance (whether locally or on Heroku) or in an Amazon S3 bucket. If you do not enable S3 usage, Redis will be used by default.
Before running the application, you'll need to run a Redis server for caching data. One way to do this is with Homebrew.
- Run
brew updatethenbrew install redis - If you want Redis to start automatically on login, run
brew services start redis. To stop it, runbrew services stop redis. If you want to run Redis in a terminal shell instead, you can runredis-server /usr/local/etc/redis.confinstead of using brew service. - Test if Redis is running with the command
redis-cli ping. Redis should respond with "PONG." - You shouldn't need a graphic interface for this project, but if you prefer to use one, Medis is free on the Mac App Store.
To run Redis on Heroku, installing the free Heroku Redis add-on should be sufficient.
Use the following fields in your .env or in your Heroku settings.
CACHE_TYPE = "RedisCache"CACHE_REDIS_HOST = "redis"CACHE_REDIS_PORT = "6379"(unless you are using a non-default port for Redis)CACHE_REDIS_DB = "0"(unless you need a separate Redis database. Redis creates databases in numeric order, so you can use other numbers)CACHE_REDIS_URL = "redis://127.0.0.1:6379/0"(make sure the:6379matches your port value, and that/0matches your Redis database number)CACHE_DEFAULT_TIMEOUT = "500"
To push JSON data from this application to Amazon AWS, we use the boto3 library.
To upload a file to S3, fill in these .env values to match your S3 account. If S3 is enabled, URLs for your files will be https://s3.amazonaws.com/[your bucket]/[your folder]/[this spreadsheet's cachekey].json.
AWS_ACCESS_KEY_IDAWS_SECRET_ACCESS_KEYAWS_BUCKETAWS_FOLDERUSE_AWS_S3use "true" or "false" for this; it will be converted to lowercase if you forget. "false" is the default value for this setting.
We use a basic setup of Flask-JWT-Extended to manage application access.
To generate API keys, run this code in a Python shell:
import secrets
secrets.token_hex(64)Do this for each valid API key that an application can use to access this API.
Store a list of valid API keys in the .env file or in Heroku configuration.
VALID_API_KEYS = '["key1", "key2"]'
Store a single secret key in the JWT_SECRET_KEY field of the .env file or in Heroku configuration.
There are a few general .env variables not already discussed. You can see these settings, and their default values (or the kind of value they expect), in the repository's .env-example file.
- Install
git - Get the code:
git clone https://github.com/MinnPost/google-sheet-to-json-api.git - Change the directory:
cd google-sheet-to-json-api - Create a
.envfile based on the repository's.env-examplefile in the root of your project. - Run
pipenv install. - Run
pipenv shell - Run
flask run --host=0.0.0.0. This creates a basic endpoint server at http://0.0.0.0:5000.
This application should be deployed to Heroku. If you are creating a new Heroku application, clone this repository with git clone https://github.com/MinnPost/google-sheet-to-json-api.git and follow Heroku's instructions to create a Heroku remote.
Currently, this application has three endpoints:
authorizeis used to get a token from a valid API key. The token is then required by the/parser/custom-overwrite/endpoint. This endpoint acceptsPOSTrequests./parser/is the main endpoint. It acceptsGETrequests, and will return JSON of that Google sheet's data and cache it. If there is a customized JSON structure that has already been cached and has not expired, it will return that instead./parser/custom-overwrite/receivesPOSTrequests. It receives custom formatted JSON, caches it, and returns it. APOSTrequest requiresapplication/jsonas theContent-Typeheader. It's a good idea to useapplication/jsonforGETrequests as well to make theAuthorizationheader easier, but you don't necessarily have to do this.
api_keyis required in thePOSTbody for requests toauthorize. A request with a valid API key returns atoken.tokenis required as anAuthorizationheader value on/parser/and/parser/custom-overwrite/requests. A request without a valid token in that header will fail. It can be passed to aGETorPOSTendpoint.
To pass this header to an endpoint within Python, do this:
authorized_headers = {
"Authorization": f"Bearer {valid_token_value}"
}Both URL endpoints support many of the same data parameters, but they get passed on the URL (in the case of a GET endpoint) or in the post body (in the case of a POST endpoint). Most parameters are usable by both endpoints, but if they are end-point specific that will be noted.
spreadsheet_idis a required ID of a Google Sheet that the application user can access.worksheet_namesis an optional parameter of worksheet names, such asRaces|Candidates. If it is left blank, the endpoint will load the first worksheet in the spreadsheet. If there are multiple worksheets provided, they will be sorted alphabetically to keep consistency.
outputis a full, customized JSON output of the modified Google Sheet data as it should be stored. It is only accepted on the/parser/custom-overwriteendpoint as aPOSTparameter.
external_use_s3is an optional parameter that takes a value of "true" or "false", and determines whether JSON data is saved to Amazon S3 instead of in Redis. A value for this parameter will override the default setting in the.envfile.
All of these parameters are specific to the Redis cache. They are all optional, but have default values.
bypass_cachetakes a value of "true" or "false". "false" is the default value. If the value is "true" the application will not check the Redis cache for any stored data, but will attempt to parse the Google Sheet directly and return its data.delete_cachetakes a value of "true" or "false". "false" is the default value. If the value is "true" the application will delete the cached data for the Google Sheet.cache_datatakes a value of "true" or "false". "true" is the default value. If the value is "false" the application will not store data for the Google Sheet in the cache.cache_timeouttakes the number of seconds that the cache should last before it expires. If present, this will override the default value from the.envfile. This value only has an effect on requests where data is actually being cached.