Automating Google Sheets with Python using GCP & Streamlit

Seamless Data Management and Visualization: Unlocking the Power of Google Sheets with Google Clopud Platform (GCP) & Streamlit

Daksh Bhatnagar
6 min readJul 27, 2023

Introduction:

In this article, we will explore how to use Python to interact with Google Sheets, a cloud-based spreadsheet tool offered by Google, and how to visualize the data in a web application using Streamlit. We are going to be making this a continous stream of data flow since we will NOT manually download and then upload data anywhere.

Google Sheets is part of the larger Google Workspace suite of productivity tools and provides a convenient way to collaborate and store data in a tabular format.

Source: Google Images

We will be using the Google Auth library to access data from Google Sheets and Streamlit, a web application framework, to create an interactive data editor.

Setting up Google Sheets API and Credentials

For this section, we will be needing Google Cloud. Feel free to go ahead and enable billing and set up a project there.

Google Cloud is a comprehensive suite of cloud computing services and products offered by Google. It provides a wide range of cloud-based solutions for computing, storage, data analytics, machine learning, and more, enabling individuals and organizations to build, deploy, and scale applications and services in the cloud.

Now, we will have to enable the Drive and Google Sheets API on the Google Cloud Console, creating credentials and downloading the keys in a JSON format and then we can start importing libraries.

Source: Google Images

Next step involves setting up the necessary credentials to access Google Sheets API. We do this by importing the required libraries and defining the scope of access. In our case, the scope includes the ability to read spreadsheet data and access Google Drive.

#Importing the dependencies

#!pip install google-api-python-client oauth2client pandas streamlit --upgrade
import pandas as pd
from googleapiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials
import streamlit as st


#setting the scope of the API
scope = ['https://spreadsheets.google.com/feeds','https://www.googleapis.com/auth/drive']
#Setting Up the Credentials
creds = ServiceAccountCredentials.from_json_keyfile_name('PATH_TO_THE_KEYS', scope)

Let’s go through the code step by step :-

- `import pandas as pd`: We import the Pandas library, which is a powerful data manipulation library in Python. It will be used to work with data retrieved from Google Sheets.

- `from oauth2client.service_account import ServiceAccountCredentials`: This line imports the necessary class to handle authentication using service account credentials.

- `scope`: This variable defines the scope of access for the Google Sheets API. It specifies the type of operations that our application is allowed to perform.

- `creds`: Here, we create credentials using the `ServiceAccountCredentials.from_json_keyfile_name` method. These credentials are based on a JSON key file (keys.json) that we need to obtain from the Google Cloud Console. We will need to pass this later to the API when making a request to fetch the data.

Creating a Service for Google Sheets API

In this section, we create a service object for Google Sheets API using the `googleapiclient.discovery` module.

#Instantiating a service object
service = discovery.build('sheets', 'v4', credentials=creds)

- `from googleapiclient import discovery`: We import the `discovery` module from the `googleapiclient` library, which enables us to build a service for the Google Sheets API.

- `service = discovery.build(‘sheets’, ‘v4’, credentials=creds)`: We create the service object using the `discovery.build` method. The first argument specifies the API name (‘sheets’ in our case), the second argument specifies the version (‘v4’), and the third argument provides the credentials required for authorization.

Retrieving Data from Google Sheets

Now that we have set up the API, we can retrieve data from a specific Google Sheets spreadsheet.

spreadsheet_id = ‘YOUR_SPREADSHEET_ID_GOES_HERE’
range_ = ' Sheet1!A1:G '
value_render_option = 'FORMATTED_VALUE'
date_time_render_option = ‘FORMATTED_STRING’
#making the request to get the data using our API
request = service.spreadsheets().values().get(
spreadsheetId=spreadsheet_id, range=range_,
valueRenderOption=value_render_option,
dateTimeRenderOption=date_time_render_option
)
#Executing the request or you won't get any values back
response = request.execute()
#get all the values from the response object
data = response[‘values’]
#storing the values in the dataframe
df = pd.DataFrame(data)
#Setting the names of the columns
df.columns = df.iloc[0]
df.drop([0], inplace=True)

- `spreadsheet_id`: This variable holds the unique identifier of the Google Sheets spreadsheet we want to access. Pls make sure to change the access from Restricted to Viewer if you aim to only read the data or if you would like to make some changes to the Google Sheets you can set the access permission to Editor

- `range_`: We define the range of cells we want to retrieve data from. In this case, ‘A1:G’ indicates all cells from column A to column G.

- `value_render_option`: Specifies how values should be represented in the response. ‘FORMATTED_VALUE’ returns the displayed value of the cells.

- `date_time_render_option`: Specifies how dates, time, and durations should be represented in the response. ‘FORMATTED_STRING’ returns them as formatted strings.

- We then use the `service.spreadsheets().values().get()` method to make a request to Google Sheets API and obtain the data. The response is stored in the `response` variable which is in JSON format from which we extract the actual data values into the `data` variable.

  • Finally, we convert the data into a Pandas DataFrame for easier manipulation and data analysis.

And If you want to write a dataframe to the Google Sheet Tab then you simply would have to send a Batch Request to the Sheets API. What you would want to do is convert the data to be written to a 2D array then specify the range and then call the batchupdate function to paste the values in the sheet. (Make sure to convert your dates and times into object data type or string to ensure you don’t get a serialising error)

#writing the dataframe to Google Sheet
input_values = [list(df.columns)] + df.values.tolist()
input_request = [{
'range': f"{sheet_name}!R1C1:R{df.shape[0]+1}C{df.shape[1]}",
'majorDimension' : "ROWS",
'values' : input_values
}]
body = {'valueInputOption' : "RAW", 'data' : input_request}

result = service.spreadsheets().values().batchUpdate(spreadsheetId = 'SPREADSHEET_ID',
body=body).execute()

Creating an Interactive Web Application with Streamlit

Streamlit is an open-source Python library for creating interactive web apps from data scripts. It works seamlessly with popular data libraries like Pandas and allows easy integration of charts, tables, sliders, and more.

With its user-friendly interface, Streamlit simplifies data visualization and analysis, making it ideal for rapid prototyping and sharing insights with others.

Source: Google Images

To make our data accessible and user-friendly, we will use Streamlit to create an interactive web application.

#setting the title of the app
st.title('My App')
#grouping the data for plotting purposes
data = df.groupby(['COL_NAME']).count()['ANOTHER_COL'].sort_values(ascending=False)
#Adding a markdown as a title for the chart
st.markdown("<h5 style='text-align: center; color: red;'>ANOTHER_COL COUNTPLOT</h1>",
unsafe_allow_html=True)
#visualizing the data
st.bar_chart(data)
#showing the entire data
st.data_editor(df)

- `import streamlit as st`: We import the Streamlit library, a user-friendly Python library for creating web applications.

- `st.title(‘My App’)`: This line displays the title “My App” on the web application. You can modify this to any title that suits your application.

- `st.data_editor(df)`: Streamlit’s `data_editor` function is used to display a data table that users can interact with. We pass the Pandas DataFrame `df` (which contains the retrieved Google Sheets data) as the argument to display the data table.

- `st.bar_chart(data)`: Streamlit’s `bar_chart` uses Altair as the default plotting library. We pass the Pandas Series`data` (which contains the grouped data) which we are then able to visualize on the app.

Make sure to run your Streamlit app using the command to start hosting the app locally on your machine

#command to run your streamlit app
python -m streamlit run your_app_name.py

Conclusion:

In this article, we have explored how to use Python along with GCP and Streamlit to interact with Google Sheets data and create an interactive web application. By combining the power of Google Sheets and Python libraries, we can seamlessly retrieve, analyze, and present data in a user-friendly way.

This integration is particularly helpful for data analysts, businesses, and individuals who wish to leverage the collaborative and cloud-based features of Google Sheets while utilizing the capabilities of Python for data manipulation and visualization.

As you continue to experiment and build upon this foundation, you can unlock the full potential of cloud-based data storage and analysis. You can also go full blown into deployment of the app using Streamlit and present insights to your stakeholders in much easier manner.

That’s all for now!

If you liked the article and it proved to be helpful to you, please consider giving it a clap and follow for more such articles!

HAPPY CODING!!

Resources:

  1. https://www.analyticsvidhya.com/blog/2020/07/read-and-update-google-spreadsheets-with-python/
  2. https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets.values/update
  3. https://docs.streamlit.io/library/get-started

--

--

Daksh Bhatnagar
Daksh Bhatnagar

Written by Daksh Bhatnagar

Data Analyst who talks about #datascience, #dataanalytics and #machinelearning

No responses yet