BigQuery is a Google Cloud Service fully managed data warehouse to analyze data using SQL. BigTable is a NoSQL database for data from Gigabytes to Exabytes. Before, BigQuery used ETL tools to read data from BigTable. Now, Google has initiated the Zero-ETL approach to read BigTable data using BigQuery. This approach will make data querying faster.
BigQuery makes it easy to analyze data as it has built-in capabilities like machine learning. It can access data from where it lives and stores and analyzes data within BigQuery. Features of BigQuery are:
- Data querying
- Database
- Managing data
- Data sharing
- Business intelligence
- Integration
- Cross-cloud analytics
BigTable can store large amounts of data in its wide-column and key-value NoSQL database. Use cases of Bigtable include real-time fraud detection, recommendations, personalization, and time series. Features of BigTable are:
- Supports a large amount of data with high read and write throughput with low latency
- BigTable cluster resizing without downtime
- Optimization of workload with flexible and automated replication.
It was possible to use ETL tools like Dataflow to copy data from BigTable to BigQuery. But this approach has issues like a shortcoming, low data freshness, cost of storing the same data, and maintaining ETL pipelines. Therefore managing petabytes of data may cause duplication in data that may be costly. The architecture below shows how BigQuery accesses data from BigTable.
BigQuery accessing data from BigTable
Zero-ETL approach
With the Zero-ETL approach, querying data from BigTable to BigQuery can be faster without copying or moving data and by reducing the gap between operational data and analytics. Creating an external table for the Bigtable can access data inside BigQuery, by providing the Cloud BigTable URI obtained from Cloud BigTable Console. A Demo for BigTable URI is below.
Format of a BigTable URI is:
- project_id: the project containing your Bigtable instance
- instance_id: the Bigtable instance ID
- (Optional) app_profile: the app profile ID that you want to use
- table_name: the name of the table you’re querying
External table configuration will provide information like column family, column encoding, and data types to BigQuery. Also, you can query external tables like any other table in BigQuery. As the BigTable executes the query, you will get the advantage of high throughput with low latency and quickly identify relevant columns and rows from petabytes of data. And the query not supported by BigTable is executed by BigQuery. External tables can also get the advantage of BigQuery’s JDBC/ODBC drivers and connectors for business intelligence and data visualization tools like DataStudio, Looker, Tableau, and AutoML table for training machine learning.
Thus with this Zero-ETL approach to read BigTable data using BigQuery, you can query data within the BigQuery quite faster.
Consult Metclouds Technologies to go with the Zero-ETL approach of Google Cloud Service to query data from BigTable into BigQuery.