Connecting Tableau to Amazon Athena
Connecting Tableau to Amazon Athena requires a JDBC connector to establish a connection. This allows users to analyze and visualize data stored in Amazon S3 using a serverless interactive query service. By following a few simple steps and configuring the JDBC driver, users can easily connect to Amazon Athena from Tableau and begin exploring and analyzing their data visually and interactively.
If you are interested in learning about how to create visualizations for your AWS Data Lake using Amazon Athena and Tableau, you may find this AWS Blog helpful. It provides step-by-step instructions on building these visualizations and can help you gain a better understanding of how to analyze and visualize your data efficiently and effectively.
Prerequisites
-
An AWS Access Key ID and Secret Access Key with programmatic IAM access to respective Athena Workgroup. underlying data stored in S3 and Athena output S3 bucket.
-
A minimum JDK 7.0 (Java 1.7) is required. Please install it from here.
-
Tableau Desktop is installed on your local machine
For Windows & Mac: https://www.tableau.com/products/desktop/download -
Keep port 443 and 444, which Athena uses to stream query results, open to outbound traffic. We can restrict it to only open these two ports for the specific Athena endpoint (athena.<region>.amazonaws.com).
Note:
- For technical specifications including what all platforms are supported, please refer to this Tableau doc.
- For Tableau Plans and Pricing, please refer to this page.
Steps to Connect
- Follow the steps in this external article to Connect Tableau to Amazon Athena
Live Mode vs Extract Mode
Tableau Desktop supports two types of data sources: Live Data Source and Extract Data Source.
Live Mode
In Live mode, Tableau sends a query directly to Athena and displays the results in real time. This means that any changes made to the data in Athena are immediately reflected in the visualization. Live mode is useful when you need up-to-the-minute data and when the size of the data is relatively small.
However, there are a few things to keep in mind when using Live mode:
-
Live mode can be slower than Extract mode, especially when working with large datasets or complex queries.
-
Live mode requires a constant connection to the data source, which can be problematic if you have a slow or unstable network connection.
-
Live mode can put a strain on your Athena resources, especially if you have many users querying the data simultaneously.
Extract Mode
In Extract mode, Tableau extracts a subset of the data from Athena and stores it in a Tableau-specific format. This can improve query performance and allow you to work with larger datasets. You can set up an extract to refresh automatically on a schedule or manually refresh it as needed.
Here are a few things to consider when using Extract mode:
-
Extracts can be faster than Live mode, especially when working with large datasets or complex queries.
-
Extracts allow you to work offline or with a slow network connection because the data is stored locally.
-
Extracts can become stale if they are not refreshed frequently enough.
Ultimately, the choice between Live and Extract mode depends on your specific use case and requirements. If you need real-time data and have a small dataset, Live mode may be the best choice. If you have a large dataset or complex queries, or if you need to work offline or with a slow network connection, Extract mode may be the better option.