Amazon Redshift data sharing allows you to extend the ease of use, performance, and cost benefits of Amazon Redshift offers in a single cluster to multi-cluster deployments while being able to share data. Data sharing enables instant, granular, and fast data access across Amazon Redshift clusters without the need to copy or move it. Data sharing provides live access to data so that your users always see the most up-to-date and consistent information as it’s updated in the data warehouse. You can securely share live data with Amazon Redshift clusters in the same or different AWS accounts, and across regions.
Amazon Redshift data sharing provides:
- A simple and direct way to share data across Amazon Redshift data warehouses
- Instant, granular, and high performance access without data copies and data movement.
- Live and transactionally consistent views of data across all consumers.
- Secure and governed collaboration within and across organizations and external parties.
How it works
Use cases
-
Workload isolation & chargeability
-
Cross-group collaboration
-
Data & analytics as a service
-
Development agility
-
Workload isolation & chargeability
-
Workload isolation and chargeability
Share data from a ETL cluster with multiple, isolated BI and analytics clusters in a hub-spoke architecture to provide read workload isolation and optional charge-back for costs. Each analytic cluster can be sized according to its price performance requirements and new workloads can be onboarded easily.
-
Cross-group collaboration
-
Cross-group collaboration
Sharing data among multiple business groups that each maintain separate Amazo Redshift clusters to collaborate for broader analytics and data science. Each Amazon Redshift cluster can be a producer of some data but also can be a consumer of other datasets.
-
Data & analytics as a service
-
Data and analytics as a service
Sharing data as a service across different groups in the organization and also with external parties outside the organizational boundaries.
-
Development agility
-
Development agility
Sharing data between development, test, and production environments, at any granularity without having to take snapshots and restore them entirely.
Customer success
FactSet
“Many FactSet clients are undertaking cloud transformation and technology modernization programs in an effort to reduce costs, consolidate their infrastructures and vendors, and eliminate duplicated data. To meet the evolving needs of our clients, FactSet provides flexible solutions that can be seamlessly integrated into a number of preferred workflow environments including AWS Redshift, making the adoption and implementation of our content and solutions turnkey. By leveraging Redshift’s data sharing capabilities, FactSet Standard DataFeeds are instantly available in our clients’ Redshift cluster. This allows them to outsource all ETL processes to FactSet, resulting in streamlined data, reduced time to market, more efficient data integrity, and a simplified process for data discovery, linking, and testing.”
Namita Jain, Product Owner – Cloud & Managed Services
Epsilon
“Prior to data sharing, our process for exchanging data with our clients using Amazon Redshift was not as efficient as it could be. We would typically spin up two additional clusters twice per week, restore, unload, copy, drop schemas, and grant privileges. With data sharing, we can share data with our clients with little to no down time. With less than 10 SQL statements, we securely achieve what used to be a much longer process. This feature gives us more flexibility, saves time and cost, and increases client satisfaction.”
Samantha Corkery, Principal Database Administrator - Epsilon
Warner Bros.
“At Warner Bros. Games, we build and maintain complex data mobility infrastructures to manage data movements across single game clusters and consolidated business function clusters. However, developing and maintaining this system monopolizes valuable team resources and introduces delays that impede our ability to act on the data with agility and speed. Using the Redshift data sharing feature, we can remove the entire subsystem we built for data copying, movement, and loading between Redshift clusters. This will empower all of our business teams to make decisions on the right datasets more quickly and efficiently. Additionally, Redshift data sharing will also allow us to re-architect compute provisioning to more closely align with the resources needed to execute those functions’ SQL workloads—ultimately enabling simpler infrastructure operations.”
Kurt Larson, Technical Director - Warner Bros. Analytics
Yelp
“The data sharing feature seamlessly allows multiple Redshift clusters to query data located in our RA3 clusters and their managed storage. This eliminates our concerns with delays in making data available for our teams, and reduces the amount of data duplication and associated backfill headache. We now can concentrate even more of our time making use of our data in Redshift and enable better collaboration instead of data orchestration.”
Steven Moy, Engineer - Yelp
Fannie Mae
“At Fannie Mae, we adopted a de-centralized approach to data warehouse management with tens of Amazon Redshift clusters across many applications. While each team manages their own dataset, we often have use cases where an application needs to query the datasets from other applications and join with the data available locally. We currently unload and move data from one cluster to another cluster, and this introduces delays in providing timely access to data to our teams. We have had issues with unload operations spiking resource consumption on producer clusters, and data sharing allows us to skip this intermediate unload to Amazon S3, saving time and lowering consumption. Many applications are performing unloads currently in order to share datasets, and we plan to convert all such processes to leveraging the new data sharing feature. With data sharing, we can enable seamless sharing of data across application teams and give them common views of data without having to do ETL. We are also able to avoid the data copies between pre-prod, research, and production environments for each application. Data sharing made us more agile and gave us the flexibility to scale analytics in highly distributed environments like Fannie Mae.”
Amy Tseng, Enterprise Databases Manager - Fannie Mae
Home24
“Shared storage allowed us to focus on what matters: making data available to end-users. Data is no longer stuck in a myriad of storage mediums or formats, or accessible only through select APIs, but rather in a single flavor of SQL.”
Marco Couperus, Engineering Manager - home24
Resources
Sharing Amazon Redshift data securely across Amazon Redshift clusters for workload isolation
Get started with Amazon Redshift
Follow these steps to load sample data and start analyzing it with Amazon Redshift.