Redshift materialized view refresh

2/26/2024

We use AWS Step Functions for serverless workflow orchestration.We use the Amazon Redshift streaming ingestion feature to process and store the streaming data and Amazon Redshift ML to predict the likelihood of a consignment getting delayed.The streaming events are captured by Amazon Kinesis Data Streams, which is a highly scalable serverless streaming data service.

Multiple streaming data sources are simulated through Python code running in our serverless compute service, AWS Lambda.This solution is composed of the following components, and the provisioning of resources is automated using the AWS Cloud Development Kit (AWS CDK): It also shows the consignment delay predictions of an Amazon Redshift ML model that helps them proactively respond to disruptions before they even happen. From this dashboard, the team can see the current state of their consignments and their logistics fleet based on events that happened only a few seconds ago. Our example is an operational intelligence dashboard for a logistics company that provides situational awareness and augmented intelligence for their operations team.

In this post, we build a near real-time logistics dashboard using Amazon Redshift and Amazon Managed Grafana. Perform inferencing natively using Amazon Redshift ML.Generate new features that are used to predict delays using machine learning (ML).Create the different streaming database objects that are actually materialized views.Define the integration between Amazon Redshift and our streaming engines with the creation of external schema.In Amazon Redshift streaming ingestion, only SQL is required. With this feature, we can ingest hundreds of megabytes of data per second and have a latency of just a few seconds.Īnother common challenge for our customers is the additional skill required when using streaming data. Eliminating the need to stage data in Amazon S3 results in faster performance and improved latency.

Before Amazon Redshift streaming was available, we had to stage the streaming data first in Amazon Simple Storage Service (Amazon S3) and then run the copy command to load it into Amazon Redshift. Streaming data sources like system logs, social media feeds, and IoT streams can continue to push events to the streaming engines, and Amazon Redshift simply becomes just another consumer. It simplifies the streaming architecture by providing native integration between Amazon Redshift and the streaming engines in AWS, which are Amazon Kinesis Data Streams and Amazon Managed Streaming for Apache Kafka (Amazon MSK). With the Amazon Redshift streaming ingestion feature, it’s easier than ever to access and analyze data coming from real-time data sources. It also eliminates data silos by simplifying access to your operational databases, data warehouse, and data lake with consistent security and governance policies. It continues to lead price-performance benchmarks, and separates compute and storage so each can be scaled independently and you only pay for what you need. Amazon Redshift is a fully managed data warehousing service that is currently helping tens of thousands of customers manage analytics at scale.

0 Comments

Redshift materialized view refresh

Leave a Reply.

Author

Archives

Categories