Build a Data Lake Foundation on the AWS Cloud with AWS Services
The data lake foundation provides these features:
- Data submission, including batch submissions to Amazon S3 and streaming submissions via Amazon Kinesis Firehose
- Ingest processing, including data validation, metadata extraction, and indexing via Amazon S3 events, Amazon Simple Notification Service (Amazon SNS), AWS Lambda, Amazon Kinesis Analytics, and Amazon ES
- Dataset management through Amazon Redshift transformations and Kinesis Analytics
- Data transformation, aggregation, and analysis through Amazon Athena and Amazon Redshift Spectrum
- Search, by indexing metadata in Amazon ES and exposing it through Kibana dashboards
- Publishing into an S3 bucket for use by visualization tools, and visualization with Amazon QuickSight
Once this foundation is in place, you may choose to augment the data lake with ISV and software as a service (SaaS) tools.
The deployment also includes an optional wizard and a sample dataset that is loaded into the Amazon Redshift cluster and Kinesis streams. The data lake wizard uses the dataset to demonstrate data lake capabilities such as search, transforms, queries, analytics, and visualization.
Read the entire article here, Build a Data Lake Foundation on the AWS Cloud with AWS Services
via the fine folks at Amazon Web Services.