Questions and Answers

Question 17yAKeHZ8hrdV7tVD31d

Question

A data engineer is processing and analyzing multiple terabytes of raw data that is in Amazon S3. The data engineer needs to clean and prepare the data. Then the data engineer needs to load the data into Amazon Redshift for analytics.

The data engineer needs a solution that will give data analysts the ability to perform complex queries. The solution must eliminate the need to perform complex extract, transform, and load (ETL) processes or to manage infrastructure.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Use Amazon EMR to prepare the data. Use AWS Step Functions to load the data into Amazon Redshift. Use Amazon QuickSight to run queries.
  • B: Use AWS Glue DataBrew to prepare the data. Use AWS Glue to load the data into Amazon Redshift. Use Amazon Redshift to run queries.
  • C: Use AWS Lambda to prepare the data. Use Amazon Kinesis Data Firehose to load the data into Amazon Redshift. Use Amazon Athena to run queries.
  • D: Use AWS Glue to prepare the data. Use AWS Database Migration Service (AVVS DMS) to load the data into Amazon Redshift. Use Amazon Redshift Spectrum to run queries.

Question 2Rd77GRTCqBJVTIpJGvi

Question

A company uses an AWS Lambda function to transfer files from a legacy SFTP environment to Amazon S3 buckets. The Lambda function is VPC enabled to ensure that all communications between the Lambda function and other AVS services that are in the same VPC environment will occur over a secure network.

The Lambda function is able to connect to the SFTP environment successfully. However, when the Lambda function attempts to upload files to the S3 buckets, the Lambda function returns timeout errors. A data engineer must resolve the timeout issues in a secure way.

Which solution will meet these requirements in the MOST cost-effective way?

Choices

  • A: Create a NAT gateway in the public subnet of the VPC. Route network traffic to the NAT gateway.
  • B: Create a VPC gateway endpoint for Amazon S3. Route network traffic to the VPC gateway endpoint.
  • C: Create a VPC interface endpoint for Amazon S3. Route network traffic to the VPC interface endpoint.
  • D: Use a VPC internet gateway to connect to the internet. Route network traffic to the VPC internet gateway.

Question IzBIvnb6XUfBu4cUw1gs

Question

A company reads data from customer databases that run on Amazon RDS. The databases contain many inconsistent fields. For example, a customer record field that iPnamed place_id in one database is named location_id in another database. The company needs to link customer records across different databases, even when customer record fields do not match.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Create a provisioned Amazon EMR cluster to process and analyze data in the databases. Connect to the Apache Zeppelin notebook. Use the FindMatches transform to find duplicate records in the data.
  • B: Create an AWS Glue crawler to craw the databases. Use the FindMatches transform to find duplicate records in the data. Evaluate and tune the transform by evaluating the performance and results.
  • C: Create an AWS Glue crawler to craw the databases. Use Amazon SageMaker to construct Apache Spark ML pipelines to find duplicate records in the data.
  • D: Create a provisioned Amazon EMR cluster to process and analyze data in the databases. Connect to the Apache Zeppelin notebook. Use an Apache Spark ML model to find duplicate records in the data. Evaluate and tune the model by evaluating the performance and results.

Question ATffmBNIqO3Vhjb7e0c0

Question

A finance company receives data from third-party data providers and stores the data as objects in an Amazon S3 bucket.

The company ran an AWS Glue crawler on the objects to create a data catalog. The AWS Glue crawler created multiple tables. However, the company expected that the crawler would create only one table.

The company needs a solution that will ensure the AVS Glue crawler creates only one table.

Which combination of solutions will meet this requirement? (Choose two.)

Choices

  • A: Ensure that the object format, compression type, and schema are the same for each object.
  • B: Ensure that the object format and schema are the same for each object. Do not enforce consistency for the compression type of each object.
  • C: Ensure that the schema is the same for each object. Do not enforce consistency for the file format and compression type of each object.
  • D: Ensure that the structure of the prefix for each S3 object name is consistent.
  • E: Ensure that all S3 object names follow a similar pattern.

Question IzQseynAwdztRgTfMupX

Question

An application consumes messages from an Amazon Simple Queue Service (Amazon SQS) queue. The application experiences occasional downtime. As a result of the downtime, messages within the queue expire and are deleted after 1 day. The message deletions cause data loss for the application.

Which solutions will minimize data loss for the application? (Choose two.)

Choices

  • A: Increase the message retention period
  • B: Increase the visibility timeout.
  • C: Attach a dead-letter queue (DLQ) to the SQS queue.
  • D: Use a delay queue to delay message delivery
  • E: Reduce message processing time.