Questions and Answers

Question lF89nQz3ngxh6PM1Yc1R

Question

A data engineer wants to improve the performance of SQL queries in Amazon Athena that run against a sales data table.

The data engineer wants to understand the execution plan of a specific SQL statement. The data engineer also wants to see the computational cost of each operation in a SQL query.

Which statement does the data engineer need to run to meet these requirements?

Choices

  • A: EXPLAIN SELECT * FROM sales;
  • B: EXPLAIN ANALYZE FROM sales;
  • C: EXPLAIN ANALYZE SELECT * FROM sales;
  • D: EXPLAIN FROM sales;

Question ypALFrZLCZIyw8BIdHLU

Question

A data engineer needs to schedule a workflow that runs a set of AWS Glue jobs every day. The data engineer does not require the Glue jobs to run or finish at a specific time. Which solution will run the Glue jobs in the MOST cost-effective way?

Choices

  • A: Choose the FLEX execution class in the Glue job properties.
  • B: Use the Spot Instance type in Glue job properties.
  • C: Choose the STANDARD execution class in the Glue job properties.
  • D: Choose the latest version in the GlueVersion field in the Glue job properties.

Question XL69xrSDiTPxT7DGhuc4

Question

A company plans to provision a log delivery stream within a VPC. The company configured the VPC flow logs to publish to Amazon CloudWatch Logs. The company needs to send the flow logs to Splunk in near real time for further analysis.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Configure an Amazon Kinesis Data Streams data stream to use Splunk as the destination. Create a CloudWatch Logs subscription filter to send log events to the data stream.
  • B: Create an Amazon Kinesis Data Firehose delivery stream to use Splunk as the destination. Create a CloudWatch Logs subscription filter to send log events to the delivery stream.
  • C: Create an Amazon Kinesis Data Firehose delivery stream to use Splunk as the destination. Create an AWS Lambda function to send the flow logs from CloudWatch Logs to the delivery stream.
  • D: Configure an Amazon Kinesis Data Streams data stream to use Splunk as the destination. Create an AWS Lambda function to send the flow logs from CloudWatch Logs to the data stream.

Question Vn7FPeskwiBIURn1NckS

Question

A company has a data lake on AWS. The data lake ingests sources of data from business units. The company uses Amazon Athena for queries. The storage layer is Amazon S3 with an AWS Glue Data Catalog as a metadata repository.

The company wants to make the data available to data scientists and business analysts. However, the company first needs to manage fine-grained, column-level data access for Athena based on the user roles and responsibilities.

Which solution will meet these requirements?

Choices

  • A: Set up AWS Lake Formation. Define security policy-based rules for the users and applications by IAM role in Lake Formation.
  • B: Define an IAM resource-based policy for AWS Glue tables. Attach the same policy to IAM user groups.
  • C: Define an IAM identity-based policy for AWS Glue tables. Attach the same policy to IAM roles. Associate the IAM roles with IAM groups that contain the users.
  • D: Create a resource share in AWS Resource Access Manager (AWS RAM) to grant access to IAM users.

Question n5PPm7nmYeLrZ1gXh6P1

Question

A company has developed several AWS Glue extract, transform, and load (ETL) jobs to validate and transform data from Amazon S3. The ETL jobs load the data into Amazon RDS for MySQL in batches once every day. The ETL jobs use a DynamicFrame to read the S3 data.

The ETL jobs currently process all the data that is in the S3 bucket. However, the company wants the jobs to process only the daily incremental data.

Which solution will meet this requirement with the LEAST coding effort?

Choices

  • A: Create an ETL job that reads the S3 file status and logs the status in Amazon DynamoDB.
  • B: Enable job bookmarks for the ETL jobs to update the state after a run to keep track of previously processed data.
  • C: Enable job metrics for the ETL jobs to help keep track of processed objects in Amazon CloudWatch.
  • D: Configure the ETL jobs to delete processed objects from Amazon S3 after each run.