Questions and Answers

Question FrYN3PqCHR4dXLX5LARA

Question

An online retailer uses multiple delivery partners to deliver products to customers. The delivery partners send order summaries to the retailer. The retailer stores the order summaries in Amazon S3.

Some of the order summaries contain personally identifiable information (PII) about customers. A data engineer needs to detect PII in the order summaries so the company can redact the PII.

Which solution will meet these requirements with the LEAST operational overhead?

Choices

  • A: Amazon Textract
  • B: Amazon S3 Storage Lens
  • C: Amazon Macie
  • D: Amazon SageMaker Data Wrangler

Question gGTCp95FPEoQvnluWJfq

Question

A retail company has a customer data hub in an Amazon S3 bucket. Employees from many countries use the data hub to support company-wide analytics. A governance team must ensure that the company’s data analysts can access data only for customers who are within the same country as the analysts. Which solution will meet these requirements with the LEAST operational effort?

Choices

  • A: Create a separate table for each country’s customer data. Provide access to each analyst based on the country that the analyst serves.
  • B: Register the S3 bucket as a data lake location in AWS Lake Formation. Use the Lake Formation row-level security features to enforce the company’s access policies.
  • C: Move the data to AWS Regions that are close to the countries where the customers are. Provide access to each analyst based on the country that the analyst serves.
  • D: Load the data into Amazon Redshift. Create a view for each country. Create separate IAM roles for each country to provide access to data from each country. Assign the appropriate roles to the analysts.

Question 2V920AXUufLcGYllQpMI

Question

A company is migrating on-premises workloads to AWS. The company wants to reduce overall operational overhead. The company also wants to explore serverless options. The company’s current workloads use Apache Pig, Apache Oozie, Apache Spark, Apache Hbase, and Apache Flink. The on-premises workloads process petabytes of data in seconds. The company must maintain similar or better performance after the migration to AWS. Which extract, transform, and load (ETL) service will meet these requirements?

Choices

  • A: AWS Glue
  • B: Amazon EMR
  • C: AWS Lambda
  • D: Amazon Redshift

Question hCILu73hTymVEyFTxAnm

Question

A company has an Amazon Redshift data warehouse that users access by using a variety of IAM roles. More than 100 users access the data warehouse every day.

The company wants to control user access to the objects based on each user’s job role, permissions, and how sensitive the data is.

Which solution will meet these requirements?

Choices

  • A: Use the role-based access control (RBAC) feature of Amazon Redshift.
  • B: Use the row-level security (RLS) feature of Amazon Redshift.
  • C: Use the column-level security (CLS) feature of Amazon Redshift.
  • D: Use dynamic data masking policies in Amazon Redshift.

Question rrab1DCpzsdjD8jv4jv3

Question

A company uses Amazon DataZone as a data governance and business catalog solution. The company stores data in an Amazon S3 data lake. The company uses AWS Glue with an AWS Glue Data Catalog.

A data engineer needs to publish AWS Glue Data Quality scores to the Amazon DataZone portal.

Which solution will meet this requirement?

Choices

  • A: Create a data quality ruleset with Data Quality Definition language (DQDL) rules that apply to a specific AWS Glue table. Schedule the ruleset to run daily. Configure the Amazon DataZone project to have an Amazon Redshift data source. Enable the data quality configuration for the data source.
  • B: Configure AWS Glue ETL jobs to use an Evaluate Data Quality transform. Define a data quality ruleset inside the jobs. Configure the Amazon DataZone project to have an AWS Glue data source. Enable the data quality configuration for the data source.
  • C: Create a data quality ruleset with Data Quality Definition language (DQDL) rules that apply to a specific AWS Glue table. Schedule the ruleset to run daily. Configure the Amazon DataZone project to have an AWS Glue data source. Enable the data quality configuration for the data source.
  • D: Configure AWS Glue ETL jobs to use an Evaluate Data Quality transform. Define a data quality ruleset inside the jobs. Configure the Amazon DataZone project to have an Amazon Redshift data source. Enable the data quality configuration for the data source.