Questions and Answers
Question 7mDRLgkNgFtYj8PDK1R6
Question
A company saves customer data to an Amazon S3 bucket. The company uses server-side encryption with AWS KMS keys (SSE-KMS) to encrypt the bucket. The dataset includes personally identifiable information (PII) such as social security numbers and account details.
Data that is tagged as PII must be masked before the company uses customer data for analysis. Some users must have secure access to the PII data during the pre-processing phase. The company needs a low-maintenance solution to mask and secure the PII data throughout the entire engineering pipeline.
Which combination of solutions will meet these requirements? (Choose two.)
Choices
- A: Use AWS Glue DataBrew to perform extract, transform, and load (ETL) tasks that mask the PII data before analysis.
- B: Use Amazon GuardDuty to monitor access patterns for the PII data that is used in the engineering pipeline.
- C: Configure an Amazon Macie discovery job for the S3 bucket.
- D: Use AWS Identity and Access Management (IAM) to manage permissions and to control access to the PII data.
- E: Write custom scripts in an application to mask the PII data and to control access.
answer?
Answer: AD Answer_ET: AD Community answer AD (100%) Discussion
Comment 1341195 by MerryLew
- Upvotes: 1
Selected Answer: AD A will find and mask the PII D for access
Comment 1330789 by HagarTheHorrible
- Upvotes: 1
Selected Answer: AD A for data maskin and D for access
Comment 1317347 by emupsx1
- Upvotes: 1
Selected Answer: AD https://aws.amazon.com/tw/blogs/big-data/build-a-data-pipeline-to-automatically-discover-and-mask-pii-data-with-aws-glue-databrew/
Question GzSWlvFbGBQUgT5LhYAR
Question
A data engineer is launching an Amazon EMR cluster. The data that the data engineer needs to load into the new cluster is currently in an Amazon S3 bucket. The data engineer needs to ensure that data is encrypted both at rest and in transit.
The data that is in the S3 bucket is encrypted by an AWS Key Management Service (AWS KMS) key. The data engineer has an Amazon S3 path that has a Privacy Enhanced Mail (PEM) file.
Which solution will meet these requirements?
Choices
- A: Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Create a second security configuration. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach both security configurations to the cluster.
- B: Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for local disk encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.
- C: Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Use the security configuration during EMR cluster creation.
- D: Create an Amazon EMR security configuration. Specify the appropriate AWS KMS key for at-rest encryption for the S3 bucket. Specify the Amazon S3 path of the PEM file for in-transit encryption. Create the EMR cluster, and attach the security configuration to the cluster.
answer?
Answer: C Answer_ET: C Community answer C (50%) D (50%) Discussion
Comment 1360519 by italiancloud2025
- Upvotes: 1
Selected Answer: D D: Sí, porque crea una única configuración de seguridad que especifica encriptación en reposo (con KMS) y en tránsito (usando el archivo PEM), y se adjunta al clúster durante su creación.
Comment 1317352 by emupsx1
- Upvotes: 1
Selected Answer: C https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-specify-security-configuration.html
Question anmm1tWi66DStCtixWBQ
Question
A retail company is using an Amazon Redshift cluster to support real-time inventory management. The company has deployed an ML model on a real-time endpoint in Amazon SageMaker.
The company wants to make real-time inventory recommendations. The company also wants to make predictions about future inventory needs.
Which solutions will meet these requirements? (Choose two.)
Choices
- A: Use Amazon Redshift ML to generate inventory recommendations.
- B: Use SQL to invoke a remote SageMaker endpoint for prediction.
- C: Use Amazon Redshift ML to schedule regular data exports for offline model training.
- D: Use SageMaker Autopilot to create inventory management dashboards in Amazon Redshift.
- E: Use Amazon Redshift as a file storage system to archive old inventory management reports.
answer?
Answer: AB Answer_ET: AB Community answer AB (100%) Discussion
Comment 1341198 by MerryLew
- Upvotes: 1
Selected Answer: AB A and B Redshift ML for data exports? Nah. SageMaker autopilot is for building/training/deploying models Redshift for file storage?
Comment 1319983 by emupsx1
- Upvotes: 1
Selected Answer: AB The company wants to make real-time inventory recommendations. Select (A) recommendations. The company also wants to make predictions about future inventory needs. Select (B) prediction.
Question CdT3G4xWMQKLJuduAHAd
Question
A company stores CSV files in an Amazon S3 bucket. A data engineer needs to process the data in the CSV files and store the processed data in a new S3 bucket.
The process needs to rename a column, remove specific columns, ignore the second row of each file, create a new column based on the values of the first row of the data, and filter the results by a numeric value of a column.
Which solution will meet these requirements with the LEAST development effort?
Choices
- A: Use AWS Glue Python jobs to read and transform the CSV files.
- B: Use an AWS Glue custom crawler to read and transform the CSV files.
- C: Use an AWS Glue workflow to build a set of jobs to crawl and transform the CSV files.
- D: Use AWS Glue DataBrew recipes to read and transform the CSV files.
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1330781 by HagarTheHorrible
- Upvotes: 1
Selected Answer: D all more or less common operations all avilalble in data brew.
Comment 1317807 by emupsx1
- Upvotes: 1
Selected Answer: D https://docs.aws.amazon.com/databrew/latest/dg/recipes.html
Question D3tIYWgYonaVQ0E3J6JQ
Question
A company uses Amazon Redshift as its data warehouse. Data encoding is applied to the existing tables of the data warehouse. A data engineer discovers that the compression encoding applied to some of the tables is not the best fit for the data.
The data engineer needs to improve the data encoding for the tables that have sub-optimal encoding.
Which solution will meet this requirement?
Choices
- A: Run the ANALYZE command against the identified tables. Manually update the compression encoding of columns based on the output of the command.
- B: Run the ANALYZE COMPRESSION command against the identified tables. Manually update the compression encoding of columns based on the output of the command.
- C: Run the VACUUM REINDEX command against the identified tables.
- D: Run the VACUUM RECLUSTER command against the identified tables.
answer?
Answer: B Answer_ET: B Community answer B (100%) Discussion
Comment 1398880 by Ramdi1
- Upvotes: 1
Selected Answer: B Amazon Redshift uses columnar storage with compression encoding to optimize query performance and reduce storage costs. Over time, sub-optimal encoding may lead to poor performance.
To determine the best compression encoding for a table, use the ANALYZE COMPRESSION command, which: 🔹 Scans the table’s data and suggests optimal encoding types for each column. 🔹 Helps reduce storage size and improve query efficiency. 🔹 Requires a manual column update because Amazon Redshift does not automatically apply new encodings.
Comment 1307120 by kupo777
- Upvotes: 2
Correct Answer: B
ANALYZE COMPRESSION Command: This command analyzes the data in the specified tables and provides recommendations for the best compression encoding for each column. It evaluates the current encoding and suggests more efficient options based on the actual data distribution. Manual Update: After running the command, the data engineer can manually apply the recommended compression encodings to optimize storage and query performance.