Questions and Answers
Question nXpfVxJb9ucmtzFhRX3l
Question
When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?
Choices
- A: Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
- B: Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: Unlimited
- C: Cluster: Existing All-Purpose Cluster; Retries: Unlimited; Maximum Concurrent Runs: 1
- D: Cluster: New Job Cluster; Retries: None; Maximum Concurrent Runs: 1
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1341676 by RandomForest
- Upvotes: 1
Selected Answer: A The correct answer is A: The unlimited retries will take care of query failures while the max concurrent runs = 1 will keep the costs low
Question J518MSBKdYmq6kts6s7u
Question
A Delta Lake table was created with the below query:
//IMG//
Realizing that the original query had a typographical error, the below code was executed:
ALTER TABLE prod.sales_by_stor RENAME TO prod.sales_by_store
Which result will occur after running the second command?
Choices
- A: The table reference in the metastore is updated.
- B: All related files and metadata are dropped and recreated in a single ACID transaction.
- C: The table name change is recorded in the Delta transaction log.
- D: A new Delta transaction log is created for the renamed table.
answer?
Answer: C Answer_ET: A Community answer C (100%) Discussion
Comment 1366230 by lakime
- Upvotes: 1
Selected Answer: C while the metastore is updated, the key mechanism for tracking changes in Delta Lake is the transaction log.
Question bwvxveoqZoTVk1M38Neu
Question
The data engineering team has configured a Databricks SQL query and alert to monitor the values in a Delta Lake table. The recent_sensor_recordings table contains an identifying sensor_id alongside the timestamp and temperature for the most recent 5 minutes of recordings.
The below query is used to create the alert:
//IMG//
The query is set to refresh each minute and always completes in less than 10 seconds. The alert is set to trigger when mean (temperature) > 120. Notifications are triggered to be sent at most every 1 minute.
If this alert raises notifications for 3 consecutive minutes and then stops, which statement must be true?
Choices
- A: The total average temperature across all sensors exceeded 120 on three consecutive executions of the query
- B: The average temperature recordings for at least one sensor exceeded 120 on three consecutive executions of the query
- C: The source query failed to update properly for three consecutive minutes and then restarted
- D: The maximum temperature recording for at least one sensor exceeded 120 on three consecutive executions of the query
answer?
Answer: B Answer_ET: B Community answer B (100%) Discussion
Comment 1323181 by Thameur01
- Upvotes: 1
Selected Answer: B B, because avg temp is calculated by sensor_id and not total
Comment 1322902 by Ayomidetolu_A
- Upvotes: 1
Selected Answer: B B is the correct answer
Comment 1322878 by e904bf4
- Upvotes: 1
Selected Answer: B B is correct
Question qx4WW5rFC8h3DpzNFxaM
Question
The Databricks workspace administrator has configured interactive clusters for each of the data engineering groups. To control costs, clusters are set to terminate after 30 minutes of inactivity. Each user should be able to execute workloads against their assigned clusters at any time of the day.
Assuming users have been added to a workspace but not granted any permissions, which of the following describes the minimal permissions a user would need to start and attach to an already configured cluster.
Choices
- A: “Can Manage” privileges on the required cluster
- B: Cluster creation allowed, “Can Restart” privileges on the required cluster
- C: Cluster creation allowed, “Can Attach To” privileges on the required cluster
- D: “Can Restart” privileges on the required cluster
answer?
Answer: D Answer_ET: D Community answer D (86%) 14% Discussion
Comment 1329197 by UrcoIbz
- Upvotes: 4
Selected Answer: D ‘can restart’ privileges as is needed start and attach a notebook.
‘can attach to’ is not having enough privileges to start a cluster.
https://docs.databricks.com/en/security/auth/access-control/index.html#clusters
Comment 1326066 by Thameur01
- Upvotes: 1
Selected Answer: C “Can Attach To” Privileges:
The “Can Attach To” permission is sufficient for users to run workloads on an existing cluster. This permission allows users to attach notebooks or jobs to a cluster without needing additional management permissions. Cluster Creation Allowed:
Allowing cluster creation ensures users can create clusters if needed. However, for this scenario, this isn’t mandatory because clusters are already configured. The user needs only “Can Attach To” privileges.
Comment 1322518 by temple1305
- Upvotes: 2
Selected Answer: D Can restart privilages - it was even discussed
Question ejZvjeE9crg30V9dsovh
Question
The data science team has created and logged a production model using MLflow. The following code correctly imports and applies the production model to output the predictions as a new DataFrame named preds with the schema “customer_id LONG, predictions DOUBLE, date DATE”.
//IMG//
The data science team would like predictions saved to a Delta Lake table with the ability to compare all predictions across time. Churn predictions will be made at most once per day.
Which code block accomplishes this task while minimizing potential compute costs?
Choices
- A: preds.write.mode(“append”).saveAsTable(“churn_preds”)
- B: preds.write.format(“delta”).save(“/preds/churn_preds”)
- C:
- D:
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1340342 by lene
- Upvotes: 1
Selected Answer: A batch+append
Comment 1337401 by mouthwash
- Upvotes: 1
Selected Answer: A A is the right answer.
Comment 1322521 by temple1305
- Upvotes: 3
Selected Answer: A You need:
- Batch operation since it is at most once a day
- Append, since you need to keep track of past predictions
A is the correct answer. You don’t need to specify “format” when you use saveAsTable.