Questions and Answers
Question MI5jKvqslZrq06vTK1eX
Question
A dataset has been defined using Delta Live Tables and includes an expectations clause:
CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION DROP ROW
What is the expected behavior when a batch of data containing data that violates these constraints is processed?
Choices
- A: Records that violate the expectation cause the job to fail.
- B: Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.
- C: Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
- D: Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
answer?
Answer: C Answer_ET: C Community answer C (83%) A (17%) Discussion
Comment 1272972 by 9d4d68a
- Upvotes: 2
Repeated, Correct answer is C
Comment 1236566 by vigaro
- Upvotes: 2
Selected Answer: C ON VIOLATION DROP ROW
Comment 1230490 by 31cadd7
- Upvotes: 2
Selected Answer: C it’s C
Comment 1220760 by d39c1db
- Upvotes: 3
C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
When a constraint defined using the EXPECT clause is violated, Delta Live Tables will drop the records that violate the expectation from the target dataset. Additionally, information about the dropped records and the reason for their exclusion will be recorded in the event log for audit and monitoring purposes. This ensures that only valid data meeting the specified constraints is included in the target dataset.
Comment 1215994 by PreranaC
- Upvotes: 1
Selected Answer: C C should be correct, A is for ON VIOLATION FAIL UPDATE
Comment 1215992 by PreranaC
- Upvotes: 1
Selected Answer: A A should be correct
Question hlo6no0YikmL1mMhHYQO
Question
A data engineer has a Job with multiple tasks that runs nightly. Each of the tasks runs slowly because the clusters take a long time to start.
Which action can the data engineer perform to improve the start up time for the clusters used for the Job?
Choices
- A: They can use endpoints available in Databricks SQL
- B: They can use jobs clusters instead of all-purpose clusters
- C: They can configure the clusters to autoscale for larger data sizes
- D: They can use clusters that are from a cluster pool
answer?
Answer: D Answer_ET: D Community answer D (100%) Discussion
Comment 1327323 by MultiCloudIronMan
- Upvotes: 1
Selected Answer: D The correct answer is D. They can use clusters that are from a cluster pool. Using clusters from a cluster pool can significantly reduce the start-up time because the clusters are pre-configured and ready to be used, which eliminates the need to wait for new clusters to be created and started.
Comment 1299624 by RandomForest
- Upvotes: 1
Selected Answer: D pools are a set of idle, ready-to-use instances hence minimizing start-up times
Comment 1272971 by 9d4d68a
- Upvotes: 1
Repeated, Correct
Question HmfSB6m4BqIbGg8ykrPX
Question
A data engineer has a single-task Job that runs each morning before they begin working. After identifying an upstream data issue, they need to set up another task to run a new notebook prior to the original task.
Which approach can the data engineer use to set up the new task?
Choices
- A: They can clone the existing task in the existing Job and update it to run the new notebook.
- B: They can create a new task in the existing Job and then add it as a dependency of the original task.
- C: They can create a new task in the existing Job and then add the original task as a dependency of the new task.
- D: They can create a new job from scratch and add both tasks to run concurrently.
answer?
Answer: B Answer_ET: B Community answer B (80%) C (20%) Discussion
Comment 1409737 by Billybob0604
- Upvotes: 1
Selected Answer: C No, a new notebook needs to be run prior to the original task meaning the original task depends on the new notebook, hence C.
Comment 1282733 by CommanderBigMac
- Upvotes: 1
Selected Answer: B B is correct. the new task needs to be the dependancy.
Comment 1272970 by 9d4d68a
- Upvotes: 1
Correct Answer: B
Explanation: To set up the new task to run a new notebook prior to the original task in a single-task Job, the data engineer can use the following approach: In the existing Job, create a new task that corresponds to the new notebook that needs to be run. Set up the new task with the appropriate configuration, specifying the notebook to be executed and any necessary parameters or dependencies. Once the new task is created, designate it as a dependency of the original task in the Job configuration. This ensures that the new task is executed before the original task.
Comment 1227362 by hussamAlHunaiti
- Upvotes: 1
Selected Answer: B Answer is B. New task is prior than the original task.
Comment 1215997 by PreranaC
- Upvotes: 1
Selected Answer: B B is correct
Comment 1213790 by nmosq
- Upvotes: 1
B is correct, “needs to run prior to the original task”
Comment 1213719 by BharaniRaj
- Upvotes: 1
Selected Answer: B B is correct
Comment 1209951 by Kunka
- Upvotes: 1
B is correct, as new task runs first
Comment 1209378 by Ivan_Petrov
- Upvotes: 1
B is correct
Question d23UJBhkUBBtyZFmFl3D
Question
A single Job runs two notebooks as two separate tasks. A data engineer has noticed that one of the notebooks is running slowly in the Job’s current run. The data engineer asks a tech lead for help in identifying why this might be the case.
Which approach can the tech lead use to identify why the notebook is running slowly as part of the Job?
Choices
- A: They can navigate to the Runs tab in the Jobs UI to immediately review the processing notebook.
- B: They can navigate to the Tasks tab in the Jobs UI and click on the active run to review the processing notebook.
- C: They can navigate to the Runs tab in the Jobs UI and click on the active run to review the processing notebook.
- D: They can navigate to the Tasks tab in the Jobs UI to immediately review the processing notebook.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1327324 by MultiCloudIronMan
- Upvotes: 1
Selected Answer: C The correct answer is C. They can navigate to the Runs tab in the Jobs UI and click on the active run to review the processing notebook. This approach allows the tech lead to directly access and review the notebook that is currently running, helping to identify any issues causing it to run slowly.
Comment 1282734 by CommanderBigMac
- Upvotes: 1
Selected Answer: C Question states it is Running slowly, nothing is wrong with the job itself, so the Run needs to be checked.
Comment 1272964 by 9d4d68a
- Upvotes: 1
Repeated, Correct
Question zU0SrUhh0MdpRvtWVmW6
Question
Which of the following commands will return the location of database customer360?
Choices
- A: DESCRIBE LOCATION customer360;
- B: DROP DATABASE customer360;
- C: DESCRIBE DATABASE customer360;
- D: ALTER DATABASE customer360 SET DBPROPERTIES (‘location’ = ‘/user’};
- E: USE DATABASE customer360;
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 997916 by vctrhugo
- Upvotes: 8
Selected Answer: C C. DESCRIBE DATABASE customer360;
To retrieve the location of a database named “customer360” in a database management system like Hive or Databricks, you can use the DESCRIBE DATABASE command followed by the database name. This command will provide information about the database, including its location.
Comment 1262398 by 80370eb
- Upvotes: 1
Selected Answer: C C. DESCRIBE DATABASE customer360; this will show the location of the databaase.
Comment 1203172 by benni_ale
- Upvotes: 1
Selected Answer: C C is correct
Comment 1177193 by Itmma
- Upvotes: 1
Selected Answer: C C is correct
Comment 1113193 by SerGrey
- Upvotes: 1
Selected Answer: C Correct answer is C
Comment 1064793 by awofalus
- Upvotes: 1
Selected Answer: C Correct :C
Comment 1017351 by KalavathiP
- Upvotes: 1
Selected Answer: C C is correct
Comment 978232 by Akshay67364
- Upvotes: 1
Option C
Comment 972704 by Gowthamr02
- Upvotes: 1
Option C
Comment 876210 by Varma_Saraswathula
- Upvotes: 2
Option C - https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-describe-database.html
Comment 859666 by surrabhi_4
- Upvotes: 2
Selected Answer: C option c
Comment 858863 by knivesz
- Upvotes: 2
Selected Answer: C Muy facil
Comment 857992 by XiltroX
- Upvotes: 3
Selected Answer: C Correct answer