Questions and Answers

Question MI5jKvqslZrq06vTK1eX

Question

A dataset has been defined using Delta Live Tables and includes an expectations clause:

CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION DROP ROW

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

Choices

A: Records that violate the expectation cause the job to fail.
B: Records that violate the expectation are added to the target dataset and flagged as invalid in a field added to the target dataset.
C: Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
D: Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.

answer?

Answer: C Answer_ET: C Community answer C (83%) A (17%) Discussion

Comment 1272972 by 9d4d68a

Upvotes: 2

Repeated, Correct answer is C

Comment 1236566 by vigaro

Upvotes: 2

Selected Answer: C ON VIOLATION DROP ROW

Comment 1230490 by 31cadd7

Upvotes: 2

Selected Answer: C it’s C

Comment 1220760 by d39c1db

Upvotes: 3

C. Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.

When a constraint defined using the EXPECT clause is violated, Delta Live Tables will drop the records that violate the expectation from the target dataset. Additionally, information about the dropped records and the reason for their exclusion will be recorded in the event log for audit and monitoring purposes. This ensures that only valid data meeting the specified constraints is included in the target dataset.

Comment 1215994 by PreranaC

Upvotes: 1

Selected Answer: C C should be correct, A is for ON VIOLATION FAIL UPDATE

Comment 1215992 by PreranaC

Upvotes: 1

Selected Answer: A A should be correct

Question hlo6no0YikmL1mMhHYQO

Question

A data engineer has a Job with multiple tasks that runs nightly. Each of the tasks runs slowly because the clusters take a long time to start.

Which action can the data engineer perform to improve the start up time for the clusters used for the Job?

Choices

A: They can use endpoints available in Databricks SQL
B: They can use jobs clusters instead of all-purpose clusters
C: They can configure the clusters to autoscale for larger data sizes
D: They can use clusters that are from a cluster pool

answer?

Answer: D Answer_ET: D Community answer D (100%) Discussion

Comment 1327323 by MultiCloudIronMan

Upvotes: 1

Selected Answer: D The correct answer is D. They can use clusters that are from a cluster pool. Using clusters from a cluster pool can significantly reduce the start-up time because the clusters are pre-configured and ready to be used, which eliminates the need to wait for new clusters to be created and started.

Comment 1299624 by RandomForest

Upvotes: 1

Selected Answer: D pools are a set of idle, ready-to-use instances hence minimizing start-up times

Comment 1272971 by 9d4d68a

Upvotes: 1

Repeated, Correct

Question HmfSB6m4BqIbGg8ykrPX

Question

A data engineer has a single-task Job that runs each morning before they begin working. After identifying an upstream data issue, they need to set up another task to run a new notebook prior to the original task.

Which approach can the data engineer use to set up the new task?

Choices

A: They can clone the existing task in the existing Job and update it to run the new notebook.
B: They can create a new task in the existing Job and then add it as a dependency of the original task.
C: They can create a new task in the existing Job and then add the original task as a dependency of the new task.
D: They can create a new job from scratch and add both tasks to run concurrently.

answer?

Answer: B Answer_ET: B Community answer B (80%) C (20%) Discussion

Comment 1409737 by Billybob0604

Upvotes: 1

Selected Answer: C No, a new notebook needs to be run prior to the original task meaning the original task depends on the new notebook, hence C.

Comment 1282733 by CommanderBigMac

Upvotes: 1

Selected Answer: B B is correct. the new task needs to be the dependancy.

Comment 1272970 by 9d4d68a

Upvotes: 1

Correct Answer: B

Explanation: To set up the new task to run a new notebook prior to the original task in a single-task Job, the data engineer can use the following approach: In the existing Job, create a new task that corresponds to the new notebook that needs to be run. Set up the new task with the appropriate configuration, specifying the notebook to be executed and any necessary parameters or dependencies. Once the new task is created, designate it as a dependency of the original task in the Job configuration. This ensures that the new task is executed before the original task.

Comment 1227362 by hussamAlHunaiti

Upvotes: 1

Selected Answer: B Answer is B. New task is prior than the original task.

Comment 1215997 by PreranaC

Upvotes: 1

Selected Answer: B B is correct

Comment 1213790 by nmosq

Upvotes: 1

B is correct, “needs to run prior to the original task”

Comment 1213719 by BharaniRaj

Upvotes: 1

Selected Answer: B B is correct

Comment 1209951 by Kunka

Upvotes: 1

B is correct, as new task runs first

Comment 1209378 by Ivan_Petrov

Upvotes: 1

B is correct

Question d23UJBhkUBBtyZFmFl3D

Question

A single Job runs two notebooks as two separate tasks. A data engineer has noticed that one of the notebooks is running slowly in the Job’s current run. The data engineer asks a tech lead for help in identifying why this might be the case.

Which approach can the tech lead use to identify why the notebook is running slowly as part of the Job?

Choices

A: They can navigate to the Runs tab in the Jobs UI to immediately review the processing notebook.
B: They can navigate to the Tasks tab in the Jobs UI and click on the active run to review the processing notebook.
C: They can navigate to the Runs tab in the Jobs UI and click on the active run to review the processing notebook.
D: They can navigate to the Tasks tab in the Jobs UI to immediately review the processing notebook.

answer?

Answer: C Answer_ET: C Community answer C (100%) Discussion

Comment 1327324 by MultiCloudIronMan

Upvotes: 1

Selected Answer: C The correct answer is C. They can navigate to the Runs tab in the Jobs UI and click on the active run to review the processing notebook. This approach allows the tech lead to directly access and review the notebook that is currently running, helping to identify any issues causing it to run slowly.

Comment 1282734 by CommanderBigMac

Upvotes: 1

Selected Answer: C Question states it is Running slowly, nothing is wrong with the job itself, so the Run needs to be checked.

Comment 1272964 by 9d4d68a

Upvotes: 1

Repeated, Correct

Question zU0SrUhh0MdpRvtWVmW6

Question

Which of the following commands will return the location of database customer360?

Choices

A: DESCRIBE LOCATION customer360;
B: DROP DATABASE customer360;
C: DESCRIBE DATABASE customer360;
D: ALTER DATABASE customer360 SET DBPROPERTIES (‘location’ = ‘/user’};
E: USE DATABASE customer360;

answer?

Answer: C Answer_ET: C Community answer C (100%) Discussion

Comment 997916 by vctrhugo

Upvotes: 8

Selected Answer: C C. DESCRIBE DATABASE customer360;

To retrieve the location of a database named “customer360” in a database management system like Hive or Databricks, you can use the DESCRIBE DATABASE command followed by the database name. This command will provide information about the database, including its location.

Comment 1262398 by 80370eb

Upvotes: 1

Selected Answer: C C. DESCRIBE DATABASE customer360; this will show the location of the databaase.

Comment 1203172 by benni_ale

Upvotes: 1

Selected Answer: C C is correct

Comment 1177193 by Itmma

Upvotes: 1

Selected Answer: C C is correct

Comment 1113193 by SerGrey

Upvotes: 1

Selected Answer: C Correct answer is C

Comment 1064793 by awofalus

Upvotes: 1

Selected Answer: C Correct :C

Comment 1017351 by KalavathiP

Upvotes: 1

Selected Answer: C C is correct

Comment 978232 by Akshay67364

Upvotes: 1

Option C

Comment 972704 by Gowthamr02

Upvotes: 1

Option C

Comment 876210 by Varma_Saraswathula

Upvotes: 2

Option C - https://spark.apache.org/docs/3.0.0-preview/sql-ref-syntax-aux-describe-database.html

Comment 859666 by surrabhi_4

Upvotes: 2

Selected Answer: C option c

Comment 858863 by knivesz

Upvotes: 2

Selected Answer: C Muy facil

Comment 857992 by XiltroX

Upvotes: 3

Selected Answer: C Correct answer

vuthanhdatt's Second Brain

Explorer

7

Questions and Answers

Question MI5jKvqslZrq06vTK1eX

Question

Choices

Comment 1272972 by 9d4d68a

Comment 1236566 by vigaro

Comment 1230490 by 31cadd7

Comment 1220760 by d39c1db

Comment 1215994 by PreranaC

Comment 1215992 by PreranaC

Question hlo6no0YikmL1mMhHYQO

Question

Choices

Comment 1327323 by MultiCloudIronMan

Comment 1299624 by RandomForest

Comment 1272971 by 9d4d68a

Question HmfSB6m4BqIbGg8ykrPX

Question

Choices

Comment 1409737 by Billybob0604

Comment 1282733 by CommanderBigMac

Comment 1272970 by 9d4d68a

Comment 1227362 by hussamAlHunaiti

Comment 1215997 by PreranaC

Comment 1213790 by nmosq

Comment 1213719 by BharaniRaj

Comment 1209951 by Kunka

Comment 1209378 by Ivan_Petrov

Question d23UJBhkUBBtyZFmFl3D

Question

Choices

Comment 1327324 by MultiCloudIronMan

Comment 1282734 by CommanderBigMac

Comment 1272964 by 9d4d68a

Question zU0SrUhh0MdpRvtWVmW6

Question

Choices

Comment 997916 by vctrhugo

Comment 1262398 by 80370eb

Comment 1203172 by benni_ale

Comment 1177193 by Itmma

Comment 1113193 by SerGrey

Comment 1064793 by awofalus

Comment 1017351 by KalavathiP

Comment 978232 by Akshay67364

Comment 972704 by Gowthamr02

Comment 876210 by Varma_Saraswathula

Comment 859666 by surrabhi_4

Comment 858863 by knivesz

Comment 857992 by XiltroX

Graph View

Table of Contents