Questions and Answers

Question ghbaHcPaGQAYeTtYBYu2

Question

A junior data engineer has manually configured a series of jobs using the Databricks Jobs UI. Upon reviewing their work, the engineer realizes that they are listed as the “Owner” for each job. They attempt to transfer “Owner” privileges to the “DevOps” group, but cannot successfully accomplish this task.

Which statement explains what is preventing this privilege transfer?

Choices

A: Databricks jobs must have exactly one owner; “Owner” privileges cannot be assigned to a group.
B: The creator of a Databricks job will always have “Owner” privileges; this configuration cannot be changed.
C: Only workspace administrators can grant “Owner” privileges to a group.
D: A user can only transfer job ownership to a group if they are also a member of that group.

answer?

Answer: A Answer_ET: A Community answer A (100%) Discussion

Comment 1238165 by 03355a2

Upvotes: 5

Selected Answer: A This is the correct answer for this question in a past Databricks version, however now you can indeed add a group as a owner to a job.

Comment 1222428 by imatheushenrique

Upvotes: 1

A. Databricks jobs must have exactly one owner; “Owner” privileges cannot be assigned to a group. It’s only possivel that a databricks JOB has an owner, not a group.

Question Qb4qn2yMUDK0p3nAoqjx

Question

A table named user_ltv is being used to create a view that will be used by data analysts on various teams. Users in the workspace are configured into groups, which are used for setting up data access using ACLs.

The user_ltv table has the following schema:

email STRING, age INT, ltv INT

The following view definition is executed:

//IMG//

An analyst who is not a member of the auditing group executes the following query:

SELECT * FROM user_ltv_no_minors

Which statement describes the results returned by this query?

Choices

A: All columns will be displayed normally for those records that have an age greater than 17; records not meeting this condition will be omitted.
B: All age values less than 18 will be returned as null values, all other columns will be returned with the values in user_ltv.
C: All values for the age column will be returned as null values, all other columns will be returned with the values in user_ltv.
D: All columns will be displayed normally for those records that have an age greater than 18; records not meeting this condition will be omitted.

answer?

Answer: A Answer_ET: A Community answer A (100%) Discussion

Comment 1230603 by Isio05

Upvotes: 3

Selected Answer: A Surely, it’s an A

Comment 1229325 by hpkr

Upvotes: 2

Selected Answer: A option A is correct

Comment 1226709 by BrianNguyen95

Upvotes: 2

Selected Answer: A Greater than 17

Comment 1222905 by Freyr

Upvotes: 1

Selected Answer: A Correct Answer: A (>17) qual to (>=18). So, all records above 17 years will get in result and other records will be omitted.

Comment 1222427 by imatheushenrique

Upvotes: 1

A. All columns will be displayed normally for those records that have an age greater than 17; records not meeting this condition will be omitted.

Because the condition of age>=18 only is respected in option A.

Comment 1221171 by MDWPartners

Upvotes: 1

Selected Answer: A Nope, A greater than 18 is 19. D is incorrect.

Question OMBF6p3aCn26Wx4ZKTOL

Question

All records from an Apache Kafka producer are being ingested into a single Delta Lake table with the following schema:

key BINARY, value BINARY, topic STRING, partition LONG, offset LONG, timestamp LONG

There are 5 unique topics being ingested. Only the “registration” topic contains Personal Identifiable Information (PII). The company wishes to restrict access to PII. The company also wishes to only retain records containing PII in this table for 14 days after initial ingestion. However, for non-PII information, it would like to retain these records indefinitely.

Which solution meets the requirements?

Choices

A: All data should be deleted biweekly; Delta Lake’s time travel functionality should be leveraged to maintain a history of non-PII information.
B: Data should be partitioned by the registration field, allowing ACLs and delete statements to be set for the PII directory.
C: Data should be partitioned by the topic field, allowing ACLs and delete statements to leverage partition boundaries.
D: Separate object storage containers should be specified based on the partition field, allowing isolation at the storage level.

answer?

Answer: C Answer_ET: C Community answer C (100%) Discussion

Comment 1229327 by hpkr

Upvotes: 2

Selected Answer: C C is correct

Comment 1222425 by imatheushenrique

Upvotes: 1

C. Partitioning the data by the topic field allows the company to apply different access control policies and retention policies for different topics. Althought there is a performance optmization gain because of the read in the partition path.

Question fyl6VplQahCgrakimcQY

Question

The data governance team is reviewing code used for deleting records for compliance with GDPR. The following logic has been implemented to propagate delete requests from the user_lookup table to the user_aggregates table.

//IMG//

Assuming that user_id is a unique identifying key and that all users that have requested deletion have been removed from the user_lookup table, which statement describes whether successfully executing the above logic guarantees that the records to be deleted from the user_aggregates table are no longer accessible and why?

Choices

A: No; the Delta Lake DELETE command only provides ACID guarantees when combined with the MERGE INTO command.
B: No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.
C: No; the change data feed only tracks inserts and updates, not deleted records.
D: Yes; Delta Lake ACID guarantees provide assurance that the DELETE command succeeded fully and permanently purged these records.

answer?

Answer: B Answer_ET: B Community answer B (100%) Discussion

Comment 1300603 by m79590530

Upvotes: 1

Selected Answer: B Default Delta Lake time travel retention is 7 days so records deleted are still accessible via previous table versions up to 7 days later unless somebody changes this default setting to less days and runs VACUUM on the table earlier.

Comment 1222423 by imatheushenrique

Upvotes: 3

B. No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.

Question vgSrvrJl7Dp87ncVbdXW

Question

An external object storage container has been mounted to the location /mnt/finance_eda_bucket.

The following logic was executed to create a database for the finance team:

//IMG//

After the database was successfully created and permissions configured, a member of the finance team runs the following code:

//IMG//

If all users on the finance team are members of the finance group, which statement describes how the tx_sales table will be created?

Choices

A: A logical table will persist the query plan to the Hive Metastore in the Databricks control plane.
B: An external table will be created in the storage container mounted to /mnt/finance_eda_bucket.
C: A managed table will be created in the DBFS root storage container.
D: An managed table will be created in the storage container mounted to /mnt/finance_eda_bucket.

answer?

Answer: D Answer_ET: D Community answer D (90%) 10% Discussion

Comment 1555897 by a85becd

Upvotes: 1

Selected Answer: B The table cannot be a managed table:

If a database specifies a LOCATION, all tables created inside it inherit that LOCATION, and they are external tables.

Managed tables cannot exist in a database where the storage location has been explicitly specified, as the managed table lifecycle is tied to Databricks’ internal storage system. -Even if you don’t explicitly define the table as external, the database’s external location forces all tables to behave as external tables.

Comment 1326688 by Sriramiyer92

Upvotes: 1

Selected Answer: D D. Managed tables - Since Location is not mentioned. mounted to /mnt/finance_eda_bucket - This is where our database is created as per first query and in second query we are using it to create the table further.

Comment 1230606 by Isio05

Upvotes: 3

Selected Answer: D It will be created in database location, but it will be managed table (missing LOCATION keyword in CREATE TABLE).

Comment 1229342 by hpkr

Upvotes: 2

Selected Answer: D D is correct

Comment 1222909 by Freyr

Upvotes: 2

Selected Answer: D Correct Answer: D The table is still managed by Spark SQL in terms of metadata, but the data files are stored in the specified path due to the database’s location setting.

Given the inherited location from the database, if the CREATE TABLE statement had explicitly used USING EXTERNAL or specified a LOCATION, it would definitely be an external table. However, since it inherits the database’s location without these specifications, it creates a managed table.

Comment 1221172 by MDWPartners

Upvotes: 1

Selected Answer: D Seems D

vuthanhdatt's Second Brain

Explorer

15

Questions and Answers

Question ghbaHcPaGQAYeTtYBYu2

Question

Choices

Comment 1238165 by 03355a2

Comment 1222428 by imatheushenrique

Question Qb4qn2yMUDK0p3nAoqjx

Question

Choices

Comment 1230603 by Isio05

Comment 1229325 by hpkr

Comment 1226709 by BrianNguyen95

Comment 1222905 by Freyr

Comment 1222427 by imatheushenrique

Comment 1221171 by MDWPartners

Question OMBF6p3aCn26Wx4ZKTOL

Question

Choices

Comment 1229327 by hpkr

Comment 1222425 by imatheushenrique

Question fyl6VplQahCgrakimcQY

Question

Choices

Comment 1300603 by m79590530

Comment 1222423 by imatheushenrique

Question vgSrvrJl7Dp87ncVbdXW

Question

Choices

Comment 1555897 by a85becd

Comment 1326688 by Sriramiyer92

Comment 1230606 by Isio05

Comment 1229342 by hpkr

Comment 1222909 by Freyr

Comment 1221172 by MDWPartners

Graph View

Table of Contents