Questions and Answers
Question ghbaHcPaGQAYeTtYBYu2
Question
A junior data engineer has manually configured a series of jobs using the Databricks Jobs UI. Upon reviewing their work, the engineer realizes that they are listed as the “Owner” for each job. They attempt to transfer “Owner” privileges to the “DevOps” group, but cannot successfully accomplish this task.
Which statement explains what is preventing this privilege transfer?
Choices
- A: Databricks jobs must have exactly one owner; “Owner” privileges cannot be assigned to a group.
- B: The creator of a Databricks job will always have “Owner” privileges; this configuration cannot be changed.
- C: Only workspace administrators can grant “Owner” privileges to a group.
- D: A user can only transfer job ownership to a group if they are also a member of that group.
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1238165 by 03355a2
- Upvotes: 5
Selected Answer: A This is the correct answer for this question in a past Databricks version, however now you can indeed add a group as a owner to a job.
Comment 1222428 by imatheushenrique
- Upvotes: 1
A. Databricks jobs must have exactly one owner; “Owner” privileges cannot be assigned to a group. It’s only possivel that a databricks JOB has an owner, not a group.
Question Qb4qn2yMUDK0p3nAoqjx
Question
A table named user_ltv is being used to create a view that will be used by data analysts on various teams. Users in the workspace are configured into groups, which are used for setting up data access using ACLs.
The user_ltv table has the following schema:
email STRING, age INT, ltv INT
The following view definition is executed:
//IMG//
An analyst who is not a member of the auditing group executes the following query:
SELECT * FROM user_ltv_no_minors
Which statement describes the results returned by this query?
Choices
- A: All columns will be displayed normally for those records that have an age greater than 17; records not meeting this condition will be omitted.
- B: All age values less than 18 will be returned as null values, all other columns will be returned with the values in user_ltv.
- C: All values for the age column will be returned as null values, all other columns will be returned with the values in user_ltv.
- D: All columns will be displayed normally for those records that have an age greater than 18; records not meeting this condition will be omitted.
answer?
Answer: A Answer_ET: A Community answer A (100%) Discussion
Comment 1230603 by Isio05
- Upvotes: 3
Selected Answer: A Surely, it’s an A
Comment 1229325 by hpkr
- Upvotes: 2
Selected Answer: A option A is correct
Comment 1226709 by BrianNguyen95
- Upvotes: 2
Selected Answer: A Greater than 17
Comment 1222905 by Freyr
- Upvotes: 1
Selected Answer: A Correct Answer: A (>17) qual to (>=18). So, all records above 17 years will get in result and other records will be omitted.
Comment 1222427 by imatheushenrique
- Upvotes: 1
A. All columns will be displayed normally for those records that have an age greater than 17; records not meeting this condition will be omitted.
Because the condition of age>=18 only is respected in option A.
Comment 1221171 by MDWPartners
- Upvotes: 1
Selected Answer: A Nope, A greater than 18 is 19. D is incorrect.
Question OMBF6p3aCn26Wx4ZKTOL
Question
All records from an Apache Kafka producer are being ingested into a single Delta Lake table with the following schema:
key BINARY, value BINARY, topic STRING, partition LONG, offset LONG, timestamp LONG
There are 5 unique topics being ingested. Only the “registration” topic contains Personal Identifiable Information (PII). The company wishes to restrict access to PII. The company also wishes to only retain records containing PII in this table for 14 days after initial ingestion. However, for non-PII information, it would like to retain these records indefinitely.
Which solution meets the requirements?
Choices
- A: All data should be deleted biweekly; Delta Lake’s time travel functionality should be leveraged to maintain a history of non-PII information.
- B: Data should be partitioned by the registration field, allowing ACLs and delete statements to be set for the PII directory.
- C: Data should be partitioned by the topic field, allowing ACLs and delete statements to leverage partition boundaries.
- D: Separate object storage containers should be specified based on the partition field, allowing isolation at the storage level.
answer?
Answer: C Answer_ET: C Community answer C (100%) Discussion
Comment 1229327 by hpkr
- Upvotes: 2
Selected Answer: C C is correct
Comment 1222425 by imatheushenrique
- Upvotes: 1
C. Partitioning the data by the topic field allows the company to apply different access control policies and retention policies for different topics. Althought there is a performance optmization gain because of the read in the partition path.
Question fyl6VplQahCgrakimcQY
Question
The data governance team is reviewing code used for deleting records for compliance with GDPR. The following logic has been implemented to propagate delete requests from the user_lookup table to the user_aggregates table.
//IMG//
Assuming that user_id is a unique identifying key and that all users that have requested deletion have been removed from the user_lookup table, which statement describes whether successfully executing the above logic guarantees that the records to be deleted from the user_aggregates table are no longer accessible and why?
Choices
- A: No; the Delta Lake DELETE command only provides ACID guarantees when combined with the MERGE INTO command.
- B: No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.
- C: No; the change data feed only tracks inserts and updates, not deleted records.
- D: Yes; Delta Lake ACID guarantees provide assurance that the DELETE command succeeded fully and permanently purged these records.
answer?
Answer: B Answer_ET: B Community answer B (100%) Discussion
Comment 1300603 by m79590530
- Upvotes: 1
Selected Answer: B Default Delta Lake time travel retention is 7 days so records deleted are still accessible via previous table versions up to 7 days later unless somebody changes this default setting to less days and runs VACUUM on the table earlier.
Comment 1222423 by imatheushenrique
- Upvotes: 3
B. No; files containing deleted records may still be accessible with time travel until a VACUUM command is used to remove invalidated data files.
Question vgSrvrJl7Dp87ncVbdXW
Question
An external object storage container has been mounted to the location /mnt/finance_eda_bucket.
The following logic was executed to create a database for the finance team:
//IMG//
After the database was successfully created and permissions configured, a member of the finance team runs the following code:
//IMG//
If all users on the finance team are members of the finance group, which statement describes how the tx_sales table will be created?
Choices
- A: A logical table will persist the query plan to the Hive Metastore in the Databricks control plane.
- B: An external table will be created in the storage container mounted to /mnt/finance_eda_bucket.
- C: A managed table will be created in the DBFS root storage container.
- D: An managed table will be created in the storage container mounted to /mnt/finance_eda_bucket.
answer?
Answer: D Answer_ET: D Community answer D (90%) 10% Discussion
Comment 1555897 by a85becd
- Upvotes: 1
Selected Answer: B The table cannot be a managed table:
- If a database specifies a
LOCATION, all tables created inside it inherit thatLOCATION, and they are external tables.- Managed tables cannot exist in a database where the storage location has been explicitly specified, as the managed table lifecycle is tied to Databricks’ internal storage system. -Even if you don’t explicitly define the table as external, the database’s external location forces all tables to behave as external tables.
Comment 1326688 by Sriramiyer92
- Upvotes: 1
Selected Answer: D D. Managed tables - Since Location is not mentioned. mounted to /mnt/finance_eda_bucket - This is where our database is created as per first query and in second query we are using it to create the table further.
Comment 1230606 by Isio05
- Upvotes: 3
Selected Answer: D It will be created in database location, but it will be managed table (missing LOCATION keyword in CREATE TABLE).
Comment 1229342 by hpkr
- Upvotes: 2
Selected Answer: D D is correct
Comment 1222909 by Freyr
- Upvotes: 2
Selected Answer: D Correct Answer: D The table is still managed by Spark SQL in terms of metadata, but the data files are stored in the specified path due to the database’s location setting.
Given the inherited location from the database, if the CREATE TABLE statement had explicitly used USING EXTERNAL or specified a LOCATION, it would definitely be an external table. However, since it inherits the database’s location without these specifications, it creates a managed table.
Comment 1221172 by MDWPartners
- Upvotes: 1
Selected Answer: D Seems D