Labour Day - Special 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70dumps

DP-203 Questions and Answers

Question # 6

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Full Access
Question # 7

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access
Question # 8

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 9

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Full Access
Question # 10

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 11

You have an Azure Synapse Analytics dedicated SQL pool that contains a table named Sales.Orders. Sales.Orders contains a column named SalesRep.

You plan to implement row-level security (RLS) for Sales.Orders.

You need to create the security policy that will be used to implement RLS. The solution must ensure that sales representatives only see rows for which the value of the SalesRep column matches their username.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 12

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

Full Access
Question # 13

You have an Azure Databricks workspace that contains a Delta Lake dimension table named Tablet. Table1 is a Type 2 slowly changing dimension (SCD) table. You need to apply updates from a source table to Table1. Which Apache Spark SQL operation should you use?

A.

CREATE

B.

UPDATE

C.

MERGE

D.

ALTER

Full Access
Question # 14

You have an Azure SQL database named Database1 and two Azure event hubs named HubA and HubB. The data consumed from each source is shown in the following table.

You need to implement Azure Stream Analytics to calculate the average fare per mile by driver.

How should you configure the Stream Analytics input for each source? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 15

You are building an Azure Stream Analytics job to retrieve game data.

You need to ensure that the job returns the highest scoring record for each five-minute time interval of each game.

How should you complete the Stream Analytics query? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 16

You have a self-hosted integration runtime in Azure Data Factory.

The current status of the integration runtime has the following configurations:

  • Status: Running
  • Type: Self-Hosted
  • Version: 4.4.7292.1
  • Running / Registered Node(s): 1/1
  • High Availability Enabled: False
  • Linked Count: 0
  • Queue Length: 0
  • Average Queue Duration. 0.00s

The integration runtime has the following node details:

  • Name: X-M
  • Status: Running
  • Version: 4.4.7292.1
  • Available Memory: 7697MB
  • CPU Utilization: 6%
  • Network (In/Out): 1.21KBps/0.83KBps
  • Concurrent Jobs (Running/Limit): 2/14
  • Role: Dispatcher/Worker
  • Credential Status: In Sync

Use the drop-down menus to select the answer choice that completes each statement based on the information presented.

NOTE: Each correct selection is worth one point.

Full Access
Question # 17

You are designing a solution that will copy Parquet files stored in an Azure Blob storage account to an Azure Data Lake Storage Gen2 account.

The data will be loaded daily to the data lake and will use a folder structure of {Year}/{Month}/{Day}/.

You need to design a daily Azure Data Factory data load to minimize the data transfer between the two

accounts.

Which two configurations should you include in the design? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

A.

Delete the files in the destination before loading new data.

B.

Filter by the last modified date of the source files.

C.

Delete the source files after they are copied.

D.

Specify a file naming pattern for the destination.

Full Access
Question # 18

You have an Azure Data Factory pipeline that contains a data flow. The data flow contains the following expression.

Full Access
Question # 19

You have an Azure Databricks resource.

You need to log actions that relate to changes in compute for the Databricks resource.

Which Databricks services should you log?

A.

clusters

B.

workspace

C.

DBFS

D.

SSH

E jobs

Full Access
Question # 20

You have an Azure Stream Analytics job that is a Stream Analytics project solution in Microsoft Visual Studio. The job accepts data generated by IoT devices in the JSON format.

You need to modify the job to accept data generated by the IoT devices in the Protobuf format.

Which three actions should you perform from Visual Studio on sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 21

You are designing a folder structure for the files m an Azure Data Lake Storage Gen2 account. The account has one container that contains three years of data.

You need to recommend a folder structure that meets the following requirements:

• Supports partition elimination for queries by Azure Synapse Analytics serverless SQL pooh

• Supports fast data retrieval for data from the current month

• Simplifies data security management by department

Which folder structure should you recommend?

A.

\YYY\MM\DD\Department\DataSource\DataFile_YYYMMMDD.parquet

B.

\Depdftment\DataSource\YYY\MM\DataFile_YYYYMMDD.parquet

C.

\DD\MM\YYYY\Department\DataSource\DataFile_DDMMYY.parquet

D.

\DataSource\Department\YYYYMM\DataFile_YYYYMMDD.parquet

Full Access
Question # 22

You are designing a highly available Azure Data Lake Storage solution that will include geo-zone-redundant storage (GZRS).

You need to monitor for replication delays that can affect the recovery point objective (RPO).

What should you include in the monitoring solution?

A.

availability

B.

Average Success E2E Latency

C.

5xx: Server Error errors

D.

Last Sync Time

Full Access
Question # 23

You are designing an anomaly detection solution for streaming data from an Azure IoT hub. The solution must meet the following requirements:

  • Send the output to Azure Synapse.
  • Identify spikes and dips in time series data.
  • Minimize development and configuration effort.

Which should you include in the solution?

A.

Azure Databricks

B.

Azure Stream Analytics

C.

Azure SQL Database

Full Access
Question # 24

You are monitoring an Azure Stream Analytics job by using metrics in Azure.

You discover that during the last 12 hours, the average watermark delay is consistently greater than the configured late arrival tolerance.

What is a possible cause of this behavior?

A.

Events whose application timestamp is earlier than their arrival time by more than five minutes arrive as inputs.

B.

There are errors in the input data.

C.

The late arrival policy causes events to be dropped.

D.

The job lacks the resources to process the volume of incoming data.

Full Access
Question # 25

You have an Azure Synapse Analytics dedicated SQL pool named Pcol1. Pool1 contains a table named tablet

You load 5 TB of data into table1.

You need to ensure that column store compression is maximized for table1.

Which statement should you execute?

A.

ALTER INDEX ALL on table REBUILD

B.

DBCC DBREINOEX (table)

C.

DBCC IIDEXDEFRAG (pool1, table1)

D.

ALTER INDEX ALL on table REORGANIZE

Full Access
Question # 26

You are designing database for an Azure Synapse Analytics dedicated SQL pool to support workloads for detecting ecommerce transaction fraud.

Data will be combined from multiple ecommerce sites and can include sensitive financial information such as credit card numbers.

You need to recommend a solution that meets the following requirements:

  • Users must be able to identify potentially fraudulent transactions.
  • Users must be able to use credit cards as a potential feature in models.
  • Users must NOT be able to access the actual credit card numbers.

What should you include in the recommendation?

A.

Transparent Data Encryption (TDE)

B.

row-level security (RLS)

C.

column-level encryption

D.

Azure Active Directory (Azure AD) pass-through authentication

Full Access
Question # 27

You have the following Azure Stream Analytics query.

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each correct selection is worth one point.

Full Access
Question # 28

A company has a real-time data analysis solution that is hosted on Microsoft Azure. The solution uses Azure Event Hub to ingest data and an Azure Stream Analytics cloud job to analyze the data. The cloud job is configured to use 120 Streaming Units (SU).

You need to optimize performance for the Azure Stream Analytics job.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

A.

Implement event ordering.

B.

Implement Azure Stream Analytics user-defined functions (UDF).

C.

Implement query parallelization by partitioning the data output.

D.

Scale the SU count for the job up.

E.

Scale the SU count for the job down.

F.

Implement query parallelization by partitioning the data input.

Full Access
Question # 29

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:

  • A workload for data engineers who will use Python and SQL.
  • A workload for jobs that will run notebooks that use Python, Scala, and SOL.
  • A workload that data scientists will use to perform ad hoc analysis in Scala and R.

The enterprise architecture team at your company identifies the following standards for Databricks environments:

  • The data engineers must share a cluster.
  • The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
  • All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.

You need to create the Databricks clusters for the workloads.

Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 30

You have an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 contains a fact table named Tablet. Table1 contains sales data. Sixty-five million rows of data are added to Table1 monthly.

At the end of each month, you need to remove data that is older than 36 months. The solution must minimize how long it takes to remove the data.

How should you partition Table1, and how should you remove the old data? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 31

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a tumbling window, and you set the window size to 10 seconds.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 32

You have an Azure Synapse Analytics workspace named WS1 that contains an Apache Spark pool named Pool1.

You plan to create a database named D61 in Pool1.

You need to ensure that when tables are created in DB1, the tables are available automatically as external tables to the built-in serverless SQL pod.

Which format should you use for the tables in DB1?

A.

Parquet

B.

CSV

C.

ORC

D.

JSON

Full Access
Question # 33

You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications:

• Contain sales data for 20,000 products.

• Use hash distribution on a column named ProduclID,

• Contain 2.4 billion records for the years 20l9 and 2020.

Which number of partition ranges provides optimal compression and performance of the clustered columnstore index?

A.

40

B.

240

C.

400

D.

2,400

Full Access
Question # 34

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a session window that uses a timeout size of 10 seconds.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 35

You plan to use an Apache Spark pool in Azure Synapse Analytics to load data to an Azure Data Lake Storage Gen2 account.

You need to recommend which file format to use to store the data in the Data Lake Storage account. The solution must meet the following requirements:

• Column names and data types must be defined within the files loaded to the Data Lake Storage account.

• Data must be accessible by using queries from an Azure Synapse Analytics serverless SQL pool.

• Partition elimination must be supported without having to specify a specific partition.

What should you recommend?

A.

Delta Lake

B.

JSON

C.

CSV

D.

ORC

Full Access
Question # 36

You need to implement an Azure Databricks cluster that automatically connects to Azure Data lake Storage Gen2 by using Azure Active Directory (Azure AD) integration. How should you configure the new clutter? To answer, select the appropriate options in the answers area. NOTE: Each correct selection is worth one point.

Full Access
Question # 37

You are deploying a lake database by using an Azure Synapse database template.

You need to add additional tables to the database. The solution must use the same grouping method as the template tables.

‘Which grouping method should you use?

A.

business area

B.

size

C.

facts and dimensions

D.

partition style

Full Access
Question # 38

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.

You need to determine the size of the transaction log file for each distribution of DW1.

What should you do?

A.

On DW1, execute a query against the sys.database_files dynamic management view.

B.

From Azure Monitor in the Azure portal, execute a query against the logs of DW1.

C.

Execute a query against the logs of DW1 by using the

Get-AzOperationalInsightsSearchResult PowerShell cmdlet.

D.

On the master database, execute a query against the

sys.dm_pdw_nodes_os_performance_counters dynamic management view.

Full Access
Question # 39

What should you do to improve high availability of the real-time data processing solution?

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Full Access
Question # 40

What should you recommend using to secure sensitive customer contact information?

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Full Access
Question # 41

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Full Access
Question # 42

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access