Lead4Pass DP-100 dumps are verified and audited by a Microsoft professional team, and they really meet the requirements of the DP-100 certification exam, covering more than 95% of the exam questions in the exam room!
And, offer the most popular study methods: DP-100 dumps PDF, and DP-100 dumps VCE, both study formats contain the latest certification exam questions and answers!
Therefore, the best exam solution is to use DP-100 dumps with PDF and VCE formats: https://www.leads4pass.com/dp-100.html (362 Q&A), to help you practice easily and achieve exam success.
What’s more! Part of the Lead4Pass DP-100 dumps exam questions online for free download: https://drive.google.com/file/d/1hobZlQm6WcjQxrp8-BPRXUMOMOLT1Ma_/
You can also practice some of the Lead4Pass DP-100 dumps exam questions online
From | Number of exam questions | Exam name | Exam code |
Lead4pass | 15 | Designing and Implementing a Data Science Solution on Azure | DP-100 |
Question 1:
You need to implement a scaling strategy for the local penalty detection data. Which normalization type should you use?
A. Streaming
B. Weight
C. Batch
D. Cosine
Correct Answer: C
Post batch normalization statistics (PBN) is the Microsoft Cognitive Toolkit (CNTK) version of how to evaluate the population means and variance of Batch Normalization which could be used in inference Original Paper. In CNTK, custom networks are defined using the BrainScriptNetworkBuilder and described in the CNTK network description language “BrainScript.”
Scenario:
Local penalty detection models must be written by using BrainScript.
References:
https://docs.microsoft.com/en-us/cognitive-toolkit/post-batch-normalization-statistics
Question 2:
You need to implement a feature engineering strategy for the crowd sentiment local models. What should you do?
A. Apply an analysis of variance (ANOVA).
B. Apply a Pearson correlation coefficient.
C. Apply a Spearman correlation coefficient.
D. Apply a linear discriminant analysis.
Correct Answer: D
The linear discriminant analysis method works only on continuous variables, not categorical or ordinal variables.
Linear discriminant analysis is similar to analysis of variance (ANOVA) in that it works by comparing the means of the variables.
Scenario:
Data scientists must build notebooks in a local environment using automatic feature engineering and model building in machine learning pipelines.
Experiments for local crowd sentiment models must combine local penalty detection data.
All shared features for local models are continuous variables.
Incorrect Answers:
B: The Pearson correlation coefficient, sometimes called Pearson\’s R test, is a statistical value that measures the linear relationship between two variables. By examining the coefficient values, you can infer something about the strength of the relationship between the two variables, and whether they are positively correlated or negatively correlated.
C: Spearman\’s correlation coefficient is designed for use with non-parametric and non-normally distributed data. Spearman\’s coefficient is a nonparametric measure of statistical dependence between two variables and is sometimes denoted by the Greek letter rho. The Spearman\’s coefficient expresses the degree to which two variables are monotonically related. It is also called Spearman rank correlation because it can be used with ordinal variables.
Question 3:
You need to implement a model development strategy to determine a user\’s tendency to respond to an ad. Which technique should you use?
A. Use a Relative Expression Split module to partition the data based on centroid distance.
B. Use a Relative Expression Split module to partition the data based on the distance traveled to the event.
C. Use a Split Rows module to partition the data based on the distance traveled to the event.
D. Use a Split Rows module to partition the data based on centroid distance.
Correct Answer: A
Split Data partitions the rows of a dataset into two distinct sets.
The Relative Expression Split option in the Split Data module of Azure Machine Learning Studio is helpful when you need to divide a dataset into training and testing datasets using a numerical expression.
Relative Expression Split: Use this option whenever you want to apply a condition to a number column. The number could be a date/time field, a column containing age or dollar amounts, or even a percentage. For example, you might want to divide your data set depending on the cost of the items, group people by age ranges, or separate data by calendar date.
Scenario:
Local market segmentation models will be applied before determining a user\’s propensity to respond to an advertisement.
The distribution of features across training and production data is not consistent
References:
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/split-data
Question 4:
You need to resolve the local machine learning pipeline performance issue. What should you do?
A. Increase Graphic Processing Units (GPUs).
B. Increase the learning rate.
C. Increase the training iterations.
D. Increase Central Processing Units (CPUs).
Correct Answer: A
Question 5:
You need to select an environment that will meet the business and data requirements. Which environment should you use?
A. Azure HDInsight with Spark MLlib
B. Azure Cognitive Services
C. Azure Machine Learning Studio
D. Microsoft Machine Learning Server
Correct Answer: D
Question 6:
You need to implement a new cost factor scenario for the ad response models as illustrated in the performance curve exhibit. Which technique should you use?
A. Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.
B. Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.
C. Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.
D. Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.
Correct Answer: A
Scenario: Performance curves of current and proposed cost factor scenarios are shown in the following diagram:
The ad propensity model uses a cut threshold is 0.45 and retrains occur if weighted Kappa deviated from 0.1 +/- 5%.
Question 7:
You create a script that trains a convolutional neural network model over multiple epochs and logs the validation loss after each epoch. The script includes arguments for batch size and learning rate.
You identify a set of batch size and learning rate values that you want to try.
You need to use Azure Machine Learning to find the combination of batch size and learning rate that results in the model with the lowest validation loss.
What should you do?
A. Run the script in an experiment based on an AutoMLConfig object
B. Create a PythonScriptStep object for the script and run it in a pipeline
C. Use the Automated Machine Learning interface in Azure Machine Learning studio
D. Run the script in an experiment based on a ScriptRunConfig object
E. Run the script in an experiment based on a HyperDriveConfig object
Correct Answer: E
Reference: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-tune-hyperparameters
Question 8:
You need to select a feature extraction method. Which method should you use?
A. Mutual information
B. Mood\’s median test
C. Kendall correlation
D. Permutation Feature Importance
Correct Answer: C
In statistics, the Kendall rank correlation coefficient, commonly referred to as Kendall\’s tau coefficient (after the Greek letter ), is a statistic used to measure the ordinal association between two measured quantities. It is a supported method of the Azure Machine Learning Feature selection.
Scenario: When you train a Linear Regression module using a property dataset that shows data for property prices for a large city, you need to determine the best features to use in a model. You can choose standard metrics provided to measure performance before and after the feature importance process completes. You must ensure that the distribution of the features across multiple training models is consistent.
References: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/feature-selection-modules
Question 9:
You need to select a feature extraction method. Which method should you use?
A. Mutual information
B. Pearson\’s correlation
C. Spearman correlation
D. Fisher Linear Discriminant Analysis
Correct Answer: C
Spearman\’s rank correlation coefficient assesses how well the relationship between two variables can be described using a monotonic function.
Note: Both Spearman\’s and Kendall\’s can be formulated as special cases of a more general correlation coefficient, and they are both appropriate in this scenario.
Scenario: The MedianValue and AvgRoomsInHouse columns both hold data in numeric format. You need to select a feature selection algorithm to analyze the relationship between the two columns in more detail.
Incorrect Answers:
B: The Spearman correlation between two variables is equal to the Pearson correlation between the rank values of those two variables; while Pearson\’s correlation assesses linear relationships, Spearman\’s correlation assesses monotonic relationships (whether linear or not).
References: https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/feature-selection-modules
Question 10:
You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed. Which three Azure Machine Learning Studio modules should you use? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.
A. Create Scatterplot
B. Summarize Data
C. Clip Values
D. Replace Discrete Values
E. Build Counting Transform
Correct Answer: ABC
B: To have a global view, the summarized data module can be used. Add the module and connect it to the data set that needs to be visualized.
A: One way to quickly identify Outliers visually is to create scatter plots.
C: The easiest way to treat the outliers in Azure ML is to use the Clip Values module. It can identify and optionally replace data values that are above or below a specified threshold.
You can use the Clip Values module in Azure Machine Learning Studio, to identify and optionally replace data values that are above or below a specified threshold. This is useful when you want to remove outliers or replace them with a mean, a constant, or other substitute value.
References: https://blogs.msdn.microsoft.com/azuredev/2017/05/27/data-cleansing-tools-in-azure-machine-learning/
https://docs.microsoft.com/en-us/azure/machine-learning/studio-module-reference/clip-values Question Set 3
Question 11:
You are developing a hands-on workshop to introduce Docker for Windows to attendees.
You need to ensure that workshop attendees can install Docker on their devices.
Which two prerequisite components should attendees install on the devices? Each correct answer presents part of the solution.
NOTE: Each correct selection is worth one point.
A. Microsoft Hardware-Assisted Virtualization Detection Tool
B. Kitematic
C. BIOS-enabled virtualization
D. VirtualBox
E. Windows 10 64-bit Professional
Correct Answer: CE
C: Make sure your Windows system supports Hardware Virtualization Technology and that virtualization is enabled. Ensure that hardware virtualization support is turned on in the BIOS settings. For example:
E: To run Docker, your machine must have a 64-bit operating system running Windows 7 or higher.
References:
https://docs.docker.com/toolbox/toolbox_install_windows/
Question 12:
Your team is building a data engineering and data science development environment. The environment must support the following requirements:
1.
support Python and Scala
2.
compose data storage, movement, and processing services into automated data pipelines
3.
the same tool should be used for the orchestration of both data engineering and data science
4.
support workload isolation and interactive workloads
5.
enable scaling across a cluster of machines
You need to create the environment.
What should you do?
A. Build the environment in Apache Hive for HDInsight and use Azure Data Factory for orchestration.
B. Build the environment in Azure Databricks and use Azure Data Factory for orchestration.
C. Build the environment in Apache Spark for HDInsight and use Azure Container Instances for orchestration.
D. Build the environment in Azure Databricks and use Azure Container Instances for orchestration.
Correct Answer: B
In Azure Databricks, we can create two different types of clusters.
1.
Standard, these are the default clusters and can be used with Python, R, Scala, and SQL
2.
High-concurrency
Azure Databricks is fully integrated with Azure Data Factory.
Incorrect Answers:
D: Azure Container Instances is good for development or testing. Not suitable for production workloads.
References:
Question 13:
You plan to build a team data science environment. Data for training models in machine learning pipelines will be over 20 GB in size. You have the following requirements:
1.
Models must be built using Caffe2 or Chainer frameworks.
2.
Data scientists must be able to use a data science environment to build machine learning pipelines and train models on their personal devices in both connected and disconnected network environments.
Personal devices must support updating machine learning pipelines when connected to a network.
You need to select a data science environment.
Which environment should you use?
A. Azure Machine Learning Service
B. Azure Machine Learning Studio
C. Azure Databricks
D. Azure Kubernetes Service (AKS)
Correct Answer: A
The Data Science Virtual Machine (DSVM) is a customized VM image on Microsoft\’s Azure cloud built specifically for doing data science. Caffe2 and Chainer are supported by DSVM. DSVM integrates with Azure Machine Learning.
Incorrect Answers:
B: Use Machine Learning Studio when you want to experiment with machine learning models quickly and easily, and the built-in machine learning algorithms are sufficient for your solutions.
References: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview
Question 14:
You are implementing a machine learning model to predict stock prices.
The model uses a PostgreSQL database and requires GPU processing.
You need to create a virtual machine that is pre-configured with the required tools.
What should you do?
A. Create a Data Science Virtual Machine (DSVM) Windows edition.
B. Create a Geo Al Data Science Virtual Machine (Geo-DSVM) Windows edition.
C. Create a Deep Learning Virtual Machine (DLVM) Linux edition.
D. Create a Deep Learning Virtual Machine (DLVM) Windows edition.
Correct Answer: A
In the DSVM, your training models can use deep learning algorithms on hardware that\’s based on graphics processing units (GPUs).
PostgreSQL is available for the following operating systems: Linux (all recent distributions), 64-bit installers available for macOS (OS X) version 10.6, and newer. Windows (with installers available for 64-bit version; tested on latest versions
and back to Windows 2012 R2.
Incorrect Answers:
B: The Azure Geo AI Data Science VM (Geo-DSVM) delivers geospatial analytics capabilities from Microsoft\’s Data Science VM. Specifically, this VM extends the AI and data science toolkits in the Data Science VM by adding ESRI\’s market-leading ArcGIS Pro Geographic Information System.
C, D: DLVM is a template on top of the DSVM image. In terms, The packages, GPU drivers, etc are all there in the DSVM image. Mostly it is for convenience during creation where we only allow DLVM to be created on GPU VM instances on Azure.
Reference: https://docs.microsoft.com/en-us/azure/machine-learning/data-science-virtual-machine/overview
Question 15:
You are developing deep learning models to analyze semi-structured, unstructured, and structured data types. You have the following data available for model building:
1.
Video recordings of sporting events
2.
Transcripts of radio commentary about events
3.
Logs from related social media feeds captured during sporting events
You need to select an environment for creating the model. Which environment should you use?
A. Azure Cognitive Services
B. Azure Data Lake Analytics
C. Azure HDInsight with Spark MLib
D. Azure Machine Learning Studio
Correct Answer: A
Azure Cognitive Services expand on Microsoft\’s evolving portfolio of machine learning APIs and enable developers to easily add cognitive features. such as emotion and video detection; facial, speech, and vision recognition; and speech and language understanding. into their applications. The goal of Azure Cognitive Services is to help developers create applications that can see, hear, speak, understand, and even begin to reason. The catalog of services within Azure Cognitive Services can be categorized into five main pillars – Vision, Speech, Language, Search, and Knowledge.
References: https://docs.microsoft.com/en-us/azure/cognitive-services/welcome
Lead4Pass DP-100 dumps share two study materials for free: you can download them online and practice exams online!
Now! Download the DP-100 best practice solution! Use Lead4Pass DP-100 dumps with PDF and VCE: https://www.leads4pass.com/dp-100.html Contains 362 latest exam questions and answers to help you pass the exam 100%.